What is statistics? – Skewness & Kurtosis

Skewness and Kurtosis
Skewness and Kurtosis are concepts that describe some parts of the dispersion of a dataset and thus, some parts of the shape of a probability distribution in relation to its mean. Therefore, both concepts are very useful.

Skewness: Definition
Skewness refers to the level of asymmetry that the dispersion of a dataset can have with real numbers and thus, it refers also to the probability distribution that this dataset has in relation to its mean. However, to quantify this “level” of asymmetry is very hard and no single rule exists. Instead, a number of formulas and rules exist with many exceptions. Traditionally, skewness can be divided into positive skewness and into negative skewness (see a related example in Graph Figure I) which are also hard to be defined!

Sometimes, when the majority of data in a dataset of real numbers has lower dispersion in its left than in its right, then it can be said that Positive Skewness exists or that this dataset is Left skewed. However, note that this description do not cover all cases.

Sometimes, when the majority of data in a dataset of real numbers has lower dispersion in its right than in its left, then it can be said that Negative Skewness exists or that this dataset is Right skewed. However, note that this description do not cover all cases.

Skewness: Statistical Definition
Statistically, several formulas exist in order to quantify skewness and which is the proper one, it can depend on several factors. The following formula is named as Pearson’s moment coefficient of skewness for sample:

Skew(X)=(\frac{n\sqrt{(n-1)}}{(n-2)})(\frac{\sum_{i=1}^{n}(X_{i}-\bar{X})^3}{(\sum_{i=1}^{n}(X_{i}-\bar{X})^2)^{3/2}})

Graph Figure I: Examples of Positive Skewness and Negative Skewness
Skewness

The value of Skewness
i) If Skewness value is less than Skew(X)<-1 or higher than Skew(X)>1 then it can be suggested that Sample data are either Negatively or Positively skewed in relation to its Sample mean. However, this suggests that Sample Data do come from a Population with normal distribution properties.

ii) If Skewness value is zero 0, then it can suggest that Sample data are not either Negatively or Positively skewed in relation to its Sample mean. However, this suggests that Sample Data do come from a Population with normal distribution properties.

Kurtosis: Definition
Kurtosis is a concept that tries to describe the shape of the peak of a probability distribution. It is a Greek word which is coming from “κυρτός” which means something is curved. It is also hard to quantify Kurtosis. Traditionally, Kurtosis can be divided into three types. its names are Greek words (see Graph Figure II):

  • The platykurtic type which means that the peak of a probability distribution tries to minimize its curvature, that is, it can be named as a “flat-curve”. Usually, it has negative “Excess Kurtosis” e.g -1.20
  • The leptokurtic type which means that the peak of a probability distribution tries to get as sharp as possible. Usually, it has positive “Excess Kurtosis” e.g. +2.10
  • The Mesokurtic type which means that the peak of a probability distribution tries to be between platykurtic and leptokurtic. It is the peak that the Standard Normal Distribution has. Therefore, Its Excess Kurtosis is Zero


  • Kurtosis: Statistical Definition
    Statistically, no a single formula can quantify the all types of Kurtosis. Which formula one must choose, It can depend on several factors. The following formula is named as Pearson’s moment coefficient of Kurtosis for sample. Note that Sample Excess Kurtosis formula represents the Kurtosis in relation to the Kurtosis of a Normal Distribution. Here, only this Pearson formula is given:

    ExKurt(X)=(\frac{n(n+1)}{(n-1)(n-2)(n-3)})(\frac{\sum_{i=1}^{n}(X_{i}-\bar{X})^4}{s^4})-\frac{3(n-1)^2}{(n-2)(n-3)}

    The given Excess Kurtosis formula can be shrunk because the 1st and 3rd fractions produce very small numbers when the size of sample is large e.g. n>20:

    ExKurt(X)=(\frac{1}{n})(\frac{\sum_{i=1}^{n}(X_{i}-\bar{X})^4}{s^4})-3

    Symbol Explanation
    \sum: Indicate the summation of all the results that will be produced when the math operations related to every X_{i} member inside \sum have finished.
    \bar{X}: The mean of the sample
    s: The Standard Deviation of the sample
    n: The size of the sample
    X_{i}: It is replaced each time by a different member of a given dataset, until all members of it have taken part.


    Graph Figure II: Kurtosis: Platykurtic, Mesokurtic, and Leptokurtic
    Kurtosis - Platykurtic - Mesokurtic - Leptokurtic