Testing normality for statistical analysis

Statistical errors are common in scientific publications. It is estimated that about 50% of the published research articles have at least one error in the statistical methods they have used. Minimization of statistical errors starts with the selection of the most suitable statistic test to analyse the. In the statistical test selection, one of the main criteria is the distribution of the data.

If the sample is from a normal distribution we will use parametric statistical tests. If not the non-paramatric tests must be used. However, the statistical power is much higher in parametric tests. Also, according to the central limit theorem in large samples (> 30 or 40), the sampling distribution tends to be normal, regardless of the shape of the data and means of random samples from any distribution will themselves have normal distribution. Therefore, if the sample size is very high (more than 100), I strongly recommend proceeding with the parametric tests, disregarding the distribution of data.  Although true normality is considered to be a myth, if the sample size is less than 30, I would recommend testing the normality and using non-parametric tests if the distribution is not normal. For small sample sizes, normality tests have little power to reject the null hypothesis and therefore small samples most often pass normality tests. Thus, the use of graphical methods will come in handy in such situations. After carefully observing graphical distributions and using the statistical test results, you should decide which tests to use (parametric or non-parametric).

I have explained some of the methods of testing normality below. Do not need to perform all the tests for a data set. I have attached Rscripts with some instructions here. I prepared it for a friend, so the text might be little bit difficult to understand. If you are using this and need any help, drop me a message or an email. If you are using SPSS or Minitab, you will easily find how to do these tests in the integrated tutorials of the software.

Visual methods of testing normality

The frequency distribution (histogram) – Check whether the distribution is bell shaped.

Stem-and-leaf plot – Similar to the histogram. Instead of frequencies it uses actual data values

Boxplot – If the boxplot is symmetric with the median line at approximately the center of the box and with symmetric whiskers slightly longer than the subsections of the center box suggests that the data may have come from a normal distribution.

P-P plot (probability-probability plot) – Plots the cumulative probability of a variable against the cumulative probability of a particular distribution (In this case, normal distribution). If the data are normally distributed, the result would be a straight diagonal line.

Q-Q plot (quantile-quantile plot) – Similar to the P-P plot, but it plots the quantiles (values that split the data set into equal parts) of the data. The Q-Q plots are easier to interpret in case of large sample sizes.

Statistical tests for normality

Kolmogorov-Smirnov (K-S) test

Lilliefors corrected K-S test

Shapiro-Wilk test

Anderson-Darling test  

Cramer-von Mises test  

D’Agostino skewness test

Anscombe-Glynn kurtosis test

D’Agostino-Pearson omnibus test

Jarque-Bera test

Above mentioned tests compare the scores in the sample to a normally distributed set of scores with the same mean and standard deviation.  The null hypothesis is that “sample distribution is normal” If the test is significant (at 95% confidence levels p<0.05), the distribution is non-normal. Most commonly used normality tests are K-S test, Lilliefors corrected K-S test, Shapiro-Wilk test and Anderson-Darling test. The K-S test has low power and it should not be seriously considered for testing normality, especially if the subsequent statistic test is concerned with the means. Moreover, it is not recommended when parameters are estimated from the data, regardless of sample size. The Shapiro-Wilk test is based on the correlation between the data and the corresponding normal scores and provides better power than the K-S test and Lilliefors test. Thode (2002) suggests the Shapiro-Wilk test as the best choice for testing the normality of data.