Question
Asked 1st Jul, 2014

If you had a data set which exhibited both non-normally distributed and normally distributed data, which statistical test would you use?

A parametric or non parametric test?

Most recent answer

Sahar Abdelwahab
De Montfort University
Hi,
In Andy field's book ''Discovering statistics using SPSS''>>Differences between several related groups (page 575), the test of normality-using Shapiro–Wilk test- resulted in 2 groups that weren't normally distributed and one group was normally distributed. The author proceeded with a non parametric test 'using Friedman test'
2 Recommendations

Popular answers (1)

Tatiana Fidalgo
Rio de Janeiro State University
Hi
it is really commom.  I did this question to my biostatistic professor some time ago. If you have to choose, it is better and more realistic use non-parametric analysis. Because it is better consider a parametric data as a non-parametric one than the opposite.
I hope I helped you.
Tatiana 
 
19 Recommendations

All Answers (36)

Ambrina Qureshi
Dow University of Health Sciences
How Can a Single data show both normal and skewed distribution...
1 Recommendation
Gheeseong Lim
Newcastle University
Hi Ambrina, thanks for your reply.
hm..how should i put this....i have a few variables to compare..and the data for each variables showed different distribution.
1 Recommendation
RC Castrejón-Pérez
Instituto Nacional De Geriatría
Hello Gheeseon Lim!
The analysis must be designed according to the distribution of the dependent variable. Of course you must take into account the distribution of independent variables, however, the main objective is to find an explanation the de distribution of the dependent variable.
I hope you find this useful.
Cheers
2 Recommendations
Hany Mohamed Aly Ahmed
University of Malaya
Hi Lim,
Yes, this could happen. If you are comparing between a number of groups, and the data of some groups are homogenous and others are not, then you have to count which groups are homogenous and which are not. If the number of homogenous groups is more than or equal to non-homogenous, then use parametric, if not, use non-parametic. The same concept is applied if you are comparing between different groups with some variables (count then choose the test).
Please note that this issue is normal, especially in cytotoxicity assays such as MTT assay, but non acceptable in others such as real time PCR, where the SD is very low.
I hope this could help!,
Cheers,
HMA Ahmed. 
15 Recommendations
Tatiana Fidalgo
Rio de Janeiro State University
Hi
it is really commom.  I did this question to my biostatistic professor some time ago. If you have to choose, it is better and more realistic use non-parametric analysis. Because it is better consider a parametric data as a non-parametric one than the opposite.
I hope I helped you.
Tatiana 
 
19 Recommendations
Gheeseong Lim
Newcastle University
thank you all for the input. it is very useful indeed:))
1 Recommendation
Aaro Turunen
University of Turku
I'd agree with Tatiana on this: Isn't it generally harder to find significance on nonparametric analyses (such as U-test vs the T-test) and as such the significant results gained would be even more believable when acquired using such a method (especially if M&M states that the data was at times normally distributed)?
1 Recommendation
Sumit Acharya
College of Physicians and Surgeons Pakistan
i have the same issue too, thank you for information.
mark this question
Namrata *
Maulana Azad National Institute of Technology, Bhopal
This happened in my case where one variable's data was non-normally distributed and other two were normally distributed. When I did the analysis using only non normally distributed I used non parametric test, whereas when I used only normally distributed data, I used parametric test. I would like to know whether this is right or wrong using both parametric and non parametric tests in a study for different variables?
3 Recommendations
Leon Islas-Weinstein
Instituto Nacional de Ciencias Médicas y Nutrición Salvador Zubirán
Adding to Namrata's question, I have a similar situation. In my experiment there are three time points (three different days of the experiment) and two independent groups (it is important to mention that the subjects are always different for each time point). When I use the Shapiro-Wilk's test to evaluate normality, most groups pass the test but some do not. Therefore, should I employ a non-parametric test as recommended by Tatiana, a parametric test (because most groups pass the normality test) as recommended by Hany, or a mix of both parametric and non-parametric tests as suggested by Namrata? I am sure that only one of these three options should be the more appropriate one. But which of the three is it? Thanks in advance!
1 Recommendation
Eugene Appenteng Osae
University of Houston College of Optometry
I am in the situation Leon. I will use a mix of both but I also read that "true normality is actually a myth", further I also learned that when you have a large enough sample sizes (n > 30 or 40), the “violation” of the normality assumption should not cause major problems; this implies that we have some liberty to apply some parametric procedures even when the data are not normally distributed . This has some connection to the central limit theorem, if the sample data are approximately normal then the sampling distribution too will be normal. On this note, please check your sample size. Hope this helps. :)
1 Recommendation
Leon Islas-Weinstein
Instituto Nacional de Ciencias Médicas y Nutrición Salvador Zubirán
Thanks for your response Eugene. From what I have investigated thus far, the best option of the three decison criteria is to use a combination of both parametric and non-parametric tests as suggested by Namrata and you. When you are dealing with small sized groups as is my case, normality plays an important role in the statistical analysis. First, you run all your groups through a non-parametric test as suggested by Tatiana. But you don't stop there. You can further test all your groups for normality and variance distribution and the groups that pass both tests can be further subjected to a parametric test. This will allow an additional opportunity for attempting to find a statistically significant difference in groups that did not reveal a statistically significant effect when using a non-parametric test. Finally, it is very important to disclose in the material & method section, that some groups (specify which groups) were subjected to parametric and non-parametric tests, while others just to non-parametric tests as they did not gather the necessary criteria for using a parametric test.
6 Recommendations
The Central Limit Theorem states that the sampling distribution of the sample means approaches a normal distribution as the sample size gets larger — no matter what the shape of the population distribution. This fact holds especially true for sample sizes over 30.
Laya Heidari Darani
Islamic Azad University Falavarjan Branch
Dear Prof. Gheeseong Lim ,
Thanks for your interesting question. Today, I faced this problem and the answers helped me a lot. I had two groups, one of which had normally distributed data and another of which has non-normally distributed data. I'm working on the Social Sciences field and this is common in our field, as well. My major is Applied Linguistics.
Thank you and best wishes,
Laya
4 Recommendations
Srini Vasan
University of New Mexico
Excellent discussion from our colleagues. All points are well taken including use of parametric vs. non-parametric tests, sample size, central limit theorem, etc. May I add one more idea? When distributions are not normally distributed one does transformation of the data. A common transformation is taking the logarithm of the variable value. This results in highly skewed distributions to become more normal and then they can be analysed using parametric tests. What if you transform all dependent variable data using a log transformation and then you can uniformly apply parametric test, provided all data become normal. Do you think that this is a feasible solution?
4 Recommendations
Ali Bahadoran-Baghbaderani
Neurovision Language Teaching and Research Center
I hope that the following links can provide you with the answers you are looking for.
Thank you all for this useful information. I was beginning to think something was wrong with my data. Ali thank you for the readings!
1 Recommendation
Dionysus Tafiadis
University of Ioannina
θα έκανα λογαριθμική ανάλυση με σκοπό να γίνει κανονικοποίηση του δείγματος. εκτός και ακολουθούσε το κεντρικό θεώρημα για το μέγεθος του δείγματος και θα θεωρούσα την κατανομή κανονική
Pawan Sharma
University of Louisville
I agree with checking the normality of the distribution and running parametric and non-parametric tests based on the type of distribution. My question is related to the reporting of such data. For example, if I have four data pairs, out of which two have normal distribution and other do not. Should i be plotting mean or median in such scenario.
Thanks!
Revathy Mani
UNSW Sydney
For Pawan Sharma's question, I think reporting mean (SD) for normally distributed data and Median (IQR range) for not-normal distribution may be better.
Revathy Mani
Alessio Facchin
Università degli Studi di Milano-Bicocca
I agree with all answer, but there is other considerations. In the case of large samples (my last case >300), some test of normality (i.e. Shapiro-wilk) tend to produce false negatives (data are not normally even the plot seems to be normal) and classically both parametric and non parametric comparisons give the same result. A possible solution is to use a Bayesian approach.
However depends of your analysis and their relative structure. Despite normality, for ANOVA is more important NOT the normal distribution of data, but the normal distribution of residuals.
Shno Koiek
University of Southern Denmark
Excellent discussion.
I am in an almost same situation;
I have tested 19 participants with a measurements using 11 different methods in two sessions (test and retest). In order to check the effects of method and session on the results of testing, I wanted to use repeated measure ANOVA (parametric analyses). But the data using two different testing methods in retest (one of the session) are not normally distributed. Can I still use repeated measure ANOVA (parametric analyses)?
Thanks in advance.
Alessio Facchin
Università degli Studi di Milano-Bicocca
I think YES. With parametrics you have the possibility to build a correct factorial design.
In the case of ANOVA (which is regression based), the most important factor is NOT the distribution of data, but the distribution of residuals. This point is more important in rmANOVA.
3 Recommendations
Mehrdad Rabiei
Microbial Treasure
the repeated measures test is a longitudinal one. Is your experiment conducted in different time periods?
Shno Koiek
University of Southern Denmark
Thank you Alessio Facchin
And yes Mehrdad Rabiei test and retest were performed in different time (with some time span interval)
Samar Ahmad
University of Manitoba
As a follow up of the above discussion, i have small sample size, i checked my data and most of them are not normally distributed, so i went to do a non parametric test, then i found some significant difference in the mean for two variables, what should i do next?
1- check the normality of these variables----> if not normally distributed-->log transform the data--->then run a parametric test again only for those variables?
1 Recommendation
Indrashis Podder
College of Medicine & Sagore Dutta Hospital
I have two sets of independent, continuous data. One group shows normal distribution, other doesn't. What to use? Independent t or mann Whitney?
1 Recommendation
Himel Mondal
All India Institute of Medical Sciences Deoghar
Indrashis Podder
College of Medicine & Sagore Dutta Hospital
Thanks a lot @ Dr. Himel Mondal, the blog is super useful. Cleared many of my doubts. Thanks a lot
1 Recommendation
Antonino Bianco
Università degli Studi di Palermo
non-parametric analysis for sure
3 Recommendations
Katerina Gospodinova
University of Oxford
Could you please point me to a reference backing up the statement below?
'If you are comparing between a number of groups, and the data of some groups are homogenous and others are not, then you have to count which groups are homogenous and which are not. If the number of homogenous groups is more than or equal to non-homogenous, then use parametric, if not, use non-parametic. The same concept is applied if you are comparing between different groups with some variables (count then choose the test).'
Thank you very much for your help.
Best wishes,
Katerina
1 Recommendation
Sandeep Shinde
Krishna Institute Of Medical Sciences "Deemed to be University" Karad.
Non parametric analysis
Eshna Ramdhany Moonwessur
University of Technology Mauritius
Hello,
Did you get a reference for backing your statement? Else, if not, what test did you do if some were normally distributed and other not normally distributed?
Thanks
Sahar Abdelwahab
De Montfort University
Hi,
In Andy field's book ''Discovering statistics using SPSS''>>Differences between several related groups (page 575), the test of normality-using Shapiro–Wilk test- resulted in 2 groups that weren't normally distributed and one group was normally distributed. The author proceeded with a non parametric test 'using Friedman test'
2 Recommendations

Similar questions and discussions

Related Publications

Article
Full-text available
The objective of this research was to determine factors that influence application of non-parametric analysis technique. The data emanated from research done by postgraduate students over a ten year period (1995-2004) and archived by the project in postgraduate education research (PPER). A Survey of three South African universities was conducted. T...
Article
This study examines the privatized firms financial performance, to compare pre and post- privatization financial performance from 13 State Owned Enterprises (SOEs). The period used in this study are 1991-2003. The objective of this study is to analyze how big the privatization effects of fir ms financial performance. There are 13 SOEs that have bee...
Chapter
Full-text available
The test which are based on without knowledge of frequency function and parameter of that distribution. They are known as non-parametric (N.P.) tests. A non-parametric test is a test which is independent of the frequency from which the samples are drawn. In other words, non-parametric test does not make any assumption regarding the form of the popu...
Got a technical question?
Get high-quality answers from experts.