Justus-Liebig-Universität Gießen
Discussion
Started 12th Nov, 2022
Does anyone know how to find correlation between 3 variables?
I have 3 variables that consist:
- Duration (< 2 hour [short], 3 hour [mid], > 3 hour [long]
- Body posture (1 [safe], 2 [small risk], 3 [middle risk], 4 [very high risk])
- Pain (1 [low], 2 [mid], 3 [high], 4 [very high])
I want to analyze the correlation between duration - pain and posture - pain. So I pretend that duration and posture are independent. Should I use chi-square or spearman? Thank you.
Most recent answer
Chuck A Arize , in a correlation analysis there is no "dependent" and "independent" variable. The question remains: what is the correlation between three (or more) variables? And shouldn't your "multiple correlation analysis" (whatever this is meant to be!) take care of the kind of values (nominal, ordered, metric) and functional relationships between the variables (monotone, linear, periodic, etc.)?
2 Recommendations
Popular replies (1)
Lakehead University Thunder Bay Campus
I noticed that Nata Vitha included SPPS as one of the relevant topics. If that was a typo for SPSS, then Nata will see in the output from the CROSSTABS command a Chi-square test for "linear-by-linear" association. It does assume/require that both variables are ordinal. What is less well known is that the actual values used for the categories matter. Why? Because this particular Chi-square is equal to (N-1)r2. See these notes by the late David Howell for more info:
See also these pages from Sal Mangiafico's R Companion site for more info and additional options.
HTH.
3 Recommendations
All replies (34)
University of Nevada, Las Vegas
Are you wanting a single value for the association among the three, or just the three pairwise ones?
independent research
You have rank-ordered variables so the Spearman will be correct. For Duration and Body Posture you may count in strata determined by pain levels (adjusted). On the other hand you may aggregate across the strata of pain (one value).
2 Recommendations
Fakultas Kedokteran Universitas Udayana
Jochen Wilhelm Hanna Mielniczuk Thank you very much. May I ask when should I use chi-square?
Fakultas Kedokteran Universitas Udayana
Daniel Wright I want to find correlation in the three pairwise ones, but I just want to know 2 correlation:
- between duration and pain
- between posture and pain
1 Recommendation
Justus-Liebig-Universität Gießen
Chi² can be used with categorical variables with no ordinal information. It measures how much the frequency distributions of the categories differ. Spearman's rank correlation measures the monotone association of two at least ordinal variables (actually, the variables should be continuous, so that there are no ties in the ranks - but it's still not too wrong in the presence of ties). The monotone association gives you an information about the direction of the correlation, which is available for ordinal or continuous variables and which is not analyzed by comparing frequency distributions (Chi²).
2 Recommendations
Fakultas Kedokteran Universitas Udayana
Jochen Wilhelm Thank you very much.
Lakehead University Thunder Bay Campus
I noticed that Nata Vitha included SPPS as one of the relevant topics. If that was a typo for SPSS, then Nata will see in the output from the CROSSTABS command a Chi-square test for "linear-by-linear" association. It does assume/require that both variables are ordinal. What is less well known is that the actual values used for the categories matter. Why? Because this particular Chi-square is equal to (N-1)r2. See these notes by the late David Howell for more info:
See also these pages from Sal Mangiafico's R Companion site for more info and additional options.
HTH.
3 Recommendations
Rutgers, The State University of New Jersey
Because you have ordinal variables that will have many ties, I think I would recommend using Kendall tau-b or tau-c correlation.
tau-c may also be called Stuart's tau-c.
Because the number of categories in e.g. Duration and Pain are not the same (3 vs. 4), tau-c is usually recommended.
Spearman correlation probably won't be much different, so if this is easier, it's probably fine to use that also. Likewise, tau-b and tau-c shouldn't be terribly different in this case.
I don't know how easy any of Spearman's rho, Kendalls tau-b, or Stuart's tau-c is to obtain in SPSS.
The following is example code in R. It can be run at https://rdrr.io/snippets/ without installing software. It produces the Kendalls tau-b, Stuart's tau-c, and Spearman's rho. And the 95% confidence intervals for each. And the p-value for hypothesis tests for Kendalls tau-b and Spearman's rho.
Data =read.table(header=TRUE, stringsAsFactors=TRUE, text="
Posture Pain
Safe Low
Safe Low
Safe Low
Safe Low
Safe Mid
Safe High
SmallRisk Low
SmallRisk Low
SmallRisk Mid
SmallRisk Mid
SmallRisk Mid
SmallRisk High
MiddleRisk Low
MiddleRisk Mid
MiddleRisk Mid
MiddleRisk Mid
MiddleRisk High
MiddleRisk VeryHigh
HighRisk Low
HighRisk Mid
HighRisk High
HighRisk High
HighRisk VeryHigh
HighRisk VeryHigh
")
Data$Posture = factor(Data$Posture, levels=c('Safe', 'SmallRisk', 'MiddleRisk', 'HighRisk'))
Data$Pain = factor(Data$Pain, levels=c('Low', 'Mid', 'High', 'VeryHigh'))
Table = xtabs(~ Posture + Pain, data=Data)
Table
library(DescTools)
KendallTauB(Table, conf.level=0.95)
StuartTauC(Table, conf.level=0.95)
cor.test(as.numeric(Data$Posture), as.numeric(Data$Pain), method="kendall")
SpearmanRho(Table, conf.level=0.95)
cor.test(as.numeric(Data$Posture), as.numeric(Data$Pain), method="spearman")
1 Recommendation
Lakehead University Thunder Bay Campus
Here is SPSS code for Sal's example. The NONPAR CORR command computes both tau-B and Spearman's rho. For CROSSTABS, the CHISQ option is needed to see the so-called test of linear-by-linear association. The other key words on the STATISTICS subcommand are (likely) self-explanatory.
NEW FILE.
DATASET CLOSE ALL.
DATA LIST LIST / Posture Pain (2F1).
BEGIN DATA
0 0
0 0
0 0
0 0
0 1
0 2
1 0
1 0
1 1
1 1
1 1
1 2
2 0
2 1
2 1
2 1
2 2
2 3
3 0
3 1
3 2
3 2
3 3
3 3
END DATA.
VALUE LABELS
Posture 0 "Safe" 1 "Low risk" 2 "Medium risk" 3 "High risk" /
Pain 0 "Low" 1 "Mid" 2 "High" 3 "Very high".
CROSSTABS
/TABLES=Posture BY Pain
/FORMAT=AVALUE TABLES
/STATISTICS=CHISQ GAMMA D BTAU CTAU
/CELLS=COUNT
/COUNT ROUND CELL.
NONPAR CORR
/VARIABLES=Posture Pain
/PRINT=BOTH TWOTAIL SIG LOWER
/MISSING=PAIRWISE.
1 Recommendation
Rutgers, The State University of New Jersey
Thanks, Bruce Weaver .
It turns out I have access to SPSS through the university (they keep changing this), so I was able to compare results.
At the time of writing, PSPP, will give you the values for tau-c and tau-b without the p-values, but it does give you the T values and the standard error, so you could easily calculate the p-values and confidence intervals. (Not sure why the last step isn't implemented).
For R users, there's an answer here that will give you the p-value for tau-c. It's pretty simple. (But, as written, you have to take the absolute value of tau and its confidence interval limits if tau is negative).
P.S.
For PSPP, I had to change the first couple of lines to the following. I really don't understand SPSS language, but this worked.
DATA LIST / Posture 1 (A) Pain 3 (A).
BEGIN DATA.
And then you can change the labels command.
VALUE LABELS
Posture "0" "Safe" "1" "Low risk" "2" "Medium risk" "3" "High risk" /
Pain "0" "Low" "1" "Mid" "2" "High" "3" "Very high".
1 Recommendation
University of Calabar
There are many ways to go about it. First, you could find the average or sum of all items per variable to obtain continuous level data. With the data, Pearson correlation could be performed on the aggregated data.
1 Recommendation
Rutgers, The State University of New Jersey
Hi, Raid Amin . Probably this is what Agresti calls the "linear-by-linear" test. This called "Linear-by-Linear Association" by SPSS and in some packages in R. Confusingly, I think SAS and some R packages use terms like "Cochran-Mantel-Haenszel" or "Mantel-Haenszel" for this test.
1 Recommendation
Justus-Liebig-Universität Gießen
If the data are paired, the McNemar test can be used. The CMH-test is a generalization where the pairing is done via stratification (within each stratum an odds ratio is calculated from a 2x2 table, and these are weighted and combined). I think McNemar's test is the one that applies to Nata's problem.
1 Recommendation
Lakehead University Thunder Bay Campus
Raid Amin, in my Nov 13 post, I gave this link to the late David Howell's notes regarding the ordinal Chi-square you are talking about:
Notice that this statistic = (N-1)r2. Therefore, the numeric values used for the different categories matter. E.g., suppose you have a 5-point item with labels Strongly disagree, Disagree, Neutral, Agree, Strongly agree. People often use numeric values 1-5 to code these categories. But suppose you believe that these numbers better reflect the "distances" between the categories: SD=1, D=2, N=3.5, A=5, SA=6. Pearson's Chi-square would be the same for both sets of numeric codes, but the ordinal Chi-square would not. Make up an example and try it! HTH.
1 Recommendation
Fakultas Kedokteran Universitas Udayana
Thank you very much for the information. Really appreciate the feedback and it really helps Bruce Weaver Raid Amin Sal Mangiafico Valentine Joseph Owan Jochen Wilhelm
Fakultas Kedokteran Universitas Udayana
Bruce Weaver I apologize, I didn't realize until I read your answer. Yes, what I mean is SPSS. Once again, thank you very much for your helps!
Lakehead University Thunder Bay Campus
Hi Sal Mangiafico. I found an SPSS syntax file from 2016 in which I did a demonstration of the effect different coding schemes on the linear-by-linear Chi2. I cleaned up the code a bit to make it more presentable--see attached. For those who do not have SPSS, the output can be viewed in the attached PDF.
And before anyone busts my chops, yes, I know it is a bad idea to use age groups when one has the actual ages. Age group was just a convenient variable to use to demonstrate the effect of coding on the ordinal (or linear-by-linear) Chi2.
Necmettin Erbakan Üniversitesi
In my opinion correlation is not a suitable word when used for categorical variables. It suits and was historically introduced for continuous variables.
When talking about categorical variables, it would be better use the word association. I believe also that what the poster has in mind is question of association rather than correlation.
Lakehead University Thunder Bay Campus
Hi Mehmet Sinan Iyisoy. I do agree with you that it is generally better practice to only say correlation when one is talking about Pearson r or one of the other common correlation coefficients. This can help to prevent confusion. However, there is a common expression most of us know that uses the word correlation in a much more general sense (IMO):
- Correlation does not imply causation.
IMO, this statement translates to:
- Statistical covariation on its own does not logically imply causation.
In other words, it is not referring specifically and only to specific correlation coefficients that measure linear association. YMMV. ;-)
Rutgers, The State University of New Jersey
Probably for categorical data, "association" is a more typical term than "correlation". Though Kendall's tau- b or c is often used for categorical data and is often called "correlation". In contrast, the linear-by-linear test is often called "association". For a 2 x 2 table, phi is sometimes called a "correlation" coefficient, but is usually called a measure of "association". Upshot: "association" is probably a safer term for categorical data, and won't confuse the reader into thinking that Pearson correlation was used.
Eskisehir Osmangazi University
Hi,
There is a PhD thesis with the name of “Essays on Market Reaction and Cryptocurrency” of Toan Luu Duc Huynh (January 25, 2022). This thesis may be usefull for your needs.
S. Ç.
Rutgers, The State University of New Jersey
Hi, Salih Celebioglu . Can you let us know how that thesis applies to this question ? I looked at it briefly. I didn't find anything related to ordinal variables. There is the use of wavelet correlation, but I don't see how this would apply to this situation.
Eskisehir Osmangazi University
Dear Mangiafico,
I just saw the title of the question on ResearchGate today. Sorry, I don't know the details of your discussion. I have shown this source (thesis) in case it may be of use to you. I can't be of any further help as I'm busy in these days. But sometimes I also face such problems. As an example, I can give the question of how I can calculate a correlation between the rankings of the champion teams in the super football leagues of the countries over time. For now, I will not be able to continue this discussion because of my excuses. I hope you don't mind me.
Rutgers, The State University of New Jersey
Hi, Salih Celebioglu . No worries. ~ Sal
Texas A&M University-Commerce
To find the correlation between three variables, you can use multiple correlation analysis. Multiple correlation analysis is used to determine the relationship between a dependent variable and two or more independent variables1.
Justus-Liebig-Universität Gießen
Chuck A Arize , in a correlation analysis there is no "dependent" and "independent" variable. The question remains: what is the correlation between three (or more) variables? And shouldn't your "multiple correlation analysis" (whatever this is meant to be!) take care of the kind of values (nominal, ordered, metric) and functional relationships between the variables (monotone, linear, periodic, etc.)?
2 Recommendations
Similar questions and discussions
Does the two-sample t-test provide a valid solution to practical problems?
Hening Huang
Due to growing concerns about the replication crisis in the scientific community in recent years, many scientists and statisticians have proposed abandoning the concept of statistical significance and null hypothesis significance testing procedure (NHSTP). For example, the international journal Basic and Applied Social Psychology (BASP) has officially banned the NHSTP (p-values, t-values, and F-values) and confidence intervals since 2015 [1]. Cumming [2] proposed ‘New Statistics’ that mainly includes (1) abandoning the NHSTP, and (2) using the estimation of effect size (ES).
The t-test, especially the two-sample t-test, is the most commonly used NHSTP. Therefore, abandoning the NHSTP means abandoning the two-sample t-test. In my opinion, the two-sample t-test can be misleading; it may not provide a valid solution to practical problems. To understand this, consider a well-posted example that is originally given in a textbook of Roberts [3]. Two manufacturers, denoted by A and B, are suppliers for a component. We are concerned with the lifetime of the component and want to choose the manufacturer that affords the longer lifetime. Manufacturer A supplies 9 units for lifetime test. Manufacturer B supplies 4 units. The test data give the sample means 42 and 50 hours, and the sample standard deviations 7.48 and 6.87 hours, for the units of manufacturer A and B respectively. Roberts [3] discussed this example with a two-tailed t-test and concluded that, at the 90% level, the samples afford no significant evidence in favor of either manufacturer over the other. Jaynes [4] discussed this example with a Bayesian analysis. He argued that our common sense tell us immediately, without any calculation, the test data constitutes fairly substantial (although not overwhelming) evidence in favor of manufacturer B.
For this example, in order to choose between the two manufacturers, what we really care about is (1) how likely the lifetime of manufacturer B’s components (individual units) is greater than the lifetime of manufacturer A’s components? and (2) on average, how much the lifetime of manufacturer B’s components is greater than the lifetime of manufacturer A’s components? However, according to Roberts’ two-sample t-test, the difference between the two manufacturers’ components is labeled as “insignificant”. This label does not answer these two questions. Moreover, the true meaning of the p-value associated with Roberts’ t-test is not clear.
I recently visited this example [5]. I calculated the exceedance probability (EP), i.e. the probability that the lifetime of manufacturer B’s components (individual units) is greater than the lifetime of manufacturer A’s components. The result is EP(XB>XA)=77.8%. In other words, the lifetime of manufacturer B’s components is greater than the lifetime of manufacturer A’s components at an odds of 3.5:1. I also calculated the relative mean effect size (RMES). The result is RMES=17.79%. That is, the mean lifetime of manufacturer B’s components is greater than the mean lifetime of manufacturer A’s component by 17.79%. Based on the values of the EP and RMES, we should have a preference of manufacturer B. In my opinion, the meaning of exceedance probability (EP) is clear without confusion; a person even not trained in statistics can understand it. The exceedance probability (EP) analysis, in conjunction with the relative mean effect size (RMES), provides the valid solution to this example.
[1] Trafimow D and Marks M 2015 Editorial Basic and Applied Social Psychology 37 1-2
[2] Cumming G 2014 The New Statistics Psychological Science 25(1)DOI: 10.1177/0956797613504966
[3] Roberts N A 1964 Mathematical Methods in Reliability Engineering McGraw-Hill Book Co. Inc. New York
[4] Jaynes E T 1976 Confidence intervals vs Bayesian intervals in Foundations of Probability Theory, Statistical Inference and Statistical Theories of Science, eds. Harper and Hooker, Vol. II, 175-257, D. Reidel Publishing Company Dordrecht-Holland
[5] Huang H 2022 Exceedance probability analysis: a practical and effective alternative to t-tests. Journal of Probability and Statistical Science, 20(1), 80-97. https://journals.uregina.ca/jpss/article/view/513
Can the use of a parametric test (ANOVA) after a failed normality test be justified?
Kevin Harris
I am using Prism to carry out some two-way ANOVA tests. I have a series of fairly small datasets that I want to analyse the same way. But a few of the datasets do not pass the Shapiro-Wilk normality test. Reading the Prism Guide, it seems like I would be justified in still carrying out an ANOVA, seeing as the Q-Q plots looks like they do not deviate too far a normal distribution.
However I am struggling to find examples in the literature of this being done. I am wondering how I would justify this in a thesis/paper and if the following would be acceptable:
"Before undergoing an ANOVA, the each dataset was tested for Gaussian distribution using the Shapiro-Wilk normality test (ɑ = 0.05). Although not all datasets passed, majority of datasets in the series were normally distributed. Of those that failed, Q-Q was assessed and no major violations were detected. As statistical tests are generally robust to mild violations, and to maintain consistency across datasets, two-way ANOVA was carried out."
Although may manuals and websites state that ANOVA is robust there don't seem to be any peer reviewed references for two- or three-way ANOVA but I can find a couple of references for one-way and RM ANOVA (PMID 36695847 & 29048317). If someone could supply a reference to justify my use that would be great.
Excerpt from Prism Guide:(https://www.graphpad.com/guides/prism/latest/statistics/stat_interpreting_results_normality.htm)
What should I conclude if the P value from the normality test is low?
The null hypothesis is that the data are sampled from a Gaussian distribution. If the P value is small enough, you reject that null hypothesis and so accept the alternative hypothesis that the data are not sampled from a Gaussian population. The distribution could be close to Gaussian (with large data sets) or very far from it. The normality test tells you nothing about the alternative distributions.
If your P value is small enough to declare the deviations from the Gaussian idea to be "statistically significant", you then have four choices:
- The data may come from another identifiable distribution. If so, you may be able to transform your values to create a Gaussian distribution. For example, if the data come from a lognormal distribution, transform all values to their logarithms.
- The presence of one or a few outliers might be causing the normality test to fail. Run an outlier test. Consider excluding the outlier(s).
- If the departure from normality is small, you may choose to do nothing. Statistical tests tend to be quite robust to mild violations of the Gaussian assumption.
- Switch to nonparametric tests that don’t assume a Gaussian distribution. But the decision to use (or not use) nonparametric tests is a big decision. It should not be based on a single normality test and should not be automated.
Don't use this approach: First perform a normality test. If the P value is low, demonstrating that the data do not follow a Gaussian distribution, choose a nonparametric test. Otherwise choose a conventional test.
Prism does not use this approach, because the choice of parametric vs. nonparametric is more complicated than that.
- Often, the analysis will be one of a series of experiments. Since you want to analyze all the experiments the same way, you cannot rely on the results from a single normality test.
- Many biological variables follow lognormal distributions. If your data are sampled from a lognormal distribution, the best way to analyze the data is to first transform to logarithms and then analyze the logs. It would be a mistake to jump right to nonparametric tests, without considering transforming.
- Other transforms can also be useful (reciprocal) depending on the distribution of the data.
- Data can fail a normality test because of the presence of an outlier. Removing that outlier can restore normality.
- The decision of whether to use a parametric or nonparametric test is most important with small data sets (since the power of nonparametric tests is so low). But with small data sets, normality tests have little power to detect non-gaussian distributions, so an automatic approach would give you false confidence.
- With large data sets, normality tests can be too sensitive. A low P value from a normality test tells you that there is strong evidence that the data are not sampled from an ideal Gaussian distribution. But you already know that, as almost no scientifically relevant variables form an ideal Gaussian distribution. What you want to know is whether the distribution deviates enough from the Gaussian ideal to invalidate conventional statistical tests (that assume a Gaussian distribution). A normality test does not answer this question. With large data sets, trivial deviations from the idea can lead to a small P value.
The decision of when to use a parametric test and when to use a nonparametric test is a difficult one, requiring thinking and perspective. This decision should not be automated.
Related Publications
In this chapter, you will learn how to use and interpret chi-squared tests, which are hypothesis tests for qualitative data where you have categories instead of numbers. With nominal qualitative data, you can only count (since ordering and arithmetic cannot be done). Chi-squared tests are therefore based on counts that represent the number of items...
In this chapter, you will learn how to use and interpret chi-squared tests, which are hypothesis tests for qualitative data where you have categories instead of numbers. With nominal qualitative data, you can only count (because ordering and arithmetic cannot be done). Chi-squared tests are therefore based on counts that represent the number of ite...
This chapter provides an overview of the Chi-squared analysis. Qualitative data are summarized using counts and percentages in the chapter. The Chi-squared tests provide hypothesis tests for qualitative data when there are categories instead of numbers. The Chi-squared statistic measures the difference between the actual counts and the expected cou...