# Which test do I use to estimate the correlation between an independent categorical variable and a dependent continuous variable?

Is it a fair assumption that if you do an Anova or Kruskal Wallis test with an independent categorical variable and a dependent continuous variable that shows no significance, to assume that there is no "correlation" between the two variables? For two continuous variables you can perform a Pearson or Spearman's correlation test, but I am not sure to use which test in the above mentioned situation?

## Popular Answers

Emmanuel Curis· Université René Descartes - Paris 5Let's say X is your independant categorical variable and Y your dependant, continuous variable.

First of all, strictly speaking, a test will not estimate anything, juste give a kind of yes/or no answer, here « there is/there is not association/correlation between X and Y ».

Second, if X is categorical, speaking of correlation is somehow abusive, since correlation is defined by means and categorical variables do not have mean. Speaking of association is better.

To answer specifically to your question: for ANOVA and Kruskall-Wallis, the null hypothesis is that the two variables are independant (ANOVA: Y is gaussian and has the same variance and mean for each X value; KW: Y has the same distribution function for each X value --- not forgetting the tests assumptions!).

Hence, a significant result prooves that Y and X are dependant.

However, a non-significant results may not be enough to proove independance since not-rejecting the null hypothesis does not proove it is true, by any way --- in fact, it does not proove anything at all.

To _estimate_ the correlation/association, I think you should first more precisely define your question

Both ANOVA and K-W test are basically _tests_; estimation of the strength of the association can be derived from it, but the more useful one I think strongly depends on the exact problem you are working with.

The tetrachoric/polychoric is made for two categorical variables (read the link given in the post of Luis), hence cannot be used here without raising the problem of defining classes for your continuous variable.

Jeffrey A Welge· University of Cincinnati