7th Feb, 2020

Minerazzi.com

We use cookies to make interactions with our website easy and meaningful, to better understand the use of our services, and to tailor advertising. For further information, including about cookie settings, please read our Cookie Policy . By continuing to use this site, you consent to the use of cookies.

We value your privacy

We use cookies to offer you a better experience, personalize content, tailor advertising, provide social media features, and better understand the use of our services.

To learn more or modify/prevent the use of cookies, see our Cookie Policy and Privacy Policy.

Question

Asked 7th May, 2017

Let I have different Pearson correlation coefficient values like,

r= [.50 .67 .55 .60 .80 .70 .51 .63].

is this meaningful to discuss the average of these correlation coefficients? I Imean that discussing the average Pearson correlation coefficient,

r=0.62?

is meaningful or not? Thanks

The main reason for not averaging correlation coefficients in the arithmetic sense follows.

Correlations coefficients cannot be averaged in the arithmetic sense as they are not additive in the arithmetic sense. This is due to the fact that a correlation coefficient is a cosine, and cosines are not additive. This can be understood by mean-centering a paired data set and computing the cosine similarity between the vectors representing the variables involved.

If a paired data set violates the bivariative normality assumption (often overlooked, as Seifert correctly asserted), that worsens the picture. However, even if it doesn't violates bivariative normality the computed average is a mathematically invalid exercise. If a meta analysis study is based on these averages the results can be easily challenged on these grounds.

Sample-size weighting is a good start, as Seifert asserted. We can certainly do better. We may compute self-weighted averages from one, more than one, or all of the constituent terms of a correlation coefficient, to account for different types of variability information present in the paired data, which otherwise might be ignored by simply sample-size weighting or applying Fisher Transformations. Which self-weighting scheme to use depends on the source of variability information to be considered (https://www.tandfonline.com/doi/abs/10.1080/03610926.2011.654037).

1 Recommendation

You can calculate an average correlation coefficient but NOT by simply calculating the mean of the coefficients. You first need to transform each correlation coefficient using Fisher's Z, calculate the mean of the z values, then back-transform to the correlation coefficient.

12 Recommendations

Hi Abdur,

The answer to your question is NO. You are working on a probabilistic space, in this space most of the arithmetic operations have no meaning. A brief summary of the mathematics appropriate to work with random variables can be found in this paper: https://www.researchgate.net/publication/313893117_Notes_on_the_method_of_Cumulants_for_solution_of_linear_equations_involving_random_variables

In particular the correlation between two random variables X and Y indicates whether both variables follow the same pattern of change. If X tends to increase along with Y, the correlation would be close to 1.0. If, on the other hand, Y tends to increase as X decreases their correlation would be negative. If they are independent from each other the correlation would be 0. If you divide the correlation by 2, does it mean that the correlation between X and Y suddenly became smaller? It does not make any sense.

Hope it helps,

Augusto

1 Recommendation

sir, I have one paper in which the author has discussed the average correlation coefficient so I don't know what he means by this average? Here is the link

For the Fig.4. you can see the first paragraph on right side of page 542.

You can calculate an average correlation coefficient but NOT by simply calculating the mean of the coefficients. You first need to transform each correlation coefficient using Fisher's Z, calculate the mean of the z values, then back-transform to the correlation coefficient.

12 Recommendations

Dear Abdur Rauf

No, because correlation coefficient **describes** the relationship between two independent variables and r has a value within +1 and -1, and zero mean there is no relationship between the variables. In a positive correlation, the variables change in the same direction while in the negative correlation the variables change in opposite directions.

In the numbers relationship zero is more value than -1, but in the description of the correlation coefficient -1 mean complete negative correlation and describe stronger relationship than zero.

If we suppose there is one negative correlation coefficient among the numbers of your example, then will be canceled in the average.

Good Luck

2 Recommendations

@ Abdur Rauf, my respect to Michael C Seto idea, but I see it is not suitable for correlation coefficient

Regards

you can say in simple way "the correlation coefficient ranged from "0.50 :0.80"

If you want to calculate the arithmetic mean, then you must follow the roles given by Michael C especially if each case of use different number of observations.

"good luck"

1 Recommendation

Correlation Coefficients are not additive. In addition, Fisher Transformations of correlations are valid if only if the paired data is bivariate normally distributed, otherwise gross errors are introduced if you blindly use them.

Back in 2012, I published a model that solves the problem of computing averages from nonadditive quantities like correlation coefficients, coefficients of determinations, etc. See links:

The Self-Weighting Model

(Communications in Statistics - Theory and Methods, 41, 2012, 8; Taylor and Francis, London.)

Part 1 Tutorial

Part 2 Tutorial

3 Recommendations

The stronger the association of the two variables, the closer the Pearson correlation coefficient, *r*, will be to either +1 or -1 depending on whether the relationship is positive or negative, respectively. Achieving a value of +1 or -1 means that all your data points are included on the line of best fit – there are no data points that show any variation away from this line. Values for *r* between +1 and -1 (for example, *r* = 0.8 or -0.4) indicate that there is variation around the line of best fit.

- 18.83 KBr.png

1 Recommendation

Regarding computing Spearman's as a Pearson's Correlation, also known as score-to-rank transformations:

Did you know that score-to-rank transformations can change the sampling distribution of a statistic like a correlation coefficient, and that Fisher transformations are sensitive to normality violations? Combining both types of transformations is a recipe for a statistical disaster.

That is addressed in Part 3 of another tutorial series at

How to combine the multiple correlation values?

I have five groups of data, let say A1, A2, A3, A4, and A5.

The correlation between A1 and A2 is C12

The correlation between A1 and A3 is C13

The correlation between A1 and A4 is C14

The correlation between A1 and A5 is C15

How to combine all these correlations (C12, C13, C14, and C15 ) values for conveying that A1 is highly correlated with other group elements.

This is something I wrapped my head around lately. I understand this answer may be too late. Someone is out there who will use this, I am sure of it.

1. Basically, you can average correlations. Of course, it only makes sense when they come from the same population. Otherwise, you'd be averaging apples with oranges (I believe that is what Khalid Hassan wanted to point out). Of course, it would be preferable to get a single estimate of r from one large sample, but sometimes that is not possible and you have to obtain correlation estimates from several samples or from repeated measurement on the same sample.

2. A correlation coefficient is not unbiased. That means, when you average several correlations it will not converge to the true correlation. It will underestimate the true correlation. This effect is greatest for mid-range correlations around .05.

3. There are several way to correct for that. See Alexander (1990) for a brief overview (https://link.springer.com/content/pdf/10.3758/BF03334037.pdf).

4. The Fisher z is the most well known correction. There are lots of others. Some of them are more precise than Fisher z. If you're new to this, Fisher z is still a good start.

5. In any way, you should not simply average the correlations but weight them using the sample size of each correlation.

6. Sample size is also important given the fact that the bias of the correlation coefficient highly depends on the underlying sample sizes (Corey, Dunlap & Burke, 1998).

7. With sample sizes larger than 50 the bias is already pretty small and only shows on the third digit. That is negligible for many applications, which implies that you could ignore any corrections if that is the case in your field (Corey, Dunlap & Burke, 1998).

8. However, still use weighted averages.

9. Keep also in mind that a correction that improves the averaged correlation r may also affect the standard deviation. So, if you want to want to apply a mathematical operation that uses the standard deviation you have to ask this forum again.

10. As Edel Garcia pointed out: all the above is only true under the assumption of bivariate normality. Violating normality may change the picture. Most papers do not consider such violations.

1 Recommendation

Article

Full-text available

- Jan 2014

In this paper we present general principles of correlation analysis and its use in biomedical research. Practical examples of correlation analysis are given. Calculations of Pearson’s and Spearmen’s correlation coefficients are presented using formulas and STATA software. Main assumptions for the use of correlation analysis are discussed as well as...

Article

Full-text available

- Jan 2014

Multivariate linear correlation analysis plays an important role in various
fields such as statistics, economics, and big data analytics. However, there
was no compact formulation to define and measure multivariate linear
correlation. In this paper, we propose a pair of coupling coefficients, the
multivariate linear correlation coefficient (LCC) an...

Data

- Jul 2010

Assessment of the replicated experiments using Pearson's correlation analysis. X and Y-axis represent the two replicates at each time point. The colour of each square denotes the Pearson's correlation coefficient of the two experiments as indicated in the legend.

Get high-quality answers from experts.

Not finding the right answers on Google?

Sign up today to join our network of over **17+ million scientific professionals**.