25th Apr, 2014

Edinburgh Napier University

Q&A

Find answers to technical questions and follow scientific discussions

Question

Asked 25th Apr, 2014

Ordination is vital method for analysis community data, but I really don't know how to choose suitable method and these different.

The choice of ordination methods depends on 1) the type of data you have, 2) the similarity distance matrix you want/can use, and 3) what you want to say. All of these ordination methods are based on similarity distance matrix constructed on your data, using different methods (such as Euclidean, Bray-Curtis (=Sorensen), Jaccard etc.) to calculate the distance between samples. However, the different methods to calculate the similarity matrix will not give the same results. Different ordination methods use different similarity matrix, and can significantly affect the results. For example, PCA will use only Euclidean distance, while nMDS or PCoA use any similarity distance you want.

So, how to choose a method?

- If you have a dataset that include null values (e.g. most dataset from genotyping using fingerprinting methods include null values, when for example a bacterial OTU is present in some samples and not in others), I would advise you to use Bray-Curtis similarity matrix and nMDS ordination. Bray-Curtis distance is chosen because it is not affected by the number of null values between samples like Euclidean distance, and nMDS is chosen because you can choose any similarity matrix, not like PCA.

- if you have a dataset that do not include null values (e.g. environmental variables), you can use Euclidean distance, and use either PCA or nMDS, and you will see that in this case, it will give you the same results.

Many ordination methods exist, such as the ones you mentioned, but also RDA (Redundancy analysis), CAP (canonical analysis of principal coordinates), dbRDA (distance based redundancy analysis), and others… Some methods will be better than others to show complex community or a specific effect of a factor on your data. For example, CAP will be good to show the effect of the interaction between factors on your community. So sometimes, it is good to try different methods if you are not happy about the results, but keep in mind that these methods are “only” ordination, and you need to perform test for significant differences between groups (e.g. ANOSIM, ADONIS, PERMANOVA, MRPP…).

Often different ordination methods and different features/characteristics than you will find interesting, such overlay vectors or extra variables, % explained by each axis, 3D… However, all these details are more software related than truly related to the ordination methods.

You can find more information about ordination methods and also test for significant differences between groups in this review:

A. Ramette (2007) Multivariate analyses in microbial ecology, FEMS Microbiology Ecology, 62, 142-160.

Hope that help

Aimeric

It depends on your input data. You should first know whether your data belong to the fixed mode or to the random mode. Not all methods are suitable for fixed mode data.

**Get help with your research**

Join ResearchGate to ask questions, get input, and advance your work.

The choice of ordination methods depends on 1) the type of data you have, 2) the similarity distance matrix you want/can use, and 3) what you want to say. All of these ordination methods are based on similarity distance matrix constructed on your data, using different methods (such as Euclidean, Bray-Curtis (=Sorensen), Jaccard etc.) to calculate the distance between samples. However, the different methods to calculate the similarity matrix will not give the same results. Different ordination methods use different similarity matrix, and can significantly affect the results. For example, PCA will use only Euclidean distance, while nMDS or PCoA use any similarity distance you want.

So, how to choose a method?

- If you have a dataset that include null values (e.g. most dataset from genotyping using fingerprinting methods include null values, when for example a bacterial OTU is present in some samples and not in others), I would advise you to use Bray-Curtis similarity matrix and nMDS ordination. Bray-Curtis distance is chosen because it is not affected by the number of null values between samples like Euclidean distance, and nMDS is chosen because you can choose any similarity matrix, not like PCA.

- if you have a dataset that do not include null values (e.g. environmental variables), you can use Euclidean distance, and use either PCA or nMDS, and you will see that in this case, it will give you the same results.

Many ordination methods exist, such as the ones you mentioned, but also RDA (Redundancy analysis), CAP (canonical analysis of principal coordinates), dbRDA (distance based redundancy analysis), and others… Some methods will be better than others to show complex community or a specific effect of a factor on your data. For example, CAP will be good to show the effect of the interaction between factors on your community. So sometimes, it is good to try different methods if you are not happy about the results, but keep in mind that these methods are “only” ordination, and you need to perform test for significant differences between groups (e.g. ANOSIM, ADONIS, PERMANOVA, MRPP…).

Often different ordination methods and different features/characteristics than you will find interesting, such overlay vectors or extra variables, % explained by each axis, 3D… However, all these details are more software related than truly related to the ordination methods.

You can find more information about ordination methods and also test for significant differences between groups in this review:

A. Ramette (2007) Multivariate analyses in microbial ecology, FEMS Microbiology Ecology, 62, 142-160.

Hope that help

Aimeric

Hi,

Have you received answer?

What I used to perform in method selection? I first perform DCA on the sample by species dataset. If the lengths of axis is greater or equal to 2.5 then I prefer to utilize CCA otherwise I stick to linear methods such as RDA, PCA or CA. But CCA must have reasonably justifiable environmental variable/s.

Thanks.

Chitra

Is depends on the aims and scope of your research. Once you have selected the technique to use, previous to the analysis, your data may need specific adjustments, depending on your objectives and techniques you want to use. Usually ecological data are highly heterogeneous, including lots of zeros (null values, absences of species, resulting from a large number of rare species in most ecological community samples) and several techniques perform badly with this kind of data.

Here are some questions that may guide you to select a proper tool of analysis:

1) Are you comparing existing groups? consider DA, MRPP, perMANOVA, ISA

2) Are you looking for groups? consider Cluster Analysis (Flexible Beta is great with any distance measure, you can control the space distorting properties), Ward (Euclidean only), avoid Twinspan (performs poorly with more than one important gradient).

4) Focus on Direct vs. Indirect gradient analysis?

Beals (1984) makes two strong statements about the advantages of Indirect gradient analyses (=sociological ordination) over direct (=environmental ordination) gradient analysis:

i) "Species differences between two samples do reflect their environmental differences, but in a highly integrated fashion, which includes differences in biotic interactions and historical events. The environmental differences are automatically scaled according to overall species response. Therefore the ordination with the clearest species patterns reflects the environmental space the way biotic communities interpret it."

ii) "The disadvantage of environmental ordination is that one must prejudge which are the important environmental factors to the vegetation or the fauna. An environmental ordination may omit important variables; it is often biased toward those factors most easily measured; measured variables may be scaled wrong; and biotic patterns imposed by competition, predation and other interactions are ignored."

5) Do you still want to focus on Direct gradient analysis? NPMR (for a single response variable), otherwise RDA (linear) or CCA (unimodal).

6) Do you want to focus on Indirect gradient analysis? NMS (powerful method in community ecology, valid for any distance measure and any number of dimensions); CA, RA or WA (for a single dimension=gradient only); Avoid using DCA (a heavily manipulated technique, except for its first dimension, equivalent to CA). Avoid using PCA (unless linear relationships in the main matrix are met).

For more details see:

McCune, B. & Grace, J. B. 2002. Analysis of Ecological Communities. Gleneden Beach, Oregon, USA.

**Get help with your research**

Join ResearchGate to ask questions, get input, and advance your work.

Hi all,

This topic seems to interest lot of people. An article have been recently published in Molecular Ecology about the different multivariate methods in microbial ecology, but it is useful in many other fields of research. The article describe the different approach, explanatory methods, interpretive and statistical test. It is I think almost a complete (but not everything ;) overview of the tools we can use and it seems really helpful when you need to know which methods to use and why.

So overall it is an excellent article to have, completing the article from Ramette 2007 in FEMS Microbiology Ecology.

Title and link to the article:

Application of multivariate statistical techniques in microbial ecology

O. Paliy and V. Shankar

Forsberg, Kevin J., Sanket Patel, Molly K. Gibson, Christian L. Lauber, Rob Knight, Noah Fierer, and Gautam Dantas. “Bacterial Phylogeny Structures Soil Resistomes across Habitats.” *Nature* 509, no. 7502 (May 21, 2014): 612–16. doi:10.1038/nature13377.

For your information, I have read some nature paper using Bray-curtis distance matrix for PCoA analysis.

Actually, PCoA is not limited to Euclidean distance only, the same with NMDS. It can take any distance measures and adjust its functions to combine the original variables according to your dissimilarity measure. If you use beta_diversity_through_plots.py in Qiime to generate beta diversity distance matrices for PCoA, you may choose different distance measures (-s).

PCoA, PCA are less computer intensive than NMDS.

PCoA, CA, NMDS also consider double zeros situation (better than PCA).

Non-Euclidean measures should be chosen for data set with zero.

My suggestion is that PCoA and NMDS could be considered as equally informative with ecological data but dissimilarity measures and data transformation are more important.

Dear All

I have applied some biofertilizers in the field soil to know their impact on soil variables (enzymes, ph, EC) and plant growth variables (plant height, fresh and dry weight branch etc.). What kind of ordination method can be used in this data set? For instance, PCA, PCoA, CCA, DCA, RDA etc. Also how to frame the data matrix? Should all the data in a single excel file? Data set of column and row is also important. Please suggest any link.

Regards

Rizwan

+919412819870

Email: rizwans.ansari@gmail.com

@Sara Patricia Luna: I think the below mentioned publication will be of help. The R code for the program is available as supplementary material (I believe). You can use the program with DGGE and other fingerprints as well

Kropf, S.,Heuer, H.,Grüning, M.,and Smalla,K.(2004). Significance

test for comparing complex microbial community fingerprints using

pairwise similarity measures. J. Microbiol.Methods 57, 187–195.doi:

10.1016/j.mimet.2004.01.002

Having unbalance samples should not be a problem to analyse your T-RFLP data. You just need to follow "normal" procedure. I would normalise and square root the data and then performed an nMDS using Bray-Curtis similarity matrix. Then you can run an ANOSIM to test for differences between sites and the fact of having unbalance samples is not an issue with these analysis.

Hi, Ikram, I 'm pretty sure you can't assign a % variation explained to each axis in NMDS. What you should consider for this type of analysis is the stress, or how well the distances in your plot represent the (dis)similarities you used to generate the NMDS. Lower stress means that the distances in your plot do a good job of representing the calculated similarity.

Alternatively, you could calculate which species or features correlate with various directions on your ordination. I think vegan's envfit function can accomplish this.

Sounds like you are on the right track. Check out this website for a guide to interpreting stress values (and NMDS in general) https://mb3is.megx.net/gustame/dissimilarity-based-methods/nmds

Hi, these are tha main uses of the ordination methods:

Principal component analysis (PCA):

Euclidean distance

Parametric (like ANOVA)

Based on eigenvectors

Raw and quantitative data.

Preserves the Euclidean distance between sites.

Correspondence analysis (CA):

X2 distance

Frequency or similar data, dimensionally homogeneous and non-negative.

Keep distance c2 between rows or columns.

Used in ecology to analyze tables of species data.

Main coordinate analysis (PCoA):

A lot of distances

Arrangement of distance matrices (Q mode), instead of site by variables.

Flexibility in the choice of association measures.

Non-metric multidimensional scaling (NMDS):

It is not a method based on eigenvectors.

It tries to represent the set of objects along a predetermined number of axes while preserving the ordering relationships between them.

Sources:

Legendre, P., & Legendre, L. F. (2012). *Numerical ecology* (Vol. 24). Elsevier.

Borcard, D., Gillet, F., & Legendre, P. (2018). *Numerical ecology with R*. Springer.

PCoA using *Euclidean* distances is basically PCA. The "advantage" of PCoA is that you can use *other* distance/(dis)similarity measures, s.a., https://mb3is.megx.net/gustame/dissimilarity-based-methods/principal-coordinates-analysis.

Hence, PCoA with Gower distance is possible, or UniFrac distance, or Bray-Curtis dissimilarity, etc.

Best,

Cedric

This is a really relevant discussion on an important topic for a lot of people working with community ecology. I believe Dr Blaud addressed the main questions. I just like to add that CA has the arch effect problem and DCA is not a enough sollution.

Maybe I could emphasize that the choice for the best ordination method should be addressed looking for the ecological question and the available data set.

Hello All,

Here are few more relevant/important sources for "community ecologists":

1. McCune, B. and J.B. Grace. 2002. Analysis of Ecological Communities. MJM Press (there are several good chapters).

2. Digby, P.G.N. and R.A. Kempton. 1987. Multivariate analysis of ecological communities. Chapman & Hall

3. Legendre, P. and Legendre, L. 2012. Chapter 7 – Ecological resemblance (Chapter 8 – Cluster analysis.). In: Legendre, P. and Legendre, L. 1998, Numerical ecology. Elsevier.

4. McCune, B. and Kent, M. 2012. Chap. 6 – Ordination methods. Pages 171–271.

5. Everitt, B. and T. Hothorn. Chaps 3–4. PCA and NMDS.

6. Borcard, Gillet and Legendre. Unconstrained Ordination (and Chap 6: Canonical Ordination).

More from whom I took multivariate analysis class (DW Roberts has also written some R packages like "labdsv". I use this package along with "vegan"):

7. Roberts, D.W. 1986. Ordination on the basis of fuzzy set theory. Vegetatio 66:123-131.

8. Roberts, D.W. 2008. Statistical analysis of multidimensional fuzzy set ordination. Ecology 89:1246-1260.

9. Roberts, D.W. 2015. Vegetation classification by two new iterative reallocation optimization algorithms. Plant Ecology 216(5):741–758.

10. Roberts, D.W. labdsv: https://cran.r-project.org/web/packages/labdsv/labdsv.pdf

And, there are certainly hundreds of more resources you could find.

Cheers!

Subodh

And for a general overview see https://mb3is.megx.net/gustame

Please look GUSTA ME: https://mb3is.megx.net/gustame

While I agree that are guidelines on the use of these methods, it is impossible to know which one is best. This is because ordination methods based on distance matrices are not model based approaches. I would highly suggest the usage of latent variable models and the package gglvm. In general, have a look at the work of David Warton.

What is the purpose of a Permanova test, specifically in terms of the gut microbiota?

Question

10 answers

- Asked 14th Mar, 2017

- Haley Hallowell

Scientists Support Ukraine

Discussion

Be the first to reply

- Asked 3rd Mar, 2022

- Ijad Madisch

Like so many, I am shocked and saddened at seeing war break out in Europe. My thoughts – and those of the ResearchGate team – are with the people of Ukraine and everyone affected.

ResearchGate is an international company, whose purpose is to enable scientists across the world to work together openly and collaboratively, regardless of borders or nationality. We have people from over 40 countries on our staff of around 200, and being based in Berlin, we are profoundly aware of the human cost of conflicts, the echoes of which have shaped and scarred our home city. We join with the international community in condemning the actions of the Russian state.

From today, we will offer free advertising space worth $2.5 million on our network to humanitarian organizations working to respond to the crisis. ResearchGate benefits from over 50 million visitors every month, and we hope this initiative can help raise funds and awareness for those organizations that are having direct impact and need support.

We also want to use our platform to highlight the response from the scientific community. Personally, I have found the messages of support from scientists everywhere to be truly heartfelt, and I would like to highlight some of the community initiatives I’ve seen here:

**Science for Ukraine**provides an overview of labs offering a place for researchers and students who are affected to work from, as well as offers of employment, funding, and accommodation: https://scienceforukraine.eu/**Labs supporting Ukrainian Scientists**is an expansive list of labs and PIs offering support at this time. Submissions go here: https://docs.google.com/forms/d/e/1FAIpQLSeRGe5Da_b6GGyC6VT7CLGViGs06SzeuX7wRKpC4K5tnvlhgg/viewform Find the full list here: https://docs.google.com/spreadsheets/d/1HqTKukfJGpmowQnSh4CoFn3T6HXcNS1T1pK-Xx9CknQ/edit#gid=320641758- This
**open letter from European scientists**expressing solidarity and promoting peace is open for signatures: https://www.ipetitions.com/petition/an-open-letter-from-european-scientists-against

Additionally, I’m posting here some of the organizations responding to the crisis and actively soliciting donations:

**Doctors Without Borders / Médecins Sans Frontières (MSF)**: https://www.doctorswithoutborders.org/what-we-do/countries/ukraine**The UN Refugee Agency**: https://donate.unhcr.org/int/en/ukraine-emergency**UNICEF**: https://www.unicef.org/emergencies/conflict-ukraine-pose-immediate-threat-children

To help gather more support for these initiatives, please consider sharing this post further (you don’t need a ResearchGate account to see it), and I will continue to update it with other initiatives as I find them. You can also click “Recommend” below to help others in your ResearchGate network see it. And if you know of any other community initiatives that we can share here please let us know via this form: https://forms.gle/e37EHouWXFLyhYE8A

-Ijad Madisch, CEO & Co-Founder of ResearchGate

-----

This list outlines country-level initiatives from various academic institutions and research organizations, with a focus on programs and sponsorship for Ukrainian researchers:

Article

Full-text available

- Sep 2013

Get high-quality answers from experts.