Q&A
Find answers to technical questions and follow scientific discussions
Question
Asked 25th Apr, 2014

How to choose ordination method, such as PCA, CA, PCoA, and NMDS?

Ordination is vital method for analysis community data, but I really don't know how to choose suitable method and these different.

Popular answers (1)

25th Apr, 2014
Aimeric Blaud
Edinburgh Napier University
The choice of ordination methods depends on 1) the type of data you have, 2) the similarity distance matrix you want/can use, and 3) what you want to say. All of these ordination methods are based on similarity distance matrix constructed on your data, using different methods (such as Euclidean, Bray-Curtis (=Sorensen), Jaccard etc.) to calculate the distance between samples. However, the different methods to calculate the similarity matrix will not give the same results. Different ordination methods use different similarity matrix, and can significantly affect the results. For example, PCA will use only Euclidean distance, while nMDS or PCoA use any similarity distance you want.
So, how to choose a method?
- If you have a dataset that include null values (e.g. most dataset from genotyping using fingerprinting methods include null values, when for example a bacterial OTU is present in some samples and not in others), I would advise you to use Bray-Curtis similarity matrix and nMDS ordination. Bray-Curtis distance is chosen because it is not affected by the number of null values between samples like Euclidean distance, and nMDS is chosen because you can choose any similarity matrix, not like PCA.
- if you have a dataset that do not include null values (e.g. environmental variables), you can use Euclidean distance, and use either PCA or nMDS, and you will see that in this case, it will give you the same results.
Many ordination methods exist, such as the ones you mentioned, but also RDA (Redundancy analysis), CAP (canonical analysis of principal coordinates), dbRDA (distance based redundancy analysis), and others… Some methods will be better than others to show complex community or a specific effect of a factor on your data. For example, CAP will be good to show the effect of the interaction between factors on your community. So sometimes, it is good to try different methods if you are not happy about the results, but keep in mind that these methods are “only” ordination, and you need to perform test for significant differences between groups (e.g. ANOSIM, ADONIS, PERMANOVA, MRPP…).
Often different ordination methods and different features/characteristics than you will find interesting, such overlay vectors or extra variables, % explained by each axis, 3D… However, all these details are more software related than truly related to the ordination methods.
You can find more information about ordination methods and also test for significant differences between groups in this review:
A. Ramette (2007) Multivariate analyses in microbial ecology, FEMS Microbiology Ecology, 62, 142-160.
Hope that help
Aimeric

Most recent answer

15th Sep, 2022
Ashraf M. T. Elewa
Minia University
It depends on your input data. You should first know whether your data belong to the fixed mode or to the random mode. Not all methods are suitable for fixed mode data.

All Answers (54)

25th Apr, 2014
Aimeric Blaud
Edinburgh Napier University
The choice of ordination methods depends on 1) the type of data you have, 2) the similarity distance matrix you want/can use, and 3) what you want to say. All of these ordination methods are based on similarity distance matrix constructed on your data, using different methods (such as Euclidean, Bray-Curtis (=Sorensen), Jaccard etc.) to calculate the distance between samples. However, the different methods to calculate the similarity matrix will not give the same results. Different ordination methods use different similarity matrix, and can significantly affect the results. For example, PCA will use only Euclidean distance, while nMDS or PCoA use any similarity distance you want.
So, how to choose a method?
- If you have a dataset that include null values (e.g. most dataset from genotyping using fingerprinting methods include null values, when for example a bacterial OTU is present in some samples and not in others), I would advise you to use Bray-Curtis similarity matrix and nMDS ordination. Bray-Curtis distance is chosen because it is not affected by the number of null values between samples like Euclidean distance, and nMDS is chosen because you can choose any similarity matrix, not like PCA.
- if you have a dataset that do not include null values (e.g. environmental variables), you can use Euclidean distance, and use either PCA or nMDS, and you will see that in this case, it will give you the same results.
Many ordination methods exist, such as the ones you mentioned, but also RDA (Redundancy analysis), CAP (canonical analysis of principal coordinates), dbRDA (distance based redundancy analysis), and others… Some methods will be better than others to show complex community or a specific effect of a factor on your data. For example, CAP will be good to show the effect of the interaction between factors on your community. So sometimes, it is good to try different methods if you are not happy about the results, but keep in mind that these methods are “only” ordination, and you need to perform test for significant differences between groups (e.g. ANOSIM, ADONIS, PERMANOVA, MRPP…).
Often different ordination methods and different features/characteristics than you will find interesting, such overlay vectors or extra variables, % explained by each axis, 3D… However, all these details are more software related than truly related to the ordination methods.
You can find more information about ordination methods and also test for significant differences between groups in this review:
A. Ramette (2007) Multivariate analyses in microbial ecology, FEMS Microbiology Ecology, 62, 142-160.
Hope that help
Aimeric
16th Aug, 2014
Chitra Bahadur Baniya
Tribhuvan University
Hi,
Have you received answer?
What I used to perform in method selection? I first perform DCA on the sample by species dataset. If the lengths of axis is greater or equal to 2.5 then I prefer to utilize CCA otherwise I stick to linear methods such as RDA, PCA or CA. But CCA must have reasonably justifiable environmental variable/s.
Thanks.
Chitra
21st Aug, 2014
Md. Masum Billah
University of Bologna
Thanks a lot for your helpful suggestions!@Aimeric Blaud 
17th Nov, 2014
Adebola Lateef
University of Ilorin
It majorly depends on What you want to say ( you can view your data from different meaningful angle, but what u want to explain will determine the analysis you need) and your type of data.  Try different analysis type and see which one depicts what you intend to explain.
18th Nov, 2014
José Antonio Vázquez-García
University of Guadalajara
Is depends on the aims and scope of your research. Once you have selected the technique to use, previous to the analysis, your data may need specific adjustments, depending on your objectives and techniques you want to use. Usually ecological  data are highly heterogeneous, including lots of zeros (null values, absences of species, resulting from a large number of rare species in most ecological community samples) and several techniques perform badly with this kind of data.
Here are some questions that may guide you to select a proper tool of analysis:
1) Are you comparing existing groups? consider DA, MRPP, perMANOVA, ISA
2) Are you looking for groups? consider Cluster Analysis (Flexible Beta is great with any distance measure, you can control the space distorting properties), Ward (Euclidean only), avoid Twinspan (performs poorly with more than one important gradient).
4) Focus on Direct vs. Indirect gradient analysis?
Beals (1984) makes two strong statements about the  advantages of Indirect gradient analyses (=sociological ordination) over direct (=environmental ordination) gradient analysis: 
i) "Species differences between two samples do reflect their environmental differences, but in a highly integrated fashion, which includes differences in biotic interactions and historical events. The environmental differences are automatically scaled according to overall species response. Therefore the ordination with the clearest species patterns reflects the environmental space the way biotic communities interpret it."
ii) "The disadvantage of environmental ordination is that one must prejudge which are the important environmental factors to the vegetation or the fauna. An environmental ordination may omit important variables; it is often biased toward those factors most easily measured; measured variables may be scaled wrong; and biotic patterns imposed by competition, predation and other interactions are ignored." 
5) Do  you still want to focus on Direct gradient analysis? NPMR (for a single response variable), otherwise RDA (linear) or CCA (unimodal).
6) Do you want to focus on Indirect gradient analysis?  NMS (powerful method in community ecology, valid for any distance measure and any number of dimensions); CA, RA or WA (for a single dimension=gradient only);  Avoid using DCA (a heavily manipulated technique, except for its first dimension, equivalent to CA). Avoid using PCA (unless linear relationships in the main matrix are met).
For more details see:
McCune, B. & Grace, J. B.  2002. Analysis of Ecological Communities. Gleneden Beach, Oregon, USA.
4th Feb, 2016
Aimeric Blaud
Edinburgh Napier University
Hi all,
This topic seems to interest lot of people. An article have been recently published in Molecular Ecology about the different multivariate methods in microbial ecology, but it is useful in many other fields of research. The article describe the different approach, explanatory methods, interpretive and statistical test. It is I think almost  a complete (but not everything ;) overview of the tools we can use and it seems really helpful when you need to know which methods to use and why.
So overall it is an excellent article to have, completing the article from Ramette 2007 in FEMS Microbiology Ecology.
Title and link to the article:
Application of multivariate statistical techniques in microbial ecology
O. Paliy and V. Shankar
10th Jul, 2016
Yu Xia
Southern University of Science and Technology
Forsberg, Kevin J., Sanket Patel, Molly K. Gibson, Christian L. Lauber, Rob Knight, Noah Fierer, and Gautam Dantas. “Bacterial Phylogeny Structures Soil Resistomes across Habitats.” Nature 509, no. 7502 (May 21, 2014): 612–16. doi:10.1038/nature13377.
For your information, I have read some nature paper using Bray-curtis distance matrix for PCoA analysis.
18th Jul, 2016
An Ni Zhang
Massachusetts Institute of Technology
Actually, PCoA is not limited to Euclidean distance only, the same with NMDS. It can take any distance measures and adjust its functions to combine the original variables according to your dissimilarity measure. If you use beta_diversity_through_plots.py in Qiime to generate beta diversity distance matrices for PCoA, you may choose different distance measures (-s).
PCoA, PCA are less computer intensive than NMDS.
PCoA, CA, NMDS also consider double zeros situation (better than PCA). 
Non-Euclidean measures should be chosen for data set with zero. 
My suggestion is that PCoA and NMDS could be considered as equally informative with ecological data but dissimilarity measures and data transformation are more important.
17th Mar, 2017
Rizwan Ali Ansari
Aligarh Muslim University
Dear All
I have applied some biofertilizers in the field soil to know their impact on soil variables (enzymes, ph, EC) and plant growth variables (plant height, fresh and dry weight branch etc.). What kind of ordination method can be used in this data set? For instance, PCA, PCoA, CCA, DCA, RDA etc. Also how to frame the data matrix? Should all the data in a single excel file? Data set of column and row is also important. Please suggest any link.
Regards
Rizwan
+919412819870
17th Mar, 2017
Benoit Vanhee
Lille Catholic University
Hi Rizwan,
With quantitativ data (no zero asymetry), I recommand to use Principal Component Analysis.
27th May, 2017
Sara Luna
University of the Azores
Hello every one, 
i am using T-RFLP to study AMF communities beetwen sites, however, i have different sample number for each site. How do i analyse it?
thank you
24th Jun, 2017
RAJAL DEBNATH
Central Silk Board
@Sara Patricia Luna:  I think the below mentioned publication will be of help. The R code for the program is available as supplementary material (I believe). You can use the program with DGGE and other fingerprints as well
Kropf, S.,Heuer, H.,Grüning, M.,and Smalla,K.(2004). Significance
test for comparing complex microbial community fingerprints using
pairwise similarity measures. J. Microbiol.Methods 57, 187–195.doi:
10.1016/j.mimet.2004.01.002
25th Jun, 2017
Sara Luna
University of the Azores
Thank you, RAJAL DEBNATH
25th Jun, 2017
Aimeric Blaud
Edinburgh Napier University
Having unbalance samples should not be a problem to analyse your T-RFLP data. You just need to follow "normal" procedure. I would normalise and square root the data and then performed an nMDS using Bray-Curtis similarity matrix. Then you can run an ANOSIM to test for differences between sites and the fact of having unbalance samples is not an issue with these analysis.
28th Jul, 2017
Ikram Dahmani
Mohammed V University of Rabat
Hi Mr  Blaud,
Thank you for this great information about how we can choose ordination method.
I have a question about the percentage explained by each axis for NMDS analysis, how can calculate this?
cordially
28th Jul, 2017
Julian Trachsel
United States Department of Agriculture
Hi, Ikram, I 'm pretty sure you can't assign a % variation explained to each axis in NMDS.  What you should consider for this type of analysis is the stress, or how well the distances in your plot represent the (dis)similarities you used to generate the NMDS.  Lower stress means that the distances in your plot do a good job of representing the calculated similarity.
Alternatively, you could calculate which species or features correlate with various directions on your ordination. I think vegan's envfit function can accomplish this.
28th Jul, 2017
Francisco Calaça
Universidade Estadual de Goiás
Good explanations! 
31st Jul, 2017
Ikram Dahmani
Mohammed V University of Rabat
 Hi Julian, Thanks for your explain, exactly, I had already used function envfit in the package "vegan" for relating community data to environemental data and I had a stress value of the 0.1564409, what do you think? 
31st Jul, 2017
Julian Trachsel
United States Department of Agriculture
Sounds like you are on the right track.  Check out this website for a guide to interpreting stress values (and NMDS in general) https://mb3is.megx.net/gustame/dissimilarity-based-methods/nmds
31st Jul, 2017
Ikram Dahmani
Mohammed V University of Rabat
Hi Julian,Thank you very much for you help. 
7th Dec, 2017
Lan Liu
Hainan University
A useful paper named "Multivariate analysis of ecological communities in R-vegan tutorial". It tells the differences and many related information. Hope it can help!
6th Apr, 2018
Vijay Singh Meena
Borlaug Institute for South Asia (BISA)
This issue is very helpful me also, thanks
2nd May, 2018
Elizabeth Larson
University of Massachusetts Dartmouth
Great thread. Thanks for all the generous replies and suggestions!
31st May, 2018
Jorge Antonio Gomez-Diaz
Universidad Veracruzana
Hi, these are tha main uses of the ordination methods:
Principal component analysis (PCA):
Euclidean distance
Parametric (like ANOVA)
Based on eigenvectors
Raw and quantitative data.
Preserves the Euclidean distance between sites.
Correspondence analysis (CA):
X2 distance
Frequency or similar data, dimensionally homogeneous and non-negative.
Keep distance c2 between rows or columns.
Used in ecology to analyze tables of species data.
Main coordinate analysis (PCoA):
A lot of distances
Arrangement of distance matrices (Q mode), instead of site by variables.
Flexibility in the choice of association measures.
Non-metric multidimensional scaling (NMDS):
It is not a method based on eigenvectors.
It tries to represent the set of objects along a predetermined number of axes while preserving the ordering relationships between them.
Sources:
Legendre, P., & Legendre, L. F. (2012). Numerical ecology (Vol. 24). Elsevier.
Borcard, D., Gillet, F., & Legendre, P. (2018). Numerical ecology with R. Springer.
1st Jun, 2018
Francisco Calaça
Universidade Estadual de Goiás
Good, Jorge Antonio Gomez-Diaz! Thanks for more explanations!
FJSC.
15th Jun, 2018
Kishor Sharma
Sikkim University
The above discussion greatly helped me to chose a better ordination method for my current research. Thank you .
The book Borcard, D., Gillet, F., & Legendre, P. (2018). Numerical ecology with R. Springer mentioned @ Jorge Antonio Gomez-Diaz is very useful.
11th Oct, 2018
Altaf Hussain
University of Alberta
is PCoA really limited to Euclidean distance? I have seen people using Gower distance and what not!
11th Oct, 2018
Cedric Laczny
University of Luxembourg
PCoA using *Euclidean* distances is basically PCA. The "advantage" of PCoA is that you can use *other* distance/(dis)similarity measures, s.a., https://mb3is.megx.net/gustame/dissimilarity-based-methods/principal-coordinates-analysis.
Hence, PCoA with Gower distance is possible, or UniFrac distance, or Bray-Curtis dissimilarity, etc.
Best,
Cedric
11th Oct, 2018
Altaf Hussain
University of Alberta
Exactly, thanks for the clarification
11th Oct, 2018
Aimeric Blaud
Edinburgh Napier University
PCoA is not limited to Euclidean distance but work with any dissimilarity measure. Sorry for not updating my answer before, creating confusion. I now updated it.
PCoA is now commonly used with Bray-Curtis and UniFrac distance (weighted or unweighted) as Cedric mentioned.
27th May, 2019
Shbbir Raza Khan
Banaras Hindu University
Aimeric Blaud thanks Sir, your explanations are always very help full for me.
Thanks once again
8th Jul, 2019
Hudhaifa maan Al-Hamndi
Tikrit University
Very good and interesting discussion
23rd Jul, 2019
Carlos Freitas
Federal University of Amazonas
This is a really relevant discussion on an important topic for a lot of people working with community ecology. I believe Dr Blaud addressed the main questions. I just like to add that CA has the arch effect problem and DCA is not a enough sollution.
Maybe I could emphasize that the choice for the best ordination method should be addressed looking for the ecological question and the available data set.
30th Jul, 2019
Tluang Hmung Thang
Xishuangbanna Tropical Botanical Garden
Very nice discussion and explanation!
30th Jul, 2019
Bismark Ofosu-Bamfo
University of Energy and Natural Resources
The discussion above has been very helpful.
19th Aug, 2019
Abhijit Mitra
University of Calcutta
Trying to get detailed information
21st Aug, 2019
Subodh Adhikari
University of Idaho
Hello All,
Here are few more relevant/important sources for "community ecologists":
1. McCune, B. and J.B. Grace. 2002. Analysis of Ecological Communities. MJM Press (there are several good chapters).
2. Digby, P.G.N. and R.A. Kempton. 1987. Multivariate analysis of ecological communities. Chapman & Hall
3. Legendre, P. and Legendre, L. 2012. Chapter 7 – Ecological resemblance (Chapter 8 – Cluster analysis.). In: Legendre, P. and Legendre, L. 1998, Numerical ecology. Elsevier.
4. McCune, B. and Kent, M. 2012. Chap. 6 – Ordination methods. Pages 171–271.
5. Everitt, B. and T. Hothorn. Chaps 3–4. PCA and NMDS.
6. Borcard, Gillet and Legendre. Unconstrained Ordination (and Chap 6: Canonical Ordination).
More from whom I took multivariate analysis class (DW Roberts has also written some R packages like "labdsv". I use this package along with "vegan"):
7. Roberts, D.W. 1986. Ordination on the basis of fuzzy set theory. Vegetatio 66:123-131.
8. Roberts, D.W. 2008. Statistical analysis of multidimensional fuzzy set ordination. Ecology 89:1246-1260.
9. Roberts, D.W. 2015. Vegetation classification by two new iterative reallocation optimization algorithms. Plant Ecology 216(5):741–758.
And, there are certainly hundreds of more resources you could find.
Cheers!
Subodh
3rd Oct, 2019
Chimi Djomo Cédric
Institute of Agricultural Research for Development
very nice technical question and detailled explication.
thank to all
12th Nov, 2019
Mehrdad Rabiei
Max-Planck-Institut für terrestrische Mikrobiologie
To my knowledge, I have seen a lot of times PCoA, and NMDS methods used in papers. Just focus on high-ranking papers and then pick one up that is similar to your experiment. I think it is the simplest way to catch your answer.
28th Nov, 2019
Abhijit Mitra
University of Calcutta
Ordination Methods - an overview
Michael W. Palmer - may be helpful
21st Jul, 2020
Bulbul Ahmed
Research and Productivity Council (RPC)
Such a nice reading. Thanks all, specially Aimeric Blaud.
22nd Jul, 2020
Ícaro Castro
University of São Paulo
Very nice discussion. Thanks all.
26th Sep, 2020
Mekdimu Mezemir Damtie
Concept Engineers Inc.
Thank you Dr Aimeric Blaud
11th Dec, 2020
Sangwook Scott Lee
The Hong Kong University of Science and Technology
Thank you for your helpful advice, especially Aimeric Blaud .
10th Jan, 2021
David J. Gibson
Southern Illinois University Carbondale
This annotated bibliography may be helpful: Ordination Analysis
  • DOI: 10.1093/OBO/9780199830060-0003
12th Jan, 2021
Paul Somerfield
Plymouth Marine Laboratory
And for a general overview see https://mb3is.megx.net/gustame
21st Jan, 2021
Negin Katal
Max Planck Institute for Biogeochemistry Jena
I had in my master program such nice course about Mutlivariation analysis in ecology and for your problem we had a very useful sheet which I share it with you here. I hope it can help you.
11th May, 2021
Serkan Özdemir
Isparta University of Applied Sciences
You can choose the most appropriate ordination method, taking into account the distinctions on the axes. So, applying one method would be wasteful. It is better to use various methods and choose the ideal one.
29th Dec, 2021
Carlos Freitas
Federal University of Amazonas
Dear Negin Katal, I really appreciate the table you shared. Another potential approach to include environmental drivers is to take them as factors of a Permanova. After it you can use a NMDS to show the pattern. A vantage is that is possible to use the same matrix of distances.
16th Jan, 2022
Adijailton Jose de Souza
University of São Paulo
3rd Apr, 2022
Riccardo Soldan
University of Oxford
While I agree that are guidelines on the use of these methods, it is impossible to know which one is best. This is because ordination methods based on distance matrices are not model based approaches. I would highly suggest the usage of latent variable models and the package gglvm. In general, have a look at the work of David Warton.
15th Sep, 2022
Ashraf M. T. Elewa
Minia University
It depends on your input data. You should first know whether your data belong to the fixed mode or to the random mode. Not all methods are suitable for fixed mode data.

Similar questions and discussions

Scientists Support Ukraine
Discussion
Be the first to reply
  • Ijad MadischIjad Madisch
Like so many, I am shocked and saddened at seeing war break out in Europe. My thoughts – and those of the ResearchGate team – are with the people of Ukraine and everyone affected.
ResearchGate is an international company, whose purpose is to enable scientists across the world to work together openly and collaboratively, regardless of borders or nationality. We have people from over 40 countries on our staff of around 200, and being based in Berlin, we are profoundly aware of the human cost of conflicts, the echoes of which have shaped and scarred our home city. We join with the international community in condemning the actions of the Russian state.
We have been asking ourselves: What can we do?
From today, we will offer free advertising space worth $2.5 million on our network to humanitarian organizations working to respond to the crisis. ResearchGate benefits from over 50 million visitors every month, and we hope this initiative can help raise funds and awareness for those organizations that are having direct impact and need support.
We also want to use our platform to highlight the response from the scientific community. Personally, I have found the messages of support from scientists everywhere to be truly heartfelt, and I would like to highlight some of the community initiatives I’ve seen here:
Additionally, I’m posting here some of the organizations responding to the crisis and actively soliciting donations:
To help gather more support for these initiatives, please consider sharing this post further (you don’t need a ResearchGate account to see it), and I will continue to update it with other initiatives as I find them. You can also click “Recommend” below to help others in your ResearchGate network see it. And if you know of any other community initiatives that we can share here please let us know via this form: https://forms.gle/e37EHouWXFLyhYE8A
-Ijad Madisch, CEO & Co-Founder of ResearchGate
-----
Update 03/07:
This list outlines country-level initiatives from various academic institutions and research organizations, with a focus on programs and sponsorship for Ukrainian researchers:

Related Publications

Got a technical question?
Get high-quality answers from experts.