Science topic
PRIMER-E - Science topic
Explore the latest questions and answers in PRIMER-E, and find PRIMER-E experts.
Questions related to PRIMER-E
I ENCOUNTERED A PROBLEM WITH Primer 6 - PERMANOVA since I have a dataset with about 10000 records and Permanova doesn't run. The dataset has 8 variables, 1 factor with 2 levels. Do you have idea on how to solve the problem?....my PC is new, quite powerful and I tried also on other PCs
Is it possible to draw convex hulls or ellipses on PCA and dbRDA plots on PRIMER v7 software?
Dear Community
I have a dataset consisting of a matrix of diet samples as rows and prey species as columns (in gram). I am interested in investigating any differences in diet between years, seasons and areas. The data is very skewed, and the majority of samples comes from one of the seasons and from a few of the years in the time series.
Does anyone have any hints of how I could look at this using ANOSIM and/or PERMANOVA in the Primer software?
Kind regards,
Karl
Some datasets in ecology (e.g. CPUE) inherently contain abundant zero values, which may need to be adjusted and/or fitted. Apart from common solutions, I am asking how to work with such kind of data in PRIMER-e (i.e. possible pretreatments, adjustments or other post-treatment functions).
I want to investigate the relationship between differences in coral physiological variables based on euclidean distances and seawater environmental variables using DISTLM and dbRDA in PRIMER, but I am not sure if this analysis is suitable given the lack of replication I have in my predictor variable (environmental) matrix.
I have attached an excel file illustrating the structure of my data set (the response and predictor variables). Briefly, I have a multivariate data set of measured physiological variables (e.g. lipid concentration, protein concentration, tissue biomass etc.) for corals collected from five different locations (A-E), where each site is very unique in its seawater physico-chemical parameters. I collected 12 corals per site (total of 60 samples). I have constructed a resemblance matrix of the physiological data in PRIMER based on Euclidean distances, and there is clear grouping of data points in the NMDS, which coincides with the different collection sites for each coral. I want to investigate the proportion of the observed variation in the multivariate data cloud that can be explained by the environmental characteristics of each collection site (e.g. mean annual sea surface temperature, seawater chlorophyll concentration, salinity etc.). However, the dataset of environmental variables does not have replication. i.e. for each site (A-E), I only have one value for mean annual sea surface temp, one value of salinity etc.
All of the case-study examples I have read about distance-based redundancy analysis in R or PRIMER have two resemblance matrices (predictor and response) both of which have replication. However, in my case, my response variables have replication (i.e. 12 samples per site), whereas my environmental variables do not have replication (i.e. one measurement per variable per site).
Can someone advise me whether or not dbRDA is suitable in this instance? If as I predict, it is not suitable, can you recommend a better approach? I am not an expert in multivariate statistics, but I want to make sure that the approach I take is sound.
Any and all advice is welcome. Thanks
We have a dataset consisting of six years of fish detection data from an open ocean acoustic tracking array that was deployed to record the presence of acoustically-tagged fishes. The array consists of 50 permanently moored and widely spaced tracking stations divided equally between a “deep” and “shallow” stratum. Our core question is “Does the ‘community’ of detected fish (16 species) differ across depth strata and seasons?” Secondarily, “What habitat covariates help explain differences in the community?” We are not especially interested in year or station effects.
I’m working in PRIMER v7 that allows PERMANOVA models with both fixed factors and continuous covariates. If I considered a model without covariates, the design might be a repeated measures approach (since stations never move) with: Season = fixed (4 levels), Stratum = fixed (2 levels), Station = random (50 levels) nested within Stratum.
Things get more tricky when we consider adding covariates. Some covariates (e.g., distance from shore, seafloor slope, sediment type) are always linked to station and will not change though time while others such as remotely-sensed water temperature and chlorophyll vary on rather shorter timescales. One thought would be to include smaller time blocks as a random effect, maybe in one week or one month increments (so 321 or 60 blocks, respectively, over 6 years) and use mean temperate and chlorophyll values for each time block.
So my questions are:
1) Is it a PERMANOVA ‘felony’ to have some habitat covariate values that repeat many times while others do not?
2) Should Station even be a random effect when habitat covariates are tied to them?
We also considered using PERMANOVA just for fixed factor Season x Stratum x Interaction tests and the DISTLM routine for the continuous covariates but the same problem of static covariates due to repeated 'sampling' of the same stations would seem to remain.
We welcome any insights or criticisms on this approach
- Eric
Hello all,
This is a real dbRDA plot using real invertebrate abundance data (taxa-station matrix) with environmental data (substrate characteristics-station matrix) as predictor variables. The plot is produced in PRIMER v.7. Invertebrate data is 4th root transformed, Bray-Curtis similarity was used. Environmental data is normalized, Euclidean distance was used.
My question is: why is the vector overlay not centered at 0,0 in the plot? Interpreting this plot, one would conclude that every sampling station within the study area has values below the mean for predictor variables 2 and 13, which is impossible. Why would the center of the vector overlay be displaced -40 units? How can this be? Why is the plot centered on the dbRDA2 axis but the dbRDA1 axis?
Please let me know if anyone needs more information. Thank you!

I have a dataset with 96 individuals and 1496 transcripts of interest. I have analyzed the experiment by multiway ANOVA in the Permanova platform of Primer-E. I am satisfied that I have figured out how to do this and test my experimental hypotheses. However, it occurs to me that I can also run some kind of cluster analysis to identify and examine clusters of similar individuals. What might be the best Primer-E method to accomplish this?
I have a large dataset of fish abundance as well as some environmental variables covering around 795 sampling sites. I have tried to find the relationship of my environmental variables with the biological data with the RELATE function in PRIMER-E. The results indicate that using Spearman rank correlation, the sample statistic (Rho) is 0.11. Now the significance level of sample statistic is 0.1% (much less than 5%), so according to the manual, this is significant result! I used 999 permutations to get this result. I am unable to interpret this result as I would usually expect that if the p value is significant, the corresponding degree of association should be high also. So, I would expect the sample statistic to be much higher than 0.11! (above 0.7 or so). Smilar situation with the distLM procedure, here the p value suggests that each of the variable has significant effect on the model but again the overall R^2 of fit is only 0.13! How is this possible that with such a poor R^2, individual variables are all significant.
I have used the square root transformation and Bray-Curtis similarity on the biological data and have normalized the environmental variable and Euclidean measure. I haven't transformed the environmental variables.
I would really appreciate it if someone can help me to interpret these results.
Hi everyone,
I have an experiment investigating the effect of sediment condition (factor 1 - 2 levels) and the benthic macroinvertebrate community (factor 2 - 2 levels [faunated vs defaunated]) on ecosystem fluxes.
Some animals remained after defaunation, however, so there is a potentially confounding variable in the experiment. Adding a covariate (unremoved biomass of inverts) to PERMANOVA could help to account for this and avoid making a Type 1 error.
There are two opposing thoughts coming out:
1. The covariate should only be added if it has a significant effect
2. The covariate should be added regardless of significance, to account for its effect
Two similar questions online say they should be included regardless of significance:
So, should the addition of a covariate representing a potentially confounding effect be based on its significance? Your thoughts and reasoning would be greatly appreciated.
All the best,
Sorcha
I would like to know whether the BIOENV routine in PRIMER could be used to identify a subset of biological data (gut content items in my case) best explaining environmental data (biomarkers and condition indices in my case).
Thanks
I analysed some of my data from Caspian Sea basin. these data comprised form hard substrate of macrobenthic communities. I attached the results. in the attached file Time (1,2,3, and 4) represent season and Site (1 to 8) represent sampling sites.
anyone can help me to understand the results?
Thank you
I have a data set comparing the accumulated biomass on two types of substrate. The number of samples from each substrate is different, however low (n<10).
What would be the most appropriate test to show significant differences between the two substrates?
I've tried PERMANOVA v7 on euclidean distance resemblance matrix, but it seems a bit too much for such a small sample size.
Suggestions anyone?
Hello,
I've done a field study investigating the community composition (18S and 16S) along a contamination gradient. I'm creating graphs of my environmental variables and community composition using PCA and DistLM in PERMANOVA in PRIMER 7. The labels for the variables are overlapping making it hard to read them. Is there a way to make them spread out more? If you have a way to solve this it would be greatly appreciated.
Kind regards
Megan
Hi everyone,
I'm looking for a little help in the analysis of my qPCR raw results. I hope you'll be able to help me. Thanks in advance.
Here is my example:
I have 3 genes: 1 target gene, and 2 housekeeping genes.
I have estimated the PCR efficiency for these genes and their associated primers: E(target gene) = 2.1, E(HK gene 1) = 1.9, E(HK gene 2) = 2.2
I have Ct raw results for 2 technical replicates each time
I have a Treatment condition, and a Control condition, for all of the 3 genes (target gene and the 2 housekeeping genes)
Finally I have these data for 3 biological replicates
(see the attach Excel document).
From these data I have calculated the normalized expression of my target gene (using the control condition as a calibrator, and the housekeeping genes for final normalization). All the calculations are in the Excel document.
In the end I have a mean of the normalized expression for my target gene, which is great. However, I don't know yet how to obtain the standard error for this mean. I can calculate the standard error from the 3 normalized expressions from my 3 biological replicate (SE of 1.39, 1.55 and 1.26, which is easy in R or Excel), but from I what I read on internet (but I didn't understand it yet) there's apparently a way to calculate the SE value from the different SE obtained all along the calculation process (SE from the mean of the technical replicate values of CT, then SE from the relative quantity calculation, and finally SE from the normalization process).
Does anyone is familiar with that and could provide a little help or at least a little look at the attached document to confirm that my calculations are correct. From then you'll probably be able to lead me towards the correct calculations for my SE value.
Thanks a lot in advance
Regards
Marc
Hello,
I am using ANOSIM in PRIMER-E for my statistics. i need few clarifications in pairewise test ANOSIM.
If i am correct, the more closer the R towards 1, the two samples are dissimilar and are significantly different from each other. P value also less than 0.05. (P=4.8%).
But for some of my other samples the R =1, and P=16.7 (0.167). How do i interpret this data? for eg, In general people should always consider the samples are significant when the P <0.05. but here For my sample R=1 and is my P value is 0.167. So Can some one clarify how to interpret this data?
Thanks
Venkat
How can I select variables for PCA analysis from huge set of environmental data?
While I can easily assign factors to my samples to compare within sites or days of sampling, I got curious about "Edit-> Indicators". It seems that you may assign traits to the species in your dataset.
Does anyone here have experience with Indicators and is able to explain how to use them?
Thank you in advance for answers!
I have 4 experimental treatments. From each treatment I have area measurements of bryozoan colonies over time (n>60 per treatment). Each colony has a unique identifier, thus can be traced through time and given a growth rate at any particular interval. If I plot the successive areas on a line plot, the rates can also be interpreted from the slopes of the graph. I am interested to know if there is a significant effect of treatment on growth rate.
Is it appropriate to use Anosim (performed in Primer) to compare these line plots? Is it actually comparing the rates, or just the difference between the areas? E.g. if a colony was small at the start, it may be growing at the same rate, but always be smaller than a colony that was larger at the start. Thus the areas would be different but the growth rates would not.
If I am interested in growth rate rather than size, is it appropriate to use Anosim to compare the growth rates over time, rather than the areas?- presumably this would be looking at the change in rate over time, rather than the change in size.
Thanks for any advice!
Gail
Looking at presence and absence of algae species above and below a dam, the MDS is perfect with a stress of 0.11, when I did the ANOSIM I expected great results but the Global R is 0.029 and the significance is 36.4%. Why has this happened and can I do any thing about it?
How do I discuss this result?
Hi,
got a data set of three variables set as presence/absence (ie 8 possibilities) among 1200 stations and I would like to run a DISTLM model on Permanova (Primer) but I am not sure if we can run a such model on presence/absence data. Thanks for your help!
The help of PRIMER-e advices to do it if there are a large number of variables, but i don't know exactly what it does mean with large number.
I've attached the Draftsman-plot that I've obtained.
I'm using 04 variables: Temperature, Dissolved oxygen, Current speed and Depth ('Profundidad' in the attached figure).
I see no necessity of transforming it, but I just want to be sure if I'm correct or not.
Thanks and cheers

Or are ECVs just an indice of how important each term is in the model at explaining the overall variation. In the PERMANOVA output the reported ECVs are calculated from the square root of (the terms Mean square - residual mean square)/ n. However in a lot of papers the ECV is reported as the percentage of variation explained by each term.
Thanks
I used PRIMER-E software to perform ANOSIM and SIMPER analysis. But when I write the discussion section I faced problem in interpreting the results. Please help me by providing appropriate reference where all the basics of these analysis are described?
Thanks in advance.