Article

The Effect of Species Number and Type of Data on the Resemblance Structure of a Phytosociological Collection

Authors:
  • Western University, London, Canada
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Methods are described which can be used in the course of a pilot survey aimed at the evaluation of the relative importance of given species and given types of data as potential descriptors of a given plant community. It is suggested that on the basis of these methods the sampling effort may considerably be reduced while the loss of information is minimized. Numerical examples are included utilizing data from a deciduous forest community.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Rare species frequently have been deleted by setting relative abundance or occurrence frequency criteria (Gauch 1982) or, implicitly, by using small sample sizes (Cao et al. 1998), with the typical argument that rare species contribute little to interpretive value, but add noise to the statistical solution (Gauch 1982). Much of the evidence to support this argument was documented during the 1960s and 1970s BRIDGES (Webb et al. 1967, Austin and Greig-Smith 1968, Day et al. 1971, Orloci and Mukkattu 1973, Callahan et al. 1979. ...
... Callahan et al. (1979) applied polar ordination to a river nematode dataset, and noted that the 6 most abundant species recovered the classification based on all 154 species. Orloci and Mukkattu (1973) attempted to quantify the effects of number of plant species included in PCA ordination. They defined a subset that included n species from a total of N (77) and ranked those species based on their covariance. ...
Article
BRIDGES is a recurring feature of J-NABS intended to provide a forum for the interchange of ideas and information between basic and applied researchers in benthic science. Articles in this series will focus on topical research areas and linkages between basic and applied aspects of research, monitoring policy, and education. Readers with ideas for topics should contact Associate Editors, Nick Aumen and Marty Gurtz. Multivariate analyses are used commonly in bioassessment studies examining the degree of human impact on aquatic ecosystems. However, these analyses may have shortcomings with respect to how well they address the presence or absence of rare species. Researchers may delete rare species explicitly, or ignore them implicitly by the use of small sample sizes. The motivation for exclusion of rare species may be related to sampling or analytical resource limitations. The authors provide an overview of the importance of rare species, the sensitivity of the newer multivariate techniques to rare species, and the need for careful evaluation of the potential influences of the inclusion or exclusion of rare species from analyses in light of each study’s objectives and spatial scale. Nick Aumen,nick_aumen@nps.gov Marty Gurtz,megurtz@usgs.gov Co-editors
... Having the optimal character order, profiles of the function are drawn (as in Mukkattu 1973, Orlóci and. Probabilities of the function can be obtained by means of randomization (see Section 3.4). ...
... Therefore, at each level i, m-i+2 values of ρ(S i ; S i-1 ) are compared. A similar method to rank species is mentioned in Orlóci and Mukkattu (1973), but it was considered computationally impractical at that time because of the large number of species involved. ...
... This argument is referred to as the "statistical argument" for excluding rare species. Support for this argument has come from results from multivariate methods could be driven by the inclusion of rare species alone (Webb et al. 1967, Austin and Greig-Smith 1968, Day et al. 1971, Orloci and Mukkattu 1973. To some degree this argument has been examined in the literature with analyses of data standardizations , similarity coefficients ), ordination method (Marchant 1990), or their combinations (e.g. ...
... Prior to analysis of community structure, raw taxa data were collected into separate taxa matrices for phytoplankton and periphyton, each divided into diatoms and non-diatom algae (NDA). Several community analysis methodologies (including those that are used in this work) are distorted by rare taxa (Austin and Greig-Smith, 1968;Orloci and Makkattu, 1973;Goff, 1975) and are typically difficult to quantify accurately. To down-weight the effect of transient population spikes and zero-inflation in community analysis, taxa not comprising more than 0.5% of total abundance in two or more samples were removed, and data were Hellinger transformed (Rao, 1995;Legendre, 2012). ...
Article
Potential effects of pesticides on stream algae occur alongside complex environmental influences; in situ studies examining these effects together are few, and have not typically controlled for collinearity of variables. We monitored the dynamics of periphyton, phytoplankton, and environmental factors including atrazine, and other water chemistry variables at 6 agricultural streams in the Midwest US from spring to summer of 2011 and 2012, and used variation partitioning of community models to determine the community inertia that is explained uniquely and/or jointly by atrazine and other environmental factors or groups of factors. Periphyton and phytoplankton assemblages were significantly structured by year, day of year, and site, and exhibited dynamic synchrony both between site–years and between periphyton and phytoplankton in the same site–year. The majority of inertia in the models (55.4% for periphyton, 68.4% for phytoplankton) was unexplained. The explained inertia in the models was predominantly shared (confounded) between variables and variable groups (13.3, 30.9%); the magnitude of inertia that was explained uniquely by variable groups (15.1, 18.3%) was of the order hydroclimate > chemistry > geography > atrazine for periphyton, and chemistry > hydroclimate > geography > atrazine for phytoplankton. The variables most influential to the assemblage structure included flow and velocity variables, and time since pulses above certain thresholds of nitrate + nitrite, total phosphorus, total suspended solids, and atrazine. Time since a ≥ 30 μg/L atrazine pulse uniquely explained more inertia than time since pulses ≥ 10 μg/L or daily or historic atrazine concentrations; this result is consistent with studies concluding that the effects of atrazine on algae typically only occur at ≥ 30 μg/L and are recovered from.
... It has often been concluded that canopy-species are the most consistent floristic consideration for the ecological classification of tropical forest (Webb et al. 1967Webb et al. , 1970 Austin & Greig-Smith 1968; Greig-Smith 1971; Orloci & Mukkattu 1973; Knight 1975; Newbery et al. 1986). Here again the largest size-class best conforms to a perceived typology. ...
Article
Full-text available
In 1947, W. J. Eggeling published an account of forest succession at Budongo, Uganda. This interpretation was based on a large-scale comparative plot study, performed in the 1930s and 1940s. This account, with its implication that species richness declines in late succession, endures as a controversial corner-stone in theories and disputes about community diversity. Data have now been collected over six decades from five of Eggeling's original plots. This paper evaluates Eggeling's successional interpretation of the Budongo vegetation. The first set of analyses assesses the consistency of the original data with the predictions of compositional progression and convergence implicit in Eggeling's model. The second analyses do the same for the time-series observations. A logical approach shows how temporal information may be derived from both between plot, and within plot, evaluations using size-structured data. A Detrended-Correspondence-Analysis (DCA) of canopy-tree composition, from the original data, ranks the plots in perfect correspondence to Eggeling's successional sequence. A ‘development-scoring’ procedure is developed using passive-ordination against this sequence; this is then applied to composition by plot and stem-size class. Eggeling's original data are consistent with each prediction assessed. The analyses show compositional progression and apparent convergence across the plot series, and also progression and convergence within each plot. A monodominant-Cynometra forest is the natural end-point of this progression. The time-series results, though in apparent agreement for one early successional plot, do not generally accord with Eggeling's ideas. The analyses illustrate a general means for evaluating explicit and implicit compositional trends in communities with structured populations.
Chapter
Historically, ordination techniques have had a number of sources (Whittaker 1967), from early work in direct gradient analysis (articles 2 and 4) and the use of similarity measurements in various schools (articles 5 and 6) to the development of Wisconsin polar ordination and its modifications (article 7) and the introduction of multivariate techniques from other fields (articles 8 and 9) to the testing of these techniques and development of new techniques appropriate to ecological data and research purposes (articles 10 and 11). The following article contains a brief summary of this history (11.2). The combinations of different ordination approaches, algorithms, similarity coefficients, and criteria for axis determination provide an uncountable number of possible techniques. Summarization of this range of techniques and evaluation of their usefulness for data from natural communities are needed. We consider here how ordination techniques are affected by the peculiarities of community data, how they can be evaluated, and what recommendations on their use can be offered.
Article
Binary discriminant analysis (BDA) is a readily understood, easily used technique for identifying binary variables and their common trends which are most important for discriminating between groups. For the plant ecologist, the technique can be used on species lists to reveal similar patterns of preference or avoidance among species responding significantly to a multistate environmental parameter such as soil type, rock type, or aspect. In Q-mode BDA, orthogonal canonical factors are obtained which represent uncorrelated floristic trends best separating the groups. Scores of species on the factors can be plotted in multidimensional hyperspace, showing how each species responds to the floristic trends. In R-mode BDA, groups of species with similar responds to the environmental parameter are identified. These groups may be interpreted as statistical associations or community components comprised of species with similar ecologies. An example using lists of woody species from the Maryland Piedmont and Coastal Plain sorted according to underlying rock type produces floristic trends which are easily interpreted and species groups which are readily understandable.
Article
(1) As part of a study of the manner in which Kloss gibbons use the forests available to them, data were collected in two study areas of Lowland Evergreen Rain forest on Siberut Island, Indonesia, from 2459 trees of at least 15-cm dbh along transects of quadrats the total area of which was 11.3 ha. The identity of each tree, its height and climber load, the abundance of Myrmecodia epiphytes, and the position of its crown within the neighbouring canopy were recorded. (2) These data were analysed using classification and ordination techniques and as a result nine forest types were derived. These could be interpreted ecologically. One was on dry level ground, one on wet level ground, one on peatswamp, two on major ridges, and four on minor ridges. These types were generally recognizable by indicator species, and that gibbons recognized them too was reflected in their behaviour. (3) Seven of the forest types, in twenty-two blocks, were in the 31-ha home range of the main study group of gibbons. The number of times that gibbons were seen feeding in a forest type was very highly correlated with the density of trees that were potential fruit sources for gibbons. (4) Female gibbons sang from those forest types with the trees that were tallest and above the break of slope. The gibbon group slept in those forest types in which the trees had a low frequency of the myrmecophilous epiphyte Myrmecodia tuberosa, presumably to avoid being bitten by ants. (5) Few meaningful correlations were found between forest type and specific activities. However, forest types such as peatswamp were under-used consistently, while others such as those on minor ridges were used disproportionately often for certain activities. This information allowed the conventional two-dimensional view of a home range to be extended to a four-dimensional view by including differential canopy use and changes in behaviour through the day.
Article
Full-text available
Although benthic infaunal communities are commonly measured to assess the effectiveness of environmental management in protecting biological resources, the tools used to interpret the resulting data are often subjective or site specific. We present an objective, quantitative index for application throughout the southern California coastal shelf environment that measures the condition of a benthic assemblage, with defined thresholds for levels of environmental disturbance. The index was calculated using a two-step process in which ordination analysis was employed to quantify a pollution gradient within a 717-sample calibration data set. The pollution tolerance of each species was determined based upon its distribution of abundance along the gradient. The index is calculated as the abundance-weighted average pollution tolerance of species in a sample. Thresholds were established for reference condition as well as for four levels of biological response. Reference condition was established as the index value in samples taken distant from areas of anthropogenic activity and for which no contaminants exceeded the effects range low (ERL) screening levels. The four response levels were established as the index values at which key community attributes were lost. Independent data sets were used to validate the index in three ways. First, index sensitivity to a spatial gradient of exposure to a discharge from a point source was tested. Second, index response to a temporal gradient of exposure to a discharge from a point source was examined, testing index robustness to natural temporal variation. Third, the effect of changes in natural habitat (e.g., substrate, depth, and latitude) on index sensitivity was tested by evaluating the ability of the index to segregate samples taken in areas with high and low chemical exposure, across a gradient of physical habitats.
Article
A procedure is described which provides an objective basis for selecting the number of individuals to sample at each location in studies of geographic variation within species. An example is given using cone data for Pinus contorta.
Article
The paper describes vegetation types in natural communities near London, Ontario. The methods include the use of sum-of-squares cluster analysis on data collected by systematic sampling to define vegetation types. Ranking followed by stress analysis indicates that substantial reduction of species number is possible with minimum information loss. Comparison of vegetation types by profile analysis using soil properties shows that vegetation types derived by cluster analysis may reliably indicate specific environmental conditions.
Article
On the use of characteristic specific combination to compare vegetational types. – A comparison between ordination and classification of phyto-sociological types using synthetic tables and characteristic specific combination of Raabe was made. The results of classification and ordination are very similar in both the cases, so that we can propose the characteristic specific combination of Raabe as a good method to limit the number of species when the vegetational types have to be classified.
Article
Full-text available
The method known as Analysis of Concentration (AOC) is proposed as a tool to measure the predictivity of binary data for cover data. The application of AOC to structured tables of oak forests of Central Italy has proved that binary data are more predictive for cover than cover for binary data. The ordinations produced by AOC with binary and cover data are very similar and interpretable with similar results.
Article
A graphical method of analyzing the spatial pattern of macrofungal species in beech forests is described. Species response curves are investigated both from the qualitative (presence/absence) and from the quantitative (log fruit-body number) variation of each species. The approach was applied to 49 macrofungal species, of which 34 were significantly related to certain pattern models. Macrofungi exhibit characteristic response curves along a one-dimensional mull/mor gradient. Also species with a relatively scarce occurrence have the potential to show significant patterns. Graphical tests of the qualitative and quantitative variation were used to provide a better description of the probability of a species to be present on certain sites.
Article
The methodology of comparing the results of multivariate community studies (resemblance matrices, ordinations, hierarchical and nonhierarchical classifications) is reviewed from two viewpoints: basic strategy and measure employed. The basic strategy is determined by 7 choices concerning the type of results, consensus methods or resemblance measures, hypothesis testing or exploratory analysis, lack or presence of reference basis, data set congruence or algorithmic effects, number of factors responsible for differences among results, and the number of properties considered in the comparison. Included is a brief summary of methods applicable to vegetation studies. Examples from a grassland survey demonstrate the utility of comparisons in evaluating the effects of plot size, data type, standardization, taxonomic level and number of species on classifications and ordinations.
Article
A detailed analysis of sample plots quantifying woody vegetation and environment in the Piedmont and Coastal Plain of Maryland, USA, shows (1) that sample sites must be selected according to some formal design, and (2) that binary (presence or absence) measurement of species serves equally well as continuous measurements of importance value or percent cover in identifying significant species-environment relationships. A comparison of data from a set of samples located at the discretion of the investigators with a set chosen according to a predetermined stratified random sampling plan shows that each set produces different results. The samples located at the investigators' discretion show a larger number of significant species-environment relationships, and in addition, the outcomes of many individual species-environment tests differ between the two sets of samples. Binary and continuous measurements were compared within the random stratified samples, and nearly all species showed similar environmental trends whether analyzed by continuous or binary techniques. The differences between outcomes of binary and continuous techniques are of the magnitude which can be expected from random variation. These results suggest that where an area containing great floristic variation is to be sampled to identify species-environment relationships, the best sample plan would involve many species taken according to a carefully divised random sampling plan.
Article
Numerical classification methods can simulate strategies of intuitive classifications. This paper considers two different intuitive syntaxonomic schemes suggested for stagnant eutrophic fresh-water communities with a view to identifying which among the commonest numerical methods of classification fits the two intuitive schemes best. Comparison of classifications using an information function and discriminant analysis revealed that the different numerical methods simulate different intuitive schemes, but the results of the numerical classifications are always judged superior. Two new syntaxonomic schemes optimizing the sharpness between the syntaxa are proposed.
Article
Non-centred principal components analyses followed by varimax rotation and ortho-oblique rotation are applied to 4 test data tables of known structure. These methods are also applied to a set of boreal forest understorey vegetation data. The effectiveness of varimax and ortho-oblique rotation techniques at generating unipolar components is considered and the ortho-oblique method with γ=0.5 appears optimal. Efficiency of cluster seeking, through the emergence of unipolar components, is assessed in terms of test data table recapture. The ortho-oblique method is superior at all levels of data structure considered. Stability of clusters at different levels of component rotation is examined. Beyond a level of fundamental structure, oblique-roatation produces fortuitous unipolar components whereas varimax rotation is very conservative. The use of both rotation techniques is advocated as complementary aspects of nodal component analysis. The advantages of nodal component analysis in vegetation study are briefly diseussed, especially with respect to large collections of stand data.
Article
Editing of community data matrices is complementary to analyzing data by multivariate techniques of classification and ordination in the overall task of data analysis. A computer program, DATAEDIT, is described that can perform numerous editing functions, including data transformation, deletion of certain species or samples, deletion of rare species, deletion of outliers, separation of disjunet sample groups, reordering of the species or samples of a data matrix, and the formation of composite samples or of sample subsets. DATAEDIT can use the information in a nonhierarchical or hierarchical classification, and includes its own internal routine for reciprocal averaging ordination.
Article
Different objective functions are reviewed with regard to their intrinsic properties and generic relationships as they relate to the measurement of phytosociological resemblance. These functions may take the form of a distance, probability, or information. Functions of these kinds are best conceived by the phytosociologist as abstractions which are meaningful only when placed within the bounds of a given sample space. In such a space the phytosociological objects, such as the individual stands of vegetation, are represented as points whose relative spatial placement is determined by the resemblance function. The spatial configuration of points, i.e., the manner of their placement relative to one another in sample space, is referred to as sample structure in the present paper. The first part of the paper includes a discussion of the sample space and sample structure, and it also deals with the concept of stochastic and deterministic resemblance functions. This is followed by the description of the different variants of distance, a probability-type coefficient, and several information theory functions. While the distance functions here represent metric divergences, which define the relative placement of objects in sample space, a probability-type coefficient, as a probability divergence, expresses the likelihood that given objects will be more dis-similar than others of the same collection. The information theory functions are directly related to probability in terms of two basically different definitions. The first of these is concerned with the information content of a single frequency distribution, and the second regards the information divergence which separates two or more frequency distributions. species and quadrates as vectors, their condensed forms or expectations derived therefrom, are given as examples for such distributions. In the two final parts important inequalities are presented, one for information and another for sum of squares, and several criteria are discussed which the phytosociologist may use as a basis to choose a particular function to suit the analysis which he intends to perform. The main text also includes numerical examples concerned with the computations.
Article
A FORM of data which occurs occasionally in sociology, and very commonly in ecology, is that in which the attributes, though measurable when they occur, do not run through the whole population, so that the data matrix contains many zeros. Such a population consists in reality of a number of more or less discrete sub-populations, each defined by some only of the attributes, and the application of traditional multivariate methods to such data encounters two main difficulties. First, even though the non-zero values of an attribute are normally distributed, the addition of zeros causes the mean/variance relationship of the whole attribute to approximate to that of a qualitative (0,1) distribution; if component analysis is used in an attempt to separate the sub-populations the normal (x-xmacr)/sigma transformation then results in excessive weight being given to the absence of a common attribute or the presence of a rare one. Secondly, if factor analysis with communalities is attempted, factors after the first-since all factors unrealistically involve all attributes-cannot be interpreted. Data which are heterogeneous in this sense are better subdivided on a presence-or-absence basis; and a series of papers from this laboratory1-6 has explored one statistical technique for this purpose. However, if the data are not intrinsically entirely qualitative this involves discarding information, and there appears to exist no method of assessing the relative importance of the qualitative and quantitative elements of a given set of data. This communication outlines such a method.
Article
Each individual of a multivariate sample may be represented by a point in a multidimensional Euclidean space. Cluster analysis attempts to group these points into disjoint sets which it is hoped will correspond to marked features of the sample. Different methods of cluster analysis of the same sample may assume different geometrical distributions of the points or may employ different clustering criteria or may differ in both respects. Three superficially different methods of cluster analysis are examined. It is shown that the clustering criteria of all these methods, and several new ones derived from or suggested by these methods, can be interpreted in terms of the distances between the centroids of the clusters; the geometrical point distribution is found in most instances. The methods are compared, suggestions made for their improvement, and some of their properties are established.
Manual of Vascular Plants of Northeastern United States and Adjacent Canada Some distance properties of latent root and vector methods used in multivariate analysis
  • H A Gleason
  • A D Cronquist
  • Van Nostrand
  • Toronto
  • J C Gower
Gleason, H. A. & Cronquist, A. (1963). Manual of Vascular Plants of Northeastern United States and Adjacent Canada. D. van Nostrand, Toronto. Gower, J. C. (1966). Some distance properties of latent root and vector methods used in multivariate analysis. Biometrika, 53, 325-38.