Permutation Statistical Methods
Abstract
This research monograph provides a synthesis of a number of statistical tests and measures, which, at first consideration, appear disjoint and unrelated. Numerous comparisons of permutation and classical statistical methods are presented, and the two methods are compared via probability values and, where appropriate, measures of effect size.
Permutation statistical methods, compared to classical statistical methods, do not rely on theoretical distributions, avoid the usual assumptions of normality and homogeneity of variance, and depend only on the data at hand. This text takes a unique approach to explaining statistics by integrating a large variety of statistical methods, and establishing the rigor of a topic that to many may seem to be a nascent field in statistics. This topic is new in that it took modern computing power to make permutation methods available to people working in the mainstream of research.
This research monograph addresses a statistically-informed audience, and can also easily serve as a textbook in a graduate course in departments such as statistics, psychology, or biology. In particular, the audience for the book is teachers of statistics, practicing statisticians, applied statisticians, and quantitative graduate students in fields such as psychology, medical research, epidemiology, public health, and biology.
Chapters (10)
Chapter 2 introduces a generalized Minkowski distance function that is the basis for a set of multi-response permutation procedures for univariate and multivariate completely randomized data. Multi-response permutation procedures constitute a class of permutation methods for one or more response measurements that are designed to distinguish possible differences among two or more groups. The multi-response permutation procedures provide a synthesizing foundation for a variety of statistical tests and measures developed in successive chapters.
Chapter 3 utilizes the Multi-Response Permutation Procedures (MRPP) presented in Chap. 2 to develop the relationships between the test statistics of MRPP, δ and \(\mathfrak{R}\), and selected conventional tests and measures designed for the analysis of completely randomized data at the interval level of measurement. The structure of the MRPP test statistic, δ, depends on the choice of v in the generalized Minkowski distance function and the treatment-group weights, C
i
, i = 1, …, g. Four tests are examined in this chapter: (1) Student’s two-sample t test with interval-level univariate response measurements, (2) Hotelling’s two-sample T
2 test with multivariate interval-level response measurements, (3) one-way fixed-effects analysis of variance (ANOVA) with interval-level univariate response measurements, and (4) one-way multivariate analysis of variance (MANOVA) with interval-level multivariate response measurements.
Chapter 4 continues Chap. 3, utilizing the multi-response permutation procedures developed in Chap. 2 for analyzing completely randomized data at the interval level of measurement. In Chap. 4, multi-response permutation procedures are used to analyze regression residuals generated by ordinary least squares (OLS) and least absolute deviation (LAD) regression models. Experimental designs presented and analyzed in Chap. 4 include one-way randomized, one-way randomized with a covariate, one-way randomized-block, two-way randomized-block, two-way factorial, Latin square, split-plot, and two-factor nested designs.
Chapter 5 utilizes the multi-response permutation procedures (MRPP) developed in Chap. 2 for analyzing completely randomized data at the ordinal level of measurement. The structure of the MRPP test statistic, δ, depends on the choice of v in the generalized Minkowski distance function. A variety of tests are described in this chapter, including the Wilcoxon two-sample rank-sum test, the Kruskal–Wallis multiple-sample rank-sum test, the Ansari–Bradley rank-sum test for dispersion, the Taha sum-of-squared-ranks test, the Mood rank-sum test for dispersion, the Brown–Mood median test, the Mielke power-of-rank function tests, the Whitfield two-sample rank-sum test, and the Cureton rank-biserial test.
Chapter 6 utilizes the Multi-Response Permutation Procedures (MRPP) developed in Chap. 2 to establish relationships between the test statistics of MRPP, δ and \(\mathfrak{R}\), and multivariate generalizations of selected conventional tests and measures designed for the analysis of completely randomized data at the ordinal level of measurement. Considered in this chapter are multivariate extensions of the Wilcoxon two-sample rank-sum test, the Kruskal–Wallis multi-sample rank sum test, the Ansari–Bradley rank sum test for dispersion, the Taha sum-of-squared-ranks test, the Mood rank-sum test for dispersion, the Brown–Mood median test, the Mielke A
Ns
, B
Ns
, and C
Ns
power-of-rank function tests, the Whitfield two-sample rank-sum test, and the Cureton rank-biserial test.
Chapter 7 utilizes the Multi-Response Permutation Procedures (MRPP) developed in Chap. 2 for analyzing completely randomized data at the nominal (categorical) ordinal level of measurement. The structure of the MRPP test statistic, δ, depends on the choice of v in the generalized Minkowski distance function. A variety of tests are described in this chapter, including Goodman and Kruskal’s t
a
and t
b
asymmetric measures of nominal association, Light and Margolin’s categorical analysis of variance, and tests to analyze multiple binary choices.
Chapter 8 utilizes a generalized Minkowski distance function as the basis for a set of multivariate block permutation procedures for univariate and multivariate randomized-block data. Multivariate block permutation procedures constitute a class of permutation methods for one or more response measurements in each block that are designed to distinguish possible differences among two or more treatments. The multivariate block permutation procedures provide a synthesizing foundation for a variety of statistical tests and measures developed in successive chapters.
Chapter 9 utilizes the Multivariate Randomized Block Permutation (MRBP) procedures developed in Chap. 8 for analyzing randomized-block data at the interval level of measurement. The structure of the MRBP test statistic, δ, depends on the choice of v in the generalized Minkowski distance function. Four tests are examined in this chapter: (1) Student’s matched-pairs t test with interval-level univariate response measurements, (2) Hotelling’s matched-pairs T
2 test with multivariate interval-level response measurements, (3) one-way randomized-block analysis of variance with interval-level univariate response measurements, and (4) one-way randomized-block analysis of variance with interval-level multivariate response measurements.
Chapter 10 utilizes the multivariate randomized-block permutation procedures (MRBP) developed in Chap. 8 for analyzing randomized-block data at the ordinal level of measurement. The structure of the MRBP test statistic, δ, depends on the choice of v in the generalized Minkowski distance function. A variety of tests are described in this chapter, including the Wilcoxon signed-rank test for matched pairs, the sign test, Spearman’s rank-order correlation coefficient and footrule measure, the Kruskal–Wallis analysis of variance for ranks, Kendall’s coefficient of concordance, Cohen’s weighted kappa measure of agreement, Kendall’s τ
a
and τ
b
measures of ordinal association, Stuart’s τ
c
statistic, Goodman and Kruskal’s γ measure of ordinal association, and Somers’ asymmetric measures of ordinal association.
Chapter 11 utilizes the multivariate randomized-block permutation procedures (MRBP) developed in Chap. 8 for analyzing randomized-block data at the nominal level of measurement. The structure of the MRBP test statistic, δ, depends on the choice of v in the generalized Minkowski distance function. A variety of tests are described in this chapter, including Cohen’s unweighted kappa measure of chance-corrected agreement, McNemar’s test for change, Cochran’s Q test for change, Yule’s Q measure, the odds ratio, Somers’ d
xy
and d
yx
asymmetric measures of association, Pearson’s product-moment correlation coefficient, percentage differences, and chi-squared.
... It is important to note that any results here are applicable only to these two simple designs. Many note that random allocation and random sampling lead to two fundamentally different modes of inference, given different names by different authors: experimental versus sampling inference (Kempthorne, 1979), randomization versus population inference (Ludbrook, 1995), finite sample versus super population inference (Imbens and Rubin, 2015), and permutation versus population model (Berry et al., 2014). The former, stemming from random allocation and addressing only the sample at hand, originated withFisher (1935Fisher ( , 1936). ...
... Philip Good has been reported as distinguishing testing via resampling and reallocating by the hypotheses themselves, noting that reallocating tests " hypotheses concerning distributions, " while resampling tests " hypotheses concerning parameters " (Berry et al., 2014,p. 7). ...
Simulation-based inference plays a major role in modern statistics, and often employs either reallocating (as in a randomization test) or resampling (as in bootstrapping). Reallocating mimics random allocation to treatment groups, while resampling mimics random sampling from a larger population; does it matter whether the simulation method matches the data collection method? Moreover, do the results differ for testing versus estimation? Here we answer these questions in a simple setting by exploring the distribution of a sample difference in means under a basic two group design and four different scenarios: true random allocation, true random sampling, reallocating, and resampling. For testing a sharp null hypothesis, reallocating is superior in small samples, but reallocating and resampling are asymptotically equivalent. For estimation, resampling is generally superior, unless the effect is truly additive. Moreover, these results hold regardless of whether the data were collected by random sampling or random allocation.
... We know from the statistical literature the Spearman's ρ is a chance-corrected measure of correlation and that −1 ≤ ρ ≤ 1 [2]. But we can wonder what are the properties of the average pairwise Spearman's ρ? ...
... Theorem 3 shows that no matter what is the total number of features d, the stability estimateΦ of a random FR will be 0 in expectation. As pointed out by [2], "chance-corrected measures yield values that are interpreted as a proportion above that expected by chance alone". We can therefore interpret the stability estimateΦ as the proportion of agreement above chance between the rankings in R. Some popular measures of stability used in the literature do not have this property. ...
Producing stable feature rankings is critical in many areas, such as in bioinformatics where the robustness of a list of ranked genes is crucial to interpretation by a domain expert. In this paper, we study Spearman’s rho as a measure of stability to training data perturbations - not just as a heuristic, but here proving that it is the natural measure of stability when using mean rank aggregation. We provide insights on the properties of this stability measure, allowing a useful interpretation of stability values - e.g. how close a stability value is to that of a purely random feature ranking process, and concepts such as the expected value of a stability estimator.
... Rayleigh's uniformity test was used to test the hypothesis of random dispersal against the alternative of preferred directions for samples with unimodal distributions. Samples that had bimodal or multimodal distributions were tested with Rao's spacing test for one-sidedness against the alternative of random dispersal, implemented as a special case of empirical coverage permutation tests (Mielke, 2001), with the statistical software 'Blossom' (Cade & Richards, 1999). This test is an alternative to Rayleigh's test for detecting departures from uniform distributions when multimodality is suspected. ...
... Tests of differences in directional orientation between samples were performed through a multi-response permutation procedure (MRPP). This method is based on distance functions (Mielke, 2001), has the advantage of not being sensitive to the underlying modality of the data, and compares grouped data in a way that is analogous to a one-way analysis of variance. The null hypothesis tested with MRPP is that circular distributions are identical for the samples compared. ...
Migration patterns across a drift fence with pitfall traps were studied between 1997 and 1999 at a breeding pond with populations of great crested newts, Triturus cristatus, and smooth newts, T. vulgaris, at a study site in south-central Sweden. Metamorphs and older newts emigrated from the pond non-randomly and seemed to avoid exiting where open fields adjoined, but were oriented towards a patch of forest immediately to the east of the pond. Movement patterns changed slightly over the years, but metamorphs were more dispersed and less concentrated than older newts, and did not choose directions identical to those of older newts. Older great crested and smooth newts showed similar directional orientation. Great crested newt metamorphs dispersed towards both edges of the forest patch, and possible explanations for this are discussed. The results suggest that orientation in relation to cues from the surroundings of a breeding pond may be used by newts to make migratory decisions. adults can identify areas favourable for dispersal or for terrestrial activities. The present study focuses on ques- tions related to this problem. In particular, are migration patterns directed towards habitat patches that are preferred by newts in different life stages? Further- more, do newts have stronger directional responses to such habitats as they get older, and are there detectable differences in orientation and dispersion between newts in different stages of life, or even between closely re- lated species? I studied migratory movements across a drift fence with pitfall traps from 1997 to 1999, at a pond in south-central Sweden, as part of a population study of great crested and smooth newts (T. cristatus and T. vulgaris, respectively). The questions addressed here may be interesting from a general biological per- spective, but detailed knowledge about migratory behaviour can also prove critical for conservation ef- forts (Sutherland, 1998; Marsh & Trenham, 2001).
... As can be seen in (1), Pearson correlation coefficient is a function of two vectors of paired observations. The usual permutation-based p-value reorders the values of x or of y, thus changing the pairs (Legendre & Legendre, 2012;Berry et al., 2016). For every reordering of the values, (2) is calculated. ...
Re-sampling based statistical tests are known to be computationally heavy, but reliable when small sample sizes are available. Despite their nice theoretical properties not much effort has been put to make them efficient. Computationally efficient method for calculating permutation-based p-values for the Pearson correlation coefficient and two independent samples t-test are proposed. The method is general and can be applied to other similar two sample mean or two mean vectors cases.
... The Pearson's correlation coefficient was calculated for log-transformed data and for 100 000 random permutations of the data to test the null hypothesis that variables were independent. 8 We made no assumption of causation when associations were identified. ...
Aims:
The European Association of Percutaneous Cardiovascular Interventions (EAPCI) Atlas of Interventional Cardiology has been developed to map interventional practice across European Society of Cardiology (ESC) member countries. Here we present the main findings of a 16-country survey in which we examine the national availability of interventional infrastructure, human resource, and procedure volumes.
Methods and results:
Sixteen ESC member countries participated in the EAPCI Atlas survey. Interventional data were collected by the National Cardiac Society of each participating country. An annual median of 5131 [interquartile range (IQR) 4013-5801] diagnostic heart procedures per million people were reported, ranging from <2500 in Egypt and Romania to >7000 in Turkey and Germany. Procedure rates showed significant correlation (r = 0.67, P = 0.013) with gross national income (GNI) per capita. An annual median of 2478 (IQR 1690-2633) percutaneous coronary interventions (PCIs) per million people were reported, ranging from <1000 in Egypt and Romania to >3000 in Switzerland, Poland, and Germany. Procedure rates showed significant correlation with GNI per capita (r = 0.62, P = 0.014). An annual median of 48.2 (IQR 29.1-105.2) transcatheter aortic valve implantation procedures per million people were performed, varying from <25 per million people in Egypt, Romania, Turkey, and Poland to >100 per million people in Denmark, France, Switzerland, and Germany. Procedure rates showed significant correlation with national GNI per capita (r = 0.92, P < 0.001).
Conclusion:
The first report from the EAPCI Atlas has shown considerable international heterogeneity in interventional cardiology procedure volumes. The heterogeneity showed association with national economic resource, a reflection no doubt of the technological costs of developing an interventional cardiology service.
... Randomization and resampling tests are treated by Edgington and Onghena (2007), Good (2005Good ( , 2006), Berry et al. (2016), Pesarin and Salmaso (2010), Efron and Tibshirani (1993), Politis et al. (1999), Mammen (1992), Davison and Hinkley (1997), Barbe and Bertail (1995), Lahiri (2003), and LePage and Billard (Eds.) (1992). ...
Under the assumption of differentiability in the mean, we discuss parametric score tests. In the case of one-sided test problems with one-dimensional parameter spaces, score tests are locally optimal. Based on this motivation, we derive rank tests for two-sample problems, by means of conditioning on the ranks of the observables and calculating the corresponding score function. This leads to classical nonparametric tests like Wilcoxon’s rank sum test or the log-rank test. Finally, we provide an alternative justification of two-sample rank tests, which is based on statistical functionals.
... One of the main advantages of this method is that it can be applied to any statistic and can incorporate distributional and dependence characteristics inherent to the data used, making it a robust test (Westfall and Young 1993). Most importantly, when using permutation tests the null distribution is empirical, that is, is obtained by calculating all possible, or a very large number of, values of the statistic under rearrangements of the labels on the observed data points (Berry et al. 2016). Therefore, in the case of our analysis, the null distribution of adaptive and constraint rates is different for the different analysis, as each one comprised a different number and combination of genes. ...
We present a survey of selection across Drosophila melanogaster embryonic anatomy. Our approach integrates genomic variation, spatial gene expression patterns and development, with the aim of mapping adaptation over the entire embryo’s anatomy. Our adaptation map is based on analyzing spatial gene expression information for 5,969 genes (from text-based annotations of in situ hybridization data directly from the BDGP database, Tomancak et al. 2007) and the polymorphism and divergence in these genes (from the project DGRP, Mackay et al. 2012).
The proportion of non-synonymous substitutions that are adaptive, neutral or slightly deleterious are estimated for the set of genes expressed in each embryonic anatomical structure using the DFE-alpha method (Eyre-Walker and Keightley 2009),. This method is a robust derivative of the McDonald and Kreitman test (McDonald and Kreitman 1991). We also explore whether different anatomical structures differ in the phylogenetic age, codon usage or expression bias of the genes they express and whether genes expressed in many anatomical structures show more adaptive substitutions than other genes.
We found that: (i) most of the digestive system and ectoderm-derived structures are under selective constraint, (ii) the germ line and some specific mesoderm-derived structures show high rates of adaptive substitution and (iii) the genes that are expressed in a small number of anatomical structures show higher expression bias, lower phylogenetic ages and less constraint.
... Given a set of input variables (features), the number of hidden layer neurons is estimated through the following procedure: the process is initiated by using a minimum number of hidden neurons and this number is increased provided there is a statistically significant improvement (significance level of 5%) in the errors obtained. A nonparametric permutation test for paired data is used for the comparison of the errors of two models [43] [44] [45] [46]. Finally, with a view to obtaining the definitive MLP neural network, this optimum number of hidden units is used in the training of an MLP neural network using all the data of the training-validation sample. ...
Recent studies in the field of renewable energies, and specifically in wind resource prediction, have shown growing interest in proposals for Measure–Correlate–Predict (MCP) methods which simultaneously use data recorded at various reference weather stations. In this context, the use of a high number of reference stations may result in overspecification with its associated negative effects. These include, amongst others, an increase in the estimation error and/or overfitting which could be detrimental to the generalisation capacity of the model when handling new data (prediction).
Chapter 5 describes connections, equivalencies, and relationships relating to matched-pair tests of null hypotheses. First, Student’s conventional matched-pair t-test is described. Second, a permutation matched-pair test is presented and the connection linking the two matched-pair tests is established. An example analysis illustrates the differences in the two approaches and the connection linking the two tests. Third, measures of effect size for matched-pair tests are presented for both Student’s matched-pair t-test and the permutation matched-pair test and the connections linking the various measures are set out. Fourth, Wilcoxon’s nonparametric signed-rank test is introduced and illustrated with an example analysis. A permutation alternative to Wilcoxon’s test is described and the connection linking the two tests is established. Finally, the connections linking a conventional matched-pair z-test for proportions and Pearson’s chi-squared test of goodness of fit is delineated.
Chapter 4 describes connections, equivalencies, and relationships relating to two-sample tests of null hypotheses. First, Student’s conventional two-sample t-test is described. Second, a permutation two-sample test is presented and the connection linking the two tests is established. An example analysis illustrates the differences in the two approaches and the connection linking the tests. Third, measures of effect size for two-sample tests are presented for both Student’s two-sample t-test and a permutation two-sample test and the connections linking the various measures are set out. Fourth, the Wilcoxon–Mann–Whitney two-sample rank-test is presented with a permutation alternative for rank-score data and the connection linking the two tests is described. Finally, the connection between a conventional two-sample z-test for proportions and Pearson’s chi-square test of independence is delineated.
Chapter 13 is the first of two chapters describing connections, equivalencies, and relationships relating to fourfold contingency tables. First, Pearson’s mean-square contingency coefficient, , and tetrachoric correlation coefficient, , are described. Second, Yule’s Q and Y measures are described, and the connections linking Q and Y are established. Third, the odds ratio is described, and the connections linking the odds ratio with Yule’s Q and Yule’s Y statistics are detailed. Fourth, Goodman and Kruskal’s and asymmetric measures of association are presented, and the connections linking , , , Pearson’s , and Pearson’s are described. Fifth, Somers’ and asymmetric measures are described, the connections linking and with simple percentage differences are established, and the connections with the corresponding unstandardized regression coefficients are detailed. Sixth, Kendall’s measure of ordinal association is described, and the connections among , , Pearson’s , and Pearson’s are delineated. Finally, the connections linking Pearson’s and Cohen’s are established.
Chapter 7 describes connections, equivalencies, and relationships relating to one-way randomized-blocks analysis of variance designs. First, Fisher’s conventional one-way randomized-blocks analysis of variance is described. Second, a permutation test is presented for randomized-blocks data and the connection linking the two approaches is established. An example analysis illustrates the differences in the two approaches and the connection linking the two tests. Third, measures of effect size for multiple related samples are described and the interconnections linking the various measures of effect size are detailed. Fourth, Friedman’s two-way analysis of variance for ranks is described and illustrated with a small rank-score dataset. A permutation multi-sample rank-sum test is introduced and the connection linking Friedman’s test statistic and the permutation randomized-blocks test statistic is described and illustrated.
In this paper we propose a new statistical approach to description of changes in vowels formants under influence of surrounding noise. The approach consists in presenting changes of first (F1) and second (F2) formants as vectors in ΔF1-ΔF2 coordinates and further plotting of polar histograms reflecting probabilities to find the vectors in twelve 30º-sectors of the coordinate space for every vowel. To illustrate this approach we performed audio recordings of several words with basic vowels /a/, /i/ and /u/ in stressed positions pronounced by 17 adult native Russian speakers (7 men, 10 women) in silence and on the background of 60 dB (A) babble noise. The noise was presented via headphones and an auditory feedback was provided to compensate for dampening effect of headphones’ cushions. Group polar histograms reflecting changes in F1 and F2 of vowels /a/, /i/ and /u/ in babble noise had specific shapes with 2–3 dominant petals and were significantly different from each other (p < 0.0001, Watson’s U2 test). This indicates that there are relatively stable and distinctive patterns characterizing joint changes in F1 and F2 for each of the studied vowels. The data can be used to account for changes in the formant structure of vowels in noise to improve performance of automatic speech recognition systems and also for planning and assessment of speech rehabilitation process.KeywordsVowel articulationLombard effectFormantsBabble noisePolar histograms
The research work is devoted to the study of the characteristics of the stress distribution in seals with a rubber matrix, depending on the degree of deformation. Although the determination of the relative deformation of seals by traditional methods allows us to obtain a fairly accurate result in the cross-sectional area under consideration, depending on the load distribution, the results obtained during the distribution do not fully reflect reality. In this regard, it is advisable to use modern information technologies that allow obtaining the necessary analytical forecasting data. One of the most promising areas today is artificial intelligence and fuzzy logic technologies. For this purpose, the research paper considered the issue of determining the radial strains caused by the relative deformation of the seal on the basis of fuzzy logic. The article proposes an approach based on the theory of fuzzy sets to solving deformation problems in rubber matrix humidifiers. The results of the study were used as initial data.
This chapter introduces conventional and permutation methods for matched-pairs tests. The chapter contains example analyses illustrating computation of exact permutation probability values for matched-pairs tests, calculation of measures of effect size for matched-pairs tests, exact and Monte Carlo permutation procedures for matched-pairs tests, and applications of permutation methods to matched-pairs rank-score data. Also included in the chapter are permutation versions of Student’s matched-pairs t test, Wilcoxon’s signed-ranks test, and a permutation-based alternative for the two conventional measures of effect size for matched-pairs: Cohen’s , and Pearson’s .
This chapter introduces conventional and permutation methods for one-sample tests. Included in this chapter are example analyses illustrating computation of exact permutation probability values for one-sample tests, calculation of measures of effect size for one-sample tests, exact and Monte Carlo permutation procedures for one-sample tests, and application of permutation methods to one-sample rank-score data. Also included in the chapter are permutation versions of Student’s one-sample t test, Wilcoxon’s signed-ranks test, and a permutation-based alternative for the two conventional measures of effect size for one-sample tests: Cohen’s and Pearson’s .
This chapter introduces conventional and permutation methods for two-sample tests. The chapter contains example analyses illustrating computation of exact permutation probability values for two-sample tests, calculation of measures of effect size for two-sample tests, exact and Monte Carlo permutation procedures for two-sample tests, and the application of permutation methods to two-sample rank-score data. Also included in the chapter are permutation versions of Student’s two-sample t test, the Wilcoxon–Mann–Whitney two-sample rank-sum test, and a permutation-based alternative for the four conventional measures of effect size for two-sample tests: Cohen’s , Pearson’s , Kelley’s , and Hays’ .
This chapter describes two models of statistical inference: the population model first put forward by Jerzy Neyman and Egon Pearson in 1928 and the permutation model developed by R.A Fisher, R.C. Geary, T. Eden, F. Yates, H. Hotelling, M. R. Pabst, and E.J.G. Pitman in the 1920s and 1930s. The remainder of the chapter presents a brief history of the early years and subsequent development of permutation statistical methods from 1920 to the present.
This chapter introduces conventional and permutation methods for multiple matched samples, i.e., randomized-blocks designs. The chapter contains example analyses illustrating the computation of exact permutation probability values for randomized-blocks designs, calculation of measures of effect size for randomized-blocks designs, exact and Monte Carlo permutation procedures for randomized-blocks designs, and applications of permutation methods to randomized-blocks designs with rank-score data. Also included in the chapter are permutation versions of Fisher’s F test for a one-way randomized-blocks design, Friedman’s two-way analysis of variance for ranks, and a permutation-based alternative for the four conventional measures of effect size for randomized-blocks designs: Hays’ , Pearson’s , Cohen’s partial , and Cohen’s .
This research aims to study the results of an online learning platform in collaboration with virtual technology using a digital ecosystem to develop the information, media and technology skills of undergraduate students who learnt with a digital ecosystem. The sample group was 79 undergraduate students who were randomised with a cluster sampling method. The participants were divided into two groups, with 40 students in the experimental group and 39 students in the control group. The research tools used were (1) the teaching plan for online learning platform in collaboration with virtual technology using digital ecosystem, in accordance with the Dick and Carey Model; (2) an academic achievement evaluation form and (3) the student's information skill evaluation form. The data was analysed using mean (X), standard deviation, and Multivariate Analysis of Variance (MANOVA). The research found that, with significance level of 0.5, the academic achievement and information skills of the students who learned using online learning platform in collaboration with virtual technology using a digital ecosystem was higher compared to the students who using the normal learning method.
Alien species invasion affects local community biodiversity and stability considerably, and ecosystem services and functions will accordingly be dramatically changed. Many studies have reported a correlation between invasibility and the chemical nature of soil, but the influences of understory plant community structure and soil trace element concentrations on invasibility have not been fully explored. Landscape heterogeneity in the urban and rural ecotone may alter the invasion process, and assessing the invasibility of different types of native forests may lead to a better understanding of the mechanisms by which native species resist invasion. We compared the composition, structure, diversity and stability of the understory community in abandoned fallows, severely invaded by Mikania micrantha and Borreria latifolia, and adjacent natural and planted forests in the urban and rural ecotone of Eastern Guangzhou, China. Additionally, we quantified mineral element concentrations in the topsoil (0–25 cm) most influenced by the root system of understory communities in the forest stand types. Abandoned fallows had the highest concentrations of available ferrum (Fe) and available boron (B) and the lowest concentration of total mercury (Hg) Hg among the three stand types. In contrast to various species diversity indices, the understory structure of the three stands better explained differences in community invasibility. Average understory cover significantly differed among the three stand types, and those types with the greatest number of stems in height and cover classes 1 and 2 differed the most, indicating that seedling establishment may deter invasion to a certain extent. CCA (canonical correspondence analysis) results better reflected the distribution range of each stand type and its relationship with environmental factors, and available Fe, available B, exchangeable calcium (Ca), exchangeable magnesium (Mg), cover, available copper (Cu) and total Hg , were strongly related the distribution of native and exotic understory species. Invasion weakened community stability. The stability index changed consistently with the species diversity index, and abandoned fallows understory community stability was lower than the other stand types. According to our results, both soil mineral element concentrations and community structure are related to alien species invasion. Against the backdrop of urbanization and industrialization, this information will provide forest management and planning departments with certain reference points for forest protection and invasive plant management.
To reduce ambiguity across a conversation, interlocutors reach temporary conventions or referential precedents on how to refer to an entity. Despite their central role in communication, the cognitive underpinnings of the interpretation of precedents remain unclear, specifically the role and mechanisms by which information related to the speaker is integrated. We contrast predictions of one-stage, original two-stage, and extended two-stage models for the processing of speaker information and provide evidence favoring the latter: we show that both stages are sensitive to speaker-specific information. Using an experimental paradigm based on visual-world eye tracking in the context of a referential communication task, we look at the moment-by-moment interpretation of precedents and focus on the temporal profile of the influence of the speaker and linguistic information when facing ambiguity. We find two clearly identifiable moments where speaker-specific information has its effects on reference resolution. We conclude that these two stages reflect two distinct cognitive mechanisms, with different timings, and rely on different representational formats for encoding and accessing information about the speaker: a cue-driven memory retrieval process that mediates language processing and an inferential mechanism based on perspective-taking abilities.
This chapter addresses the decision to use permutation tests as opposed to parametric analyses in the context of between-group analysis in randomized clinical trials designed to evaluate a medical intervention. It is important to understand at the outset that permutation tests represent a means to an end, rather than an end unto themselves. It is not so much that one seeks to use permutation tests just for the sake of doing so but, rather, that one recognizes the severe deficiencies of parametric analyses and wishes to use some other type of analysis that does not similarly suffer from these drawbacks. When viewed in this context, properly conducted permutation tests are the solution to the problem of how to compare treatments without having to rely on assumptions that cannot possibly be true. We argue that the default position would clearly be the use of exact analyses and that the burden of proof would fall to those who would argue that the approximate analyses are just as good or, as is sometimes argued, even better.
This chapter introduces permutation methods for multiple independent variables; that is, completely-randomized designs. Included in this chapter are six example analyses illustrating computation of exact permutation probability values for multi-sample tests, calculation of measures of effect size for multi-sample tests, the effect of extreme values on conventional and permutation multi-sample tests, exact and Monte Carlo permutation procedures for multi-sample tests, application of permutation methods to multi-sample rank-score data, and analysis of multi-sample multivariate data. Included in this chapter are permutation versions of Fisher’s F test for one-way, completely-randomized analysis of variance, the Kruskal–Wallis one-way analysis of variance for ranks, the Bartlett–Nanda–Pillai trace test for multivariate analysis of variance, and a permutation-based alternative for the four conventional measures of effect size for multi-sample tests: Cohen’s , Pearson’s η², Kelley’s , and Hays’ .
This chapter provides a brief history and overview of the early beginnings and subsequent development of permutation statistical methods, organized by decades from the 1920s to the present.
This chapter introduces permutation methods for one-sample tests. Included in this chapter are six example analyses illustrating computation of exact permutation probability values for one-sample tests, calculation of measures of effect size for one-sample tests, the effect of extreme values on conventional and permutation one-sample tests, exact and Monte Carlo permutation procedures for one-sample tests, application of permutation methods to one-sample rank-score data, and analysis of one-sample multivariate data. Included in this chapter are permutation versions of Student’s one-sample t test, Wilcoxon’s signed-ranks test, the sign test, and a permutation-based alternative for the two conventional measures of effect size for one-sample tests: Cohen’s and Pearson’s r².
This chapter describes measures of association for two variables at different levels of measurement, e.g., a nominal-level independent variable and an ordinal- or interval-level dependent variable, and an ordinal-level independent variable and an interval-level dependent variable. This chapter begins with discussions of three measures of association for a nominal-level independent variable and an ordinal-level dependent variable: Freeman’s θ, Agresti’s , and Piccarreta’s . This chapter continues with a discussion of measures of association for a nominal-level independent variable and an interval-level dependent variable: the correlation ratio η², 𝜖², and .
This chapter describes permutation statistical methods for measures of association designed for two or more interval-level variables. Included in this chapter are simple and multiple ordinary least squares (OLS) regression, simple and multiple least absolute deviation (LAD) regression, point-biserial correlation, and biserial correlation. Fisher’s Z transform for non-normal distributions is examined and evaluated. This chapter concludes with a discussion of the intraclass correlation coefficient.
This chapter provides an introduction to two models of statistical inference—the population model and the permutation model—and the three main approaches to permutation statistical methods—exact, moment approximation, and Monte Carlo resampling-approximation. Advantages of permutation statistical methods are described and recursion techniques are described and illustrated.
Estimation of the degree of agreement between different raters is of crucial importance in medical and social sciences. There are lots of different approaches proposed in the literature for this aim. In this article, we focus on the inter-rater agreement measures for the ordinal variables. The ordinal nature of the variable makes this estimation task more complicated. Although there are modified versions of inter-rater agreement measures for ordinal tables, there is no clear agreement on the use of a particular approach. We conduct an extensive Monte Carlo simulation study to evaluate and compare the accuracy of mainstream inter-rater agreement measures for ordinal tables with each other and figure out the effect of different table structures on the accuracy of these measures. Our results are useful in the sense that they provide detailed information about which measure to use with different table structures to get most reliable inferences about the degree of agreement between two raters. With our simulation study, we recommend use of Gwet’s AC2 and Brennan-Prediger’s κ in the situation where there is high agreement among raters. However, it should be noted that these coefficients overstate the extent of agreement among raters when there is no agreement, and the data is unbalanced.
hypotheses from observations of the world, which they then deploy to test their reliability. The best way to test reliability is to predict an effect before it occurs. If we can manipulate the independent variables (the efficient causes) that make it occur, then ability to predict makes it possible to control. Such control helps to isolate the relevant variables. Control also refers to a comparison condition, conducted to see what would have happened if we had not deployed the key ingredient of the hypothesis: scientific knowledge only accrues when we compare what happens in one condition against what happens in another. When the results of such comparisons are not definitive, metrics of the degree of efficacy of the manipulation are required. Many of those derive from statistical inference, and many of those poorly serve the purpose of the cumulation of knowledge. Without ability to replicate an effect, the utility of the principle used to predict or control is dubious. Traditional models of statistical inference are weak guides to replicability and utility of results. Several alternatives to null hypothesis testing are sketched: Bayesian, model comparison, and predictive inference (prep). Predictive inference shows, for example, that the failure to replicate most results in the Open Science Project was predictable. Replicability is but one aspect of scientific understanding: it establishes the reliability of our data and the predictive ability of our formal models. It is a necessary aspect of scientific progress, even if not by itself sufficient for understanding.
Re-sampling based statistical tests are known to be computationally heavy, but reliable when small sample sizes are available. Despite their nice theoretical properties not much effort has been put to make them efficient. In this paper we treat the case of Pearson correlation coefficient and two independent samples t-test. We propose a highly computationally efficient method for calculating permutation based p-values in these two cases. The method is general and can be applied or be adopted to other similar two sample mean or two mean vectors cases.
Introduction
Modern omics experiments pertain not only to the measurement of many variables but also follow complex experimental designs where many factors are manipulated at the same time. This data can be conveniently analyzed using multivariate tools like ANOVA-simultaneous component analysis (ASCA) which allows interpretation of the variation induced by the different factors in a principal component analysis fashion. However, while in general only a subset of the measured variables may be related to the problem studied, all variables contribute to the final model and this may hamper interpretation.
Objectives
We introduce here a sparse implementation of ASCA termed group-wise ANOVA-simultaneous component analysis (GASCA) with the aim of obtaining models that are easier to interpret.
Methods
GASCA is based on the concept of group-wise sparsity introduced in group-wise principal components analysis where structure to impose sparsity is defined in terms of groups of correlated variables found in the correlation matrices calculated from the effect matrices.
Results
The GASCA model, containing only selected subsets of the original variables, is easier to interpret and describes relevant biological processes.
Conclusions
GASCA is applicable to any kind of omics data obtained through designed experiments such as, but not limited to, metabolomic, proteomic and gene expression data.
In this introductory chapter, we first recall some basic concepts from mathematical statistics regarding statistical decision theory and, in particular, statistical test theory. Then, we deal with the concepts of conditional distributions and conditional expectations in a general manner. The treatment is based on a construction by means of Markov kernels. Finally, we give an introduction to the theory of nonparametric tests, and we discuss the specific examples of bootstrap tests (for one-sample problems) and permutation tests (for multi-sample problems) on a conceptual level.
Permutation statistical methods possess a number of advantages compared with conventional statistical methods, making permutation statistical methods the preferred statistical approach for many research situations. Permutation statistical methods are data‐dependent, do not rely on distribution assumptions such as normality, provide either exact or highly‐accurate approximate probability values, do not require knowledge of theoretical standard errors, and are ideal methods for small data sets where theoretical mathematical functions are often poor fits to discrete sampling distributions. On the other hand, permutation statistical methods are computationally intensive. Computational efficiencies for permutation statistical methods are described and permutation statistical methods are illustrated with a variety of common statistical tests and measures.
This article is categorized under:
• Statistical and Graphical Methods of Data Analysis > Bootstrap and Resampling
• Statistical and Graphical Methods of Data Analysis > Multivariate Analysis
• Statistical and Graphical Methods of Data Analysis > Nonparametric Methods
• Statistical and Graphical Methods of Data Analysis > Monte Carlo Methods
Non-normality and heteroscedasticity are common in applications. For the comparison of two samples in the non-parametric Behrens–Fisher problem, different tests have been proposed, but no single test can be recommended for all situations. Here, we propose combining two tests, the Welch t test based on ranks and the Brunner–Munzel test, within a maximum test. Simulation studies indicate that this maximum test, performed as a permutation test, controls the type I error rate and stabilizes the power. That is, it has good power characteristics for a variety of distributions, and also for unbalanced sample sizes. Compared to the single tests, the maximum test shows acceptable type I error control.
The terms “randomization test” and “permutation test” are sometimes used interchangeably. However, there are both historical and conceptual reasons for making a clear distinction between the two terms. Using a historical perspective, this chapter emphasizes the contributions made by Edwin Pitman and Bernard Welch to arrive at a coherent theory for randomization and permutation tests. From a conceptual perspective, randomization tests are based on random assignment and permutation tests are based on random sampling. The justification of the randomization test derives from the fact that under the null hypothesis of no treatment effect, the random assignment procedure produces a random shuffle of the responses. The justification of the permutation test derives from the fact that under the null hypothesis of identical distributions, all permutations of the responses are equally likely. It is argued that this terminological distinction is crucial for recognizing the assumptions behind each of the tests and for appreciating the validity of the corresponding inferences.
The problem of testing mutual independence between many random vectors is addressed. The closely related problem of testing serial independence of a multivariate stationary sequence is also considered. The Möbius transformation of characteristic functions is used to characterize independence. A generalization to p vectors of distance covariance and Hilbert-Schmidt independence criterion (HSIC) tests with the translation invariant kernel of a stable probability distribution is proposed. Both test statistics can be expressed in a simple form as a sum over all elements of a componentwise product of p doubly-centered matrices. It is shown that an HSIC statistic with sufficiently small scale parameters is equivalent to a distance covariance statistic. Consistency and weak convergence of both types of statistics are established. Approximation of p-values is made by randomization tests without recomputing interpoint distances for each randomized sample. The dependogram is adapted to the proposed tests for the graphical identification of sources of dependencies. Empirical rejection rates obtained through extensive simulations confirm both the applicability of the testing procedures in small samples and the high level of competitiveness in terms of power. Applications to meteorological and financial data provide some interesting interpretations of dependencies revealed by dependograms.
Several simple, classical, little-known algorithms in the statistical literature for generating random permutations by coin-tossing are examined, analyzed and implemented. These algorithms are either asymptotically optimal or close to being so in terms of the expected number of times the random bits are generated. In addition to asymptotic approximations to the expected complexity, we also clarify the corresponding variances, as well as the asymptotic distributions. A brief comparative discussion with numerical computations in a multicore system is also given.
This paper presents an efficient Bayesian framework for solving nonlinear, high-dimensional model calibration problems. It is based on a Variational Bayesian formulation that aims at approximating the exact posterior by means of solving an optimization problem over an appropriately selected family of distributions. The goal is two-fold. Firstly, to find lower-dimensional representations of the unknown parameter vector that capture as much as possible of the associated posterior density, and secondly to enable the computation of the approximate posterior density with as few forward calls as possible. We discuss how these objectives can be achieved by using a fully Bayesian argumentation and employing the
marginal likelihood or evidence as the ultimate model validation metric for
any proposed dimensionality reduction. We demonstrate the performance of the proposed methodology for problems in nonlinear elastography where the identification of the mechanical
properties of biological materials can inform non-invasive, medical diagnosis.
An Importance Sampling scheme is finally employed in order to validate the results and assess the efficacy of the approximations provided.
ResearchGate has not been able to resolve any references for this publication.