Article

Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

The common approach to the multiplicity problem calls for controlling the familywise error rate (FWER). This approach, though, has faults, and we point out a few. A different approach to problems of multiple significance testing is presented. It calls for controlling the expected proportion of falsely rejected hypotheses — the false discovery rate. This error rate is equivalent to the FWER when all hypotheses are true but is smaller otherwise. Therefore, in problems where the control of the false discovery rate rather than that of the FWER is desired, there is potential for a gain in power. A simple sequential Bonferronitype procedure is proved to control the false discovery rate for independent test statistics, and a simulation study shows that the gain in power is substantial. The use of the new procedure and the appropriateness of the criterion are illustrated with examples.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... 2DLC-MS and 2DLC-MS/MS analyses were performed using a parallel method as described previously [17]. The group-based pooled sample was analyzed by 2DLC-MS/MS to acquire MS/MS spectra at four collision energies (10,20,40, and 60 eV) for metabolite identification. A pooled sample was also analyzed by 2DLC-MS after analyzing every six biological samples for quality control. ...
... (accessed on 24 February 2023). Univariate analysis of metabolite abundance among groups was conducted using one-way ANOVA with Tukey post-test and the Benjamini and Hochberg method [20] for multiple testing correction. ROC analysis was employed to evaluate the diagnostic potential of metabolite abundance variations in COVID-19 patients. ...
Article
Full-text available
The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection is intricately related to the reprogramming of host metabolism. However, existing studies have mainly focused on peripheral blood samples and barely identified specific metabolites that are critically involved in the pathology of coronavirus disease 2019 (COVID-19). In the current small-scale study, we performed metabolic profiling in plasma (n = 61) and paired bronchoalveolar lavage fluid (BALF) samples (n = 20) using parallel two-dimensional liquid chromatography–mass spectrometry (2DLC-MS). In addition, we studied how an identified metabolite regulates the immunopathogenesis of COVID-19. The results unveiled distinct metabolome changes between healthy donors, and moderate and severe patients in both plasma and BALF, indicating that locations and disease severity play critical roles in COVID-19 metabolic alteration. Notably, a vital metabolite, indoxyl sulfate, was found to be elevated in both the plasma and BALF of severe COVID-19 patients. Indoxyl sulfate selectively induced TNF-α production, reduced co-stimulatory signals, and enhanced apoptosis in human monocytes. Moreover, its levels negatively correlated with the strength of co-stimulatory signals and antigen presentation capability in monocytes of COVID-19 patients. Collectively, our findings suggest that the levels of indoxyl sulfate could potentially serve as a functional biomarker to monitor COVID-19 disease progression and guide more individualized treatment for COVID-19 patients.
... This updating process allows for a more meaningful interpretation, particularly when prior information is available, making it more advantageous than traditional approaches (Muehlemann et al., 2023). (Benjamini & Hochberg, 1995). In marketing research, where multiple variables are frequently examined at the same time, the use of the False Discovery Rate (FDR) approach is essential. ...
... This stepwise approach arranges p-values from multiple tests in increasing order and establishes a cutoff point, allowing the rejection of null hypotheses while maintaining control over the anticipated proportion of false positives. (Benjamini & Hochberg, 1995). ...
Article
Full-text available
This study examines the effectiveness of various alternative statistical approaches in testing the null hypothesis's significance to improve marketing research's reliability. Conventional methods such as p-value are often misunderstood and have limitations in interpreting the results of the analysis. Therefore, this study compares several alternative approaches, including Bayesian inference, bootstrap resampling, and false discovery rate, to improve the validity and repeatability of marketing research results. By using a simulation-based quantitative approach and a survey of marketing academics and practitioners, the findings of this study show that alternative methods can provide more informative results than conventional techniques. The implications of this study contribute to strengthening marketing research methodologies to be more accurate and reliable.
... To maintain control of a suitable number of false discoveries, Benjamini & Hochberg (1995) introduce the false discovery rate (FDR) as a crucial concept and an important metric in their seminal work to serve this goal. In this work, Benjamini & Hochberg (1995) also provide a simple but powerful step-down p-value procedure to control the desired FDR. ...
... To maintain control of a suitable number of false discoveries, Benjamini & Hochberg (1995) introduce the false discovery rate (FDR) as a crucial concept and an important metric in their seminal work to serve this goal. In this work, Benjamini & Hochberg (1995) also provide a simple but powerful step-down p-value procedure to control the desired FDR. More concretely, we wish to test m simultaneous null hypotheses H 01 , H 02 , . . . ...
Preprint
Full-text available
The horseshoe prior, a widely used handy alternative to the spike-and-slab prior, has proven to be an exceptional default global-local shrinkage prior in Bayesian inference and machine learning. However, designing tests with frequentist false discovery rate (FDR) control using the horseshoe prior or the general class of global-local shrinkage priors remains an open problem. In this paper, we propose a frequentist-assisted horseshoe procedure that not only resolves this long-standing FDR control issue for the high dimensional normal means testing problem but also exhibits satisfactory finite-sample FDR control under any desired nominal level for both large-scale multiple independent and correlated tests. We carry out the frequentist-assisted horseshoe procedure in an easy and intuitive way by using the minimax estimator of the global parameter of the horseshoe prior while maintaining the remaining full Bayes vanilla horseshoe structure. The results of both intensive simulations under different sparsity levels, and real-world data demonstrate that the frequentist-assisted horseshoe procedure consistently achieves robust finite-sample FDR control. Existing frequentist or Bayesian FDR control procedures can lose finite-sample FDR control in a variety of common sparse cases. Based on the intimate relationship between the minimax estimation and the level of FDR control discovered in this work, we point out potential generalizations to achieve FDR control for both more complicated models and the general global-local shrinkage prior family.
... [12][13][14][15][16][17][18][19]. We analyzed the data for each country separately and determined the statistical significance of all results reported in the next section by applying the false discovery rate correction 49 ...
... Effects marked with an asterisk (*) were statistically significant after applying the false discovery rate correction (see the Supplemental Material, p. 3). 49 A significant effect in the "η 2 human vs. synthetic" column means that the effects that experimental manipulations produced for human participants significantly differed from those they produced for synthetic participants. ...
Article
Full-text available
Researchers are testing the feasibility of using the artificial intelligence tools known as large language models to create synthetic research participants—artificial entities that respond to surveys as real humans would. Thus far, this research has largely not been designed to examine whether synthetic participants could mimic human answers to policy-relevant surveys or reflect the views of people from non-WEIRD (Western, educated, industrialized, rich, and democratic) nations. Addressing these gaps in one study, we have compared human and synthetic participants’ responses to policy-relevant survey questions in three domains: sustainability, financial literacy, and female participation in the labor force. Participants were drawn from the United States as well as two non-WEIRD nations that have not previously been included in studies of synthetic respondents: the Kingdom of Saudi Arabia and the United Arab Emirates. We found that for all three nations, the synthetic participants created by GPT-4, a form of large language model, on average produced responses reasonably similar to those of their human counterparts. Nevertheless, we observed some differences between the American and non-WEIRD participants: For the latter, the correlations between human and synthetic responses to the full set of survey questions tended to be weaker. In addition, although we found a common tendency in all three countries for synthetic participants to show more positive and less negative bias (that is, to be more progressive and financially literate relative to their human counterparts), this trend was less pronounced for the non-WEIRD participants. We discuss the main policy implications of our findings and offer practical recommendations for improving the use of synthetic participants in research.
... For comparisons of two independent groups, Welch's t-test [22] was chosen due to its high statistical power and ability to minimize the risk of invalid statistical inferences due to unequal sample sizes [23]. The Benjamini-Hochberg method was applied to adjust p-values to effectively control false discovery rates [24]. The contingency between methylation biomarkers and categorical variables examined using Fisher's exact test due to the small sample size. ...
Article
Full-text available
Background: Ovarian cancer (OC) is the third most common and second most lethal onco-gynecological disease in the world, with high-grade serous ovarian cancer (HGSOC) making up the majority of OC cases worldwide. The current serological biomarkers used for OC diagnosis are lacking sensitivity and specificity, thus new biomarkers are greatly needed. Recently, the chromatin remodeling complex gene ARID1A, Notch and Wnt pathway gene expression, as well as HOX-related gene promoter methylation have been linked with promoting OC. Methods: In this pilot study, 10 gene expression biomarkers and 4 promoter methylation biomarkers were examined as potential diagnostic and prognostic indicators of OC in 65 fresh-frozen gynecologic tumor tissues. Results: Out of 10 genes analyzed, the expression of eight biomarkers was significantly reduced in OC cases compared to benign, and HOX-related gene promoter methylation significantly increased in OC tumors. Out of 14 biomarkers, CTNNB1 showed the best single biomarker separation of HGSOC from benign cases (AUC = 0.97), while a combination of the seven Notch pathway-related gene expressions (NOTCH1, NOTCH2, NOTCH3, NOTCH4, DLL1, JAG2, and HES1) demonstrated the best separation of HGSOC from the benign cases (AUC = 1). Conclusions: The combination of multiple gene expression or gene promoter methylation biomarkers shows great promise for the development of an effective biomarker-based diagnostic approach for OC.
... A popular error criterion for multiple hypothesis testing is the false discovery rate (FDR), first introduced by Benjamini and Hochberg (1995). The FDR is the expected proportion of discoveries that are false, and defined as follows: ...
Preprint
Researchers often lack the resources to test every hypothesis of interest directly or compute test statistics comprehensively, but often possess auxiliary data from which we can compute an estimate of the experimental outcome. We introduce a novel approach for selecting which hypotheses to query a statistic (i.e., run an experiment, perform expensive computation, etc.) in a hypothesis testing setup by leveraging estimates (e.g., from experts, machine learning models, previous experiments, etc.) to compute proxy statistics. Our framework allows a scientist to propose a proxy statistic, and then query the true statistic with some probability based on the value of the proxy. We make no assumptions about how the proxy is derived and it can be arbitrarily dependent with the true statistic. If the true statistic is not queried, the proxy is used in its place. We characterize "active" methods that produce valid p-values and e-values in this setting and utilize this framework in the multiple testing setting to create procedures with false discovery rate (FDR) control. Through simulations and real data analysis of causal effects in scCRISPR screen experiments, we empirically demonstrate that our proxy framework has both high power and low resource usage when our proxies are accurate estimates of the respective true statistics.
... If the assumption of sphericity was violated, the Greenhouse-Geisser correction was applied. Post-hoc tests with Benjamini-Hochberg correction (Benjamini and Hochberg, 1995) for multiple comparisons were applied when significant effects were observed. ...
Article
Full-text available
Humans achieve efficient behaviors by perceiving and responding to errors. Error-related potentials (ErrPs) are electrophysiological responses that occur upon perceiving errors. Leveraging ErrPs to improve the accuracy of brain-computer interfaces (BCIs), utilizing the brain's natural error-detection processes to enhance system performance, has been proposed. However, the influence of external and contextual factors on the detectability of ErrPs remains poorly understood, especially in multitasking scenarios involving both BCI operations and sensorimotor control. Herein, we hypothesized that the difficulty in sensorimotor control would lead to the dispersion of neural resources in multitasking, resulting in a reduction in ErrP features. To examine this, we conducted an experiment in which participants were instructed to keep a ball within a designated area on a board, while simultaneously attempting to control a cursor on a display through motor imagery. The BCI provided error feedback with a random probability of 30%. Three scenarios–without a ball (single-task), lightweight ball (easy-task), and heavyweight ball (hard-task)–were used for the characterization of ErrPs based on the difficulty of sensorimotor control. In addition, to examine the impact of multitasking on ErrP-BCI performance, we analyzed single-trial classification accuracy offline. Contrary to our hypothesis, varying the difficulty of sensorimotor control did not result in significant changes in ErrP features. However, multitasking significantly affected ErrP classification accuracy. Post-hoc analyses revealed that the classifier trained on single-task ErrPs exhibited reduced accuracy under hard-task scenarios. To our knowledge, this study is the first to investigate how ErrPs are modulated in a multitasking environment involving both sensorimotor control and BCI operation in an offline framework. Although the ErrP features remained unchanged, the observed variation in accuracy suggests the need to design classifiers that account for task load even before implementing a real-time ErrP-based BCI.
... EM, MP, and HP at the horizontal axis represent evenly mixed, moderately patched and highly patched arrays, respectively. Different letters represent significant differences at the 0.05 level in pairwise tests with FDR (false discovery rate) controlling procedures (Benjamini & Hochberg, 1995). ...
Article
Full-text available
Pollinating insects often exhibit flower constancy, that is, the tendency to make consecutive visits to the same flower species while disregarding others. This behaviour is commonly attributed to the cost of retrieving visual or motor memories from long‐term storage while switching between flowers with distinct colours and shapes. Accordingly, researchers often predict co‐flowering species to exhibit significantly greater phenotypic diversity than random expectation, thereby minimizing heterospecific pollen transfer. However, field observations have not consistently supported this notion. The observed inconsistencies may arise from variations in travel costs, which depend on the interaction between the foragers' constancy level and the spatial mixing of plant species. If species are evenly mixed, constant pollinators incur higher levels of travel cost due to the frequent skipping of neighbouring flowers. In contrast, if species are patchily distributed, constant pollinators experience lower levels of travel cost, as most neighbours are of the same species. Considering this, ‘realized flower constancy’ may be determined as an optimal strategy for balancing cognitive and travel costs, which dynamically vary across different degrees of spatial species mixing. Here we test this possibility in indoor experiments with bumble bees foraging from two differently coloured artificial flowers (‘species’) arranged at three mixing levels. First, bees dramatically reduced flower constancy as species mixing increased, irrespective of flower spacing. Second, bees were less inclined to switch species after accumulating consecutive visits to one species, suggesting a rapid decay of another species' information in short‐term memory back to long‐term storage. This effect may have additionally contributed to the increased flower constancy observed in species with patchy distributions. Third, bees showed minimal constancy for similarly coloured, evenly mixed flower species, suggesting that these flowers were operated with shared short‐term memory. The constancy level was hardly affected by colour similarity when species were patchily distributed. Results support our initial hypothesis that realized flower constancy reflects an optimal foraging strategy rather than a fixed outcome of cognitive limitation. Notably, bees' constancy increased significantly with greater colour differences only when species were evenly mixed, suggesting a novel perspective: spatial mixing promotes the evolution and maintenance of floral diversity. Read the free Plain Language Summary for this article on the Journal blog.
... *** p < .001). To account for the false discovery rate when conducting multiple hypothesis tests, we applied the Benjamini-Hochberg method and also reported adjusted p values (Benjamini & Hochberg, 1995). ...
Article
Full-text available
Recovery from work is important for promoting employees’ well-being but little is known about which environments are most conducive for recovery. This article examines the relationship between recovery and experiencing nature and, thus, provides a link between recovery research and environmental psychology. In two studies, we drew on the effort-recovery model and proposed that contact with nature is associated with employees’ recovery experiences and affective well-being. In Study 1, we theorized that appraising nature as esthetic is an underlying mechanism in the relationship between being in nature and recovery. Using an experience sampling approach with multisource data from self-reports and smartphone photos (N = 50, measurements = 411), we found that being in nature was indirectly related to recovery experiences (i.e., relaxation, detachment) and affective well-being (i.e., positive activation, serenity, low fatigue) via perceived attractiveness. In Study 2, we theorized that appreciative contact with nature (i.e., nature savoring) is linked to enhanced recovery and well-being. Using a randomized controlled trial (N = 66), we found that a nature-savoring intervention, compared to a waiting-list control group, had beneficial effects on recovery experiences and positive affective states. Overall, our results suggest that contact with nature is a prototypical setting for employees’ recovery, and we discuss theoretical and practical implications of this finding for occupational health psychology.
... Dunn's test of multiple comparisons (Dunn 1964). The false discovery rate (FDR) was controlled using the Benjamini-Hochberg adjustment (Benjamini and Hochberg 1995). It is worth noting that this division is not a regionalization; cities in a given group are often geographically distant from each other. ...
Article
Full-text available
On January 1, 1999, Poland implemented an administrative reform that resulted in 31 cities losing their status as voivodship capitals. This change removed a significant, and for some cities, the most important factor driving their development. Negative consequences quickly emerged, including the out-migration of qualified staff, population decline, reduced investment and economic activity, and a decrease in the income of residents. The aim of the study is to assess the changes in the economic situation of these former voivodship capitals following the reform’s implementation. Using statistical methods, convergence models, a taxonomic measure of development and the Wrocław taxonomy method the study examines the socioeconomic changes that occurred. The analysis is based on the average annual growth rates of ten selected indicators, focusing on changes in the socioeconomic situation rather than their absolute level of development. Therefore, cities ranked highest are those that developed the fastest in the analyzed period, not necessarily those with the highest level of economic development. The taxonomic measure of development shows values ranging from 0.31 to 0.65. This indicates that individual cities experienced different rates of development after the reform, but no clear leader emerged with the highest growth rate across most indicators. Similarly, no cities demonstrated consistently low growth rates across most indicators.
... Missing data were imputed using IVEware (Raghunathan et al., 2002) with the baseline levels of the outcomes and demographic factors, with imputed data in 20 data sets analyzed separately; model parameters and standard errors were combined following Rubin (1987). A Benjamini-Hochberg false discovery rate correction (Benjamini & Hochberg, 1995) was applied, and unadjusted and adjusted p values are reported in the results tables. ...
Article
Full-text available
Objective: To evaluate an online intervention to support family members of individuals who sustained a traumatic brain injury (TBI). Research Design: Randomized control trial. Parallel assignment to TBI Family Support (TBIFS) intervention or enhanced usual care control (TAU). Three testing timepoints: pretest baseline (T1), posttest within 2 weeks of assignment (T2), and follow-up 1 month after posttest (T3). Setting: Online. Participants: Sixty-eight caregivers recruited nationally: 18 years of age or older, English speaking, providing primary caregiving to an adult family member with TBI and mild to moderate disability. Intervention: Eight interactive modules providing information about cognitive, behavioral, and social consequences of TBI, training in problem-solving framework, and application exercises (N = 35). TAU was an informational website (N = 33). Measures: Proximal outcomes—program use, usability, and user satisfaction for TBIFS participants. Primary outcomes—TBI content knowledge, strategy application objective response and open-ended response, and strategy-application confidence. Secondary outcomes—appraisals of burden, satisfaction, uncertainty in mastery, guilt, and negative environment. Results: Proximal outcomes—about 80% of TBIFS participants completed the posttest assessment, and 91% reported moderate to high usability and user satisfaction. Primary outcomes—greater posttest gains in TBI content knowledge for TBIFS than TAU (t = 3.53, p = .0005, adjusted p = .0090, d = 0.91). Gains maintained through follow-up (t = 2.89, p = .0038, adjusted p = .0342, d = 0.90). No other effects for the primary or secondary outcomes. Conclusion: TBIFS improved TBI content knowledge relative to TAU. Modifications might be needed to improve application and distal outcomes for caregivers.
... Main effects of the best-fitting models were inspected using omnibus Type III F tests with Satterthwaite's approximation for degrees of freedom. Post-hoc analyses were corrected for multiple comparisons using the false discovery rate (FDR) correction method (Benjamini & Hochberg, 1995). Given that mental health indicators were found to fluctuate depending on pandemic-related social distancing restrictions (Pedersen et al., 2022), we further explored the interrelationships between changes in friendship quality and mental health symptoms across all assessment timepoints. ...
Article
Full-text available
Young people with childhood adversity (CA) were at increased risk to experience mental health problems during the COVID-19 pandemic. Pre-pandemic research identified high-quality friendship support as a protective factor that can buffer against the emergence of mental health problems in young people with CA. This longitudinal study investigated friendship buffering effects on mental health symptoms before and at three timepoints during the pandemic in 102 young people (aged 16–26) with low to moderate CA. Multilevel analyses revealed a continuous increase in depression symptoms following the outbreak. Friendship quality was perceived as elevated during lockdowns and returned to pre-pandemic baseline levels during reopening. A stress-sensitizing effect of CA on social functioning was evident, as social thinning occurred following the outbreak. Bivariate latent change score modeling revealed that before and during the pandemic, young people with greater friendship quality self-reported lower depression symptoms and vice versa. Furthermore, sequential mediation analysis showed that high-quality friendships before the pandemic buffered depression symptoms during the pandemic through reducing perceived stress. These findings highlight the importance of fostering stable and supportive friendships in young people with CA and suggest that through reducing stress perceptions high-quality friendships can mitigate mental health problems during times of multidimensional stress.
... The significance of the difference between the two proportions was compared using the Binomial test. Pairwise differences between embedding models were assessed using the Wilcoxon signed-rank test, with p-values adjusted for multiple comparisons using the Benjamini-Hochberg method 21 . The adjusted p-values are reported as q-values. ...
Preprint
Full-text available
Referral workflow inefficiencies, including misaligned referrals and delays, contribute to suboptimal patient outcomes and higher healthcare costs. In this study, we investigated the possibility of predicting procedural needs based on primary care diagnostic entries, thereby improving referral accuracy, streamlining workflows, and providing better care to patients. A de-identified dataset of 2,086 orthopedic referrals from the University of Texas Health at Tyler was analyzed using machine learning models built on Base General Embeddings (BGE) for semantic extraction. To ensure real-world applicability, noise tolerance experiments were conducted, and oversampling techniques were employed to mitigate class imbalance. The selected optimum and parsimonious embedding model demonstrated high predictive accuracy (ROC-AUC: 0.874, Matthews Correlation Coefficient (MCC): 0.540), effectively distinguishing patients requiring surgical intervention. Dimensionality reduction techniques confirmed the model's ability to capture meaningful clinical relationships. A threshold sensitivity analysis identified an optimal decision threshold (0.30) to balance precision and recall, maximizing referral efficiency. In the predictive modeling analysis, the procedure rate increased from 11.27% to an optimal 60.1%, representing a 433% improvement with significant implications for operational efficiency and healthcare revenue. The results of our study demonstrate that referral optimization can enhance primary and surgical care integration. Through this approach, precise and timely predictions of procedural requirements can be made, thereby minimizing delays, improving surgical planning, and reducing administrative burdens. In addition, the findings highlight the potential of clinical decision support as a scalable solution for improving patient outcomes and the efficiency of the healthcare system.
Article
Full-text available
The atypical antipsychotic clozapine targets multiple receptor systems beyond the dopaminergic pathway and influences prepulse inhibition (PPI), a critical translational measure of sensorimotor gating. Since PPI is modulated by atypical antipsychotics such as risperidone and clozapine, we hypothesized that p11—an adaptor protein associated with anxiety- and depressive-like behaviors and G-protein-coupled receptor function—might modulate these effects. In this study, we assessed the role of p11 in clozapine’s PPI-enhancing effect by testing wild-type and global p11 knockout (KO) mice in response to haloperidol, risperidone, and clozapine. We also performed structural and functional brain imaging. Contrary to our expectation that anxiety-like p11-KO mice would exhibit an augmented startle response and heightened sensitivity to clozapine, PPI tests showed that p11-KO mice were unresponsive to the PPI-enhancing effects of risperidone and clozapine. Imaging revealed distinct regional brain volume differences and reduced hippocampal connectivity in p11-KO mice, with significantly blunted clozapine-induced connectivity changes in the CA1 region. Our findings highlight a novel role for p11 in modulating clozapine’s effects on sensorimotor gating and hippocampal connectivity, offering new insight into its functional pathways.
Article
Full-text available
Background The receptor tyrosine kinase TIE2 and its ligands, angiopoietins (ANGPTs), promote angiogenesis. In addition to expression on vascular endothelial cells, TIE2 is expressed on M2-like pro-tumorigenic macrophages. Thus, the TIE2 inhibitor rebastinib was developed as a potential therapy to address multiple cancers. The objective of this study was to determine the effects of rebastinib alone and combined with chemotherapy in a syngeneic murine model of ovarian cancer. Methods Female C57Bl6J mice were intraperitoneally injected with syngeneic ID8 ovarian cancer cells. Once tumors were established, mice were untreated (control) or treated with rebastinib, carboplatin plus paclitaxel (chemotherapy), or rebastinib plus chemotherapy. In one set of experiments, survival was followed for 140 days. In other experiments, ascites was harvested 24 h after the last treatment and analyzed by flow cytometry. In in vitro experiments, RNA sequencing was performed on ID8 cells and murine peritoneal macrophage cells (PMJ2R) after treatment with rebastinib, chemotherapy, or rebastinib plus chemotherapy. Results Tumor-bearing mice treated with rebastinib plus chemotherapy had longer median survival than mice treated with chemotherapy (132.5 vs. 127 days, P < 0.01). Ascites from mice treated with rebastinib had more CD45 + macrophages (P < 0.03) and cytotoxic T cells (P < 0.0001) than ascites from mice treated with chemotherapy. Rebastinib had no significant effect on the numbers of regulatory T cells, Tie2 + macrophages, or Tie2 + M2 macrophages. In ID8 cells, in vitro, rebastinib treatment upregulated 1528 genes and downregulated 3115 genes. In macrophages, in vitro, rebastinib treatment upregulated 2302 genes and downregulated 2970 genes. Rebastinib differentially regulated ANGPT-like proteins in both types of cells, including several ANGPT-like genes involved in tumorigenesis, angiogenesis, and proliferation. ANGPTL1, an anti-angiogenic and anti-apoptotic gene, was increased tenfold in ID8 cells treated with rebastinib (P < 0.001) but was not altered in macrophages. Conclusions Rebastinib plus chemotherapy extends survival in a syngeneic murine model of ovarian cancer. Rebastinib alters proportions of immune cell subsets, increases cytotoxic T cells in ascites, and alters gene expression in tumor cells and macrophages.
Preprint
Access to diverse, high-quality datasets is crucial for machine learning model performance, yet data sharing remains limited by privacy concerns and competitive interests, particularly in regulated domains like healthcare. This dynamic especially disadvantages smaller organizations that lack resources to purchase data or negotiate favorable sharing agreements. We present SecureKL, a privacy-preserving framework that enables organizations to identify beneficial data partnerships without exposing sensitive information. Building on recent advances in dataset combination methods, we develop a secure multiparty computation protocol that maintains strong privacy guarantees while achieving >90\% correlation with plaintext evaluations. In experiments with real-world hospital data, SecureKL successfully identifies beneficial data partnerships that improve model performance for intensive care unit mortality prediction while preserving data privacy. Our framework provides a practical solution for organizations seeking to leverage collective data resources while maintaining privacy and competitive advantages. These results demonstrate the potential for privacy-preserving data collaboration to advance machine learning applications in high-stakes domains while promoting more equitable access to data resources.
Article
Heterogeneous treatment effects are driven by treatment effect modifiers (TEMs), pretreatment covariates that modify the effect of a treatment on an outcome. Current approaches for uncovering these variables are limited to low-dimensional data, data with weakly correlated covariates, or data generated according to parametric processes. We resolve these issues by proposing a framework for defining model-agnostic TEM variable importance parameters (TEM-VIPs), deriving one-step, estimating equation, and targeted maximum likelihood estimators of these parameters, and establishing these estimators’ asymptotic properties. This framework is showcased by defining TEM-VIPs for data-generating processes with continuous, binary, and time-to-event outcomes with binary treatments, and deriving accompanying asymptotically linear estimators. Simulation experiments demonstrate that these estimators’ asymptotic guarantees are approximately achieved in realistic sample sizes in randomized and observational studies alike. This methodology is also applied to gene expression data collected in a clinical trial assessing the effect of a novel therapy on disease-free survival in breast cancer patients. Predicted TEMs have previously been linked to treatment resistance.
Article
Introduced species are one of the biggest threats to aquatic ecosystems. Rainbow trout ( Oncorhynchus mykiss , Salmonidae) is considered one of the most dangerous introduced predatory fish species, as they often put native species at risk of extinction. This study evaluated the effects of rainbow trout introduction on Andean stream food webs. We tested the hypothesis that the presence of rainbow trout changes Andean stream food webs by changing the diet of other carnivorous species and the energy source supporting native fish species. We sampled streams with and without rainbow trout and with different vegetation cover (i.e., pastures and crops versus forest) and combined data from stomach contents and stable isotopes. We analyed stomach contents from 231 specimens of two native catfish species and 116 rainbow trout specimens. Our results demonstrate that aquatic insects are essential in Andean stream food webs, where collector gatherers and scrapers were the most consumed and assimilated by catfish and rainbow trout species. Leaf litter was predicted to be the primary energy source in streams with forest cover. In contrast, periphyton contributed the most in streams with pastures and crops. We conclude that the presence of trout coupled with use/land cover (i.e., vegetation cover) exerts a substantial effect on Andean stream food webs. Protection of riparian forests of the Andean streams of Colombia is needed to guarantee the stability of aquatic food webs and thus help native species coexist with established non‐native species introduced decades ago.
Article
In the digital era, leveraging communication technologies to foster collaborative learning is of utmost importance. This study explores the impact of different communication modalities, such as text, audio and video, on social presence and regulation processes within a computer-supported collaborative learning (CSCL) environment. Using learning analytics, we examine the influences of these modalities on collaboration and derive recommendations for their optimized use in the design of future CSCL environments. Our findings reveal a significant impact of communication modalities on the sense of social presence and regulation of collaborative activities. Audio communication results in enhanced co-presence, psychobehavioral accessibility, and better regulation processes compared to video and text modalities, indicating that audio is the most suitable modality in collaborative virtual environments for decision-making tasks. Conversely, video communication still facilitated strategic planning and enhanced self-regulation. Chat communication showed the lowest sense of social presence, yet improvements over time suggest that participants adapt to this modality, enhancing their collaborative efficiency.
Article
Altered brain connectivity and atypical neural oscillations have been observed in autism, yet their relationship with autistic traits in nonclinical populations remains underexplored. Here, we employ electroencephalography to examine functional connectivity, oscillatory power, and broadband aperiodic activity during a dynamic facial emotion processing task in 101 typically developing children aged 4 to 12 years. We investigate associations between these electrophysiological measures of brain dynamics and autistic traits as assessed by the Social Responsiveness Scale, 2nd Edition (SRS-2). Our results revealed that increased facial emotion processing–related connectivity across theta (4 to 7 Hz) and beta (13 to 30 Hz) frequencies correlated positively with higher SRS-2 scores, predominantly in right-lateralized (theta) and bilateral (beta) cortical networks. Additionally, a steeper 1/f-like aperiodic slope (spectral exponent) across fronto-central electrodes was associated with higher SRS-2 scores. Greater aperiodic-adjusted theta and alpha oscillatory power further correlated with both higher SRS-2 scores and steeper aperiodic slopes. These findings underscore important links between facial emotion processing-related brain dynamics and autistic traits in typically developing children. Future work could extend these findings to assess these electroencephalography-derived markers as potential mechanisms underlying behavioral difficulties in autism.
Article
In single-cell studies, cells can be characterized with multiple sources of heterogeneity (SOH) such as cell type, developmental stage, cell cycle phase, activation state, and so on. In some studies, many nuisance SOH are of no interest, but may confound the identification of the SOH of interest, and thus affect the accurate annotate the corresponding cell subpopulations. In this paper, we develop B-Lightning, a novel and robust method designed to identify marker genes and cell subpopulations corresponding to an SOH (e.g. cell activation status), isolating it from other SOH (e.g. cell type, cell cycle phase). B-Lightning uses an iterative approach to enrich a small set of trustworthy marker genes to more reliable marker genes and boost the signals of the SOH of interest. Multiple numerical and experimental studies showed that B-Lightning outperforms existing methods in terms of sensitivity and robustness in identifying marker genes. Moreover, it increases the power to differentiate cell subpopulations of interest from other heterogeneous cohorts. B-Lightning successfully identified new senescence markers in ciliated cells from human idiopathic pulmonary fibrosis lung tissues, new T-cell memory and effector markers in the context of SARS-COV-2 infections, and their synchronized patterns that were previously neglected, new AD markers that can better differentiate AD severity, and new dendritic cell functioning markers with differential transcriptomics profiles across breast cancer subtypes. This paper highlights B-Lightning’s potential as a powerful tool for single-cell data analysis, particularly in complex data sets where SOH of interest are entangled with numerous nuisance factors.
Article
Full-text available
Aliphatic polyamides, or nylons, are widely used in the textile and automotive industry due to their high durability and tensile strength, but recycling rates are below 5%. Chemical recycling of polyamides is possible but typically yields mixtures of monomers and oligomers which hinders downstream purification. Here, Pseudomonas putida KT2440 was engineered to metabolize C6-polyamide monomers such as 6-aminohexanoic acid, ε-caprolactam and 1,6-hexamethylenediamine, guided by adaptive laboratory evolution. Heterologous expression of nylonases also enabled P. putida to metabolize linear and cyclic nylon oligomers derived from chemical polyamide hydrolysis. RNA sequencing and reverse engineering revealed the metabolic pathways for these non-natural substrates. To demonstrate microbial upcycling, the phaCAB operon from Cupriavidus necator was heterologously expressed to enable production of polyhydroxybutyrate (PHB) from PA6 hydrolysates. This study presents a microbial host for the biological conversion, in combination with chemical hydrolysis, of polyamide monomers and mixed polyamids hydrolysates to a value-added product.
Article
Full-text available
Background Changes in large-scale brain networks have been reported in migraine patients, but it remains unclear how these manifest in the various phases of the migraine cycle. Case-control fMRI studies spanning the entire migraine cycle are lacking, precluding a complete assessment of brain functional connectivity in migraine. Such studies are essential for understanding the inherent changes in the brain of migraine patients as well as transient changes along the cycle. Here, we leverage the concept of functional connectome (FC) fingerprinting, whereby individual subjects may be identified based on their FC, to investigate changes in FC and its stability across different phases of the migraine cycle. Methods We employ a case-control longitudinal design to study a group of 10 patients with episodic menstrual or menstrual-related migraine without aura, in the 4 phases of their spontaneous migraine cycle (preictal, ictal, postictal, interictal), and a group of 14 healthy controls in corresponding phases of the menstrual cycle, using resting-state functional magnetic resonance imaging (fMRI). We propose a novel multilevel clinical connectome fingerprinting approach to analyse the FC identifiability not only within-subject, but also within-session and within-group. Results This approach allowed us to obtain individual FC fingerprints by reconstructing the data using the first 19 principal components to maximize identifiability at all levels. We found decreased FC identifiability for patients in the preictal phase relative to controls, which increased with the progression of the attack and became comparable to controls in the interictal phase. Using Network-Based Statistic analysis, we found increased FC strength across several brain networks for patients in the ictal and postictal phases relative to controls. Conclusion Our novel multilevel clinical connectome fingerprinting approach captured FC variations along the migraine cycle in a case-control longitudinal study, bringing new insights into the cyclic nature of the disorder.
Article
Full-text available
How the brain encodes, recognizes, and memorizes general visual objects is a fundamental question in neuroscience. Here, we investigated the neural processes underlying visual object perception and memory by recording from 3173 single neurons in the human amygdala and hippocampus across four experiments. We employed both passive-viewing and recognition memory tasks involving a diverse range of naturalistic object stimuli. Our findings reveal a region-based feature code for general objects, where neurons exhibit receptive fields in the high-level visual feature space. This code can be validated by independent new stimuli and replicated across all experiments, including fixation-based analyses with large natural scenes. This region code explains the long-standing visual category selectivity, preferentially enhances memory of encoded stimuli, predicts memory performance, encodes image memorability, and exhibits intricate interplay with memory contexts. Together, region-based feature coding provides an important mechanism for visual object processing in the human brain.
Article
Surface electromyography (sEMG) signals are electrical signals released by muscles during movement, which can directly reflect the muscle conditions during various actions. When a series of continuous static actions are connected along the temporal axis, a sequential action is formed, which is more aligned with people's intuitive understanding of real-life movements. The signals acquired during sequential actions are known as sequential sEMG signals, including an additional dimension of sequence, embodying richer features compared to static sEMG signals. However, existing methods show inadequate utilization of the signals' sequential characteristics. Addressing these gaps, this paper introduces the Spatio-Temporal Feature Extraction Network (STFEN), which includes a Sequential Feature Analysis Module based on static-sequential knowledge transfer, and a Spatial Feature Analysis Module based on dynamic graph networks to analyze the internal relationships between the leads. The effectiveness of STFEN is tested on both modified publicly available datasets and on our acquired Arabic Digit Sequential Electromyography (ADSE) dataset. The results show that STFEN outperforms existing models in recognizing sequential sEMG signals. Experiments have confirmed the reliability and wide applicability of STFEN in analyzing complex muscle activities. Furthermore, this work also suggests STFEN's potential benefits in rehabilitation medicine, particularly for stroke recovery, and shows promising future applications.
Article
Background Immune checkpoint inhibitors (ICIs) are the gold standard therapy in patients with deficient mismatch repair (dMMR)/microsatellite instability-high (MSI-H) metastatic colorectal cancer (mCRC). A significant proportion of patients show resistance, making the identification of determinants of response crucial. Growing evidence supports the role of sex in determining susceptibility to anticancer therapies, but data is lacking for patients with MSI-H CRC. Methods In this real-world cohort comprising 624 patients with MSI-H mCRC receiving ICIs, we investigated the impact of sex on patients’ outcomes, overall and according to RAS-BRAF mutational status or type of treatment (anti-PD-(L)1 with or without anti-CTLA-4 agents). We then investigated these associations also in two independent cohorts of patients with early-stage or advanced MSI-H CRC unexposed to ICIs. Finally, we explored two public microarray and RNA-seq datasets from patients with non-metastatic or metastatic MSI-H CRC to gain translational insights on the association between sex, BRAF status and immune contextures/ICI efficacy. Results Although no differences were observed between females and males either overall or in the BRAF wild-type cohort, male sex was associated with inferior progression-free survival (PFS) and overall survival (OS) in the BRAF mutated cohort (in multivariable models, HR for PFS: 1.79, 95% CI: 1.13 to 2.83, p=0.014, and for OS: 2.33, 95% CI: 1.36 to 3.98, p=0.002). Males receiving anti-PD-(L)1 monotherapy had the worst outcomes, with a 3-year PFS and 3-year OS of 23.9% and 41.8%, respectively, while the addition of anti-CTLA-4 agents rescued such a worse outcome. We also observed that females experienced a higher frequency of any-grade immune-related adverse events. Conversely, sex was not prognostic in the independent cohorts of patients with MSI-H CRCs not treated with ICIs. Exploratory transcriptomic analyses suggest that tumors of males with BRAF mutated MSI-H metastatic CRC are characterized by an enrichment of androgen receptor signature and an immune-depleted microenvironment, with a reduction in memory B cells, activated natural killer cells, and activated myeloid dendritic cells. Conclusions Overall, our findings suggest a complex interplay between sex and BRAF mutational status that may modulate the activity of ICIs in patients with MSI-H mCRC and pave the way to novel tailored strategies.
Article
Full-text available
Waldenstrom’s Macroglobulinemia (WM) is an IgM-secreting bone marrow (BM) lymphoma that is preceded by an asymptomatic state (AWM). To dissect tumor-intrinsic and immune mechanisms of progression, we perform single-cell RNA-sequencing on 294,206 BM tumor and immune cells from 30 patients with AWM/WM, 26 patients with Smoldering Myeloma, and 23 healthy donors. Despite their early stage, patients with AWM present extensive immune dysregulation, including in normal B cells, with disease-specific immune hallmarks. Patient T and NK cells show systemic hypo-responsiveness to interferon, which improves with interferon administration and may represent a therapeutic vulnerability. MYD88-mutant tumors show transcriptional heterogeneity, which can be distilled in a molecular classification, including a DUSP22/CD9-positive subtype, and progression signatures which differentiate IgM MGUS from overt WM and can help advance WM research and clinical practice.
Article
Gene duplication and loss play pivotal roles in the evolutionary dynamics of genomes, contributing to species phenotypic diversity and adaptation. However, detecting copy number variations (CNVs) in homoploid populations and newly-diverged species using short reads from next-generation sequencing (NGS) with traditional methods can often be challenging due to uneven read coverage caused by variations in GC content and the presence of repetitive sequences. To address these challenges, we developed a novel pipeline, ST4gCNV, which leverages ultra-fast de novo assemblies of NGS data to detect gene-specific CNVs between populations. The pipeline effectively reduces the variance of read coverage due to technical factors such as GC bias, providing a reliable CNV detection with a minimum sequencing depth of 10. We successfully apply ST4gCNV to the resequencing analysis of homoploid species Nelumbo nucifera and Nelumbo lutea (lotus). We reveal significant CNV-driven differentiation between these species, particularly in genes related to petal colour diversity such as those involved in the anthocyanin pathway. By highlighting the extensive gene duplication and loss events in Nelumbo, our study demonstrates the utility of ST4gCNV in population genomics and underscores its potential of integrating genomic CNV analysis with traditional SNP-based resequencing analysis.
Article
Background In the United States, Pulmonary and Critical Care Medicine (PCCM) fellowship training traditionally requires performing a minimum number of bronchoscopy and pleural procedures to be deemed competent. However, expert panel recommendations favor assessments based on skill and knowledge. PCCM trainees have a variable exposure to the advanced procedures in the presence of interventional pulmonary (IP) fellowships, so we surveyed the PCCM program directors (PD) across the United States to assess the procedural volume and competency of their fellows. Methods Survey invitations were emailed between April 2022 and May 2022, and responses were collected from PCCM fellowship programs. The PD assessed the competency and volume of procedures performed by PCCM fellows at the end of training. The primary objective was to determine the effect of IP fellowship or IP faculty on fellows’ procedural competency. The secondary objective was to assess the same impact on procedural volume. Results The survey response rate was 41.9% (n=109/260) with an average of 4.23 fellows/program (95% CI: 3.9-4.6). 74.5% (73/98) programs reported having access to IP faculty, while 26.5% (26/98) had an AABIP-accredited IP fellowship. No significant difference was noted for procedural competency or volume in programs with or without an IP fellowship or IP faculty during training. Most programs reported that PCCM fellows do not perform advanced bronchoscopy procedures. Conclusion An IP fellowship or IP faculty at a PCCM training institution did not appear to influence the PD-assessed volume or competency of common bronchoscopy and pleural procedures performed by fellows.
Article
Due to the wide uses of plastic products, nanoplastics are ubiquitous contaminants in the environment. Hence, extensive studies used various models to evaluate the toxicity of nanoplastics. In the present study, we developed yellow mealworm ( Tenebrio molitor ) as an alternative model to investigate the acute toxicity of nanoplastics. Our results showed that microinjection with 500 mg/kg nanoplastics significantly increased death rate of yellow mealworms after 24 or 48 h, with 100 nm particles being more effective compared with 20 nm ones. Meanwhile, dose‐dependent increase of death rate was observed in yellow mealworms after injection with 2–200 mg/kg 100 nm nanoplastics. Exposure to 2 mg/kg 100 nm but not 20 nm nanoplastics also led to hyperactivity of yellow mealworms. Both types of nanoplastics altered metabolite profiles, that 20 nm nanoplastics significantly up‐regulated and down‐regulated 9 and 12 metabolites, whereas 100 nm nanoplastics significantly up‐regulated and down‐regulated 16 and 25 metabolites, respectively. Enrichment analysis revealed that 100 nm but not 20 nm nanoplastics significantly affected alpha‐linolenic acid metabolism (ko00592) and purine metabolism (ko00230). For the metabolites belonging to these pathways, 100 nm nanoplastics significantly up‐regulated stearidonic acid but down‐regulated guanine. Combined, these results revealed size‐dependent effects of nanoplastics on acute toxicity, hyperactivity and metabolite profile changes in yellow mealworms. These results also indicated the potential uses of yellow mealworms as a cheap and simple model to evaluate the toxicity of nanoplastics.
Article
Full-text available
For children with cerebral palsy (CP), walking on uneven surfaces (US) is a challenging task essential for their engagement in their daily lives. This study aims to compare spatiotemporal parameters of multiple domains of walking (pace, rhythm, stability, variability) in children with spastic CP between gait on an uneven surface (US) and an even surface (ES) and assess differences against their typically developing (TD) peers. A total of 34 children (17CP/17TD) walked at a self-selected speed on an US and an ES. Gait speed, stride length, stride time, walk ratio, cadence, double and single support time, and stride width were calculated. For each parameter, stride-to-stride variability was calculated using the coefficient of variation. A 2-way ANOVA (group, surface) was conducted on each parameter. Stride width, and variability of gait speed, cadence, and walk ratio presented a group × surface interaction (p ≤ 0.042). Post-hoc tests revealed a greater stride width, and variability of gait speed, and walk ratio in the CP, compared to the TD group (p ≤ 0.005) only on an US, and on both surfaces for cadence variability (p = 0.017). Gait analysis on an US reveals gait changes in children with CP, highlighting the importance of using more ecological approaches for gait assessment.
Article
fin fold regeneration model is a simple system to study the process of regeneration and associated cellular mechanisms. Berberine, a plant alkaloid which is known to have wound healing properties shows potential to modulate regeneration. The present study aimed to explore the modulating influence of berberine on the signaling pathways involved in zebrafish larvae transected tail fin fold regeneration. Tail fin fold transection was performed on 3 dpf (days post fertilization) zebrafish larvae treated with Berberine (0.01%) and untreated control (System water (SW)). The larvae were observed under a microscope at 0, 1, 2, 3, 4, 5, hours post transection (hpt). RNA was extracted from Berberine treated and untreated (control) tail fin transected larvae at 4 hpt to perform RNA-seq analysis. PPI (protein-protein interaction) network, Shiny GO functional enrichment and topology analysis of DEGs (differentially expressed genes) was performed. Berberine treated larvae showed an accelerated regeneration growth in their transected tail fin by 4 hpt. Berberine induced accelerated regeneration is associated with the involvement of Insulin, IGF, stress response, jak-stat, cytokine, and cellular reprogramming signaling pathways as per RNA-seq analysis and String PPI network, and Shiny GO functional enrichment analysis of DEGs. Topological analysis using Cytohubba revealed tnfa, stat3, jak2b, igf1, jak1, hsp90aa1.1, stat1a, stat1b, bag3, hsp70, and fosl1a as the key Hub genes in the PPI network. The present study identifies the pathways and the Hub proteins involved in berberine induced accelerated regeneration process in zebrafish larvae.
Article
Full-text available
Powdery mildew outbreaks, caused by Podosphaera xanthii, cause reduced watermelon yields as the plants produce fewer and smaller fruits due to premature leaf senescence. The reduced leaf canopy can decrease fruit quality due to sun scalding. Sources of powdery mildew tolerance were previously identified by screening the USDA Citrullus germplasm collection with P. xanthii races 1 W and 2 W. However, not all gene loci associated with tolerance to race 2 W have been identified and markers tightly linked to such loci have not been developed. We employed a bulked segregant analysis approach using historical data from the USDA Germplasm Resource Information Network for an extreme-phenotype genome-wide association study (XP-GWAS) of tolerance to P. xanthii race 2 W in Citrullus accessions (N = 1,147). XP-GWAS identifies variants that segregate between pools of individuals chosen from the extremes of a phenotypic distribution from a diversity panel. Whole-genome resequencing of 45 individuals bulked from tolerant and susceptible extremes resulted in 301,059 high-quality biallelic SNPs. Two adjacent SNPs on chromosome 7 were significantly associated with P. xanthii race 2 W tolerance in the bulks and two additional SNPs had a strong signal in the XP-GWAS analysis. Kompetitive Allele-Specific PCR (KASP) markers were designed for sixteen SNPs across the three genomic regions. The KASP markers were validated by genotyping 186 accessions from the extremes of the disease response distribution of the Citrullus collection. Analysis of variance determined that thirteen of the markers were significantly associated, with the best marker in each region explaining 21–31% of the variation in powdery mildew tolerance.
Article
Full-text available
Enhancers serve as pivotal regulators of gene expression throughout various biological processes by interacting with transcription factors (TFs). While transcription factor binding sites (TFBSs) are widely acknowledged as key determinants of TF binding and enhancer activity, the significant role of their surrounding context sequences remains to be quantitatively characterized. Here we propose the concept of transcription factor binding unit (TFBU) to modularly model enhancers by quantifying the impact of context sequences surrounding TFBSs using deep learning models. Based on this concept, we develop DeepTFBU, a comprehensive toolkit for enhancer design. We demonstrate that designing TFBS context sequences can significantly modulate enhancer activities and produce cell type-specific responses. DeepTFBU is also highly efficient in the de novo design of enhancers containing multiple TFBSs. Furthermore, DeepTFBU enables flexible decoupling and optimization of generalized enhancers. We prove that TFBU is a crucial concept, and DeepTFBU is highly effective for rational enhancer design.
Article
Full-text available
A modification of the Bonferroni procedure for testing multiple hypotheses is presented. The method, based on the ordered p-values of the individual tests, is less conservative than the classical Bonferroni procedure but is still simple to apply. A simulation study shows that the probability of a type I error of the procedure does not exceed the nominal significance level, α, for a variety of multivariate normal and multivariate gamma test statistics. For independent tests the procedure has type I error probability equal to α. The method appears particularly advantageous over the classical Bonferroni procedure when several highly-correlated test statistics are involved.
Article
Current methods of statistical inference are correct but incomplete. Small probability (α) of wrong null-hypothesis rejections can be misunderstood. Definitive rejections of null hypotheses, as well as interval assessments of effect sizes, are impossible in single cases with significance probabilities like .05. Sufficiently large sets of independent experiments and attained significance levels (p-values) should be registered. From such data it is possible to calculate least upper bounds for proportions of fallacies in sets of null-hypothesis rejections or effect-size assessments. A provisional rejection of a null hypothesis in a one-tailed test, or a one-sided confidence interval showing a nonzero effect, is here called a discovery. Consider a large number n of independent experiments with r discoveries. For r rejections of null hypotheses the proportion of fallacies Q has least upper bound Qmax = (n/r — 1)α/(1 — α) < 1. For r confidence intervals, the proportion of fallacies is E = αn/r (assuming that no alternative is in the “wrong” direction).
Article
A sharper Bonferroni procedure for multiple tests of significance is derived. This procedure is an improvement of Hochberg's (1988) procedure which contrasts the individual P-values with corresponding critical points. It is shown that Hochberg's original procedure is conservative, and can be made more powerful by enlarging the rejection region so that the type-one error is exactly at the nominal level. It is also shown that the modified procedure retains all the desired properties of the original procedure.
Article
Simes (1986) has proposed a modified Bonferroni procedure for the test of an overall hypothesis which is the combination of n individual hypotheses. In contrast to the classical Bonferroni procedure, it is not obvious how statements about individual hypotheses are to be made for this procedure. In the present paper a multiple test procedure allowing statements on individual hypotheses is proposed. It is based on the principle of closed test procedures (Marcus, Peritz & Gabriel, 1976) and controls the multiple level α.
Article
A practicing statistician looks at the multiple comparison controversy and related issues through the eyes of the users. The concept of consistency is introduced and discussed in relation to five of the more common multiple comparison procedures. All of the procedures are found to be inconsistent except the simplest procedure, the unrestricted least significant difference (LSD) procedure (or multiple t test). For this and other reasons the unrestricted LSD procedure is recommended for general use, with the proviso that it should be viewed as a hypothesis generator rather than as a method for simultaneous hypothesis generation and testing. The implications for Scheffé's test for general contrasts are also discussed, and a new recommendation is made.
Article
A simple procedure for multiple tests of significance based on individual p-values is derived. This simple procedure is sharper than Holm's (1979) sequentially rejective procedure. Both procedures contrast the ordered p- values with the same set of critical values. Holm's procedure rejects an hypothesis only if its p-value and each of the smaller p-values are less than their corresponding critical-values. The new procedure rejects all hypotheses with smaller or equal p-values to that of any one found less than its critical value.
Article
This paper presents a simple and widely ap- plicable multiple test procedure of the sequentially rejective type, i.e. hypotheses are rejected one at a tine until no further rejections can be done. It is shown that the test has a prescribed level of significance protection against error of the first kind for any combination of true hypotheses. The power properties of the test and a number of possible applications are also discussed.
Article
Optimality criteria formulated in terms of the power functions of the individual tests are given for problems where several hypotheses are tested simultaneously. Subject to the constraint that the expected number of false rejections is less than a given constant γ\gamma when all null hypotheses are true, tests are found which maximize the minimum average power and the minimum power of the individual tests over certain alternatives. In the common situations in the analysis of variance this leads to application of multiple t-tests. In that case the resulting procedure is to use Fisher's "least significant difference," but without a preliminary F-test and with a smaller level of significance. Recommendations for choosing the value of γ\gamma are given by relating γ\gamma to the probability of no false rejections if all hypotheses are true. Based upon the optimality of the tests, a similar optimality property of joint confidence sets is also derived.
Article
Simes (1986) has proposed a modified Bonferroni procedure for the test of an overall hypothesis which is the combination of n individual hypotheses. In contrast to the classical Bonferroni procedure, it is not obvious how statements about individual hypotheses are to be made for this procedure. In the present paper a multiple test procedure allowing statements on individual hypotheses is proposed. It is based on the principle of closed test procedures (Marcus, Peritz & Gabriel, 1976) and controls the multiple level α.
Article
Thrombolysis with recombinant tissue-type plasminogen activator (rt-PA) and anisoylated plasminogen streptokinase activator (APSAC) in myocardial infarction has been proved to reduce mortality. A new front-loaded infusion regimen of 100 mg of rt-PA with an initial bolus dose of 15 mg followed by an infusion of 50 mg over 30 min and 35 mg over 60 min has been reported to yield higher patency rates than those achieved with standard regimens of thrombolytic treatment. The effects of this front-loaded administration of rt-PA versus those obtained with APSAC on early patency and reocclusion of infarct-related coronary arteries were investigated in a randomized multicenter trial in 421 patients with acute myocardial infarction. Coronary angiography 90 min after the start of treatment revealed a patent infarct-related artery (Thrombolysis in Myocardial Infarction [TIMI] grade 2 or 3) in 84.4% of 199 patients given rt-PA versus 70.3% of 202 patients given APSAC (p = 0.0007). Early reocclusion within 24 to 48 h was documented in 10.3% of 174 patients given rt-PA versus 2.5% of 163 patients given APSAC. Late reocclusion within 21 days was observed in 2.6% of 152 patients given rt-PA versus 6.3% of 159 patients given APSAC. There were 5 in-hospital deaths (2.4%) in the rt-PA group and 17 deaths (8.1%) in the APSAC group (p = 0.0095). The reinfarction rate was 3.8% and 4.8%, respectively. Peak serum creatine kinase and left ventricular ejection fraction at follow-up angiography were essentially identical in both treatment groups. There were more bleeding complications after APSAC (45% vs. 31%, p = 0.0019).(ABSTRACT TRUNCATED AT 250 WORDS)
Article
The problem of multiple comparisons is discussed in the context of medical research. The need for more powerful procedures than classical multiple comparison procedures is indicated. To this end some new, general and simple procedures are discussed and demonstrated by two examples from the medical literature: the neuropsychologic effects of unidentified childhood exposure to lead, and the sleep patterns of sober chronic alcoholics.
Article
The randomized clinical trial is the preferred research design for evaluating competing diagnostic and therapeutic alternatives, but confidence in the conclusions from a randomized clinical trial depends on the authors' attention to acknowledged methodologic and statistical standards. This survey assessed the level of attention to the problem of multiple comparisons in the analyses of contemporary randomized clinical trials. Of the 67 trials surveyed, 66 (99 percent) performed multiple comparisons with a mean of 30 therapeutic comparisons per trial. When criteria for statistical impairment were applied, 50 trials (75 percent) had the statistical significance of at least one comparison impaired by the problem of multiple comparisons, and 15 (22 percent) had the statistical significance of all comparisons impaired by the problem of multiple comparisons. Although some statistical techniques are available, there still exists a great need for future work to clarify further the problem of multiple comparisons and determine how the impact of this problem can best be minimized in subsequent research.
Article
This article discusses statistical methods for comparing the means of several groups and focuses on examples from 50 Original Articles published in the Journal in 1978 and 1979. Although medical authors often present comparisons of the means of several groups, the most common method of analysis, multiple t-tests, is usually a poor choice. Which method of analysis is appropriate depends on what questions the investigators wish to ask. If the investigators want to identify which of the groups under study are different from the rest, they will need a different method from the one required if they wish simply to decide whether or not the groups share a common mean. More complicated questions about the group means call for more sophisticated techniques. Of the 50 Journal articles examined, 27 (54 per cent) used inappropriate statistical methods to analyze the differences between group means. Investigators need to become better acquainted with statistical techniques for making multiple comparisons between group means.
  • Hochberg
Statistical problems in reporting of clinical trials
  • S. J. Pocock
  • M. D. Hughes
  • R. J. Lee