About
20
Publications
1,702
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
177
Citations
Introduction
Skills and Expertise
Publications
Publications (20)
In big data analysis for detecting rare and weak signals among $n$ features, some grouping-test methods such as Higher Criticism test (HC), Berk-Jones test (B-J), and $\phi$-divergence test share the similar asymptotical optimality when $n \rightarrow \infty$. However, in practical data analysis $n$ is frequently small and moderately large at most....
For testing a group of hypotheses, tremendous $p$-value combination methods have been developed and widely applied since 1930's. Some methods (e.g., the minimal $p$-value) are optimal for sparse signals, and some others (e.g., Fisher's combination) are optimal for dense signals. To address a wide spectrum of signal patterns, this paper proposes a u...
Non-proportional hazards data are routinely encountered in randomized clinical trials. In such cases, classic Cox proportional hazards model can suffer from severe power loss, with difficulty in interpretation of the estimated hazard ratio since the treatment effect varies over time. We propose CauchyCP, an omnibus test of change-point Cox regressi...
Polygenic risk scores (PRS) have been successfully developed for the prediction of human diseases and complex traits in the past years. For drug response prediction in randomized clinical trials, a common practice is to apply PRS built from a disease genome-wide association study (GWAS) directly to a corresponding pharmacogenomics (PGx) setting. He...
We evaluated the impact of class I and class II human leukocyte antigen (HLA) genotypes, heterozygosity, and diversity on the efficacy of pembrolizumab. Seventeen pembrolizumab clinical trials across eight tumor types and one basket trial in patients with advanced solid tumors were included (n > 3,500 analyzed). Germline DNA was genotyped using a c...
Motivation:
Pharmacogenomics (PGx) research holds the promise for detecting association between genetic variants and drug responses in randomized clinical trials, but it is limited by small populations and thus has low power to detect signals. It is critical to increase the power of PGx genome-wide association studies (GWAS) with small sample size...
Combining SNP p-values from GWAS summary data is a promising strategy for detecting novel genetic factors. Existing statistical methods for the p-value-based SNP-set testing confront two challenges. First, the statistical power of different methods depends on unknown patterns of genetic effects that could drastically vary over different SNP sets. S...
In pharmacogenetic (PGx) studies, drug response phenotypes are often measured in the form of change in a quantitative trait before and after treatment. There is some debate in recent literature regarding baseline adjustment, or inclusion of pre-treatment or baseline value as a covariate, in PGx genome-wide association studies (GWAS) analysis. Here,...
Analyzing correlated data by goodness-of-fit type tests is a critical statistical problem in many applications. A unified framework is provided through a general family of goodness-of-fit tests (GGOF) to address this problem. The GGOF family covers many classic and newly developed tests, such as the minimal p-value test, Simes test, the GATES, one-...
Combining dependent tests of significance has broad applications but the related p‐value calculation is challenging. For Fisher's combination test, current p‐value calculation methods (e.g., Brown's approximation) tend to inflate the type I error rate when the desired significance level is substantially less than 0.05. The problem could lead to sig...
In computational and applied statistics, it is of great interest to get fast and accurate calculation for the distributions of the quadratic forms of Gaussian random variables. This paper presents a novel approximation strategy that contains two developments. First, we propose a fast numerical procedure in computing the moments of the quadratic for...
Non-proportional hazards data are routinely encountered in randomized clinical trials. In such cases, classic Cox proportional hazards model can suffer from severe power loss, with difficulty in interpretation of the estimated hazard ratio since the treatment effect varies over time. We propose CauchyCP, an omnibus test of change-point Cox regressi...
Integrating association evidence across multiple traits can improve the power of gene discovery and reveal pleiotropy. Most multi-trait analysis methods focus on individual common variants in genome-wide association studies. Here, we introduce multi-trait analysis of rare-variant associations (MTAR), a framework for joint analysis of association su...
Fast and accurate calculation for the distributions of Quadratic forms of centered Gaussian variables is of interest in computational statistics. This paper presents a novel numerical procedure to efficiently compute the moments of a given quadratic form. Based on that, a gamma distribution with matched skewness-kurtosis ratio is proposed to approx...
Combining dependent tests of significance has broad applications but the $p$-value calculation is challenging. Current moment-matching methods (e.g., Brown's approximation) for Fisher's combination test tend to significantly inflate the type I error rate at the level less than 0.05. It could lead to significant false discoveries in big data analyse...
It is of substantial interest to discover novel genetic markers that influence drug response in order to develop personalized treatment strategies that maximize therapeutic efficacy and safety. To help enable such discoveries, we focus on testing the association between the cumulative effect of multiple single nucleotide polymorphisms (SNPs) in a p...
Wolbachia is a bacterium that is present in 60% of insects but it is not generally found in Aedes aegypti, the primary vector responsible for the transmission of dengue virus, Zika virus, and other human diseases caused by RNA viruses. Wolbachia has been shown to stop the growth of a variety of RNA viruses in Drosophila and in mosquitoes. Wolbachia...
This paper concerns the problem of applying the generalized goodness-of-fit (gGOF) type tests for analyzing correlated data. The gGOF family broadly covers the maximum-based testing procedures by ordered input $p$-values, such as the false discovery rate procedure, the Kolmogorov-Smirnov type statistics, the $\phi$-divergence family, etc. Data anal...