Niansheng TangYunnan University · Department of Statistics
Niansheng Tang
PhD
About
128
Publications
15,165
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,828
Citations
Additional affiliations
July 1997 - present
Publications
Publications (128)
We consider a novel class of semiparametric joint models for multivariate longitudinal and survival data with dependent censoring. In these models, unknown-fashion cumulative baseline hazard functions are fitted by a novel class of penalized-splines (P-splines) with linear constraints. The dependence between the failure time of interest and censori...
The non-inferiority (NI) trial is designed to show that an experimental treatment is not worse than an active reference by more than a pre-specified margin. Traditional NI trials do not include a placebo for ethical reasons; however, three-arm NI trials consisting of placebo, reference, and experimental treatment, can test the NI of experimental tr...
In missing data analysis, it is challenging to estimate the propensity score (PS) function. Traditional parametric, nonparametric or semiparametric approaches to estimate the PS function may be subject to model misspecification or lead to inefficient estimation. To address the aforementioned issues, we here assume that the PS function is unknown, a...
A Bayesian approach is proposed to estimate unknown parameters in stochastic dynamic equations (SDEs). The Fokker–Planck equation from statistical physics method is adopted to calculate the quasi-stationary probability density function. A hybrid algorithm combining the Gibbs sampler and the Metropolis–Hastings (MH) algorithm is proposed to obtain B...
This article considers the ultrahigh-dimensional prediction problem in the
presence of missing responses at random. A two-step model averaging procedure is
proposed to improve prediction accuracy of conditional mean of response variable.
The first step is to specify several candidate models, each with low-dimensional predictors. To implement this s...
We study the coherent resonance phenomenon of stock market returns and volatility and discuss the effectiveness of financial market based on the coherent resonance theory of statistical physics. In this paper, we use the Heston model to give the series describing the return and volatility. Considering the limited data of CSI 300 index, the Bayesian...
Corporation financial stability was investigated by the method of Econophysics and Bayesian approach in this paper. A stochastic predator–prey model was built to describe the corporation financial condition. The mean limiting extinction time was proposed to measure the corporation financial stability. The model parameters were estimated by using Ba...
We investigate coherence resonance of corporate finance in a stochastic predator-prey model for creditors and producers. The stochastic predator-prey model with only considering financial risk and the Integral method of an improvement parameter estimation are proposed. Then the coefficient of variation (CV) of the interspike intervals is used to me...
Testing equivalence of incomplete paired data arises frequently in biomedical studies. Most existing work impose the missing at random assumption, which is not realistic in practice. Two Bayesian approaches for testing the non-inferiority of incomplete paired data under non-ignorable missing mechanism are presented. In addition, Bayesian credible i...
A new model-free screening approach called as the slicing fused mean-variance filter is proposed for ultrahigh dimensional data analysis. The new method has the following merits: (i) its implementation does not require specifying a regression form of predictors and response variables; (ii) it can deal with various types of covariates and response v...
The classical assumption in generalized linear measurement error models (GLMEMs) is that measurement errors (MEs) for covariates are distributed as a fully parametric distribution such as the multivariate normal distribution. This paper uses a centered Dirichlet process mixture model to relax the fully parametric distributional assumption of MEs, a...
Many psychological concepts are unobserved and usually represented as latent factors apprehended through multiple observed indicators. When multiple-subject multivariate time series data are available, dynamic factor analysis models with random effects offer one way of modeling patterns of within- and between-person variations by combining factor a...
In this article we investigate the occurrence of stock market crash in an economy cycle. Bayesian approach, Heston model and statistical-physical method are considered. Specifically, Heston model and an effective potential are employed to address the dynamic changes of stock price. Bayesian approach has been utilized to estimate the Heston model's...
Efficient statistical inference on nonignorable missing data is a challenging problem. This paper proposes a new estimation procedure based on composite quantile regression (CQR) for linear regression models with nonignorable missing data, that is applicable even with high-dimensional covariates. A parametric model is assumed for modelling response...
Bayesian empirical likelihood (BEL) method with missing data depends heavily on the prior specification and missing data mechanism assumptions. It is well known that the resulting Bayesian estimations and tests may be sensitive to these assumptions and observations. To this end, a Bayesian local influence procedure is proposed to assess the effect...
The normality assumption of measurement error is a widely used distribution in joint models of longitudinal and survival data, but it may lead to unreasonable or even misleading results when longitudinal data reveal skewness feature. This paper proposes a new joint model for multivariate longitudinal and multivariate survival data by incorporating...
Handling data with the missing not at random (MNAR) mechanism is still a challenging problem in statistics. In this article, we propose a nonparametric imputation method based on the propensity score in a general class of semiparametric models for nonignorable missing data. Compared with the existing imputation methods, the proposed imputation meth...
Quantitle regression (QR) is a popular approach to estimate functional relations between variables for all portions of a probability distribution. Parameter estimation in QR with missing data is one of the most challenging issues in statistics. Regression quantiles can be substantially biased when observations are subject to missingness. We study s...
Growing-dimensional data with likelihood unavailable are often encountered in various fields. This paper presents a penalized exponentially tilted likelihood (PETL) for variable selection and parameter estimation for growing dimensional unconditional moment models in the presence of correlation among variables and model misspecifica- tion. Under so...
The stability of portfolio investment in stock market crashes with Markowitz portfolio is investigated by the method of theoretical and empirical simulation. From numerical simulation of the mean escape time (MET), we conclude that: (i) The increasing number (Np) of stocks in Markowitz portfolio induces a maximum in the curve of MET versus the init...
Structural equation models (SEMs) are often formulated using a prespecified parametric structural equation. In many applications, however, the formulation of the structural equation is unknown, and its misspecification may lead to unreliable statistical inference. This paper develops a general SEM in which latent variables are linearly regressed on...
This article investigates the trading time risk (TTR) of stock investment in the case of stock price drop of Dow Jones Industrial Average (ˆDJI) and Hushen300 data (CSI300), respectively. The escape time of stock price from the maximum to minimum in a data window length (DWL) is employed to measure the absolute TTR, the ratio of the escape time to...
We investigate the herd behavior of stock prices in a finance system with the Heston model. Based on parameter estimation of the Heston model obtained by minimizing the mean square deviation between the theoretical and empirical return distributions, we simulate mean residence time of positive return (MRTPR). Plots of MRTPR against the amplitude or...
Background:
Incomplete data often arise in various clinical trials such as crossover trials, equivalence trials, and pre and post-test comparative studies. Various methods have been developed to construct confidence interval (CI) of risk difference or risk ratio for incomplete paired binary data. But, there is little works done on incomplete conti...
In a linear regression model with nonignorable missing covariates, non-normal errors or outliers can lead to badly biased and misleading results with standard parameter estimation methods built on either least squares- or likelihood-based methods. A propensity score method with a robust and efficient regression procedure called composite quantile r...
A test for ordered categorical variables is of considerable importance, because they are frequently encountered in biomedical studies. This paper introduces a simple ordering test approach for the two-way contingency tables with incomplete counts by developing six test statistics, i.e., the likelihood ratio test statistic, score test statistic, glo...
Longitudinal data are frequently encountered in medical follow-up studies and economic research. Conditional mean regression and conditional quantile regression are often used to fit longitudinal data. Many methods focused on the cases where the observation times are independent of the response variables or conditionally independent of them given t...
This paper develops a Bayesian approach to obtain the joint estimates of unknown parameters, nonparametric functions and random effects in generalized partially linear mixed models (GPLMMs), and presents three case deletion influence measures to identify influential observations based on the φ-divergence, Cook's posterior mean distance and Cook's p...
Selecting a small number of relevant genes for classification has received a great deal of attention in microarray data analysis. While the development of methods for microarray data with only two classes is relevant, developing more efficient algorithms for classification with any number of classes is important. In this paper, we propose a Bayesia...
The roles of capital flow in an ensemble composed of sub-markets are investigated. A modified Heston model and recycled noises are employed to describe the dynamics of stock price and capital flow in the ensemble, respectively. The mean escape times of two sub-markets with a cubic nonlinearity are calculated by using numerical simulation. The resul...
By relaxing the linearity assumption in partial functional linear regression models, we propose a varying coefficient partially functional linear regression model (VCPFLM), which includes varying coefficient regression models and functional linear regression models as its special cases. We study the problem of functional parameter estimation in a V...
We propose five confidence intervals for sensitivity difference of two continuous-scale diagnostic tests at the fixed level of two specificities based on the generalized pivotal quantity method, the hybrid method and the Bootstrap resampling method.
Abstract Under the assumption of missing at random, eight confidence intervals (CIs) for the difference between two correlated proportions in the presence of incomplete paired binary data are constructed on the basis of the likelihood ratio statistic, the score statistic, the Wald-type statistic, the hybrid method incorporated with the Wilson score...
A two-arm non-inferiority trial without a placebo is usually adopted to demonstrate that an experimental treatment is not worse than a reference treatment by a small pre-specified non-inferiority margin due to ethical concerns. Selection of the non-inferiority margin and establishment of assay sensitivity are two major issues in the design, analysi...
Joint models for longitudinal and survival data are often used to investigate the association between longitudinal data and survival data in many studies. A common assumption for joint models is that random effects are distributed as a fully parametric distribution such as multivariate normal distribution. The fully parametric distribution assumpti...
Summary Matched-pair design is often used in clinical trials to increase the efficiency of establishing equivalence between two treatments with binary outcomes. In this article, we consider such a design based on rate ratio in the presence of incomplete data. The rate ratio is one of the most frequently used indices in comparing efficiency of two t...
This paper develops a Bayesian local influence approach to assess the effects of minor perturbations to the prior, sampling distribution and individual observations on the statistical inference in generalized partial linear mixed models (GPLMM) with the distribution of random effects specified by a truncated and centered Dirichlet process (TCDP) pr...
Methods for handling missing data depend strongly on the mechanism that generated the missing values, such as missing completely at random (MCAR) or missing at random (MAR), as well as other distributional and modeling assumptions at various stages. It is well known that the resulting estimates and tests may be sensitive to these assumptions as wel...
We develop an empirical likelihood (EL) inference on parameters in generalized estimating equations with nonignorably missing response data. We consider an exponential tilting model for the nonignorably missing mechanism, and propose modified estimating equations by imputing missing data through a kernel regression method. We establish some asympto...
This article proposes a Bayesian approach, which can simultaneously obtain the Bayesian estimates of unknown parameters and random effects, to analyze nonlinear reproductive dispersion mixed models (NRDMMs) for longitudinal data with nonignorable missing covariates and responses. The logistic regression model is employed to model the missing data m...
This paper considers several robust estimators for distribution functions and quantiles of a response variable when some responses may not be observed under the non-ignorable missing data mechanism. Based on a particular semiparametric regression model for non-ignorable missing response, we propose a nonparametric/semiparametric estimation method a...
Stratified matched-pair studies are often designed for adjusting stratification factors in modern medical researches. This article investigates a homogeneity test of differences between two correlated proportions in stratified matched-pair studies. We propose three test procedures, including an asymptotic test, bootstrap test, and multiple comparis...
An empirical likelihood (EL) approach to inference on mean functionals with nonignorably missing response data is developed. The nonignorably missing mechanism is specified by an exponential tilting model. Several maximum EL estimators (MELEs) for the response mean functional are proposed under different scenarios. We systematically investigate asy...
This paper develops the empirical likelihood (EL) inference on parameters and baseline function in a semiparametric nonlinear regression model for longitudinal data in the presence of missing response variables. We propose two EL-based ratio statistics for regression coefficients by introducing the working covariance matrix and a residual-adjusted...
In stratified matched-pair studies, risk difference between two proportions is one of the most frequently used indices in comparing efficiency between two treatments or diagnostic tests. This article presents five simultaneous confidence intervals and two bootstrap simultaneous confidence intervals for risk differences in stratified matched-pair de...
In the development of nonlinear reproductive dispersion mixed models, it is commonly assumed that distribution of random effects is normal. The normality assumption is likely violated in many practical applications. In this paper, we assume that distribution of random effects is specified by a Dirichlet process prior for relaxing this limitation. A...
A generalized partial linear mixed model (GPLMM) is a natural extension of generalized linear mixed models (GLMMs) and partial linear models (PLMs). Almost all existing methods for analyzing GPLMMs are developed on the basis of the assumption that random effects are distributed as a fully parametric distribution such as normal distribution. In this...
In stratified otolaryngologic (or ophthalmologic) studies, the misleading results may be obtained when ignoring the confounding effect and the correlation between responses from two ears. Score statistic and Wald-type statistic are presented to test equality in a stratified bilateral-sample design, and their corresponding sample size formulae are g...
A modified Bates and Watts geometric framework is proposed for quasi-likelihood nonlinear models in Euclidean inner product A modified Bates and Watts geometric framework is proposed for quasi-likelihood nonlinear models in Euclidean inner product
space. Based on the modified geometric framework, some asymptotic inference in terms of curvatures for...
Summary This article develops a variety of influence measures for carrying out perturbation (or sensitivity) analysis to joint models of longitudinal and survival data (JMLS) in Bayesian analysis. A perturbation model is introduced to characterize individual and global perturbations to the three components of a Bayesian model, including the data po...
Investigating the prevalence of a disease is an important topic in medical studies. Such investigations are usually based on the classification results of a group of subjects according to whether they have the disease. To classify subjects, screening tests that are inexpensive and nonintrusive to the test subjects are frequently used to produce res...
We examine three Bayesian case influence measures including the φ-divergence, Cook's posterior mode distance and Cook's posterior mean distance for identifying a set of influential observations for a variety of statistical models with missing data including models for longitudinal data and latent variable models in the absence/presence of missing d...
This paper investigates the estimations of regression parameters and response mean in nonlinear regression models in the presence of missing response variables that are missing with missingness probabilities depending on covariates. We propose four empirical likelihood (EL)-based estimators for the regression parameters and the response mean. The r...
In this article, we develop four explicit asymptotic two-sided confidence intervals for the difference between two Poisson rates via a hybrid method. The basic idea of the proposed method is to estimate or recover the variances of the two Poisson rate estimates, which are required for constructing the confidence interval for the rate difference, fr...
Sample size determination is commonly encountered in modern medical studies for two independent binomial experiments. A new approach for calculating sample size is developed by combining Bayesian and frequentist idea when a hypothesis test between two binomial proportions is conducted. Sample size is calculated according to Bayesian posterior decis...
This paper investigates homogeneity test of rate ratios in stratified matched-pair studies on the basis of asymptotic and bootstrap-resampling methods. Based on the efficient score approach, we develop a simple and computationally tractable score test statistic. Several other homogeneity test statistics are also proposed on the basis of the weighte...
In this paper we develop a general framework of Bayesian influence analysis for assessing various perturbation schemes to
the data, the prior and the sampling distribution for a class of statistical models. We introduce a perturbation model to
characterize these various perturbation schemes. We develop a geometric framework, called the Bayesian per...
The aim of this paper is to develop a Bayesian local influence method (Zhu et al. 2009, submitted) for assessing minor perturbations to the prior, the sampling distribution, and individual observations in survival analysis. We introduce a perturbation model to characterize simultaneous (or individual) perturbations to the data, the prior distributi...
We explore measuring Scleroderma patient disease improvement at the paired body part level and account for their correlation with the long term goal of possibly redefining disease progression using a shorter clinical examination. We propose using a binary outcome to measure disease progression at each paired body part level, construct tests for ass...
This paper investigates the equality test of risk ratios in multiple 2x2 tables with structural zero, which is formed by a confounding factor such as age, gender, severity of disease, region or other variables of interest. Score statistic, Wald statistic and likelihood ratio statistic for testing equality of risk ratios in multiple 2x2 tables with...
Parameters in time series and other dynamic models often show complex range restrictions and their distributions may deviate substantially from multivariate normal or other standard parametric distributions. We use the truncated Dirichlet process (DP) as a non-parametric prior for such dynamic parameters in a novel nonlinear Bayesian dynamic factor...
Parameters in time series and other dynamic models often show complex range restrictions and their distributions may deviate substantially from multivariate normal or other standard parametric distributions. We use the truncated Dirichlet process (DP) as a non-parametric prior for such dynamic parameters in a novel nonlinear Bayesian dynamic factor...
The present paper proposes a semiparametric reproductive dispersion nonlinear model (SRDNM) which is an extension of the nonlinear reproductive dispersion models and the semiparameter regression models. Maximum penalized likelihood estimates (MPLEs) of unknown parameters and nonparametric functions in SRDNM are presented. Assessment of local influe...
Semiparametric reproductive dispersion mixed-effects model (SPRDMM) is an extension of the reproductive dispersion model and the semiparametric mixed model, and it includes many commonly encountered models as its special cases. A Bayesian procedure is developed for analyzing SPRDMMs on the basis of P-spline estimates of nonparametric components. A...
In this article, we consider confidence interval construction for proportion ratio in paired samples. Previous studies usually reported that score-based confidence intervals consistently outperformed other asymptotic confidence intervals for correlated proportion difference and ratio. However, score-based confidence intervals may not possess closed...
This article proposes a semiparametric nonlinear reproductive dispersion model (SNRDM) which is an extension of nonlinear reproductive dispersion model and semiparametric regression model. Maximum penalized likelihood estimators (MPLEs) of unknown parameters and nonparametric functions in SNRDMs are presented. Some novel diagnostic statistics such...
Diffusion tensor imaging (DTI) provides important information on the structure of white matter fiber bundles as well as detailed tissue properties along these fiber bundles in vivo . This paper presents a functional regression framework, called FRATS, for the analysis of multiple diffusion properties along fiber bundle as functions in an infinite d...
Bilateral dichotomous data are very common in modern medical comparative studies (e.g. comparison of two treatments in ophthalmologic, orthopaedic and otolaryngologic studies) in which information involving paired organs (e.g. eyes, ears and hips) is available from each subject. In this article, we study various confidence interval estimators for p...
We consider a varying-coefficients reproductive dispersion linear model (VCRDLM). By the local likelihood approach, estimates of the parameters of interest are given, and the determination of the local weight and smoothing parameter as well as some statistical inferences are investigated in VCRDLM.
This article investigates the confidence regions for semiparametric nonlinear reproductive dispersion models (SNRDMs), which is an extension of nonlinear regression models. Based on local linear estimate of nonparametric component and generalized profile likelihood estimate of parameter in SNRDMs, a modified geometric framework of Bates and Wattes...
In this article, we develop a non-randomized multi-category response model for a single sensitive survey question with multiple outcomes. Unlike existing randomized response models, our proposed model does not require any randomizing device and the respondents are merely asked to answer a non-sensitive question. It thus reduces cost, ensures reprod...
A stratified matched-pair study is often designed for adjusting a confounding effect or effect of different trails/centers/ groups in modern medical studies. The relative risk is one of the most frequently used indices in comparing efficiency of two treatments in clinical trials. In this paper, we propose seven confidence interval estimators for th...
K correlated 2 x 2 tables with structural zero are commonly encountered in infectious disease studies. A hypothesis test for risk difference is considered in K independent 2 x 2 tables with structural zero in this paper. Score statistic, likelihood ratio statistic and Wald-type statistic are proposed to test the hypothesis on the basis of stratifie...
Non-linear structural equation models are widely used to analyze the relationships among outcomes and latent variables in modern educational, medical, social and psychological studies. However, the existing theories and methods for analyzing non-linear structural equation models focus on the assumptions of outcomes from an exponential family, and h...
In this article, we consider approximate sample size formulas for testing difference between two proportions for bilateral studies with binary outcomes. Sample size formulas are derived to achieve a prespecified power of a statistical test at a prechosen significance level. Four statistical tests are considered. Simulation studies are conducted to...
Nonlinear structural equation models with nonignorable missing outcomes from reproductive dispersion models are proposed to identify the relationship between manifest variables and latent variables in modern educational, medical, social and psychological studies. The nonignorable missing mechanism is specified by a logistic regression model. An EM...
Semiparametric reproductive dispersion nonlinear model (SRDNM) is an extension of nonlinear reproductive dispersion models
and semiparametric nonlinear regression models, and includes semiparametric nonlinear model and semiparametric generalized
linear model as its special cases. Based on the local kernel estimate of nonparametric component, profil...
Sample size determination is an essential component in public health survey designs on sensitive topics (e.g. drug abuse, homosexuality, induced abortions and pre or extramarital sex). Recently, non-randomised models have been shown to be an efficient and cost effective design when comparing with randomised response models. However, sample size for...
This article considers the problem of testing the equality of two multinomial proportions from ordered categories. The proportional odds model is adopted to reflect the order among categories and expresses the difference between the two multinomial distributions via a single location-type parameter. As a result, the problem of interest is reduced t...
We consider novel methods for the computation of model selection criteria in missing-data problems based on the output of the EM algorithm. The methodology is very general and can be applied to numerous situations involving incomplete data within an EM framework, from covariates missing at random in arbitrary regression models to nonignorably missi...
A stratified study is often designed for adjusting several independent trials in modern medical research. We consider the problem of non-inferiority tests and sample size determinations for a nonzero risk difference in stratified matched-pair studies, and develop the likelihood ratio and Wald-type weighted statistics for testing a null hypothesis o...
Large, family-based imaging studies can provide a better understanding of the interactions of environmental and genetic influences on brain structure and function. The interpretation of imaging data from large family studies, however, has been hindered by the paucity of well-developed statistical tools for that permit the analysis of complex imagin...
In otolaryngologic (or ophthalmologic) studies, each subject usually contributes information for each of two ears (or eyes), and the values from the two ears (or eyes) are generally highly correlated. Statistical procedures that fail to take into account the correlation between responses from two ears could lead to incorrect results. On the other h...
We develop diagnostic measures for assessing the influence of individual observations when using empirical likelihood with general estimating equations, and we use these measures to construct goodness-of-fit statistics for testing possible misspecification in the estimating equations. Our diagnostics include case-deletion measures, local influence...
Methods for the analysis of brain morphology, including voxel-based morphology and surface-based morphometries, have been used to detect associations between brain structure and covariates of interest, such as diagnosis, severity of disease, age, IQ, and genotype. The statistical analysis of morphometric measures usually involves two statistical pr...
In this article, we compare Wald-type, logarithmic transformation, and Fieller-type statistics for the classical 2-sided equivalence testing of the rate ratio under matched-pair designs with a binary end point. These statistics can be implemented through sample-based, constrained least squares estimation and constrained maximum likelihood (CML) est...
The analysis of interaction among latent variables has received much attention. This article introduces a Bayesian approach to analyze a general structural equation model that accommodates the general nonlinear terms of latent variables and covariates. This approach produces a Bayesian estimate that has the same statistical optimal properties as a...
The main purpose of this article is to investigate a nonlinear structural equation model with covariates and mixed continuous and ordered categorical out-comes, in the presence of missing observations and missing covariates that are missing with a nonignorable mechanism. The nonignorable missingness mechanism is specified by a logistic regression m...
In ophthalmologic studies, each subject usually contributes important information for each of two eyes and the values from the two eyes are generally highly correlated. Previous studies showed that test procedures for binary paired data that ignore the presence of intraclass correlation could lead to inflated significance levels. Furthermore, it is...