Lijian Yang

Lijian Yang
Tsinghua University | TH · Department of Industrial Engineering Center for Statistical Science

Ph.D. (1995), UNC-Chapel Hill; B.S. (1987), Peking University
functional moving average (FMA) and functional panel data, e.g., EEG; functional data with dependent error, e.g., ERP

About

127
Publications
19,424
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
2,567
Citations
Citations since 2017
35 Research Items
1203 Citations
2017201820192020202120222023050100150200250
2017201820192020202120222023050100150200250
2017201820192020202120222023050100150200250
2017201820192020202120222023050100150200250
Introduction
data: time series, functional, high dimensional, sample survey; theory: simultaneous confidence region & oracle efficiency; applications: econometrics, genetics, agronomy, food science, brain science; honors: ASA Fellow, IMS Fellow, ISI Elected Member, IETI Distinguished Fellow, Tjalling C. Koopmans Econometric Theory Prize; Ph.D.'s: Michigan State University 7, Soochow University 3, Tsinghua University 3: 3 associate & 3 full professors in US, 3 associate professors & 3 lecturers in China
Additional affiliations
January 2011 - April 2016
Soochow University (PRC)
Position
  • Director
Description
  • My research group work on the inference of functional data, time series data, spatial data, experimental design, empirical Bayes, high frequency financial data, etc.
January 2011 - June 2016
Soochow University (PRC)
Position
  • Professor
Description
  • Undergraduate course: Mathematical Statistics (major) Graduate courses: Nonparametric Inference I, II, Time Series Analysis, Nonlinear Time Series, Functional Data Analysis Advising: 11 MS students, 3 PhD students
August 1997 - January 2014
Michigan State University
Position
  • Professor (Full)
Description
  • My teaching and research at MSU have been rewarding. I have advised 8 doctoral students, 7 became tenure-track assistant professors upon graduation, 3 of whom had been promoted to tenured full professors, 3 tenured associate professors.
Education
August 1990 - December 1995
August 1990 - August 1993
September 1983 - June 1987
Peking University
Field of study
  • Mathematics

Publications

Publications (127)
Article
Full-text available
We propose kernel estimator for the distribution function of unobserved errors in autoregressive time series, based on residuals computed by estimating the autoregressive coefficients with the Yule-Walker method. Under mild assumptions, we establish oracle efficiency of the proposed estimator, that is, it is asymptotically as efficient as the kerne...
Article
Functional data analysis (FDA) has become an important area of statistics research in the recent decade, yet a smooth simultaneous confidence corridor (SCC) does not exist in the literature for the mean function of sparse functional data. SCC is a powerful tool for making statistical inference on an entire unknown function, nonetheless classic “Hun...
Article
Most time series that are encountered in practice contain non-zero trend, yet textbook approaches to time series analysis are typically focused on zero-mean stationary auto-regressive moving average (ARMA) processes.Trend is often estimated by ad hoc methods and subtracted from time series, and the residuals are used as the true ARMA noise for data...
Article
Full-text available
Humboldt-Universität zu Berlin Generalized additive models (GAM) are multivariate nonpara-metric regressions for non-Gaussian responses including binary and count data. We propose a spline-backfitted kernel (SBK) estimator for the component functions. Our results are for weakly dependent data and we prove oracle efficiency. The SBK techniques is bo...
Article
Full-text available
Application of nonparametric and semiparametric regression techniques to high-dimensional time series data has been hampered due to the lack of effective tools to address the ``curse of dimensionality.'' Under rather weak conditions, we propose spline-backfitted kernel estimators of the component functions for the nonlinear additive time series dat...
Article
A key step to the establishment of a tiered healthcare system is equitable access to basic primary healthcare services for all. However, no quantitative research on the national status quo of primary healthcare accessibility in China exists. We filled this gap by estimating spatial accessibility to primary healthcare centers (PHCs) and mapping its...
Article
Full-text available
We investigate statistical inference for the mean function of stationary functional time series data with an infinite moving average structure. We propose a B-spline estimation for the temporally ordered trajectories of the functional moving average, which are used to construct a two-step estimator of the mean function. Under mild conditions, the B...
Article
Full-text available
Claims about distributions of time series are often unproven assertions instead of substantiated conclusions for lack of hypotheses testing tools. In this work, Kolmogorov–Smirnov type simultaneous confidence bands (SCBs) are constructed based on simple random samples (SRSs) drawn from realizations of time series, together with smooth SCBs using ke...
Article
Estimation and testing is studied for functional data with temporally dependent errors, an interesting example of which is the event-related potential (ERP). B-spline estimators are formulated for individual smooth trajectories and their population mean as well. The mean estimator is shown to be oracally efficient in the sense that it is as efficie...
Article
Maximum likelihood estimator (MLE) and Bayesian Information Criterion (BIC) order selection are examined for ARMA time series with slowly varying trend to validate the well-known detrending technique of moving average [Section 1.4, Brockwell, P.J., and Davis, R.A. (1991), Time Series: Theory and Methods, New York: Springer-Verlag]. In step one, a m...
Article
Full-text available
Statistical inference for functional time series is investigated by extending the classic concept of autocovariance function (ACF) to functional ACF (FACF). It is established that for functional moving average (FMA) data, the FMA order can be determined as the highest nonvanishing order of FACF, just as in classic time series analysis. A two-step e...
Article
Kolmogorov-Smirnov (K-S) simultaneous confidence band (SCB) is constructed for the error distribution of dense functional data based on kernel distribution estimator (KDE). The KDE is computed from residuals of B spline trajectories over a smaller number of measurements, whereas the B spline trajectories are computed from the remaining larger set o...
Article
Full-text available
This study aims to explore the possibility of predicting the dispositional level of dialectical thinking using resting-state electroencephalography signals. Thirty-four participants completed a self-reported measure of dialectical thinking, and their resting-state electroencephalography was recorded. After wave filtration and eye movement removal,...
Preprint
Motivated by recent data analyses in biomedical imaging studies, we consider a class of image-on-scalar regression models for imaging responses and scalar predictors. We propose using flexible multivariate splines over triangulations to handle the irregular domain of the objects of interest on the images, as well as other characteristics of images....
Article
Full-text available
Asymptotically correct simultaneous confidence bands (SCBs) are proposed in both multiplicative and additive form to compare variance functions of two samples in the nonparametric regression model based on deterministic designs. The multiplicative SCB is based on two-step estimation of ratio of the variance functions, which is as efficient, up to o...
Article
A smooth simultaneous confidence band (SCB) is constructed for the distribution of unobserved errors in a nonparametric regression model based on a plug-in kernel distribution estimator. The normalized estimation error process is shown to converge to a Gaussian process. Simulation experiments indicate that the proposed SCB not only strikes an intel...
Preprint
This study aims to explore the possibility of predicting the dispositional level of dialectical thinking using resting-state electroencephalography signals. Thirty-four participants completed a self-reported measure of dialectical thinking, and their resting-state electroencephalography was recorded. After wave filtration and eye signal removal, ti...
Article
Full-text available
We consider the estimation of the boundary of a set when it is known to be sufficiently smooth, to satisfy certain shape constraints and to have an additive structure. Our proposed method is based on spline estimation of a conditional quantile regression and is resistant to outliers and/or extreme values in the data. This work is a desirable extens...
Article
This paper concerns the comparison of two sample non parametric regression. An asymptotically correct simultaneous confidence band (SCB) is proposed for the difference of two-sample non parametric regression functions to achieve the goal of comparison. Simulation experiments provide strong evidence that corroborates the asymptotic theory. The propo...
Article
The popularity of a fashion item depends on its color, shape, texture, and price. For different items (with all attributes identical except color) of a specific product, fashion retailers need to learn consumer color preference and decide their order quantities accordingly to match their products with consumer demand. This study aims to predict con...
Article
Production frontier is an important concept in modern economics and has been widely used to measure production efficiency. Existing nonparametric frontier models often only allow one or low-dimensional input variables due to ‘curse-of-dimensionality’. In this paper we propose a flexible additive frontier model which quantifies the effects of multip...
Article
A time varying‐40 autoregressive conditional heteroskedasticity (ARCH) model is proposed to describe the changing volatility of a financial return series over long time horizon, along with two‐step least squares and maximum likelihood estimation procedures. After preliminary estimation of the time varying trend in volatility scale, approximations t...
Article
Full-text available
Spatial lifecourse epidemiology is an interdisciplinary field that utilizes advanced spatial, location-based, and artificial intelligence technologies to investigate the long-term effects of environmental, behavioural, psychosocial, and biological factors on health-related states and events and the underlying mechanisms. With the growing number of...
Article
The inference via simultaneous confidence band is studied for stationary covariance function of dense functional data. A two-stage estimation procedure is proposed based on spline approximation, the first stage involving estimation of all the individual trajectories and the second stage involving estimation of the covariance function through smooth...
Article
Background: There is always a demand for fast and accurate algorithms for EEG signal processing. Owing to the high sample rate, EEG signals usually come with a large number of sample points, making it difficult to predict the working memory ability in cognitive research with EEG. New method: Following well-designed experiments, the functional li...
Article
Existing functional data analysis literature has mostly overlooked data with spikes in mean, such as weekly sporting goods sales by a salesperson which spikes around holidays. For such functional data, two-step estimation procedures are formulated for the population mean function and holiday effect parameters, which correspond to the population sal...
Preprint
Inference via simultaneous confidence band is studied for stationary covariance function of dense functional data. A two-stage estimation procedure is proposed based on spline approximation, the first stage involving estimation of all the individual trajectories and the second stage involving estimation of the covariance function through smoothing...
Article
Full-text available
Estimation and Inference for Generalized Geoadditive Models In many application areas, data are collected on a count or binary response with spatial covariate information. In this paper, we introduce a new class of generalized geoadditive models (GGAMs) for spatial data distributed over complex domains. Through a link function, the proposed GGAM as...
Article
Asymptotically correct simultaneous confidence bands (SCBs) are proposed for the mean and variance functions of a nonparametric regression model based on deterministic designs. The variance estimation is as efficient, up to order n⁻1/², as an infeasible estimator if the mean function were known. Simulation experiments provide strong evidence that c...
Article
Stratified sampling is one of the most important survey sampling approaches and is widely used in practice. In this paper, we consider the estimation of the distribution function of a finite population in stratified sampling by the empirical distribution function (EDF) and kernel distribution estimator (KDE), respectively. Under general conditions,...
Article
A plug-in estimator is proposed for a local measure of variance explained by regression, termed correlation curve in Doksum et al. (J Am Stat Assoc 89:571–582, 1994), consisting of a two-step spline–kernel estimator of the conditional variance function and local quadratic estimator of first derivative of the mean function. The estimator is oracally...
Article
Full-text available
A kernel distribution estimator (KDE) is proposed for multi-step-ahead prediction error distribution of autoregressive time series, based on prediction residuals. Under general assumptions, the KDE is proved to be oracally efficient as the infeasible KDE and the empirical cumulative distribution function (cdf) based on unobserved prediction errors....
Article
Simultaneous confidence bands (SCBs) are proposed for the distribution function of a finite population and of the latent superpopulation via the empirical distribution function (nonsmooth) and kernel distribution estimator (smooth) based on a simple random sample (SRS), either with or without finite population correction. It is shown that both nons...
Article
In spite of widespread use of generalized additive models (GAMs) to remedy the “curse of dimensionality”, there is no well-grounded methodology developed for simultaneous inference and variable selection for GAM in existing literature. However, both are essential in enhancing the capability of statistical models. To this end, we establish simultane...
Article
The semiparametric GARCH (Generalized AutoRegressive Conditional Heteroskedasticity) model of Yang (2006, Journal of Econometrics 130, 365-384) has combined the flexibility of a nonparametric link function with the dependence on infinitely many past observations of the classic GARCH model. We propose a cubic spline procedure to estimate the unknown...
Article
We propose a data-driven method to select significant variables in additive model via spline estimation. The additive structure of the regression model is imposed to overcome the ‘curse of dimensionality’, while the spline estimators provide a good approximation to the additive components of the model. The additive components are ordered according...
Article
Full-text available
We consider nonparametric estimation of the covariance function for dense functional data using computationally efficient tensor product B-splines. We develop both local and global asymptotic distributions for the proposed estimator, and show that our estimator is as efficient as an "oracle" estimator where the true mean function is known. Simultan...
Article
Full-text available
A smooth simultaneous confidence band (SCB) is obtained for heteroscedastic variance function in nonparametric regression by applying spline regression to the conditional mean function followed by Nadaraya–Waston estimation using the squared residuals. The variance estimator is uniformly oracally efficient, that is, it is as efficient as, up to ord...
Article
We consider the problem of estimating a relationship nonparametrically using regression splines when there exist both continuous and categorical predictors. We combine the global properties of regression splines with the local properties of categorical kernel functions to handle the presence of categorical predictors rather than resorting to sample...
Article
Full-text available
Over the last twenty-five years, various n-consistent estimators have been devised for the coefficient vector in the popular semiparametric single-index model. In this paper, we prove under general assumptions that the kernel estimator of the link function by a univariate regression on the index variable is oracally efficient, namely, the estimator...
Article
Full-text available
We present a method of using local linear smoothing to construct simultaneous confidence bands for the mean function of densely spaced functional data. Our approach works well under mild conditions. In addition, the local linear estimator and its accompanying confidence band enjoy semiparametric efficiency in the sense that they are asymptotically...
Article
Full-text available
We consider a varying coefficient regression model for sparse functional data, with time varying response variable depending linearly on some time-independent covariates with coefficients as functions of time-dependent covariates. Based on spline smoothing, we propose data-driven simultaneous confidence corridors for the coefficient functions with...
Article
Full-text available
In this paper, we consider the uniform strong consistency of the cumulative distribution function estimator in nonparametric regression. We obtain the extended Glivenko–Cantelli theorem for the residual-based empirical distribution function.
Book
In spite of the widespread use of generalized additive models (GAMs), there is no well established methodology for simultaneous inference and variable selection for the components of GAM. There is no doubt that both, inference on the marginal component functions and their selection, are essential in this additive statistical models. To this end, we...
Book
Full-text available
We consider a varying coefficient regression model for sparse functional data, with time varying response variable depending linearly on some time independent covariates with coefficients as functions of time dependent covariates. Based on spline smoothing, we propose data driven simultaneous confidence corridors for the coefficient functions with...
Article
Full-text available
In spite of the widespread use of generalized additive models (GAMs), there is no well established methodology for simultaneous inference and variable selection for the components of GAM. There is no doubt that both, inference on the marginal component functions and their selection, are essential in this additive statistical models. To this end, we...
Article
Many statistical models arising in applications contain non- and weakly-identified parameters. Due to identifiability concerns, tests concerning the parameters of interest may not be able to use conventional theories and it may not be clear how to assess statistical significance. This paper extends the literature by developing a testing procedure t...
Article
Full-text available
A plug-in kernel estimator is proposed for Hölder continuous cumulative distribution function (cdf) based on a random sample. Uniform closeness between the proposed estimator and the empirical cdf estimator is established, while the proposed estimator is smooth instead of a step function. A smooth simultaneous confidence band is constructed based o...
Article
Time series often contain unknown trend functions and unobservable error terms. As is known, Yule-Walker estimators are asymptotically efficient for autoregressive time series. The focus of this article is the Yule-Walker estimators for time series with trends. A nonparametric detrending procedure is proposed. It is concluded that the asymptotic pr...
Article
The paper considers the construction of a confidence band for the trend function of a stationary time series. An explicit formula is derived based on polynomial splines and Sunklodas (1984). The performance of the confidence band is illustrated by simulation studies. The proposed method is applied to the analysis of the annual yields of wheat in th...
Article
Full-text available
A polynomial spline estimator is proposed for the mean function of dense functional data together with a simultaneous confidence band which is asymptotically correct. In addition, the spline estimator and its accompanying confidence band enjoy oracle efficiency in the sense that they are asymptotically the same as if all random trajectories are obs...
Article
Full-text available
Functional data analysis has received considerable recent attention and a number of successful applications have been reported. In this paper, asymp-totically simultaneous confidence bands are obtained for the mean function of the functional regression model, using piecewise constant spline estimation. Simulation experiments corroborate the asympto...
Article
Full-text available
We consider a class of semiparametric GARCH models with additive autoregressive components linked together by a dynamic coefficient. We propose estimators for the additive components and the dynamic coefficient based on spline smoothing. The estimation procedure involves only a small number of least squares operations, thus it is computationally ef...
Article
The article considers the Yule-Walker estimator of the autoregressive coefficient based on the observed time series that contains an unknown trend function and an autoregressive error term. The trend function is estimated by means of B-splines and then subtracted from the observations. The Yule-Walker estimator is obtained from the residual sequenc...
Article
Background and objectives: Motivated from a study on breast cancer, we consider the problem of evaluating a statistical hypothesis when some model characteristics are potentially non or weakly identifiable from observed data. Such scenarios are common in longitudinal studies for evaluating a covariate effect when dropouts may be informative. The hy...
Article
Because of the growing interest in nutraceuticals and their health benefits, it is important to develop tools for modeling degradation of nutraceuticals in low-moisture- and high-temperature-heated foods. The objective of this study was to estimate the kinetic parameters for the degradation of anthocyanins in grape pomace and to calculate the boots...
Article
Full-text available
Motivation: The genetic basis of complex traits often involves the function of multiple genetic factors, their interactions and the interaction between the genetic and environmental factors. Gene-environment (G×E) interaction is considered pivotal in determining trait variations and susceptibility of many genetic disorders such as neurodegenerativ...
Article
Full-text available
In a random-design nonparametric regression model, procedures for detecting jumps in the regression function via constant and linear spline estimation method are proposed based on the maximal differences of the spline estimators among neighbouring knots, the limiting distributions of which are obtained when the regression function is smooth. Simula...
Book
Generalized additive models (GAM) are multivariate nonparametric regressions for non-Gaussian responses including binary and count data. We propose a spline-backfitted kernel (SBK) estimator for the component functions. Our results are for weakly dependent data and we prove oracle efficiency. The SBK techniques is both computational expedient and t...
Article
A spline-backfitted kernel smoothing method is proposed for partially linear additive model. Under assumptions of stationarity and geometric mixing, the proposed function and parameter estimators are oracally efficient and fast to compute. Such superior properties are achieved by applying to the data spline smoothing and kernel smoothing consecutiv...
Book
Longitudinal data analysis is a central piece of statistics. The data are curves and they are observed at random locations. This makes the construction of a simultaneous confidence corridor (SCC) (confidence band) for the mean function a challenging task on both the theoretical and the practical side. Here we propose a method based on local linear...
Article
Full-text available
3 Humboldt-Universität zu Berlin and 4 National Central University Longitudinal data analysis is a central piece of statistics. The data are curves and they are observed at random locations. This makes the construc-tion of a simultaneous confidence corridor (SCC) (confidence band) for the mean function a challenging task on both the theoretical and...
Article
Full-text available
Although many types of confidence bands exist for nonparametric regression with i.i.d. data, theoretical properties of such bands have never been established under dependence. We propose simultaneous confi-dence bands for nonparametric prediction function of time-series data using spline estimation. Asymptotic properties are established under the a...
Article
Under weak conditions of smoothness and mixing, we propose spline-backfitted spline (SBS) estimators of the component functions for a nonlinear additive autoregression model that is both computationally expedient for analyzing high dimensional large time series data, and theoretically reliable as the estimator is oracally efficient and comes with a...
Article
Full-text available
Additive coefficient model (Xue and Yang, 2006a, 2006b) is a flexible regression and autoregression tool that circumvents the We propose spline-backfitted kernel (SBK) and spline-backfitted local linear (SBLL) estimators for the component functions in the additive coefficient model that are both (i) computationally expedient so they are usable for...
Article
Full-text available
A great deal of effort has been devoted to the inference of additive model in the last decade. Among existing procedures, the kernel type are too costly to implement for high dimensions or large sample sizes, while the spline type provide no asymptotic distribution or uniform convergence. We propose a one step backfitting estimator of the component...
Article
Full-text available
Asymptotically exact and conservative confidence bands are obtained for possibly heteroscedastic variance functions, using piecewise constant and piecewise linear spline estimation, respectively. The variance estimation is as efficient as an infeasible estimator when the conditional mean function is known, and the widths of the confidence bands are...
Article
Full-text available
For the past two decades, the single-index model, a special case of pro-jection pursuit regression, has proven to be an efficient way of coping with the high-dimensional problem in nonparametric regression. In this paper, based on a weakly dependent sample, we investigate a robust single-index model, where the single-index is identified by the best...
Article
Full-text available
Asymptotically exact and conservative confidence bands are obtained for a nonparametric regression function, using piecewise constant and piecewise lin-ear spline estimation, respectively. Compared to the pointwise confidence interval of Huang (2003), the confidence bands are inflated by a factor proportional to {log (n)} 1/2 , with the same width...
Article
Full-text available
A smooth kernel estimator is proposed for multivariate cumulative distribution functions (cdf), extending the work of Yamato [H. Yamato, Uniform convergence of an estimator of a distribution function, Bull. Math. Statist. 15 (1973), pp. 69–78.] on univariate distribution function estimation. Under assumptions of strict stationarity and geometricall...
Article
Full-text available
Crop yields are highly variable spatially and temporally as a result of complex interactions among topography, weather conditions, and management practices. The objective of this study was to analyze the effects of management practices on the relationship between crop yields and precipitation and crop yields and topography using 10 yr of yield data...
Article
Degradation of nutraceuticals in low- and intermediate-moisture foods heated at high temperature (>100 degrees C) is difficult to model because of the nonisothermal condition. Isothermal experiments above 100 degrees C are difficult to design because they require high pressure and small sample size in sealed containers. Therefore, a nonisothermal m...
Article
Full-text available
Additive model has been widely recognized as an efiective tool for dimension reduction. Ex- isting methods for estimation of additive regression function, including backfltting, marginal integration, projection and spline methods, do not provide any level of uniform confldence. In this paper we propose a simple construction of confldence band for t...
Article
Full-text available
For the past two decades, single-index model, a special case of projection pursuit regression, has proven to be an efficient way of coping with the high dimensional problem in nonparametric regression. In this paper, based on weakly dependent sample, we investigate the single-index prediction (SIP) model which is robust against deviation from the s...
Article
Full-text available
A seasonal additive nonlinear vector autoregression (SANVAR) model is proposed for multivariate seasonal time series to explore the possible interaction among the various univariate series. Significant lagged variables are selected and additive autoregression functions estimated based on the selected variables using spline smoothing method. Conserv...
Article
To investigate the acute response of immature articular cartilage, in the distraction and consolidation phases, to 30% tibial lengthening. Sixteen immature New Zealand white rabbits underwent diaphyseal lengthening of the left tibia by callotasis at a distraction rate of 0.4mm twice daily. A sham control group of 12 rabbits underwent fixation and o...
Article
Due to difficulty in computing, confidence intervals (CIs) for kinetic parameters and the predicted dependent variable (Y) in nonlinear models are often not reported. The purpose of this work was to present a straightforward method to calculate asymptotic CIs for kinetic parameters and the associated Y variable for nonisothermal survivor or retenti...
Article
Full-text available
A flexible nonparametric regression model is considered in which the re-sponse depends linearly on some covariates, with regression coefficients as additive functions of other covariates. Polynomial spline estimators are proposed for the unknown coefficient functions, with optimal univariate mean square convergence rate under geometric mixing condi...