About the lab
Theory and Methodology for non- and semiparametric models, with applications in finance, economics, agronomy, food science, genetics, geoscience, brain science, etc.
Featured projects (1)
Featured research (7)
We investigate statistical inference for the mean function of stationary functional time series data with an infinite moving average structure. We propose a B-spline estimation for the temporally ordered trajectories of the functional moving average, which are used to construct a two-step estimator of the mean function. Under mild conditions, the B-spline mean estimator enjoys oracle efficiency in the sense that it is asymptotically equivalent to the infeasible estimator, that is, the sample mean of all trajectories observed entirely without errors. This oracle efficiency allows us to construct a simultaneous confidence band (SCB) for the mean function, which is asymptotically correct. Simulation results strongly corroborate the asymptotic theory. Using the SCB to analyze an elec-troencephalogram time series reveals strong evidence of a trigonometric form of the mean function.
Estimation and testing is studied for functional data with temporally dependent errors, an interesting example of which is the event-related potential (ERP). B-spline estimators are formulated for individual smooth trajectories and their population mean as well. The mean estimator is shown to be oracally efficient in the sense that it is as efficient as the infeasible mean estimator if all trajectories had been fully observed without contamination of errors. The oracle efficiency entails asymptotically correct simultaneous confidence band (SCB) for the mean function, which is useful for making inference on the global shape of the mean. Extensive simulation experiments with various time series errors and functional principal components confirm the theoretical conclusions. For a moderate-sized ERP data set, multiple comparison is made by constructing paired SCBs among four different stimuli, over three components N450, N1, and N2 separately or simultaneously, leading to interesting findings.
Maximum likelihood estimator (MLE) and Bayesian Information Criterion (BIC) order selection are examined for ARMA time series with slowly varying trend to validate the well-known detrending technique of moving average [Section 1.4, Brockwell, P.J., and Davis, R.A. (1991), Time Series: Theory and Methods, New York: Springer-Verlag]. In step one, a moving average equivalent to local linear regression is fitted to the raw data with a data-driven lag number, and subtracted from raw data to produce a sequence of residuals. The residuals are used in step two as substitutes of the latent ARMA series for MLE and BIC procedures. It is shown that with second order smooth trend and correctly chosen lag number, the two-step MLE is oracally efficient, i.e. it is asymptotically as efficient as the would-be MLE based on the unobserved ARMA series. At the same time, the two-step BIC consistently selects the orders as well. Simulation experiments corroborate the theoretical findings.
Statistical inference for functional time series is investigated by extending the classic concept of autocovariance function (ACF) to functional ACF (FACF). It is established that for functional moving average (FMA) data, the FMA order can be determined as the highest nonvanishing order of FACF, just as in classic time series analysis. A two-step estimator is proposed for FACF, the first step involving simultaneous B-spline estimation of each time trajectory and the second step plug-in estimation of FACF by using the estimated trajectories in place of the latent true curves. Under simple and mild assumptions, the proposed tensor product spline FACF estimator is asymptotically equivalent to the oracle estimator with all known trajectories, leading to asymptotic correct simultaneous confidence envelope (SCE) for the true FACF. Simulation experiments validate the asymptotic correctness of the SCE and data-driven FMA order selection. The proposed SCEs are computed for the FACFs of an ElectroEncephalogram (EEG) functional time series with interesting discovery of finite FMA lag and Fourier form functional principal components.
Kolmogorov-Smirnov (K-S) simultaneous confidence band (SCB) is constructed for the error distribution of dense functional data based on kernel distribution estimator (KDE). The KDE is computed from residuals of B spline trajectories over a smaller number of measurements, whereas the B spline trajectories are computed from the remaining larger set of measurements. Under mild and simple assumptions, it is shown that the KDE is a uniformly oracle-efficient estimator of the error distribution, and the SCB has the same asymptotic properties as the classic K-S SCB based on the infeasible empirical cumulative distribution function (EDF) of unobserved errors. Simulation examples corroborate with the theoretical findings. The proposed method is illustrated by examples of an EEG (Electroencephalogram) data and a stock data.
- Department of Industrial Engineering Center for Statistical Science
About Lijian Yang
- data: time series, functional, high dimensional, sample survey; theory: simultaneous confidence region & oracle efficiency; applications: econometrics, genetics, agronomy, food science, brain science; honors: ASA Fellow, IMS Fellow, ISI Elected Member, IETI Distinguished Fellow, Tjalling C. Koopmans Econometric Theory Prize; Ph.D.'s: Michigan State University 7, Soochow University 3, Tsinghua University 3: 3 associate & 3 full professors in US, 3 associate professors & 3 lecturers in China