ArticlePublisher preview available

Simultaneous confidence bands for comparing variance functions of two samples based on deterministic designs

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract and Figures

Asymptotically correct simultaneous confidence bands (SCBs) are proposed in both multiplicative and additive form to compare variance functions of two samples in the nonparametric regression model based on deterministic designs. The multiplicative SCB is based on two-step estimation of ratio of the variance functions, which is as efficient, up to order n-1/2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$n^{-1/2}$$\end{document}, as an infeasible estimator if the two mean functions are known a priori. The additive SCB, which is the log transform of the multiplicative SCB, is location and scale invariant in the sense that the width of SCB is free of the unknown mean and variance functions of both samples. Simulation experiments provide strong evidence that corroborates the asymptotic theory. The proposed SCBs are used to analyze several strata pressure data sets from the Bullianta Coal Mine in Erdos City, Inner Mongolia, China.
Plots of 95% SCB (thick) for σ12x/σ22x\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sigma _{1}^{2}\left( x\right) /\sigma _{2}^{2}\left( x\right) $$\end{document} (solid) and the estimator σ^12x/σ^22x\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{\sigma }_{1}^{2}\left( x\right) /\hat{\sigma } _{2}^{2}\left( x\right) $$\end{document} (dashed) in Case 1, with an1=n2=300,ε∼N0,1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$n_{1}=n_{2}=300, \varepsilon \sim N\left( 0,1\right) $$\end{document}; bn1=n2=300,ε∼U-3,3;\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$n_{1}=n_{2}=300,\varepsilon \sim U\left( -\sqrt{3},\sqrt{3}\right) ;$$\end{document}cn1=n2=600,ε∼N0,1;\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$n_{1}=n_{2}=600,\varepsilon \sim N\left( 0,1\right) ;$$\end{document}dn1=n2=600,ε∼U-3,3;\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ n_{1}=n_{2}=600,\varepsilon \sim U\left( -\sqrt{3},\sqrt{3}\right) ;$$\end{document}en1=n2=900,ε∼N0,1;\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$n_{1}=n_{2}=900,\varepsilon \sim N\left( 0,1\right) ;$$\end{document}fn1=n2=900,ε∼U-3,3\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$n_{1}=n_{2}=900,\varepsilon \sim U\left( -\sqrt{3},\sqrt{3}\right) $$\end{document}
… 
This content is subject to copyright. Terms and conditions apply.
Computational Statistics (2021) 36:1197–1218
https://doi.org/10.1007/s00180-020-01043-6
ORIGINAL PAPER
Simultaneous confidence bands for comparing variance
functions of two samples based on deterministic designs
Chen Zhong1·Lijian Yang1
Received: 8 August 2020 / Accepted: 22 October 2020 / Published online: 31 October 2020
© Springer-Verlag GmbH Germany, part of Springer Nature 2020
Abstract
Asymptotically correct simultaneous confidence bands (SCBs) are proposed in both
multiplicative and additive form to compare variance functions of two samples in the
nonparametric regression model based on deterministic designs. The multiplicative
SCB is based on two-step estimation of ratio of the variance functions, which is as
efficient, up to order n1/2, as an infeasible estimator if the two mean functions are
known a priori. The additive SCB, which is the log transform of the multiplicative
SCB, is location and scale invariant in the sense that the width of SCB is free of
the unknown mean and variance functions of both samples. Simulation experiments
provide strong evidence that corroborates the asymptotic theory. The proposed SCBs
are used to analyze several strata pressure data sets from the Bullianta Coal Mine in
Erdos City, Inner Mongolia, China.
Keywords Brownian motion ·B-spline ·Kernel ·Oracle efficiency ·Strata pressure ·
Variance ratio
1 Introduction
Nonparametric simultaneous confidence band (SCB) is a useful tool for statistical
inference about the global properties of an entire unknown curve or function. It was
first constructed in Bickel and Rosenblatt (1973) for a kernel density function. Then
nonparametric SCB was soon extended to regression function, see Johnston (1982),
Härdle (1989), Härdle and Marron (1991), Eubank and Speckman (1993), Xia (1998),
and Claeskens and Van Keilegom (2003) for early works about SCB. SCB not only is
a theoretically beautiful construct, but also has wide applications in many areas such
as sample survey and functional data analysis, see Zhao and Wu (2008), Ma et al.
BLijian Yang
yanglijian@tsinghua.edu.cn
1Center for Statistical Science and Department of Industrial Engineering, Tsinghua University, Beijing
100084, China
123
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Article
Full-text available
Statistical inference for functional time series is investigated by extending the classic concept of autocovariance function (ACF) to functional ACF (FACF). It is established that for functional moving average (FMA) data, the FMA order can be determined as the highest nonvanishing order of FACF, just as in classic time series analysis. A two-step estimator is proposed for FACF, the first step involving simultaneous B-spline estimation of each time trajectory and the second step plug-in estimation of FACF by using the estimated trajectories in place of the latent true curves. Under simple and mild assumptions, the proposed tensor product spline FACF estimator is asymptotically equivalent to the oracle estimator with all known trajectories, leading to asymptotic correct simultaneous confidence envelope (SCE) for the true FACF. Simulation experiments validate the asymptotic correctness of the SCE and data-driven FMA order selection. The proposed SCEs are computed for the FACFs of an ElectroEncephalogram (EEG) functional time series with interesting discovery of finite FMA lag and Fourier form functional principal components.
Article
This paper concerns the comparison of two sample non parametric regression. An asymptotically correct simultaneous confidence band (SCB) is proposed for the difference of two-sample non parametric regression functions to achieve the goal of comparison. Simulation experiments provide strong evidence that corroborates the asymptotic theory. The proposed SCB is used to analyze different samples of strata pressure data from the Bullianta Coal Mine in Erdos City, Inner Mongolia, China.
Article
Existing functional data analysis literature has mostly overlooked data with spikes in mean, such as weekly sporting goods sales by a salesperson which spikes around holidays. For such functional data, two-step estimation procedures are formulated for the population mean function and holiday effect parameters, which correspond to the population sales curve and the spikes in sales during holiday times. The estimators are based on spline smoothing for individual trajectories using non-holiday observations, and are shown to be oracally efficient in the sense that both the mean function and holiday effects are estimated as efficiently as if all individual trajectories were known a priori. Consequently, an asymptotic simultaneous confidence band is established for the mean function and confidence intervals for holiday effects, respectively. Two sample extensions are also formulated and simulation experiments provide strong evidence that corroborates the asymptotic theory. Application to sporting goods sales data has led to a number of new discoveries.
Article
Asymptotically correct simultaneous confidence bands (SCBs) are proposed for the mean and variance functions of a nonparametric regression model based on deterministic designs. The variance estimation is as efficient, up to order n⁻1/², as an infeasible estimator if the mean function were known. Simulation experiments provide strong evidence that corroborates the asymptotic theory. The proposed SCBs are used to analyze two sets of strata pressure data from the Bul-lianta Coal Mine in Erdos City, Inner Mongolia, China.
Article
Stratified sampling is one of the most important survey sampling approaches and is widely used in practice. In this paper, we consider the estimation of the distribution function of a finite population in stratified sampling by the empirical distribution function (EDF) and kernel distribution estimator (KDE), respectively. Under general conditions, the rescaled estimation error processes are shown to converge to a weighted sum of transformed Brownian bridges. Moreover, simultaneous confidence bands (SCBs) are constructed for the population distribution function based on EDF and KDE. Simulation experiments and illustrative data example show that the coverage frequencies of the proposed SCBs under the optimal and proportional allocations are close to the nominal confidence levels.
Article
A plug-in estimator is proposed for a local measure of variance explained by regression, termed correlation curve in Doksum et al. (J Am Stat Assoc 89:571–582, 1994), consisting of a two-step spline–kernel estimator of the conditional variance function and local quadratic estimator of first derivative of the mean function. The estimator is oracally efficient in the sense that it is as efficient as an infeasible correlation estimator with the variance function known. As a consequence of the oracle efficiency, a smooth simultaneous confidence band (SCB) is constructed around the proposed correlation curve estimator and shown to be asymptotically correct. Simulated examples illustrate the versatility of the proposed oracle SCB which confirms the asymptotic theory. Application to a 1995 British Family Expenditure Survey data has found marginally significant evidence for a local version of Engel’s law, i.e., food budget share and household real income are inversely related (Hamilton in Am Econ Rev 91:619–630, 2001).
Book
This book is based on the author's experience with calculations involving polynomial splines. It presents those parts of the theory which are especially useful in calculations and stresses the representation of splines as linear combinations of B-splines. After two chapters summarizing polynomial approximation, a rigorous discussion of elementary spline theory is given involving linear, cubic and parabolic splines. The computational handling of piecewise polynomial functions (of one variable) of arbitrary order is the subject of chapters VII and VIII, while chapters IX, X, and XI are devoted to B-splines. The distances from splines with fixed and with variable knots is discussed in chapter XII. The remaining five chapters concern specific approximation methods, interpolation, smoothing and least-squares approximation, the solution of an ordinary differential equation by collocation, curve fitting, and surface fitting. The present text version differs from the original in several respects. The book is now typeset (in plain TeX), the Fortran programs now make use of Fortran 77 features. The figures have been redrawn with the aid of Matlab, various errors have been corrected, and many more formal statements have been provided with proofs. Further, all formal statements and equations have been numbered by the same numbering system, to make it easier to find any particular item. A major change has occured in Chapters IX-XI where the B-spline theory is now developed directly from the recurrence relations without recourse to divided differences. This has brought in knot insertion as a powerful tool for providing simple proofs concerning the shape-preserving properties of the B-spline series.
Article
Simultaneous confidence bands (SCBs) are proposed for the distribution function of a finite population and of the latent superpopulation via the empirical distribution function (nonsmooth) and kernel distribution estimator (smooth) based on a simple random sample (SRS), either with or without finite population correction. It is shown that both nonsmooth and smooth SCBs achieve asymptotically the nominal confidence level under standard assumptions. In particular, the uncorrected nonsmooth SCB for superpopulation is exactly the same as the Kolmogorov–Smirnov SCB based on an independent and identically distributed sample as long as the SRS size is infinitesimal relative to the finite population size. Extensive simulation studies confirm the asymptotic properties. As an illustration, the proposed SCBs are constructed for the population distribution of the well-known baseball data (Lohr, Sampling: design and analysis, 2nd edn. Brooks/Cole, Boston, 2009).
Article
In spite of widespread use of generalized additive models (GAMs) to remedy the “curse of dimensionality”, there is no well-grounded methodology developed for simultaneous inference and variable selection for GAM in existing literature. However, both are essential in enhancing the capability of statistical models. To this end, we establish simultaneous confidence corridors (SCCs) and a type of Bayesian information criterion (BIC) through the spline-backfitted kernel smoothing techniques proposed in recent articles. To characterize the global features of each non-parametric components, SCCs are constructed for testing their overall trends and entire shapes. By extending the BIC in additive models with identity/trivial link, an asymptotically consistent BIC approach for variable selection is built up in GAM to improve the parsimony of model without loss of prediction accuracy. Simulations and a real example corroborate the above findings.