Efficient Estimation of Population-Level Summaries in General Semiparametric Regression Models

Texas A&M University, College Station, Texas, United States
Journal of the American Statistical Association (Impact Factor: 2.11). 02/2007; 102(March):123-139. DOI: 10.2307/27639826
Source: RePEc

ABSTRACT This paper considers a wide class of semiparametric regression models in which interest focuses on population-level quantities that combine both the parametric and nonparametric parts of the model. Special cases in this approach include generalized partially linear models, gener- alized partially linear single index models, structural measurement error models and many others. For estimating the parametric part of the model e-ciently, proflle likelihood kernel estimation methods are well-established in the literature. Here our focus is on estimating general population-level quantities that combine the parametric and nonparametric parts of the model, e.g., population mean, probabilities, etc. We place this problem into a general context, provide a general kernel-based methodology, and derive the asymptotic distributions of estimates of these population-level quantities, showing that in many cases the estimates are semiparametric e-cient. For estimating the population mean with no missing data, we show that the sample mean is semiparametric e-cient for canonical exponential families, but not in general. We apply the methods to a problem in nutritional epidemiology, where estimating the distribution of usual intake is of primary interest, and semiparametric methods are not available. Extensions to the case of missing response data are also discussed.


Available from: Arnab Maity, May 03, 2015
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We study a class of semiparametric skewed distributions arising when the sample selection process produces non-randomly sampled observations. Based on semiparametric theory and taking into account the symmetric nature of the population distribution, we propose both consistent estimators, i.e. robust to model mis-specification, and efficient estimators, i.e. reaching the minimum possible estimation variance, of the location of the symmetric population. We demonstrate the theoretical properties of our estimators through asymptotic analysis and assess their finite sample performance through simulations. We also implement our methodology on a real data example of ambulatory expenditures to illustrate the applicability of the estimators in practice.
    Stat 10/2012; 1(1):1-11. DOI:10.1002/sta4.2
  • [Show abstract] [Hide abstract]
    ABSTRACT: We investigate the estimation efficiency of the central mean subspace in the framework of sufficient dimension reduction. We derive the semiparametric efficient score and study its practical applicability. Despite the difficulty caused by the potential high dimension issue in the variance component, we show that locally efficient estimators can be constructed in practice. We conduct simulation studies and a real data analysis to demonstrate the finite sample performance and gain in efficiency of the proposed estimators in comparison with several existing methods.
    Journal Of The Royal Statistical Society 11/2013; 76(5). DOI:10.1111/rssb.12044
  • [Show abstract] [Hide abstract]
    ABSTRACT: A data-driven bandwidth selection method for backfitting estimation of semiparametric additive models, when the parametric part is of main interest, is proposed. The proposed method is a double smoothing estimator of the mean-squared error of the backfitting estimator of the parametric terms. The performance of the proposed method is evaluated and compared with existing bandwidth selectors by means of a simulation study.
    Computational Statistics & Data Analysis 06/2013; 62:136–148. DOI:10.1016/j.csda.2013.01.010 · 1.15 Impact Factor