A Partially Linear Regression Model for Data from an Outcome-Dependent Sampling Design.

University of North Carolina at Chapel Hill, USA.
Applied Statistics (Impact Factor: 1.49). 08/2011; 60(4):559-574. DOI: 10.1111/j.1467-9876.2010.00756.x
Source: PubMed


The outcome dependent sampling scheme has been gaining attention in both the statistical literature and applied fields. Epidemiological and environmental researchers have been using it to select the observations for more powerful and cost-effective studies. Motivated by a study of the effect of in utero exposure to polychlorinated biphenyls on children's IQ at age 7, in which the effect of an important confounding variable is nonlinear, we consider a semi-parametric regression model for data from an outcome-dependent sampling scheme where the relationship between the response and covariates is only partially parameterized. We propose a penalized spline maximum likelihood estimation (PSMLE) for inference on both the parametric and the nonparametric components and develop their asymptotic properties. Through simulation studies and an analysis of the IQ study, we compare the proposed estimator with several competing estimators. Practical considerations of implementing those estimators are discussed.

Download full-text


Available from: Guoyou Qin
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Multi-phased designs and biased sampling designs are two of the well recognized approaches to enhance study efficiency. In this paper, we propose a new and cost-effective sampling design, the two-phase probability dependent sampling design (PDS), for studies with a continuous outcome. This design will enable investigators to make efficient use of resources by targeting more informative subjects for sampling. We develop a new semiparametric empirical likelihood inference method to take advantage of data obtained through a PDS design. Simulation study results indicate that the proposed sampling scheme, coupled with the proposed estimator, is more efficient and more powerful than the existing outcome dependent sampling design and the simple random sampling design with the same sample size. We illustrate the proposed method with a real data set from an environmental epidemiologic study.
    Preview · Article · Apr 2014 · Journal of the Royal Statistical Society Series B (Statistical Methodology)
  • [Show abstract] [Hide abstract]
    ABSTRACT: An outcome-dependent sampling (ODS) design is a retrospective sampling scheme where one observes the primary exposure variables with a probability that depends on the observed value of the outcome variable. When the outcome of interest is failure time, the observed data are often censored. By allowing the selection of the supplemental samples depends on whether the event of interest happens or not and oversampling subjects from the most informative regions, ODS design for the time-to-event data can reduce the cost of the study and improve the efficiency. We review recent progresses and advances in research on ODS designs with failure time data. This includes researches on ODS related designs like case-cohort design, generalized case-cohort design, stratified case-cohort design, general failure-time ODS design, length-biased sampling design and interval sampling design.
    No preview · Article · Jan 2016 · Lifetime Data Analysis