Roderick J Little

Roderick J Little
University of Michigan | U-M · Department of Biostatistics

Ph.D.

About

165
Publications
18,788
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
22,666
Citations

Publications

Publications (165)
Article
Accidents are a leading cause of deaths in U.S. active duty personnel. Understanding accident deaths during wartime could facilitate future operational planning and inform risk prevention efforts. This study expands prior research, identifying health risk factors associated with U.S. Army accident deaths during the Afghanistan and Iraq war. Militar...
Chapter
When sample sizes are small, a useful alternative approach to multiple imputation (ML) is to add a prior distribution for the parameters and compute the posterior distribution of the parameters of interest. As with ML estimation with a general pattern of missing values, Bayes simulation requires iteration. The iterative simulation methods discussed...
Article
Missing values in predictors are a common problem in survival analysis. In this paper, we review estimation methods for accelerated failure time models with missing predictors, and apply a new method called subsample ignorable likelihood (IL) Little and Zhang (J R Stat Soc 60:591-605, 2011) to this class of models. The approach applies a likelihood...
Chapter
Imputations are means or draws from a predictive distribution of the missing values, and require a method of creating a predictive distribution for the imputation based on the observed data. There are two generic approaches to generating this distribution: Explicit modeling: the predictive distribution is based on a formal statistical model, and he...
Chapter
The estimate is computed as part of the Newton?Raphson algorithm for Maximum Likelihood (ML) estimation, and computed as part of the scoring algorithm. This chapter considers methods for computing standard errors that do not require computation and inversion of an information matrix. Another method for calculating large-sample covariance matrices i...
Chapter
This chapter considers alternative distributions to the t distribution for robust inference, and robust inference for multivariate data sets with missing values. It describes a general mixture model for robust estimation of a univariate sample that includes the t and contaminated normal disztributions as special cases. The case of multivariate data...
Article
This article summarizes recommendations on the design and conduct of clinical trials of a National Research Council study on missing data in clinical trials. Key findings of the study are that (a) substantial missing data is a serious problem that undermines the scientific credibility of causal conclusions from clinical trials; (b) the assumption t...
Article
Missing data in clinical trials can have a major effect on the validity of the inferences that can be drawn from the trial. This article reviews methods for preventing missing data and, failing that, dealing with data that are missing.
Article
Covariate measurement error is common in epidemiologic studies. Current methods for correcting measurement error with information from external calibration samples are insufficient to provide valid adjusted inferences. We consider the problem of estimating the regression of an outcome Y on covariates X and Z, where Y and Z are observed, X is unobse...
Article
Full-text available
Gene sequences are routinely used to determine the topologies of unrooted phylogenetic trees, but many of the most important questions in evolution require knowing both the topologies and the roots of trees. However, general algorithms for calculating rooted trees from gene and genomic sequences in the absence of gene paralogs are few. Using the pr...
Article
Summary We consider the linear regression of outcome Y on regressors W and Z with some values of W missing, when our main interest is the effect of Z on Y, controlling for W. Three common approaches to regression with missing covariates are (i) complete-case analysis (CC), which discards the incomplete cases, and (ii) ignorable likelihood methods,...
Article
This pragmatic randomized trial evaluated the effectiveness of a tailored educational intervention on oral health behaviors and new untreated carious lesions in low-income African-American children in Detroit, Michigan. Participating families were recruited in a longitudinal study of the determinants of dental caries in 1021 randomly selected child...
Article
Rejoinder of "Calibrated Bayes, for Statistics in General, and Missing Data in Particular" by R. Little [arXiv:1108.1917]
Article
We consider the estimation of the regression of an outcome Y on a covariate X, where X is unobserved, but a variable W that measures X with error is observed. A calibration sample that measures pairs of values of X and W is also available; we consider calibration samples where Y is measured (internal calibration) and not measured (external calibrat...
Article
It is argued that the Calibrated Bayesian (CB) approach to statistical inference capitalizes on the strength of Bayesian and frequentist approaches to statistical inference. In the CB approach, inferences under a particular model are Bayesian, but frequentist methods are useful for model development and model checking. In this article the CB approa...
Article
Two common approaches to regression with missing covariates are complete-case analysis and ignorable likelihood methods. We review these approaches and propose a hybrid class, called subsample ignorable likelihood methods, which applies an ignorable likelihood method to the subsample of observations that are complete on one set of variables, but po...
Article
In this paper, the authors describe a simple method for making longitudinal comparisons of alternative markers of a subsequent event. The method is based on the aggregate prediction gain from knowing whether or not a marker has occurred at any particular age. An attractive feature of the method is the exact decomposition of the measure into 2 compo...
Article
Full-text available
We consider assessment of nonresponse bias for the mean of a survey variable Y subject to nonresponse. We assume that there are a set of covariates observed for nonrespondents and respondents. To reduce dimensionality and for simplicity we reduce the covariates to a proxy variable X that has the highest correlation with Y , estimated from a regress...
Article
In clinical trials, a biomarker (S ) that is measured after randomization and is strongly associated with the true endpoint (T) can often provide information about T and hence the effect of a treatment (Z ) on T. A useful biomarker can be measured earlier than T and cost less than T. In this article, we consider the use of S as an auxiliary variabl...
Article
In their valuable commentary, Drs. Ghosh and Castle (1) reinforce the points made in our article (2). Specifically, they emphasize the utility of combining measures of prevalence and predictive ability and show how the idea applies to another important epidemiologic measure, population attributable risk. They also describe applications of these ide...
Article
Full-text available
Two major ideas in the analysis of missing data are (a) the EM algorithm [Dempster, Laird and Rubin, J. Roy. Statist. Soc. Ser. B 39 (1977) 1--38] for maximum likelihood (ML) estimation, and (b) the formulation of models for the joint distribution of the data ${Z}$ and missing data indicators ${M}$, and associated "missing at random"; (MAR) conditi...
Article
We propose a regression-based hot-deck multiple imputation method for gaps of missing data in longitudinal studies, where subjects experience a recurrent event process and a terminal event. Examples are repeated asthma episodes and death, or menstrual periods and menopause, as in our motivating application. Research interest concerns the onset time...
Article
In longitudinal studies of developmental and disease processes, participants are followed prospectively with intermediate milestones identified as they occur. Frequently, studies enroll participants over a range of ages including ages at which some participants' milestones have already passed. Ages at milestones that occur prior to study entry are...
Article
The Internet provides us with tools (user metrics or paradata) to evaluate how users interact with online interventions. Analysis of these paradata can lead to design improvements. The objective was to explore the qualities of online participant engagement in an online intervention. We analyzed the paradata in a randomized controlled trial of alter...
Article
This work is motivated by a quantitative Magnetic Resonance Imaging study of the differential tumor/healthy tissue change in contrast uptake induced by radiation. The goal is to determine the time in which there is maximal contrast uptake (a surrogate for permeability) in the tumor relative to healthy tissue. A notable feature of the data is its sp...
Article
Full-text available
Disclosure limitation is an important consideration in the release of public use data sets. It is particularly challenging for longitudinal data sets, since information about an individual accumulates over time. We consider problems created by high ages in cohort studies. Because of the risk of disclosure, ages of very old respondents can often not...
Article
Disclosure limitation is an important consideration in the release of public use data sets. It is particularly challenging for longitudinal data sets, since information about an individual accumulates with repeated measures over time. Research on disclosure limitation methods for longitudinal data has been very limited. We consider here problems cr...
Article
The goal of the present study was to quantify the population-based background serum concentrations of 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD) by using data from the reference population of the 2005 University of Michigan Dioxin Exposure Study (UMDES) and the 2003-2004 National Health and Nutrition Examination Survey (NHANES). Multiple imputation...
Article
Raw data on the relationship between known and measured values of an analyte are collected and analyzed to determine the limit of quantification (LOQ) of an assay. In most LOQ problems, the researcher is given an observed value for the marker of interest if this value is greater than the LOQ, and a missing value (<LOQ) otherwise. From a statistical...
Article
Repeated neuropsychological measurements, such as mini-mental state examination (MMSE) scores, are frequently used in Alzheimer’s disease (AD) research to study change in cognitive function of AD patients. A question of interest among dementia researchers is whether some AD patients exhibit transient “plateaus” of cognitive function in the course o...
Article
Data analysis for randomized trials including multi-treatment arms is often complicated by subjects who do not comply with their treatment assignment. We discuss here methods of estimating treatment efficacy for randomized trials involving multi-treatment arms subject to non-compliance. One treatment effect of interest in the presence of non-compli...
Article
Hot deck imputation is a method for handling missing data in which each missing value is replaced with an observed response from a “similar” unit. Despite being used extensively in practice, the theory is not as well developed as that of other imputation methods. We have found that no consensus exists as to the best way to apply the hot deck and ob...
Article
This work is motivated by a quantitative Magnetic Resonance Imaging study of the relative change in tumor vascular permeability during the course of radiation therapy. The differences in tumor and healthy brain tissue physiology and pathology constitute a notable feature of the image data-spatial heterogeneity with respect to its contrast uptake pr...
Article
Asthma is a serious problem for low-income preteens living in disadvantaged communities. Among the chronic diseases of childhood and adolescence, asthma has the highest prevalence and related health care use. School-based asthma interventions have proven successful for older and younger students, but results have not been demonstrated for those in...
Article
The objective of this study was to evaluate the existence of cognitive plateaus in some individuals during the course of Alzheimer's disease (AD). Data came from the historical patient group collected via the Consortium to Establish a Registry for Alzheimer's Disease (CERAD, Duke University, 1988-1996). Data reduction was performed by using princip...
Article
Full-text available
A common strategy for handling item nonresponse in survey sampling is hot deck imputation, where each missing value is replaced with an observed response from a "similar" unit. We discuss here the use of sampling weights in the hot deck. The naive approach is to ignore sample weights in creation of adjustment cells, which effectively imputes the un...
Article
Little and An (2004, Statistica Sinica 14, 949-968) proposed a penalized spline of propensity prediction (PSPP) method of imputation of missing values that yields robust model-based inference under the missing at random assumption. The propensity score for a missing variable is estimated and a regression model is fitted that includes the spline of...
Article
Parametric model-based regression imputation is commonly applied to missing-data problems, but is sensitive to misspecification of the imputation model. Little and An (2004) proposed a semiparametric approach called penalized spline propensity prediction (PSPP), where the variable with missing values is modeled by a penalized spline (P-Spline) of t...
Article
Selection models and pattern-mixture models are often used to deal with nonignorable dropout in longitudinal studies. These two classes of models are based on different factorizations of the joint distribution of the outcome process and the dropout process. We consider a new class of models, called mixed-effect hybrid models (MEHMs), where the join...
Article
Full-text available
Health behavior intervention studies have focused primarily on comparing new programs and existing programs via randomized controlled trials. However, numbers of possible components (factors) are increasing dramatically as a result of developments in science and technology (e.g., Web-based surveys). These changes dictate the need for alternative me...
Article
Consider a meta-analysis of studies with varying proportions of patient-level missing data, and assume that each primary study has made certain missing data adjustments so that the reported estimates of treatment effect size and variance are valid. These estimates of treatment effects can be combined across studies by standard meta-analytic methods...
Article
Quantitative Magnetic Resonance Imaging (qMRI) provides researchers insight into pathological and physiological alterations of living tissue, with the help of which, researchers hope to predict (local) therapeutic efficacy early and determine optimal treatment schedule. However, the analysis of qMRI has been limited to ad-hoc heuristic methods. Our...
Article
Full-text available
Although the randomized, controlled trial (RCT) is considered the gold standard in research for determining the efficacy of health education interventions, such trials may be vulnerable to "preference effects"; that is, differential outcomes depending on whether an individual is randomized to his or her preferred treatment. In this study, we review...
Article
We consider the analysis of clinical trials that involve randomization to an active treatment (T = 1) or a control treatment (T = 0), when the active treatment is subject to all-or-nothing compliance. We compare three approaches to estimating treatment efficacy in this situation: as-treated analysis, per-protocol analysis, and instrumental variable...
Article
Initial trials of web-based smoking-cessation programs have generally been promising. The active components of these programs, however, are not well understood. This study aimed to (1) identify active psychosocial and communication components of a web-based smoking-cessation intervention and (2) examine the impact of increasing the tailoring depth...
Article
Patient preference may influence intervention effects, but has not been extensively studied. Randomized controlled design (N=1075) assessed outcomes when women (60 years+) were given a choice of two formats of a program to enhance heart disease management. Randomization to "no choice" or "choice" study arms. Further randomization of "no choice" to:...
Article
Web-based programs for health promotion, disease prevention, and disease management often experience high rates of attrition. There are 3 questions which are particularly relevant to this issue. First, does engagement with program content predict long-term outcomes? Second, which users are most likely to drop out or disengage from the program? Thir...
Article
Criteria for staging the menopausal transition are not established. This article evaluates five bleeding criteria for defining early transition and provides empirically based guidance regarding optimal criteria. Prospective menstrual calendar data from four population-based cohorts: TREMIN, Melbourne Women's Midlife Health Project (MWMHP), Seattle...
Article
We consider the analysis of longitudinal data sets that include times of recurrent events, where interest lies in variables that are functions of the number of events and the time intervals between events for each individual, and where some cases have gaps when the information was not recorded. Discarding cases with gaps results in a loss of the re...
Article
This article concerns item nonresponse adjustment for two-stage cluster samples. Specifically, we focus on two types of nonignorable nonresponse: nonresponse depending on covariates and underlying cluster characteristics, and depending on covariates and the missing outcome. In these circumstances, standard weighting and imputation adjustments are l...
Article
In a previous study, we validated a polysomnographic assessment for REM sleep behavior disorder (RBD). The method proved to be reliable but required slow, labor-intensive visual scoring of surface electromyogram (EMG) activity. We therefore developed a computerized metric to assess EMG variance and compared the results to those previously published...
Article
Full-text available
Comment: Struggles with Survey Weighting and Regression Modeling [arXiv:0710.5005] Comment: Published in at http://dx.doi.org/10.1214/088342307000000186 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org)
Article
Top coding of extreme values of variables like income is a common method of statistical disclosure control, but it creates problems for the data analyst. The paper proposes two alternative methods to top coding for statistical disclosure control that are based on multiple imputation. We show in simulation studies that the multiple-imputation method...
Article
We propose new model-based methods for unit non-response in two-stage survey samples. A commonly used design-based adjustment weights respondents by the inverse of the estimated response rate in each cluster (method WT). This approach is consistent if the response probabilities are constant within clusters but is potentially inefficient when the es...
Article
Scintigraphic imaging with (123)I-metaiodobenzylguanidine ((123)I-MIBG) has demonstrated extensive losses of cardiac sympathetic neurons in idiopathic Parkinson's disease (IPD). In contrast, normal cardiac innervation has been observed in (123)I-MIBG studies of multiple-system atrophy (MSA) and progressive supranuclear palsy (PSP). Consequently, it...
Article
Full-text available
The current criterion for onset of late menopausal transition is amenorrhea of 90 d or more. The Stages of Reproductive Aging Workshop proposed alternative criteria based on a shorter period of amenorrhea. Empirical data comparing proposed criteria are not available. This paper evaluates the several bleeding criteria that served as the basis of the...
Article
Full-text available
The Stages of Reproductive Aging Workshop proposed bleeding and hormonal criteria for the menopausal transition, but operational definitions of hormone parameters were not specified. This paper investigates the longitudinal relationship of annual serum FSH levels with four proposed bleeding criteria for the late menopausal transition in two cohort...
Chapter
This article has no abstract.
Article
The lack of an agreed inferential basis for statistics makes life "interesting" for academic statisticians, but at the price of negative implications for the status of statistics in industry, science, and government. The practice of our discipline will mature only when we can come to a basic agreement about how to apply statistics to real problems....
Article
OBJECTIVE AND CONTEXT: Our objective was to examine predictability of reproductive hormone concentrations for bone mineral density (BMD) loss during the menopausal transition. We conducted a longitudinal (five annual examinations), multiple-site (n = 5) cohort study, the Study of Women's Health Across the Nation (SWAN). Participants included, at ba...
Article
Recent studies suggest that the wide variability in type, detail, and reliability of online information motivate expert searchers to develop procedural search knowledge . In contrast to prior research that has focused on finding relevant sources, procedural search knowledge focuses on how to order multiple relevant sources with the goal of retrievi...
Article
It has been speculated that gender differences in cardiovascular disease (CVD) mortality can be attributed to the effects of estrogens on inflammation and hemostatic marker profiles. Therefore, we evaluated endogenous hormone concentrations, menopause transition stages, and adoption of exogenous hormone use in relation to hemostatic and inflammatio...
Article
The goal of this study was to relate annually measured endogenous androgens to hemostatic and inflammation markers in women longitudinally. A total of 3302 participants from the Study of Women's Health Across the Nation, aged 42-52 yr at baseline and self-identified as African-American (28%), Caucasian (47%), Chinese (8%), Hispanic (8%), or Japanes...
Chapter
Missing data are a common problem in the social and behavioral sciences. Here we present an overview of the problem and possible solutions. We begin by distinguishing between the pattern of missing data and the mechanism that creates the missing data. We then consider common, but limited, approaches: complete-cases, available cases, weighting analy...
Article
Full-text available
Rapid eye movement (REM) sleep behavior disorder (RBD) was described more than 2 decades ago, but only 1 report on 5 patients and 5 normal subjects has tested the effectiveness of a method by which relevant polysomnographic findings can be quantified. We sought to validate this method in a larger sample of patients and control subjects. Cross-secti...
Chapter
IntroductionFull synthesisSMIKe and MIKeAnalysis of synthetic samplesAn applicationConclusions
Article
Accurate, early differentiation of dementias will become increasingly important as new therapies are introduced. Differential diagnosis by standard clinical criteria has limited accuracy. PET offers the potential to increase diagnostic accuracy. (18)F-FDG studies detect metabolic abnormalities in demented patients, but with limited specificity. PET...
Article
We compared the relative utility of neuropsychological testing and positron emission tomography (PET) with [18F]fluorodeoxyglucose ([18F]FDG) in differentiating Alzheimer's disease (AD) from dementia with Lewy bodies (DLB). We studied 25 patients with AD, 20 with DLB, and 19 normal elderly controls. There was no difference between patient groups fo...
Article
Demographic analysis of data on births, deaths, and migration, together with coverage measurement surveys that use capture-recapture methods, have established that U.S. Census counts are flawed for certain subpopulations. Previous work using 1990 Census data in African—Americans age 30—49 proposed a hierarchical Bayesian model that assembled Census...
Article
We wanted to identify what factors promote career development in patient-oriented clinical research (POCR). We used a survey questionnaire covering areas relevant to the training of subspecialty fellows and the career development of POCR faculty. Pursuit of an academic career after fellowship correlated with completion of a clinical project, availa...
Article
Full-text available
Over 3,000 subjects were recruited in 3 U.S. regions for a randomized experiment of an online weight management intervention. Participants were sent invitations to web survey reassessments after 3, 6, and 12 months. High and increasing nonresponse to the three follow- up surveys created the potential for nonresponse bias in key program outcomes. A...
Article
Nonresponse weighting is a common method for handling unit nonresponse in surveys and is aimed at reducing nonresponse bias. Because the method can be accompanied by an increase in variance, the efficacy of weighting adjustments is often seen as a bias-variance trade-off. This view is an oversimplification, because weighting can reduce variance as...
Article
Noncompliance is a common problem in experiments involving randomized assignment of treatments, and standard analyses based on intention-to-treat or treatment received have limitations. An attractive alternative is to estimate the Complier-Average Causal Effect (CACE), which is the average treatment effect for the subpopulation of subjects who woul...
Article
We used positron emission tomography (PET) with (+)-[(11)C]dihydrotetrabenazine ([+]-[(11)C]DTBZ) to examine striatal monoaminergic presynaptic terminal density in 20 patients with dementia with Lewy bodies (DLB), 25 with Alzheimer's disease (AD), and 19 normal elderly controls. Six DLB patients developed parkinsonism at least 1 year before dementi...
Article
Full-text available
Serum reproductive hormone concentrations were measured longitudinally in a community-based, multiethnic population of midlife women to assess whether ethnic differences exist in the patterns of change in estradiol (E2) and FSH and, if so, whether these differences are explained by host characteristics. We studied 3257 participants from seven clini...
Article
Finite population sampling is perhaps the only area of statistics in which the primary mode of analysis is based on the randomization distribution, rather than on statistical models for the measured variables. This article reviews the debate between design-based and model-based inference. The basic features of the two approaches are illustrated usi...
Article
Abstract Samplers often distrust model-based approaches to survey inference due to con- cerns about model misspecification when applied to large samples from complex populations. We suggest that the model-based paradigm,can work very success- fully in survey settings, provided models are chosen that take into account the sample design and avoid str...