Douglas M. BatesUniversity of Wisconsin–Madison | UW · Department of Statistics
Douglas M. Bates
PhD
About
137
Publications
130,654
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
173,387
Citations
Introduction
Additional affiliations
January 1983 - June 1983
June 1978 - May 1980
June 1980 - present
Publications
Publications (137)
Linear mixed-effects models have increasingly replaced mixed-model analyses of variance for statistical inference in factorial psycholinguistic experiments. The advantages of LMMs over ANOVAs, however, come at a cost: Setting up an LMM is not as straightforward as running an ANOVA. One simple option, when numerically possible, is to fit the full va...
Generalized additive mixed models are introduced as an extension of the generalized linear mixed model which makes it possible to deal with temporal autocorrelational structure in experimental data. This autocorrelational structure is likely to be a consequence of learning, fatigue, or the ebb and flow of attention within an experiment (the `human...
Generalized additive mixed models are introduced as an extension of the generalized linear mixed model which makes it possible to deal with temporal autocorrelational structure in experimental data. This autocorrelational structure is likely to be a consequence of learning, fatigue, or the ebb and flow of attention within an experiment (the `human...
The analysis of experimental data with mixed-effects models requires
decisions about the specification of the appropriate random-effects structure.
Recently, Barr et al. (2013) recommended fitting 'maximal' models with all
possible random effect components included. Estimation of maximal models,
however, may not converge. We show that failure to co...
Maximum likelihood or restricted maximum likelihood (REML) estimates of the
parameters in linear mixed-effects models can be determined using the lmer
function in the lme4 package for R. As for most model-fitting functions in R,
the model is described in an lmer call by a formula, in this case including
both fixed- and random-effects terms. The for...
Description Fit linear and generalized linear mixed-effects models. The models and their components are represented using S4 classes and methods. The core computational algorithms are implemented using the 'Eigen' C++ library for numerical linear algebra and 'RcppEigen' ``glue''.
The RcppEigen package provides access from R (R Core Team 2012a) to the Eigen (Guennebaud, Jacob, and others 2012) C++ template library for numerical linear alge- bra. Rcpp (Eddelbuettel and Fraņ cois 2011, 2012) classes and specializations of the C++ templated functions as and wrap from Rcpp provide the "glue" for passing objects from R to C++ and...
Univariate Nonlinear Regression. Univariate Nonlinear Regression: Special Situations. A Unified Asymptotic Theory for Nonlinear Models with Regression Structure. Univariate Nonlinear Regression: Asymptotic Theory. Multivariate Nonlinear Regression. Nonlinear Simultaneous Equations Models. A Unified Asymptotic Theory for Dynamic Nonlinear Models. Re...
The determinant of matrices of the form X′X is used as the D-optimal design criterion for parameter estimation in nonlinear regression and as the estimation criterion for multiresponse parameter estimation. It is helpful to be able to calculate the gradient of this determinant so that sophisticated optimization methods can be used. A method is give...
Many statistics methods require one or more least squares problems to be solved. There are several ways to perform this calculation, using objects from the base R system and using objects in the classes dened in the Matrix package. We compare the speed of some of these methods on a very small ex- ample and on a example for which the model matrix is...
There has been a substantial increase in the percentage for publications with co-authors located in departments from different
countries in 12 major journals of psychology. The results are evidence for a remarkable internationalization of psychological
research, starting in the mid 1970s and increasing in rate at the beginning of the 1990s. This gr...
This is an R package (a piece of Software) to fit and do inference on mixed-effects models.
The package is Free Software (hence open-source) and the package and much documentation about it is freely available from CRAN at
https://cran.r-project.org/package=lme4
In this paper, we discuss the R language and environment, and in particular its use on Debian Linux, as well as the Debian package management system.
Linear algebra is at the core of many areas of statistical computing and from its inception the S lan- guage has supported numerical linear algebra via a matrix data type and several functions and operators, such as %*%, qr, chol, and solve. However, these data types and functions do not provide direct access to all of the facilities for ecient man...
The lme4 package provides R functions to fit and analyze linear mixed models, generalized linear mixed models and nonlinear mixed models. In this vignette we describe the formulation of these models and the computational approach used to evaluate or approximate the log-likelihood of a model/data/parameter value combination.
Clinical mastitis is typically coded as presence/absence during some period of exposure, and records are analyzed with linear or binary data models. Because presence includes cows with multiple episodes, there is loss of information when a count is treated as a binary response. The Poisson model is designed for counting random variables, and althou...
This paper provides an introduction to mixed-effects models for the analysis of repeated measurement data with subjects and items as crossed random effects. A worked-out example of how to use recent software for mixed-effects modeling is provided. Simulation studies illustrate the advantages offered by mixed-effects analyses compared to traditional...
Partitioning the variance of a response by design levels is challenging for binomial and other discrete outcomes. Goldstein (200314.
Goldstein , H. ( 2003 ). Multilevel Statistical Models. 3rd ed . London : Edward Arnold . View all references) proposed four definitions for variance partitioning coefficients (VPC) under a two-level logistic regress...
Model Specification Preliminary Analysis Starting Values Parameter Transformations Other Iterative Techniques Obtaining Convergence Assessing the Fit and Modifying the Model Correlated Residuals Accumulated Data Comparing Models Parameters as Functions of Other Variables Presenting the Results Nitrite Utilization: A Case Study Experimental Design
The Nonlinear Regression Model Determining the Least Squares Estimates Nonlinear Regression Inference Using the Linear Approximation Nonlinear Least Squares via Sums of Squares Use of the Linear Approximation
Velocity and Acceleratidn Vectors Relative Curvatures RMS Curvatures Direct Assessment of thd Effects of Intrinsic Nonlinearity
Zeleis for various suggestions and contributions. Description This package accompanies J. Fox, An R and S-PLUS Companion to Applied Regression, Sage, 2002. The package contains mostly functions for applied regression, linear models, and generalized linear models, with an emphasis on regression diagnostics, particularly graphical diagnostic methods....
Traditional Rasch estimation of the item and student parameters via marginal maximum likelihood, joint maximum likelihood or conditional maximum likelihood, assume individuals in clustered settings are uncorrelated and items within a test that share a grouping structure are also uncorrelated. These assumptions are often violated, particularly in ed...
Linear algebra is at the core of many areas of statistical computing and from its inception the S language has supported numerical linear algebra via a matrix data type and several functions and operators, such as %*%, qr, chol, and solve. However, these data types and functions do not provide direct access to all of the facilities for efficient ma...
The analysis of longitudinal data may require a mixed-effects model, incorporating parameters for fixed effects associated with the whole population and also parameters describing distributions of random effects associated with individual subjects. These may enter the model nonlinearly, as in compartment models used in pharmacokinetics. Maximum lik...
Linear mixed-effects models are an important class of statistical models that are used directly in many fields of applications and also are used as iterative steps in fitting other types of mixed-effects models, such as generalized linear mixed models. The parameters in these models are typically estimated by maximum likelihood or restricted maximu...
geMatrix-class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 isInitialized . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 lu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 matrixGets . . . . . . . . . . . . . . . . . . . . . . . . . . ....
Because interleukin (IL)-5 family cytokines are critical regulators of eosinophil development, recruitment, and activation, this study was initiated to identify proteins induced by these cytokines in eosinophils. Using oligonucleotide microarrays, numerous transcripts were identified as responsive to both IL-5 and granulocyte macrophage-colony-stim...
The Bioconductor project is an initiative for the collaborative creation of extensible software for computational biology and bioinformatics. The goals of the project include: fostering collaborative development and widespread use of innovative software, reducing barriers to entry into interdisciplinary scientific research, and promoting the achiev...
The Bioconductor project is an initiative for the collaborative creation of extensible software for computational biology and bioinformatics. We detail some of the design decisions, software paradigms and operational strategies that have allowed a small number of researchers to provide a wide variety of innovative, extensible, software solutions in...
The nlme package for fitting and examining linear and nonlinear mixede #ects models in R is a required package and also one of the largest R packages. In the first phase of a project to extend the capabilities of the nlme package to include generalized linear mixed models (glmm's), we reimplemented linear mixed-e#ects (lme) models using `S4' classe...
In an earlier paper we provided easily-calculated expressions for the gradient of the profiled log-likelihood and log-restricted-likelihood for single-level mixed-effects models. We also showed how this gradient is related to the update of an ECME (expectation conditional maxi-mization either) algorithm for such single level models. In this paper w...
When creating the R Matrix package, which provides access to the Fortran Lapack and BLAS3 routines, we patterned the functions after the corresponding S-PLUS library but chose a completely di#erent implementation. We based the R implementation on the lapack++ classes that provide C++ wrappers for the Lapack code.
Linear Mixed-Effects * Theory and Computational Methods for LME Models * Structure of Grouped Data * Fitting LME Models * Extending the Basic LME Model * Nonlinear Mixed-Effects * Theory and Computational Methods for NLME Models * Fitting NLME Models
In this chapter we present the theory for the linear mixed-effects model introduced in Chapter 1. A general formulation of LME models is presented and illustrated with examples. Estimation methods for LME models, based on the likelihood or the restricted likelihood of the parameters, are described, together with the computational methods used to im...
As seen in Chapter 1, mixed-effects models provide a flexible and powerful tool for analyzing balanced and unbalanced grouped data. These models have gained popularity over the last decade, in part because of the development of reliable and efficient software for fitting and analyzing them. The linear and nonlinear mixed-effects (nlme) library in S...
Many common statistical models can be expressed as linear models that incorporate both fixed effects, which are parameters associated with an entire population or with certain repeatable levels of experimental factors, and random effects, which are associated with individual experimental units drawn at random from a population. A model with both fi...
As illustrated by the examples in Chapter 1, we will be modeling data from experiments or studies in which the observations are grouped according to one or more nested classifications. Often this classification is by “Subject” or some similar experimental unit. Repeated measures data, longitudinal data, and growth curve data are examples of this ge...
This chapter presents the theory for the nonlinear mixed-effects model introduced in Chapter 6. A general formulation of NLME models is presented and illustrated with examples. Estimation methods for fitting NLME models, based on approximations to the likelihood function, are described and discussed. The computational methods used in the nlme funct...
As shown in the examples in Chapter 6, nonlinear mixed-effects models offer a flexible tool for analyzing grouped data with models that depend nonlinearly upon their parameters. As nonlinear models are usually based on a mechanistic model of the relationship between the response and the covariates, their parameters can have a theoretical interpreta...
The linear mixed-effects model formulation used in Chapters 2 and 4 allows considerable flexibility in the specification of the random-effects structure, but restricts the within-group errors to be independent, identically distributed random variables with mean zero and constant variance. As illustrated in Chapters 2 and 4, this basiclinear mixed-e...
This chapter gives an overview of the nonlinear mixed-effects (NLME) model, introducing its main concepts and ideas through the analysis of real-data examples. The emphasis is on presenting the motivation for using NLME models when analyzing grouped data, while introducing some of the key features in the nlme library for fitting and analyzing such...
This chapter describes the capabilities available in the nlme library for fitting and analyzing linear mixed-effects models with uncorrelated, homoscedastic within-group errors. The lme function, for fitting linear mixede ffects models, is described in detail and its various capabilities and associated methods are illustrated through the analyses o...
This chapter describes the nonlinear modeling capabilities available in the nlme library. A brief review of the nonlinear least-squares function nls in S is presented and self-starting models for automatically producing starting values for the coefficients in a nonlinear model are introduced and illustrated. The nlsList function for fitting separat...
In this chapter we have shown examples of constructing, summarizing, and graphically displaying grouped Data objects. These objects include the data, stored as a data frame, and a formula that designates different variables as a response, a primary covariate, and as one or more grouping factors. Other variables can be designated as outer or inner f...
This chapter presents the theoretical foundations of the nonlinear mixed-effects model for single- and multilevel grouped data, including the general model formulation and its underlying distributional assumptions. Efficient computational methods for maximum likelihood estimation in the NLME model are described and discussed. Different approximatio...
In this chapter the linear mixed-effects model of Chapters 2 and 4 is extended to include heteroscedastic, correlated within-group errors. We show how the estimation and computational methods of Chapter 2 can be extended to this more general linear mixed-effects model. We introduce several classes of variance functions to characterize heteroscedast...
this document is to describe some of the capabilities in Version 3.0 of the nlme software and to give examples of their usage. A detailed description of the various functions, classes, and methods can be found in the corresponding help files, which are available on-line. The PostScript file HelpFunc.ps, included with the nlme
This document describes the data sets included in the NLMEDATA subdirectory of the nlme distribution. We have adopted naming conventions for the columns in the data frames and for the levels in the factors within the data frames, especially the grouping factor. These naming conventions are not required. We find that they help us remember the roles...
Software design for population pharmacokinetic analysis presents many challenging problems. The data can range from relatively small data sets with simple structure to comparatively large data sets collected in routine clinical settings. These larger collections often have a complicated structure which makes graphics or tabular presentation of the...
A multilevel mixed-effects model has random effects at each of several nested levels of grouping of the observed responses. We may use these, for example, when modelling observations taken over time on students who are grouped into classes that are grouped into schools that are grouped into districts. If each of the distributions of the random effe...
This article is about the organization and visualization of data grouped according to several nested classification factors. We describe data structures and display methods we developed in the S-PLUS language for representing and plotting this type of data. In Section 1 we describe a groupedData class of objects used to represent repeated measures...
Software for exploring and modelling longitudinal data can be made much easier to use by incorporating an object-oriented design. Current versions of S-Plus provide some object-oriented capability but experimental versions of S emphasize an even stronger commitment to object orientation. These new capabilities, combined with the development of Trel...
The estimation of variance-covariance matrices through optimization of an objective function, such as a log-likelihood function, is usually a difficult numerical problem. Since the estimates should be positive semi-definite matrices, we must use constrained optimization, or employ a parametrization that enforces this condition. We describe here fiv...
serial correlation structure: ar1 variance function: power Ovary.fit1 fixed: (Intercept),sin(2 * pi * Time),cos(2 * pi * Time) random: (Intercept),sin(2 * pi * Time),cos(2 * pi * Time) block: list(1, 2:3) covariance structure: identity,diagonal serial correlation structure: ar2 variance function: power Model Df AIC BIC Loglik Test Lik.Ratio P value...
Contents 1 Introduction 1 2 The lme class and related methods 1 2.1 The lme function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 2.2 The print, summary, and anova methods. . . . . . . . . . . . . . . . . . . . . 2 2.3 The plot method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.4 Other methods...
In this paper we propose using free-knot splines with a common knot vector to approximate multiresponse model functions and their associated error variance-covariance matrix. A “loose coupling” type of algorithm for minimizing the determinant parameter estimation criterion is constructed for multiresponse spline models. The algorithm evaluates the...
INTRODUCTION The estimation of variance-covariance matrices through optimization of an objective function, such as a log-likelihood function, is usually a difficult numerical problem, since one must ensure that the resulting estimate is positive semi-definite. This kind of estimation problem occurs, for example, in the analysis of linear and nonlin...
Schwenke and Milliken (1991) describe a problem of fitting pH versus time since death for beef carcasses. The model is pH = ` 1 + ` 2 Delta exp(Gammat=` 3 ) They wish to find a confidence interval on the time when pH = 6:0. Note that the data used to fit the model are pH as a function of time. The inference is on time as a function of pH. 5 Nonline...
We describe extensions to the nonlinear modeling facilities in release 3 of S and Splus. These extensions provide classesand methods for fitting and analyzing nonlinear mixed effects models with the two-stage estimation method described by Lindstrom and Bates (1990). They are implemented in a combination of S and C code and complement the classes a...
Mixed-effects models provide a powerful and flexible tool for analyzing clustere d data, such as repeated measures data and nested designs. We describe a set of S functions, classes, and methods for the analysis of both linear and nonlinear mixed-effects models. These extend the linear and nonlinear modeling facilities available in release 3 of S a...
CORRECTION TO LINDSTROM AND BATES (1988)
CORRECTION TO SEVERINI AND STANISWALIS (1994)
CORRECTION TO KOLASSA AND TANNER (1994)
Nonlinear mixed effects models involve both fixed effects and random effects. Model building for nonlinear mixed effects is the process of determining the characteristics of both the fixed and the random effects so as to give an adequate but parsimonious model. We describe procedures based on information criterion statistics for comparing different...
Introduction. Several different nonlinear mixed effects models and estimation methods for their parameters have been proposed in recent years (Sheiner and Beal, 1980; Mallet, Mentre, Steimer and Lokiek, 1988; Lindstrom and Bates, 1990; Vonesh and Carter, 1992; Davidian and Gallant, 1992; Wakefield, Smith, Racine-Poon and Gelfand, 1994). We consider...
We show how to obtain esthetically pleasing contour plots using New S and GCVPACK. With these codes, thin plate splines can easily be used to interpolate “exact” data, and to produce smoothly varying contour plots, with none of the jagged corners that plague many other interpolation methods. It is noted that GCVPACK can also be used to interpolate...
Profile methods, i.e., profile values, profile traces, profile transformations, and profile diagnostic plots, are introduced in a general setting and an interesting feature of profile transforms, the boxing property is pointed out. Profile methods are then discussed ill the frameworks of likelihood and Bayesian inference. Links with approximate Bay...
We are fortunate to have many different tools -- statistical packages, spreadsheets, or database systems -- for analysis and presentation of data. But often the first step in an analysis is manipulating or massaging the original data into a form that can be read by the package. Experienced Unix a users may use sed, awk, or shell scripts for this. R...
Nonlinear least squares problems where some parameters affect only some of the predicted responses while other parameters affect all the predicted responses are said to be loosely coupled. The most common example is a model that is fitted to responses from several individuals with some of the parameters being common to all individuals. Exploiting t...
The aim of model building is to determine the ‘correct’ model, which means that the equation describing the phenomenon under study includes all the important factors, in the correct form, and excludes unimportant factors. Practically, of course, we can only use the data at hand to fit a model which is ‘adequate’. In linear and nonlinear regression,...
We present an approximate inference procedure for the multiresponse regression model. This can be used to determine approximate confidence regions or intervals and to formulate diagnostic tools such as standardized residuals and influence measures. The overall approximation is based on approximations at two stages. At the first stage, the multiresp...