Package Lme4: Linear Mixed-Effects Models Using Eigen and S4

Technical Report (PDF Available)inJournal of statistical software 67 · January 2014with 5,586 Reads 
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
Cite this publication
Abstract
Description Fit linear and generalized linear mixed-effects models. The models and their components are represented using S4 classes and methods. The core computational algorithms are implemented using the 'Eigen' C++ library for numerical linear algebra and 'RcppEigen' ``glue''.
Advertisement
Package ‘lme4’
August 25, 2015
Version 1.1-10
Title Linear Mixed-Effects Models using 'Eigen' and S4
Maintainer Ben Bolker <bbolker+lme4@gmail.com>
Contact LME4 Authors <lme4-authors@lists.r-forge.r-project.org>
Author Douglas Bates [aut], Martin Maechler [aut],
Ben Bolker [aut, cre], Steven Walker [aut],
Rune Haubo Bojesen Christensen [ctb],
Henrik Singmann [ctb], Bin Dai [ctb],
Gabor Grothendieck [ctb]
Description Fit linear and generalized linear mixed-effects models.
The models and their components are represented using S4 classes and
methods. The core computational algorithms are implemented using the
'Eigen' C++ library for numerical linear algebra and 'RcppEigen' ``glue''.
Depends R (>= 3.0.2), Matrix (>= 1.1.1), methods, stats
LinkingTo Rcpp (>= 0.10.5), RcppEigen
Imports graphics, grid, splines, utils, parallel, MASS, nlme, lattice,
minqa (>= 1.1.15), nloptr (>= 1.0.4)
Suggests knitr, boot, PKPDmodels, MEMSS, testthat (>= 0.8.1), ggplot2,
mlmRev, optimx (>= 2013.8.6), gamm4, pbkrtest, HSAUR2, numDeriv
VignetteBuilder knitr
LazyData yes
License GPL (>=2)
URL https://github.com/lme4/lme4/ http://lme4.r-forge.r-project.org/
BugReports https://github.com/lme4/lme4/issues
Rtopics documented:
lme4-package........................................ 3
Arabidopsis......................................... 4
bootMer........................................... 5
cake ............................................. 8
cbpp............................................. 9
connt.merMod....................................... 10
convergence......................................... 12
1
2Rtopics documented:
devcomp........................................... 14
drop1.merMod ....................................... 14
dummy ........................................... 16
Dyestuff........................................... 17
expandDoubleVerts..................................... 18
factorize........................................... 18
ndbars ........................................... 19
xef ............................................. 20
fortify ............................................ 20
getME............................................ 21
GHrule ........................................... 24
glmer ............................................ 25
glmer.nb........................................... 28
glmerLaplaceHandle .................................... 29
glmFamily.......................................... 30
glmFamily-class....................................... 30
golden-class......................................... 31
GQdk ............................................ 31
grouseticks ......................................... 32
hatvalues.merMod...................................... 33
InstEval ........................................... 34
isNested........................................... 35
isREML........................................... 36
lmer............................................. 37
lmerControl......................................... 39
lmList............................................ 44
lmList4-class ........................................ 45
lmResp ........................................... 46
lmResp-class ........................................ 47
merMod-class........................................ 48
merPredD.......................................... 52
merPredD-class....................................... 53
mkdevfun .......................................... 53
mkMerMod......................................... 54
mkRespMod......................................... 55
mkReTrms ......................................... 56
mkSimulateTemplate .................................... 57
mkVarCorr ......................................... 57
modular........................................... 58
NelderMead......................................... 62
NelderMead-class...................................... 63
ngrps ............................................ 64
nlformula .......................................... 65
nlmer ............................................ 66
nloptwrap .......................................... 68
nobars............................................ 69
Pastes ............................................ 70
Penicillin .......................................... 71
plot.merMod ........................................ 72
plots.thpr .......................................... 74
predict.merMod....................................... 75
prole-methods....................................... 77
lme4-package 3
prt-utilities.......................................... 80
pvalues ........................................... 82
ranef............................................. 83
ret ............................................. 84
retML ........................................... 86
rePos ............................................ 87
rePos-class ......................................... 87
residuals.merMod...................................... 88
sigma ............................................ 89
simulate.merMod...................................... 89
sleepstudy.......................................... 91
subbars ........................................... 92
troubleshooting ....................................... 93
VarCorr ........................................... 93
varianceProf......................................... 95
vcconv............................................ 95
VerbAgg........................................... 97
Index 99
lme4-package Linear, generalized linear, and nonlinear mixed models
Description
lme4 provides functions for fitting and analyzing mixed models: linear (lmer), generalized linear
(glmer) and nonlinear (nlmer.)
Differences between nlme and lme4
lme4 covers approximately the same ground as the earlier nlme package. The most important
differences are:
lme4 uses modern, efficient linear algebra methods as implemented in the Eigen package, and
uses reference classes to avoid undue copying of large objects; it is therefore likely to be faster
and more memory-efficient than nlme.
lme4 includes generalized linear mixed model (GLMM) capabilities, via the glmer function.
lme4 does not currently implement nlme’s features for modeling heteroscedasticity and cor-
relation of residuals.
lme4 does not currently offer the same flexibility as nlme for composing complex variance-
covariance structures, but it does implement crossed random effects in a way that is both easier
for the user and much faster.
lme4 offers built-in facilities for likelihood profiling and parametric bootstrapping.
lme4 is designed to be more modular than nlme, making it easier for downstream package
developers and end-users to re-use its components for extensions of the basic mixed model
framework. It also allows more flexibility for specifying different functions for optimizing
over the random-effects variance-covariance parameters.
lme4 is not (yet) as well-documented as nlme.
4Arabidopsis
Differences between current (1.0.+) and previous versions of lme4
[gn]lmer now produces objects of class merMod rather than class mer as before
the new version uses a combination of S3 and reference classes (see ReferenceClasses,
merPredD-class, and lmResp-class) as well as S4 classes; partly for this reason it is more
interoperable with nlme
The internal structure of [gn]lmer is now more modular, allowing finer control of the different
steps of argument checking; construction of design matrices and data structures; parameter
estimation; and construction of the final merMod object (see modular)
profiling and parametric bootstrapping are new in the current version
the new version of lme4 does not provide an mcmcsamp (post-hoc MCMC sampling) method,
because this was deemed to be unreliable. Alternatives for computing p-values include para-
metric bootstrapping (bootMer) or methods implemented in the pbkrtest package and lever-
aged by the lmerTest package and the Anova function in the car package (see pvalues for
more details).
Caveats and trouble-shooting
Some users who have previously installed versions of the RcppEigen and minqa packages may
encounter segmentation faults (!!); the solution is to make sure to re-install these packages
before installing lme4. (Because the problem is not with the explicit version of the packages,
but with running packages that were built with different versions of Rcpp in conjunction with
each other, simply making sure you have the latest version, or using update.packages, will
not necessarily solve the problem; you must actually re-install the packages. The problem is
most likely with minqa.)
Arabidopsis Arabidopsis clipping/fertilization data
Description
Data on genetic variation in responses to fertilization and simulated herbivory in Arabidopsis
Usage
data("Arabidopsis")
Format
A data frame with 625 observations on the following 8 variables.
reg region: a factor with 3 levels NL (Netherlands), SP (Spain), SW (Sweden)
popu population: a factor with the form n.R representing a population in region R
gen genotype: a factor with 24 (numeric-valued) levels
rack a nuisance factor with 2 levels, one for each of two greenhouse racks
nutrient fertilization treatment/nutrient level (1, minimal nutrients or 8, added nutrients)
amd simulated herbivory or "clipping" (apical meristem damage): unclipped (baseline) or clipped
status a nuisance factor for germination method (Normal,Petri.Plate, or Transplant)
total.fruits total fruit set per plant (integer)
bootMer 5
Source
From Josh Banta
References
Joshua A. Banta, Martin H. H Stevens, and Massimo Pigliucci (2010) A comprehensive test of the
’limiting resources’ framework applied to plant tolerance to apical meristem damage. Oikos 119(2),
359–369; http://dx.doi.org/10.1111/j.1600-0706.2009.17726.x
Examples
data(Arabidopsis)
summary(Arabidopsis[,"total.fruits"])
table(gsub("[0-9].","",levels(Arabidopsis[,"popu"])))
library(lattice)
stripplot(log(total.fruits+1) ~ amd|nutrient, data = Arabidopsis,
groups = gen,
strip=strip.custom(strip.names=c(TRUE,TRUE)),
type=c('p','a'), ## points and panel-average value --
## see ?panel.xyplot
scales=list(x=list(rot=90)),
main="Panel: nutrient, Color: genotype")
bootMer Model-based (Semi-)Parametric Bootstrap for Mixed Models
Description
Perform model-based (Semi-)parametric bootstrap for mixed models.
Usage
bootMer(x, FUN, nsim = 1, seed = NULL, use.u = FALSE,
type = c("parametric", "semiparametric"),
verbose = FALSE, .progress = "none", PBargs = list(),
parallel = c("no", "multicore", "snow"),
ncpus = getOption("boot.ncpus", 1L), cl = NULL)
Arguments
xa fitted merMod object: see lmer,glmer, etc.
FUN a function taking a fitted merMod object as input and returning the statistic of
interest, which must be a (possibly named) numeric vector.
nsim number of simulations, positive integer; the bootstrap B(or R).
seed optional argument to set.seed.
use.u logical, indicating whether the spherical random effects should be simulated /
bootstrapped as well. If TRUE, they are not changed, and all inference is condi-
tional on these values. If FALSE, new normal deviates are drawn (see Details).
type character string specifying the type of bootstrap, "parametric" or "semiparametric";
partial matching is allowed.
6bootMer
verbose logical indicating if progress should print output
.progress character string - type of progress bar to display. Default is "none"; the function
will look for a relevant *ProgressBar function, so "txt" will work in general;
"tk" is available if the tcltk package is loaded; or "win" on Windows systems.
Progress bars are disabled (with a message) for parallel operation.
PBargs a list of additional arguments to the progress bar function (the package authors
like list(style=3)).
parallel The type of parallel operation to be used (if any). If missing, the default is taken
from the option "boot.parallel" (and if that is not set, "no").
ncpus integer: number of processes to be used in parallel operation: typically one
would choose this to be the number of available CPUs.
cl An optional parallel or snow cluster for use if parallel = "snow". If not
supplied, a cluster on the local machine is created for the duration of the boot
call.
Details
The semi-parametric variant is only partially implemented, and we only provide a method for lmer
and glmer results.
The working name for bootMer() was “simulestimate()”, as it is an extension of simulate (see
simulate.merMod), but we want to emphasize its potential for valid inference.
If use.u is FALSE and type is "parametric", each simulation generates new values of both
the “spherical” random effects uand the i.i.d. errors , using rnorm() with parameters corre-
sponding to the fitted model x.
If use.u is TRUE and type=="parametric", only the i.i.d. errors (or, for GLMMs, response
values drawn from the appropriate distributions) are resampled, with the values of ustaying
fixed at their estimated values.
If use.u is TRUE and type=="semiparametric", the i.i.d. errors are sampled from the dis-
tribution of (response) residuals. (For GLMMs, the resulting sample will no longer have the
same properties as the original sample, and the method may not make sense; a warning is
generated.) The semiparametric bootstrap is currently an experimental feature, and therefore
may not be stable.
The case where use.u is FALSE and type=="semiparametric" is not implemented; Morris
(2002) suggests that resampling from the estimated values of uis not good practice.
Value
an object of S3 class "boot", compatible with boot package’s boot() result.
References
Davison, A.C. and Hinkley, D.V. (1997) Bootstrap Methods and Their Application. Cambridge
University Press.
Morris, J. S. (2002). The BLUPs Are Not ‘best’ When It Comes to Bootstrapping. Statistics &
Probability Letters 56(4): 425–430. doi:10.1016/S0167-7152(02)00041-X.
bootMer 7
See Also
confint.merMod, for a more specific approach to bootstrap confidence intervals on parame-
ters.
refit(), or PBmodcomp() from the pbkrtest package, for parametric bootstrap comparison
of models.
boot(), and then boot.ci, from the boot package.
profile-methods, for likelihood-based inference, including confidence intervals.
pvalues, for more general approaches to inference and p-value computation in mixed models.
Examples
fm01ML <- lmer(Yield ~ 1|Batch, Dyestuff, REML = FALSE)
## see ?"profile-methods"
mySumm <- function(.) { s <- sigma(.)
c(beta =getME(., "beta"), sigma = s, sig01 = unname(s * getME(., "theta"))) }
(t0 <- mySumm(fm01ML)) # just three parameters
## alternatively:
mySumm2 <- function(.) {
c(beta=fixef(.),sigma=sigma(.), sig01=sqrt(unlist(VarCorr(.))))
}
set.seed(101)
## 3.8s (on a 5600 MIPS 64bit fast(year 2009) desktop "AMD Phenom(tm) II X4 925"):
system.time( boo01 <- bootMer(fm01ML, mySumm, nsim = 100) )
## to "look" at it
require("boot") ## a recommended package, i.e. *must* be there
boo01
## note large estimated bias for sig01
## (~30% low, decreases _slightly_ for nsim = 1000)
## extract the bootstrapped values as a data frame ...
head(as.data.frame(boo01))
## ------ Bootstrap-based confidence intervals ------------
## warnings about "Some ... intervals may be unstable" go away
## for larger bootstrap samples, e.g. nsim=500
## intercept
(bCI.1 <- boot.ci(boo01, index=1, type=c("norm", "basic", "perc")))# beta
## Residual standard deviation - original scale:
(bCI.2 <- boot.ci(boo01, index=2, type=c("norm", "basic", "perc")))
## Residual SD - transform to log scale:
(bCI.2L <- boot.ci(boo01, index=2, type=c("norm", "basic", "perc"),
h = log, hdot = function(.) 1/., hinv = exp))
## Among-batch variance:
(bCI.3 <- boot.ci(boo01, index=3, type=c("norm", "basic", "perc"))) # sig01
## Extract all CIs (somewhat awkward)
bCI.tab <- function(b,ind=length(b$t0), type="perc", conf=0.95) {
btab0 <- t(sapply(as.list(seq(ind)),
function(i)
8cake
boot.ci(b,index=i,conf=conf, type=type)$percent))
btab <- btab0[,4:5]
rownames(btab) <- names(b$t0)
a <- (1 - conf)/2
a <- c(a, 1 - a)
pct <- stats:::format.perc(a, 3)
colnames(btab) <- pct
return(btab)
}
bCI.tab(boo01)
## Graphical examination:
plot(boo01,index=3)
## Check stored values from a longer (1000-replicate) run:
load(system.file("testdata","boo01L.RData",package="lme4"))
plot(boo01L,index=3)
mean(boo01L$t[,"sig01"]==0) ## note point mass at zero!
cake Breakage Angle of Chocolate Cakes
Description
Data on the breakage angle of chocolate cakes made with three different recipes and baked at six
different temperatures. This is a split-plot design with the recipes being whole-units and the differ-
ent temperatures being applied to sub-units (within replicates). The experimental notes suggest that
the replicate numbering represents temporal ordering.
Format
A data frame with 270 observations on the following 5 variables.
replicate a factor with levels 1to 15
recipe a factor with levels A,Band C
temperature an ordered factor with levels 175 <185 <195 <205 <215 <225
angle a numeric vector giving the angle at which the cake broke.
temp numeric value of the baking temperature (degrees F).
Details
The replicate factor is nested within the recipe factor, and temperature is nested within replicate.
Source
Original data were presented in Cook (1938), and reported in Cochran and Cox (1957, p. 300).
Also cited in Lee, Nelder and Pawitan (2006).
cbpp 9
References
Cook, F. E. (1938) Chocolate cake, I. Optimum baking temperature. Master’s Thesis, Iowa State
College.
Cochran, W. G., and Cox, G. M. (1957) Experimental designs, 2nd Ed. New York, John Wiley \&
Sons.
Lee, Y., Nelder, J. A., and Pawitan, Y. (2006) Generalized linear models with random effects.
Unified analysis via H-likelihood. Boca Raton, Chapman and Hall/CRC.
Examples
str(cake)
## 'temp'is continuous, 'temperature'an ordered factor with 6 levels
(fm1 <- lmer(angle ~ recipe * temperature + (1|recipe:replicate), cake, REML= FALSE))
(fm2 <- lmer(angle ~ recipe + temperature + (1|recipe:replicate), cake, REML= FALSE))
(fm3 <- lmer(angle ~ recipe + temp + (1|recipe:replicate), cake, REML= FALSE))
## and now "choose" :
anova(fm3, fm2, fm1)
cbpp Contagious bovine pleuropneumonia
Description
Contagious bovine pleuropneumonia (CBPP) is a major disease of cattle in Africa, caused by a
mycoplasma. This dataset describes the serological incidence of CBPP in zebu cattle during a
follow-up survey implemented in 15 commercial herds located in the Boji district of Ethiopia. The
goal of the survey was to study the within-herd spread of CBPP in newly infected herds. Blood
samples were quarterly collected from all animals of these herds to determine their CBPP status.
These data were used to compute the serological incidence of CBPP (new cases occurring during a
given time period). Some data are missing (lost to follow-up).
Format
A data frame with 56 observations on the following 4 variables.
herd A factor identifying the herd (1 to 15).
incidence The number of new serological cases for a given herd and time period.
size A numeric vector describing herd size at the beginning of a given time period.
period A factor with levels 1to 4.
Details
Serological status was determined using a competitive enzyme-linked immuno-sorbent assay (cELISA).
Source
Lesnoff, M., Laval, G., Bonnet, P., Abdicho, S., Workalemahu, A., Kifle, D., Peyraud, A., Lancelot,
R., Thiaucourt, F. (2004) Within-herd spread of contagious bovine pleuropneumonia in Ethiopian
highlands. Preventive Veterinary Medicine 64, 27–40.
10 confint.merMod
Examples
## response as a matrix
(m1 <- glmer(cbind(incidence, size - incidence) ~ period + (1 | herd),
family = binomial, data = cbpp))
## response as a vector of probabilities and usage of argument "weights"
m1p <- glmer(incidence / size ~ period + (1 | herd), weights = size,
family = binomial, data = cbpp)
## Confirm that these are equivalent:
stopifnot(all.equal(fixef(m1), fixef(m1p), tolerance = 1e-5),
all.equal(ranef(m1), ranef(m1p), tolerance = 1e-5))
## GLMM with individual-level variability (accounting for overdispersion)
cbpp$obs <- 1:nrow(cbpp)
(m2 <- glmer(cbind(incidence, size - incidence) ~ period + (1 | herd) + (1|obs),
family = binomial, data = cbpp))
confint.merMod Compute Confidence Intervals for Parameters of a [ng]lmer Fit
Description
Compute confidence intervals on the parameters of a *lmer() model fit (of class"merMod").
Usage
## S3 method for class 'merMod'
confint(object, parm, level = 0.95,
method = c("profile", "Wald", "boot"), zeta,
nsim = 500,
boot.type = c("perc","basic","norm"),
FUN = NULL, quiet = FALSE,
oldNames = TRUE, ...)
Arguments
object a fitted [ng]lmer model
parm parameters for which intervals are sought. Specified by an integer vector of
positions, character vector of parameter names, or (unless doing parametric
bootstrapping with a user-specified bootstrap function) "theta_" or "beta_"
to specify variance-covariance or fixed effects parameters only: see the which
parameter of profile.
level confidence level <1, typically above 0.90.
method acharacter string determining the method for computing the confidence inter-
vals.
zeta (for method = "profile" only:) likelihood cutoff (if not specified, as by de-
fault, computed from level).
nsim number of simulations for parametric bootstrap intervals.
FUN bootstrap function; if NULL, an internal function that returns the fixed-effect pa-
rameters as well as the random-effect parameters on the standard deviation/correlation
scale will be used. See bootMer for details.
confint.merMod 11
boot.type bootstrap confidence interval type, as described in boot.ci. (Methods ‘stud’
and ‘bca’ are unavailable because they require additional components to be cal-
culated.)
quiet (logical) suppress messages about computationally intensive profiling?
oldNames (logical) use old-style names for variance-covariance parameters, e.g. ".sig01",
rather than newer (more informative) names such as "sd_(Intercept)|Subject"?
(See signames argument to profile).
... additional parameters to be passed to profile.merMod or bootMer, respec-
tively.
Details
Depending on the method specified, confint() computes confidence intervals by
"profile":computing a likelihood profile and finding the appropriate cutoffs based on the likeli-
hood ratio test;
"Wald":approximating the confidence intervals (of fixed-effect parameters only; all variance-
covariance parameters CIs will be returned as NA) based on the estimated local curvature of
the likelihood surface;
"boot":performing parametric bootstrapping with confidence intervals computed from the boot-
strap distribution according to boot.type (see bootMer,boot.ci).
Value
a numeric table (matrix with column and row names) of confidence intervals; the confidence inter-
vals are computed on the standard deviation scale.
Note
The default method "profile" amounts to
confint(profile(object, which=parm), signames=oldNames, ...),
level, zeta)
where the profile method profile.merMod does almost all the computations. Therefore it is
typically advisable to store the profile(.) result, say in pp, and then use confint(pp, level=*)
e.g., for different levels.
Examples
fm1 <- lmer(Reaction ~ Days + (Days|Subject), sleepstudy)
fm1W <- confint(fm1, method="Wald")# very fast, but ....
fm1W
testLevel <- if (nzchar(s <- Sys.getenv("LME4_TEST_LEVEL"))) as.numeric(s) else 1
if(interactive() || testLevel >= 3) {
## ~20 seconds, MacBook Pro laptop
system.time(fm1P <- confint(fm1, method="profile", ## default
oldNames = FALSE))
## ~ 40 seconds
system.time(fm1B <- confint(fm1,method="boot",
.progress="txt", PBargs=list(style=3)))
} else
load(system.file("testdata","confint_ex.rda",package="lme4"))
fm1P
fm1B
12 convergence
convergence Assessing Convergence for Fitted Models
Description
The lme4 package uses general-purpose nonlinear optimizers (e.g. Nelder-Mead or Powell’s BOBYQA
method) to estimate the variance-covariance matrices of the random effects. Assessing reliably
whether such algorithms have converged is difficult. For example, evaluating the Karush-Kuhn-
Tucker conditions (convergence criteria which in the simplest case of non-constrained optimization
reduce to showing that the gradient is zero and the Hessian is positive definite) is challenging be-
cause of the difficulty of evaluating the gradient and Hessian.
We (the lme4 authors and maintainers) are still in the process of finding the best strategies for testing
convergence. Some of the relevant issues are
the gradient and Hessian are the basic ingredients of KKT-style testing, but when they have
to be estimated by finite differences (as in the case of lme4; direct computation of derivatives
based on analytic expressions may eventually be available for some special classes, but we
have not yet implemented them) they may not be sufficiently accurate for reliable convergence
testing.
The Hessian computation in particular represents a difficult tradeoff between computational
expense and accuracy. At present the Hessian computations used for convergence checking
(and for estimating standard errors of fixed-effect parameters for GLMMs) follow the ordinal
package in using a naive but computationally cheap centered finite difference computation
(with a fixed step size of 104). A more reliable but more expensive approach is to use
Richardson extrapolation, as implemented in the numDeriv package.
it is important to scale the estimated gradient at the estimate appropriately; two reasonable
approaches are
1. don’t scale random-effects (Cholesky) gradients, since these are essentially already unit-
less (for LMMs they are scaled relative to the residual variance; for GLMMs they are
scaled relative to the sampling variance of the conditional distribution); for GLMMs,
scale fixed-effect gradients by the standard deviations of the corresponding input vari-
able, or
2. scale gradients by the inverse Cholesky factor of the Hessian, equivalent to scaling by
the estimated Wald standard error of the estimated parameters. The latter approach is
used in the current version of lme4; it has the disadvantage that it requires us to estimate
the Hessian (although the Hessian is required for reliable estimation of the fixed-effect
standard errors for GLMMs in any case).
Exploratory analyses suggest that (1) the naive estimation of the Hessian may fail for large
data sets (number of observations greater than approximately 105); (2) the magnitude of the
scaled gradient increases with sample size, so that warnings will occur even for apparently
well-behaved fits with large data sets.
If you do see convergence warnings, and want to trouble-shoot/double-check the results, the fol-
lowing steps are recommended (examples are given below):
double-check the model specification and the data for mistakes
center and scale continuous predictor variables (e.g. with scale)
check for singularity: if any of the diagonal elements of the Cholesky factor are zero or very
small, the convergence testing methods may be inappropriate (see examples)
convergence 13
double-check the Hessian calculation with the more expensive Richardson extrapolation method
(see examples)
restart the fit from the apparent optimum, or from a point perturbed slightly away from the
optimum
try all available optimizers (e.g. several different implementations of BOBYQA and Nelder-
Mead, L-BFGS-B from optim,nlminb, . . . ) While this will of course be slow for large
fits, we consider it the gold standard; if all optimizers converge to values that are practically
equivalent, then we would consider the convergence warnings to be false positives.
To quote Douglas Adams, we apologize for the inconvenience.
See Also
lmerControl
Examples
fm1 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy)
## 1. center and scale predictors:
ss.CS <- transform(sleepstudy, Days=scale(Days))
fm1.CS <- update(fm1, data=ss.CS)
## 2. check singularity
diag.vals <- getME(fm1,"theta")[getME(fm1,"lower") == 0]
any(diag.vals < 1e-6) # FALSE
## 3. recompute gradient and Hessian with Richardson extrapolation
devfun <- update(fm1, devFunOnly=TRUE)
if (isLMM(fm1)) {
pars <- getME(fm1,"theta")
} else {
## GLMM: requires both random and fixed parameters
pars <- getME(fm1, c("theta","fixef"))
}
if (require("numDeriv")) {
cat("hess:\n"); print(hess <- hessian(devfun, unlist(pars)))
cat("grad:\n"); print(grad <- grad(devfun, unlist(pars)))
cat("scaled gradient:\n")
print(scgrad <- solve(chol(hess), grad))
}
## compare with internal calculations:
fm1@optinfo$derivs
## 4. restart the fit from the original value (or
## a slightly perturbed value):
fm1.restart <- update(fm1, start=pars)
## 5. try all available optimizers
source(system.file("utils", "allFit.R", package="lme4"))
fm1.all <- allFit(fm1)
ss <- summary(fm1.all)
ss$ fixef ## extract fixed effects
ss$ llik ## log-likelihoods
ss$ sdcor ## SDs and correlations
14 drop1.merMod
ss$ theta ## Cholesky factors
ss$ which.OK ## which fits worked
devcomp Extract the deviance component list
Description
Return the deviance component list
Usage
devcomp(x)
Arguments
xa fitted model of class merMod
Details
A fitted model of class merMod has a devcomp slot as described in the value section.
Value
a list with components
dims a named integer vector of various dimensions
cmp a named numeric vector of components of the deviance
Note
This function is deprecated, use getME(., "devcomp")
drop1.merMod Drop all possible single fixed-effect terms from a mixed effect model
Description
Drop allowable single terms from the model: see drop1 for details of how the appropriate scope for
dropping terms is determined.
Usage
## S3 method for class 'merMod'
drop1(object, scope, scale = 0,
test = c("none", "Chisq", "user"),
k = 2, trace = FALSE, sumFun, ...)
drop1.merMod 15
Arguments
object a fitted merMod object.
scope a formula giving the terms to be considered for adding or dropping.
scale Currently ignored (included for S3 method compatibility)
test should the results include a test statistic relative to the original model? The χ2
test is a likelihood-ratio test, which is approximate due to finite-size effects.
kthe penalty constant in AIC
trace print tracing information?
sumFun a summary function to be used when test=="user". It must allow arguments
scale and k, but these may be ignored (e.g. specified in dots). The first two ar-
guments must be object, the full model fit, and objectDrop, a reduced model.
If objectDrop is missing, sumFun should return a vector of with the appropriate
length and names (the actual contents are ignored).
... other arguments (ignored)
Details
drop1 relies on being able to find the appropriate information within the environment of the formula
of the original model. If the formula is created in an environment that does not contain the data, or
other variables passed to the original model (for example, if a separate function is called to define
the formula), then drop1 will fail. A workaround (see example below) is to manually specify an
appropriate environment for the formula.
Value
An object of class anova summarizing the differences in fit between the models.
Examples
fm1 <- lmer(Reaction~Days+(Days|Subject),sleepstudy)
## likelihood ratio tests
drop1(fm1,test="Chisq")
## use Kenward-Roger corrected F test, or parametric bootstrap,
## to test the significance of each dropped predictor
if (require(pbkrtest) && packageVersion("pbkrtest")>="0.3.8") {
KRSumFun <- function(object, objectDrop, ...) {
krnames <- c("ndf","ddf","Fstat","p.value","F.scaling")
r <- if (missing(objectDrop)) {
setNames(rep(NA,length(krnames)),krnames)
} else {
krtest <- KRmodcomp(object,objectDrop)
unlist(krtest$stats[krnames])
}
attr(r,"method") <- c("Kenward-Roger via pbkrtest package")
r
}
drop1(fm1,test="user",sumFun=KRSumFun)
if(lme4:::testLevel() >= 3) { ## takes about 16 sec
nsim <- 100
PBSumFun <- function(object, objectDrop, ...) {
pbnames <- c("stat","p.value")
16 dummy
r <- if (missing(objectDrop)) {
setNames(rep(NA,length(pbnames)),pbnames)
} else {
pbtest <- PBmodcomp(object,objectDrop,nsim=nsim)
unlist(pbtest$test[2,pbnames])
}
attr(r,"method") <- c("Parametric bootstrap via pbkrtest package")
r
}
system.time(drop1(fm1,test="user",sumFun=PBSumFun))
}
}
## workaround for creating a formula in a separate environment
createFormula <- function(resp, fixed, rand) {
f <- reformulate(c(fixed,rand),response=resp)
## use the parent (createModel) environment, not the
## environment of this function (which does not contain 'data')
environment(f) <- parent.frame()
f
}
createModel <- function(data) {
mf.final <- createFormula("Reaction", "Days", "(Days|Subject)")
lmer(mf.final, data=data)
}
drop1(createModel(data=sleepstudy))
dummy Dummy variables (experimental)
Description
Largely a wrapper for model.matrix that accepts a factor, f, and returns a dummy matrix with
nlevels(f)-1 columns (the first column is dropped by default). Useful whenever one wishes to
avoid the behaviour of model.matrix of always returning an nlevels(f)-column matrix, either by
the addition of an intercept column, or by keeping one column for all levels.
Usage
dummy(f, levelsToKeep)
Arguments
fAn object coercible to factor.
levelsToKeep An optional character vector giving the subset of levels(f) to be converted to
dummy variables.
Value
Amodel.matrix with dummy variables as columns.
Examples
data(Orthodont,package="nlme")
lmer(distance ~ age + (age|Subject) +
(0+dummy(Sex, "Female")|Subject), data = Orthodont)
Dyestuff 17
Dyestuff Yield of dyestuff by batch
Description
The Dyestuff data frame provides the yield of dyestuff (Naphthalene Black 12B) from 5 different
preparations from each of 6 different batchs of an intermediate product (H-acid). The Dyestuff2
data were generated data in the same structure but with a large residual variance relative to the batch
variance.
Format
Data frames, each with 30 observations on the following 2 variables.
Batch a factor indicating the batch of the intermediate product from which the preparation was
created.
Yield the yield of dyestuff from the preparation (grams of standard color).
Details
The Dyestuff data are described in Davies and Goldsmith (1972) as coming from “an investigation
to find out how much the variation from batch to batch in the quality of an intermediate product
(H-acid) contributes to the variation in the yield of the dyestuff (Naphthalene Black 12B) made
from it. In the experiment six samples of the intermediate, representing different batches of works
manufacture, were obtained, and five preparations of the dyestuff were made in the laboratory from
each sample. The equivalent yield of each preparation as grams of standard colour was determined
by dye-trial.”
The Dyestuff2 data are described in Box and Tiao (1973) as illustrating “ the case where between-
batches mean square is less than the within-batches mean square. These data had to be constructed
for although examples of this sort undoubtably occur in practice, they seem to be rarely published.
Source
O.L. Davies and P.L. Goldsmith (eds), Statistical Methods in Research and Production, 4th ed.,
Oliver and Boyd, (1972), section 6.4
G.E.P. Box and G.C. Tiao, Bayesian Inference in Statistical Analysis, Addison-Wesley, (1973),
section 5.1.2
Examples
require(lattice)
str(Dyestuff)
dotplot(reorder(Batch, Yield) ~ Yield, Dyestuff,
ylab = "Batch", jitter.y = TRUE, aspect = 0.3,
type = c("p", "a"))
dotplot(reorder(Batch, Yield) ~ Yield, Dyestuff2,
ylab = "Batch", jitter.y = TRUE, aspect = 0.3,
type = c("p", "a"))
(fm1 <- lmer(Yield ~ 1|Batch, Dyestuff))
(fm2 <- lmer(Yield ~ 1|Batch, Dyestuff2))
18 factorize
expandDoubleVerts Expand terms with ’||’ notation into separate ’|’ terms
Description
From the right hand side of a formula for a mixed-effects model, expand terms with the double
vertical bar operator into separate, independent random effect terms.
Usage
expandDoubleVerts(term)
Arguments
term a mixed-model formula
Value
the modified term
Note
Note that || works at the level of formula parsing. This fact can lead to results that may be confusing
when factors occur to the left of the || sign (more info at https://github.com/lme4/lme4/
issues/229).
See Also
formula,model.frame,model.matrix.
Other utilities: mkRespMod,mkReTrms,nlformula,nobars,subbars
Examples
f<-y~x+(x||g)
# the right-hand side of f is,
f[[3]]
# the expanded right-hand side,
expandDoubleVerts(f[[3]])
factorize Attempt to convert grouping variables to factors
Description
If variables within a data frame are not factors, try to convert them. Not intended for end-user use;
this is a utility function that needs to be exported, for technical reasons.
Usage
factorize(x,frloc,char.only=FALSE)
findbars 19
Arguments
xa formula
frloc a data frame
char.only (logical) convert only character variables to factors?
Value
a copy of the data frame with factors converted
findbars Determine random-effects expressions from a formula
Description
From the right hand side of a formula for a mixed-effects model, determine the pairs of expressions
that are separated by the vertical bar operator. Also expand the slash operator in grouping factor ex-
pressions and expand terms with the double vertical bar operator into separate, independent random
effect terms.
Usage
findbars(term)
Arguments
term a mixed-model formula
Value
pairs of expressions that were separated by vertical bars
Note
This function is called recursively on individual terms in the model, which is why the argument is
called term and not a name like form, indicating a formula.
See Also
formula,model.frame,model.matrix.
Other utilities: mkRespMod,mkReTrms,nlformula,nobars,subbars
Examples
findbars(f1 <- Reaction ~ Days + (Days | Subject))
## => list( Days | Subject )
## These two are equivalent:% tests in ../inst/tests/test-doubleVertNotation.R
findbars(y ~ Days + (1 | Subject) + (0 + Days | Subject))
findbars(y ~ Days + (Days || Subject))
## => list of length 2: list ( 1 | Subject , 0 + Days | Subject)
findbars(~ 1 + (1 | batch / cask))
## => list of length 2: list ( 1 | cask:batch , 1 | batch)
20 fortify
fixef Extract fixed-effects estimates
Description
Extract the fixed-effects estimates
Usage
## S3 method for class 'merMod'
fixef(object, add.dropped=FALSE, ...)
Arguments
object any fitted model object from which fixed effects estimates can be extracted.
add.dropped for models with rank-deficient design matrix, reconstitute the full-length param-
eter vector by adding NA values in appropriate locations?
... optional additional arguments. Currently none are used in any methods.
Details
Extract the estimates of the fixed-effects parameters from a fitted model.
Value
a named, numeric vector of fixed-effects estimates.
Examples
fixef(lmer(Reaction ~ Days + (1|Subject) + (0+Days|Subject), sleepstudy))
fm2 <- lmer(Reaction ~ Days + Days2 + (1|Subject),
data=transform(sleepstudy,Days2=Days))
fixef(fm2,add.dropped=TRUE)
## first two parameters are the same ...
stopifnot(all.equal(fixef(fm2,add.dropped=TRUE)[1:2],
fixef(fm2)))
fortify add information to data based on a fitted model
Description
add information to data based on a fitted model
Usage
fortify.merMod(model, data = getData(model),
...)
getME 21
Arguments
model fitted model
data original data set, if needed
... additional arguments
Details
fortify is a function defined in the ggplot2 package, q.v. for more details. fortify is not defined
here, and fortify.merMod is defined as a function rather than an S3 method, to avoid (1) inducing
a dependency on ggplot2 or (2) masking methods from ggplot2. This is currently an experimental
feature.
getME Extract or Get Generalized Components from a Fitted Mixed Effects
Model
Description
Extract (or “get”) “components” – in a generalized sense – from a fitted mixed-effects model, i.e.,
(in this version of the package) from an object of class "merMod".
Usage
getME(object, name, ...)
## S3 method for class 'merMod'
getME(object,
name = c("X", "Z", "Zt", "Ztlist", "mmList", "y", "mu", "u", "b",
"Gp", "Tp", "L", "Lambda", "Lambdat", "Lind", "Tlist",
"A", "RX", "RZX", "sigma", "flist",
"fixef", "beta", "theta", "ST", "REML", "is_REML",
"n_rtrms", "n_rfacs", "N", "n", "p", "q",
"p_i", "l_i", "q_i", "k", "m_i", "m",
"cnms", "devcomp", "offset", "lower", "devfun", "glmer.nb.theta"),
...)
Arguments
object a fitted mixed-effects model of class "merMod", i.e., typically the result of lmer(),
glmer() or nlmer().
name a character vector specifying the name(s) of the “component”. If length(name) > 1
or if name = "ALL", a named list of components will be returned. Possi-
ble values are:
"X":fixed-effects model matrix
"Z":random-effects model matrix
"Zt":transpose of random-effects model matrix. Note that the structure of
Zt has changed since lme4.0; to get a backward-compatible structure, use
do.call(Matrix::rBind,getME(.,"Ztlist"))
22 getME
"Ztlist":list of components of the transpose of the random-effects model ma-
trix, separated by individual variance component
"mmList":list of raw model matrices associated with random effects terms
"y":response vector
"mu":conditional mean of the response
"u":conditional mode of the “spherical” random effects variable
"b":conditional mode of the random effects variable
"Gp":groups pointer vector. A pointer to the beginning of each group of ran-
dom effects corresponding to the random-effects terms, beginning with 0
and including a final element giving the total number of random effects
"Tp":theta pointer vector. A pointer to the beginning of the theta sub-vectors
corresponding to the random-effects terms, beginning with 0 and including
a final element giving the number of thetas.
"L":sparse Cholesky factor of the penalized random-effects model.
"Lambda":relative covariance factor Λof the random effects.
"Lambdat":transpose Λ0of Λabove.
"Lind":index vector for inserting elements of θinto the nonzeros of Λ.
"Tlist":vector of template matrices from which the blocks of Λare generated.
"A":Scaled sparse model matrix (class "dgCMatrix") for the unit, orthogonal
random effects, U, equal to getME(.,"Zt") %*% getME(.,"Lambdat")
"RX":Cholesky factor for the fixed-effects parameters
"RZX":cross-term in the full Cholesky factor
"sigma":residual standard error; note that sigma(object) is preferred.
"flist":a list of the grouping variables (factors) involved in the random effect
terms
"fixef":fixed-effects parameter estimates
"beta":fixed-effects parameter estimates (identical to the result of fixef, but
without names)
"theta":random-effects parameter estimates: these are parameterized as the
relative Cholesky factors of each random effect term
"ST":A list of S and T factors in the TSST’ Cholesky factorization of the rela-
tive variance matrices of the random effects associated with each random-
effects term. The unit lower triangular matrix, T, and the diagonal matrix,
S, for each term are stored as a single matrix with diagonal elements from
Sand off-diagonal elements from T.
"n_rtrms":number of random-effects terms
"n_rfacs":number of distinct random-effects grouping factors
"N":number of rows of X
"n":length of the response vector, y
"p":number of columns of the fixed effects model matrix, X
"q":number of columns of the random effects model matrix, Z
"p_i":numbers of columns of the raw model matrices, mmList
"l_i":numbers of levels of the grouping factors
"q_i":numbers of columns of the term-wise model matrices, ZtList
"k":number of random effects terms
"m_i":numbers of covariance parameters in each term
"m":total number of covariance parameters
getME 23
"cnms":the “component names”, a list.
"REML":0indicates the model was fitted by maximum likelihood, any other
positive integer indicates fitting by restricted maximum likelihood
"is_REML":same as the result of isREML(.)
"devcomp":a list consisting of a named numeric vector, cmp, and a named in-
teger vector, dims, describing the fitted model. The elements of cmp are:
ldL2 twice the log determinant of L
ldRX2 twice the log determinant of RX
wrss weighted residual sum of squares
ussq squared length of u
pwrss penalized weighted residual sum of squares, “wrss + ussq”
drsum sum of residual deviance (GLMMs only)
REML REML criterion at optimum (LMMs fit by REML only)
dev deviance criterion at optimum (models fit by ML only)
sigmaML ML estimate of residual standard deviation
sigmaREML REML estimate of residual standard deviation
tolPwrss tolerance for declaring convergence in the penalized iteratively
weighted residual sum-of-squares (GLMMs only)
The elements of dims are:
Nnumber of rows of X
nlength of y
pnumber of columns of X
nmp n-p
nth length of theta
qnumber of columns of Z
nAGQ see glmer
compDev see glmerControl
useSc TRUE if model has a scale parameter
reTrms number of random effects terms
REML 0indicates the model was fitted by maximum likelihood, any other
positive integer indicates fitting by restricted maximum likelihood
GLMM TRUE if a GLMM
NLMM TRUE if an NLMM
"offset":model offset
"lower":lower bounds on model parameters (random effects parameters only).
"devfun":deviance function (so far only available for LMMs)
"glmer.nb.theta":negative binomial θparameter, only for glmer.nb.
"ALL":get all of the above as a list.
... currently unused in lme4, potentially further arguments in methods.
Details
The goal is to provide “everything a user may want” from a fitted "merMod" object as far as it is not
available by methods, such as fixef,ranef,vcov, etc.
24 GHrule
Value
Unspecified, as very much depending on the name.
See Also
getCall(). More standard methods for "merMod" objects, such as ranef,fixef,vcov, etc.: see
methods(class="merMod")
Examples
## shows many methods you should consider *before* using getME():
methods(class = "merMod")
(fm1 <- lmer(Reaction ~ Days + (Days|Subject), sleepstudy))
Z <- getME(fm1, "Z")
stopifnot(is(Z, "CsparseMatrix"),
c(180,36) == dim(Z),
all.equal(fixef(fm1), b1 <- getME(fm1, "beta"),
check.attributes=FALSE, tolerance = 0))
## A way to get *all* getME()s :
## internal consistency check ensuring that all work:
parts <- getME(fm1, "ALL")
str(parts, max=2)
stopifnot(identical(Z, parts $ Z),
identical(b1, parts $ beta))
GHrule Univariate Gauss-Hermite quadrature rule
Description
Create a univariate Gauss-Hermite quadrature rule.
Usage
GHrule(ord, asMatrix = TRUE)
Arguments
ord scalar integer between 1 and 25 - the order, or number of nodes and weights, in
the rule. When the function being multiplied by the standard normal density is
a polynomial of order 2k-1 the rule of order k integrates the product exactly.
asMatrix logical scalar - should the result be returned as a matrix. If FALSE a data frame
is returned. Defaults to TRUE.
Details
This version of Gauss-Hermite quadrature provides the node positions and weights for a scalar
integral of a function multiplied by the standard normal density.
Originally based on package SparseGrid’s hidden GQN().
glmer 25
Value
a matrix (or data frame, is asMatrix is false) with ord rows and three columns which are zthe node
positions, wthe weights and ldnorm, the logarithm of the normal density evaluated at the nodes.
See Also
a different interface is available via GQdk().
Examples
(r5 <- GHrule(5, asMatrix=FALSE))
## second, fourth, sixth, eighth and tenth central moments of the
## standard Gaussian density
with(r5, sapply(seq(2, 10, 2), function(p) sum(w * z^p)))
glmer Fitting Generalized Linear Mixed-Effects Models
Description
Fit a generalized linear mixed-effects model (GLMM). Both fixed effects and random effects are
specified via the model formula.
Usage
glmer(formula, data = NULL, family = gaussian, control = glmerControl(),
start = NULL, verbose = 0L, nAGQ = 1L, subset, weights, na.action,
offset, contrasts = NULL, mustart, etastart,
devFunOnly = FALSE, ...)
Arguments
formula a two-sided linear formula object describing both the fixed-effects and fixed-
effects part of the model, with the response on the left of a ~operator and the
terms, separated by +operators, on the right. Random-effects terms are distin-
guished by vertical bars ("|") separating expressions for design matrices from
grouping factors.
data an optional data frame containing the variables named in formula. By default
the variables are taken from the environment from which lmer is called. While
data is optional, the package authors strongly recommend its use, especially
when later applying methods such as update and drop1 to the fitted model
(such methods are not guaranteed to work properly if data is omitted). If data
is omitted, variables will be taken from the environment of formula (if specified
as a formula) or from the parent frame (if specified as a character vector).
family a GLM family, see glm and family.
control a list (of correct class, resulting from lmerControl() or glmerControl() re-
spectively) containing control parameters, including the nonlinear optimizer to
be used and parameters to be passed through to the nonlinear optimizer, see the
*lmerControl documentation for details.
26 glmer
start a named list of starting values for the parameters in the model, or a numeric
vector. A numeric start argument will be used as the starting value of theta.
If start is a list, the theta element (a numeric vector) is used as the starting
value for the first optimization step (default=1 for diagonal elements and 0 for
off-diagonal elements of the lower Cholesky factor); the fitted value of theta
from the first step, plus start[["fixef"]], are used as starting values for the
second optimization step. If start has both fixef and theta elements, the first
optimization step is skipped. For more details or finer control of optimization,
see modular.
verbose integer scalar. If > 0 verbose output is generated during the optimization of the
parameter estimates. If > 1 verbose output is generated during the individual
PIRLS steps.
nAGQ integer scalar - the number of points per axis for evaluating the adaptive Gauss-
Hermite approximation to the log-likelihood. Defaults to 1, corresponding to
the Laplace approximation. Values greater than 1 produce greater accuracy in
the evaluation of the log-likelihood at the expense of speed. A value of zero uses
a faster but less exact form of parameter estimation for GLMMs by optimizing
the random effects and the fixed-effects coefficients in the penalized iteratively
reweighted least squares step. (See Details.)
subset an optional expression indicating the subset of the rows of data that should be
used in the fit. This can be a logical vector, or a numeric vector indicating which
observation numbers are to be included, or a character vector of the row names
to be included. All observations are included by default.
weights an optional vector of ‘prior weights’ to be used in the fitting process. Should be
NULL or a numeric vector.
na.action a function that indicates what should happen when the data contain NAs. The de-
fault action (na.omit, inherited from the ‘factory fresh’ value of getOption("na.action"))
strips any observations with any missing values in any variables.
offset this can be used to specify an a priori known component to be included in the
linear predictor during fitting. This should be NULL or a numeric vector of length
equal to the number of cases. One or more offset terms can be included in the
formula instead or as well, and if more than one is specified their sum is used.
See model.offset.
contrasts an optional list. See the contrasts.arg of model.matrix.default.
mustart optional starting values on the scale of the conditional mean, as in glm; see there
for details.
etastart optional starting values on the scale of the unbounded predictor as in glm; see
there for details.
devFunOnly logical - return only the deviance evaluation function. Note that because the
deviance function operates on variables stored in its environment, it may not
return exactly the same values on subsequent calls (but the results should always
be within machine tolerance).
... other potential arguments. A method argument was used in earlier versions of
the package. Its functionality has been replaced by the nAGQ argument.
Details
Fit a generalized linear mixed model, which incorporates both fixed-effects parameters and ran-
dom effects in a linear predictor, via maximum likelihood. The linear predictor is related to the
conditional mean of the response through the inverse link function defined in the GLM family.
glmer 27
The expression for the likelihood of a mixed-effects model is an integral over the random effects
space. For a linear mixed-effects model (LMM), as fit by lmer, this integral can be evaluated
exactly. For a GLMM the integral must be approximated. The most reliable approximation for
GLMMs is adaptive Gauss-Hermite quadrature, at present implemented only for models with a
single scalar random effect. The nAGQ argument controls the number of nodes in the quadrature for-
mula. A model with a single, scalar random-effects term could reasonably use up to 25 quadrature
points per scalar integral.
Value
An object of class glmerMod, for which many methods are available (e.g. methods(class="glmerMod")).
See Also
lmer (for details on formulas and parameterization); glm for Generalized Linear Models (without
random effects). nlmer for nonlinear mixed-effects models.
glmer.nb to fit negative binomial GLMMs.
Examples
## generalized linear mixed model
library(lattice)
xyplot(incidence/size ~ period|herd, cbpp, type=c('g','p','l'),
layout=c(3,5), index.cond = function(x,y)max(y))
(gm1 <- glmer(cbind(incidence, size - incidence) ~ period + (1 | herd),
data = cbpp, family = binomial))
## using nAGQ=0 only gets close to the optimum
(gm1a <- glmer(cbind(incidence, size - incidence) ~ period + (1 | herd),
cbpp, binomial, nAGQ = 0))
## using nAGQ = 9 provides a better evaluation of the deviance
## Currently the internal calculations use the sum of deviance residuals,
## which is not directly comparable with the nAGQ=0 or nAGQ=1 result.
(gm1a <- glmer(cbind(incidence, size - incidence) ~ period + (1 | herd),
cbpp, binomial, nAGQ = 9))
## GLMM with individual-level variability (accounting for overdispersion)
## For this data set the model is the same as one allowing for a period:herd
## interaction, which the plot indicates could be needed.
cbpp$obs <- 1:nrow(cbpp)
(gm2 <- glmer(cbind(incidence, size - incidence) ~ period +
(1 | herd) + (1|obs),
family = binomial, data = cbpp))
anova(gm1,gm2)
## glmer and glm log-likelihoods are consistent
gm1Devfun <- update(gm1,devFunOnly=TRUE)
gm0 <- glm(cbind(incidence, size - incidence) ~ period,
family = binomial, data = cbpp)
## evaluate GLMM deviance at RE variance=theta=0, beta=(GLM coeffs)
gm1Dev0 <- gm1Devfun(c(0,coef(gm0)))
## compare
stopifnot(all.equal(gm1Dev0,c(-2*logLik(gm0))))
## the toenail oncholysis data from Backer et al 1998
## these data are notoriously difficult to fit
## Not run:
if (require("HSAUR2")) {
28 glmer.nb
gm2 <- glmer(outcome~treatment*visit+(1|patientID),
data=toenail,
family=binomial,nAGQ=20)
}
## End(Not run)
glmer.nb Fitting GLMM’s for Negative Binomial
Description
Fits a generalized linear mixed-effects model (GLMM) for the negative binomial family, building
on glmer, and initializing via theta.ml from MASS.
Usage
glmer.nb(..., interval = log(th) + c(-3, 3),
tol = 5e-5, verbose = FALSE, nb.control = NULL,
initCtrl = list(limit = 20, eps = 2*tol, trace = verbose))
Arguments
... arguments as for glmer(.) such as formula,data,control, etc, but not
family!
interval interval in which to start the optimization. The default is symmetric on log scale
around the initially estimated theta.
tol tolerance for the optimization via optimize.
verbose logical indicating how much progress information should be printed during
the optimization. Use verbose = 2 (or larger) to enable verbose=TRUE in the
glmer() calls.
nb.control optional list, like glmerControl(), used in refit(*, control = control.nb)
during the optimization.
initCtrl (experimental, do not rely on this:) a list with named components as in the
default, passed to theta.ml (package MASS) for the initial value of the negative
binomial parameter theta.
Value
An object of class glmerMod, for which many methods are available (e.g. methods(class="glmerMod")),
see glmer.
Note
For historical reasons, the shape parameter of the negative binomial and the random effects param-
eters in our (G)LMM models are both called theta (θ), but are unrelated here.
The negative binomial θcan be extracted from a fit g <- glmer.nb() by getME(g, "glmer.nb.theta").
Parts of glmer.nb() are still experimental and methods are still missing or suboptimal. In particu-
lar, there is no inference available for the dispersion parameter θ, yet.
glmerLaplaceHandle 29
See Also
glmer; from package MASS,negative.binomial (which we re-export currently) and theta.ml,
the latter for initialization of optimization.
The ‘Details’ of pnbinom for the definition of the negative binomial distribution.
Examples
set.seed(101)
dd <- expand.grid(f1 = factor(1:3),
f2 = LETTERS[1:2], g=1:9, rep=1:15,
KEEP.OUT.ATTRS=FALSE)
summary(mu <- 5*(-4 + with(dd, as.integer(f1) + 4*as.numeric(f2))))
dd$y <- rnbinom(nrow(dd), mu = mu, size = 0.5)
str(dd)
require("MASS")## and use its glm.nb() - as indeed we have zero random effect:
m.glm <- glm.nb(y ~ f1*f2, data=dd, trace=TRUE)
summary(m.glm)
m.nb <- glmer.nb(y ~ f1*f2 + (1|g), data=dd, verbose=TRUE)
m.nb
## The neg.binomial theta parameter:
getME(m.nb, "glmer.nb.theta")
LL <- logLik(m.nb)
## mixed model has 1 additional parameter (RE variance)
stopifnot(attr(LL,"df")==attr(logLik(m.glm),"df")+1)
plot(m.nb, resid(.) ~ g)# works, as long as data 'dd'is found
glmerLaplaceHandle Handle for glmerLaplace
Description
Handle for calling the glmerLaplace C++ function. Not intended for routine use.
Usage
glmerLaplaceHandle(pp, resp, nAGQ, tol, maxit, verbose)
Arguments
pp merPredD object
resp lmResp object
nAGQ see glmer
tol tolerance
maxit maximum number of pwrss iterations
verbose display optimizer progress
Value
Value of the objective function
30 glmFamily-class
glmFamily Generator object for the glmFamily class
Description
The generator object for the glmFamily reference class. Such an object is primarily used through
its new method.
Usage
glmFamily(...)
Arguments
... Named argument (see Note below)
Methods
new(family=family) Create a new glmFamily object
Note
Arguments to the new method must be named arguments.
See Also
glmFamily
glmFamily-class Class "glmFamily" - a reference class for family
Description
This class is a wrapper class for family objects specifying a distibution family and link function
for a generalized linear model (glm). The reference class contains an external pointer to a C++
object representing the class. For common families and link functions the functions in the family
are implemented in compiled code so they can be accessed from other compiled code and for a
speed boost.
Extends
All reference classes extend and inherit methods from "envRefClass".
Note
Objects from this reference class correspond to objects in a C++ class. Methods are invoked on
the C++ class using the external pointer in the Ptr field. When saving such an object the external
pointer is converted to a null pointer, which is why there is a redundant field ptr that is an active-
binding function returning the external pointer. If the Ptr field is a null pointer, the external pointer
is regenerated for the stored family field.
golden-class 31
See Also
family,glmFamily
Examples
str(glmFamily$new(family=poisson()))
golden-class Class "golden" and Generator for Golden Search Optimizer Class
Description
"golden" is a reference class for a golden search scalar optimizer, for a parameter within an interval.
golden() is the generator for the "golden" class. The optimizer uses reverse communications.
Usage
golden(...)
Arguments
... (partly optional) arguments passed to new() must be named arguments. lower
and upper are the bounds for the scalar parameter; they must be finite.
Extends
All reference classes extend and inherit methods from "envRefClass".
Examples
showClass("golden")
golden(lower= -100, upper= 1e100)
GQdk Sparse Gaussian / Gauss-Hermite Quadrature grid
Description
Generate the sparse multidimensional Gaussian quadrature grids.
Currently unused. See GHrule() for the version currently in use in package lme4.
Usage
GQdk(d = 1L, k = 1L)
GQN
32 grouseticks
Arguments
dinteger scalar - the dimension of the function to be integrated with respect to the
standard d-dimensional Gaussian density.
kinteger scalar - the order of the grid. A grid of order kprovides an exact result
for a polynomial of total order of 2k - 1 or less.
Value
GQdk() returns a matrix with d + 1 columns. The first column is the weights and the remaining d
columns are the node coordinates.
GQN is a list of lists, containing the non-redundant quadrature nodes and weights for integration
of a scalar function of a d-dimensional argument with respect to the density function of the d-
dimensional Gaussian density function.
The outer list is indexed by the dimension, d, in the range of 1 to 20. The inner list is indexed by k,
the order of the quadrature.
Note
GQN contains only the non-redundant nodes. To regenerate the whole array of nodes, all possible
permutations of axes and all possible combinations of ±1must be applied to the axes. This entire
array of nodes is exactly what GQdk() reproduces.
The number of nodes gets very large very quickly with increasing dand k. See the charts at http:
//www.sparse-grids.de.
Examples
GQdk(2,5) # 53 x 3
GQN[[3]][[5]] # a 14 x 4 matrix
grouseticks Data on red grouse ticks from Elston et al. 2001
Description
Number of ticks on the heads of red grouse chicks sampled in the field (grouseticks) and an
aggregated version (grouseticks_agg); see original source for more details
Usage
data(grouseticks)
Format
INDEX (factor) chick number (observation level)
TICKS number of ticks sampled
BROOD (factor) brood number
HEIGHT height above sea level (meters)
YEAR year (-1900)
hatvalues.merMod 33
LOCATION (factor) geographic location code
cHEIGHT centered height, derived from HEIGHT
meanTICKS mean number of ticks by brood
varTICKS variance of number of ticks by brood
Details
grouseticks_agg is just a brood-level aggregation of the data
Source
Robert Moss, via David Elston
References
Elston, D. A., R. Moss, T. Boulinier, C. Arrowsmith, and X. Lambin. 2001. "Analysis of Aggre-
gation, a Worked Example: Numbers of Ticks on Red Grouse Chicks." Parasitology 122 (05): 563-
569. doi:10.1017/S0031182001007740. http://journals.cambridge.org/action/displayAbstract?
fromPage=online&aid=82701.
Examples
data(grouseticks)
## Figure 1a from Elston et al
par(las=1,bty="l")
tvec <- c(0,1,2,5,20,40,80)
pvec <- c(4,1,3)
with(grouseticks_agg,plot(1+meanTICKS~HEIGHT,
pch=pvec[factor(YEAR)],
log="y",axes=FALSE,
xlab="Altitude (m)",
ylab="Brood mean ticks"))
axis(side=1)
axis(side=2,at=tvec+1,label=tvec)
box()
abline(v=405,lty=2)
## Figure 1b
with(grouseticks_agg,plot(varTICKS~meanTICKS,
pch=4,
xlab="Brood mean ticks",
ylab="Within-brood variance"))
curve(1*x,from=0,to=70,add=TRUE)
## Model fitting
form <- TICKS~YEAR+HEIGHT+(1|BROOD)+(1|INDEX)+(1|LOCATION)
(full_mod1 <- glmer(form, family="poisson",data=grouseticks))
hatvalues.merMod Diagonal elements of the hat matrix
34 InstEval
Description
Returns the values on the diagonal of the hat matrix, which is the matrix that transforms the response
vector (minus any offset) into the fitted values (minus any offset). Note that this method should
only be used for linear mixed models. It is not clear if the hat matrix concept even makes sense for
generalized linear mixed models.
Usage
## S3 method for class 'merMod'
hatvalues(model, fullHatMatrix = FALSE, ...)
Arguments
model An object of class merMod.
fullHatMatrix Return full hat matrix (not just diagonal values)?
... Not currently used
Value
The diagonal elements of the hat matrix.
Examples
m <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy)
hatvalues(m)
InstEval University Lecture/Instructor Evaluations by Students at ETH
Description
University lecture evaluations by students at ETH Zurich, anonymized for privacy protection. This
is an interesting “medium” sized example of a partially nested mixed effect model.
Format
A data frame with 73421 observations on the following 7 variables.
sa factor with levels 1:2972 denoting individual students.
da factor with 1128 levels from 1:2160, denoting individual professors or lecturers.
studage an ordered factor with levels 2<4<6<8, denoting student’s “age” measured in the
semester number the student has been enrolled.
lectage an ordered factor with 6 levels, 1<2< ... < 6, measuring how many semesters back the
lecture rated had taken place.
service a binary factor with levels 0and 1; a lecture is a “service”, if held for a different depart-
ment than the lecturer’s main one.
dept a factor with 14 levels from 1:15, using a random code for the department of the lecture.
ya numeric vector of ratings of lectures by the students, using the discrete scale 1:5, with meanings
of ‘poor’ to ‘very good’.
Each observation is one student’s rating for a specific lecture (of one lecturer, during one semester
in the past).
isNested 35
Details
The main goal of the survey is to find “the best liked prof”, according to the lectures given. Statis-
tical analysis of such data has been the basis for a (student) jury selecting the final winners.
The present data set has been anonymized and slightly simplified on purpose.
Examples
str(InstEval)
head(InstEval, 16)
xtabs(~ service + dept, InstEval)
isNested Is f1 nested within f2?
Description
Does every level of f1 occur in conjunction with exactly one level of f2? The function is based on
converting a triplet sparse matrix to a compressed column-oriented form in which the nesting can
be quickly evaluated.
Usage
isNested(f1, f2)
Arguments
f1 factor 1
f2 factor 2
Value
TRUE if factor 1 is nested within factor 2
Examples
with(Pastes, isNested(cask, batch)) ## => FALSE
with(Pastes, isNested(sample, batch)) ## => TRUE
36 isREML
isREML Check characteristics of models
Description
Check characteristics of models: whether a model fit corresponds to a linear (LMM), generalized
linear (GLMM), or nonlinear (NLMM) mixed model, and whether a linear mixed model has been
fitted by REML or not (isREML(x) is always FALSE for GLMMs and NLMMs).
Usage
isREML(x, ...)
isLMM(x, ...)
isNLMM(x, ...)
isGLMM(x, ...)
Arguments
xa fitted model.
... additional, optional arguments. (None are used in the merMod methods)
Details
These are generic functions. At present the only methods are for mixed-effects models of class
merMod.
Value
a logical value
See Also
getME
Examples
fm1 <- lmer(Reaction ~ Days + (Days|Subject), sleepstudy)
gm1 <- glmer(cbind(incidence, size - incidence) ~ period + (1 | herd),
data = cbpp, family = binomial)
nm1 <- nlmer(circumference ~ SSlogis(age, Asym, xmid, scal) ~ Asym|Tree,
Orange, start = c(Asym = 200, xmid = 725, scal = 350))
isLMM(fm1)
isGLMM(gm1)
## check all :
is.MM <- function(x) c(LMM = isLMM(x), GLMM= isGLMM(x), NLMM= isNLMM(x))
stopifnot(cbind(is.MM(fm1), is.MM(gm1), is.MM(nm1))
== diag(rep(TRUE,3)))
lmer 37
lmer Fit Linear Mixed-Effects Models
Description
Fit a linear mixed-effects model (LMM) to data, via REML or maximum likelihood.
Usage
lmer(formula, data = NULL, REML = TRUE, control = lmerControl(),
start = NULL, verbose = 0L, subset, weights, na.action,
offset, contrasts = NULL, devFunOnly = FALSE, ...)
Arguments
formula a two-sided linear formula object describing both the fixed-effects and random-
effects part of the model, with the response on the left of a ~operator and the
terms, separated by +operators, on the right. Random-effects terms are distin-
guished by vertical bars ("|") separating expressions for design matrices from
grouping factors. Two vertical bars ("||") can be used to specify multiple un-
correlated random effects for the same grouping variable.
data an optional data frame containing the variables named in formula. By default
the variables are taken from the environment from which lmer is called. While
data is optional, the package authors strongly recommend its use, especially
when later applying methods such as update and drop1 to the fitted model
(such methods are not guaranteed to work properly if data is omitted). If data
is omitted, variables will be taken from the environment of formula (if specified
as a formula) or from the parent frame (if specified as a character vector).
REML logical scalar - Should the estimates be chosen to optimize the REML criterion
(as opposed to the log-likelihood)?
control a list (of correct class, resulting from lmerControl() or glmerControl() re-
spectively) containing control parameters, including the nonlinear optimizer to
be used and parameters to be passed through to the nonlinear optimizer, see the
*lmerControl documentation for details.
start a named list of starting values for the parameters in the model. For lmer this
can be a numeric vector or a list with one component named "theta".
verbose integer scalar. If > 0 verbose output is generated during the optimization of the
parameter estimates. If > 1 verbose output is generated during the individual
PIRLS steps.
subset an optional expression indicating the subset of the rows of data that should be
used in the fit. This can be a logical vector, or a numeric vector indicating which
observation numbers are to be included, or a character vector of the row names
to be included. All observations are included by default.
weights an optional vector of ‘prior weights’ to be used in the fitting process. Should
be NULL or a numeric vector. Prior weights are not normalized or standardized
in any way. In particular, the diagonal of the residual covariance matrix is the
squared residual standard deviation parameter sigma times the vector of inverse
weights. Therefore, if the weights have relatively large magnitudes, then in
order to compensate, the sigma parameter will also need to have a relatively
large magnitude.
38 lmer
na.action a function that indicates what should happen when the data contain NAs. The de-
fault action (na.omit, inherited from the ’factory fresh’ value of getOption("na.action"))
strips any observations with any missing values in any variables.
offset this can be used to specify an a priori known component to be included in the
linear predictor during fitting. This should be NULL or a numeric vector of length
equal to the number of cases. One or more offset terms can be included in the
formula instead or as well, and if more than one is specified their sum is used.
See model.offset.
contrasts an optional list. See the contrasts.arg of model.matrix.default.
devFunOnly logical - return only the deviance evaluation function. Note that because the
deviance function operates on variables stored in its environment, it may not
return exactly the same values on subsequent calls (but the results should always
be within machine tolerance).
... other potential arguments. A method argument was used in earlier versions of
the package. Its functionality has been replaced by the REML argument.
Details
If the formula argument is specified as a character vector, the function will attempt to coerce
it to a formula. However, this is not recommended (users who want to construct formulas by
pasting together components are advised to use as.formula or reformulate); model fits will
work but subsequent methods such as drop1,update may fail.
Unlike some simpler modeling frameworks such as lm and glm which automatically detect
perfectly collinear predictor variables, [gn]lmer cannot handle design matrices of less than
full rank. For example, in cases of models with interactions that have unobserved combina-
tions of levels, it is up to the user to define a new variable (for example creating ab within the
data from the results of interaction(a,b,drop=TRUE)).
the deviance function returned when devFunOnly is TRUE takes a single numeric vector ar-
gument, representing the theta vector. This vector defines the scaled variance-covariance
matrices of the random effects, in the Cholesky parameterization. For models with only sim-
ple (intercept-only) random effects, theta is a vector of the standard deviations of the random
effects. For more complex or multiple random effects, running getME(.,"theta") to retrieve
the theta vector for a fitted model and examining the names of the vector is probably the
easiest way to determine the correspondence between the elements of the theta vector and
elements of the lower triangles of the Cholesky factors of the random effects.
Value
An object of class merMod, for which many methods are available (e.g. methods(class="merMod"))
See Also
lm for linear models; glmer for generalized linear and nlmer for nonlinear mixed models.
Examples
## linear mixed models - reference values from older code
(fm1 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy))
summary(fm1)# (with its own print method; see class?merMod % ./merMod-class.Rd
str(terms(fm1))
stopifnot(identical(terms(fm1, fixed.only=FALSE),
lmerControl 39
terms(model.frame(fm1))))
attr(terms(fm1, FALSE), "dataClasses") # fixed.only=FALSE needed for dataCl.
fm1_ML <- update(fm1,REML=FALSE)
(fm2 <- lmer(Reaction ~ Days + (Days || Subject), sleepstudy))
anova(fm1, fm2)
sm2 <- summary(fm2)
print(fm2, digits=7, ranef.comp="Var") # the print.merMod() method
print(sm2, digits=3, corr=FALSE) # the print.summary.merMod() method
(vv <- vcov.merMod(fm2, corr=TRUE))
as(vv, "corMatrix")# extracts the ("hidden") 'correlation'entry in @factors
## Fit sex-specific variances by constructing numeric dummy variables
## for sex and sex:age; in this case the estimated variance differences
## between groups in both intercept and slope are zero ...
data(Orthodont,package="nlme")
Orthodont$nsex <- as.numeric(Orthodont$Sex=="Male")
Orthodont$nsexage <- with(Orthodont, nsex*age)
lmer(distance ~ age + (age|Subject) + (0+nsex|Subject) +
(0 + nsexage|Subject), data=Orthodont)
lmerControl Control of Mixed Model Fitting
Description
Construct control structures for mixed model fitting. All arguments have defaults, and can be
grouped into
general control parameters, most importantly optimizer, further restart_edge, etc;
model- or data-checking specifications, in short “checking options”, such as check.nobs.vs.rankZ,
or check.rankX (currently not for nlmerControl);
all the parameters to be passed to the optimizer, e.g., maximal number of iterations, passed via
the optCtrl list argument.
Usage
lmerControl(optimizer = "bobyqa",
restart_edge = TRUE,
boundary.tol = 1e-5,
calc.derivs=TRUE,
use.last.params=FALSE,
sparseX = FALSE,
## input checking options
check.nobs.vs.rankZ = "ignore",
check.nobs.vs.nlev = "stop",
check.nlev.gtreq.5 = "ignore",
check.nlev.gtr.1 = "stop",
check.nobs.vs.nRE="stop",
check.rankX = c("message+drop.cols", "silent.drop.cols", "warn+drop.cols",
"stop.deficient", "ignore"),
40 lmerControl
check.scaleX = c("warning","stop","silent.rescale",
"message+rescale","warn+rescale","ignore"),
check.formula.LHS = "stop",
## convergence checking options
check.conv.grad = .makeCC("warning", tol = 2e-3, relTol = NULL),
check.conv.singular = .makeCC(action = "ignore", tol = 1e-4),
check.conv.hess = .makeCC(action = "warning", tol = 1e-6),
## optimizer args
optCtrl = list())
glmerControl(optimizer = c("bobyqa", "Nelder_Mead"),
restart_edge = FALSE,
boundary.tol = 1e-5,
calc.derivs=TRUE,
use.last.params=FALSE,
sparseX = FALSE,
tolPwrss=1e-7,
compDev=TRUE,
nAGQ0initStep=TRUE,
## input checking options
check.nobs.vs.rankZ = "ignore",
check.nobs.vs.nlev = "stop",
check.nlev.gtreq.5 = "ignore",
check.nlev.gtr.1 = "stop",
check.nobs.vs.nRE="stop",
check.rankX = c("message+drop.cols", "silent.drop.cols", "warn+drop.cols",
"stop.deficient", "ignore"),
check.scaleX = "warning",
check.formula.LHS = "stop",
check.response.not.const = "stop",
## convergence checking options
check.conv.grad = .makeCC("warning", tol = 1e-3, relTol = NULL),
check.conv.singular = .makeCC(action = "ignore", tol = 1e-4),
check.conv.hess = .makeCC(action = "warning", tol = 1e-6),
## optimizer args
optCtrl = list())
nlmerControl(optimizer = "Nelder_Mead", tolPwrss = 1e-10,
optCtrl = list())
.makeCC(action, tol, relTol, ...)
Arguments
optimizer character - name of optimizing function(s). A character vector or list of func-
tions: length 1 for lmer or glmer, possibly length 2 for glmer). The built-in
optimizers are Nelder_Mead and bobyqa (from the minqa package). Any min-
imizing function that allows box constraints can be used provided that it
(1) takes input parameters fn (function to be optimized), par (starting parame-
ter values), lower (lower bounds) and control (control parameters, passed
through from the control argument) and
(2) returns a list with (at least) elements par (best-fit parameters), fval (best-fit
lmerControl 41
function value), conv (convergence code, equal to zero for successful con-
vergence) and (optionally) message (informational message, or explanation
of convergence failure).
Special provisions are made for bobyqa,Nelder_Mead, and optimizers wrapped
in the optimx package; to use the optimx optimizers (including L-BFGS-B from
base optim and nlminb), pass the method argument to optim in the optCtrl ar-
gument (you may also need to load the optimx package manually using library(optimx)
or require(optimx)).
For glmer, if length(optimizer)==2, the first element will be used for the
preliminary (random effects parameters only) optimization, while the second
will be used for the final (random effects plus fixed effect parameters) phase.
See modular for more information on these two phases.
calc.derivs logical - compute gradient and Hessian of nonlinear optimization solution?
use.last.params
logical - should the last value of the parameters evaluated (TRUE), rather than the
value of the parameters corresponding to the minimum deviance, be returned?
This is a "backward bug-compatibility" option; use TRUE only when trying to
match previous results.
sparseX logical - should a sparse model matrix be used for the fixed-effects terms? Cur-
rently inactive.
restart_edge logical - should the optimizer attempt a restart when it finds a solution at the
boundary (i.e. zero random-effect variances or perfect +/-1 correlations)? (Cur-
rently only implemented for lmerControl.)
boundary.tol numeric - within what distance of a boundary should the boundary be checked
for a better fit? (Set to zero to disable boundary checking.)
tolPwrss numeric scalar - the tolerance for declaring convergence in the penalized itera-
tively weighted residual sum-of-squares step.
compDev logical scalar - should compiled code be used for the deviance evaluation during
the optimization of the parameter estimates?
nAGQ0initStep do one initial run with nAGQ = 0.
check.nlev.gtreq.5
character - rules for checking whether all random effects have >= 5 levels. See
action.
check.nlev.gtr.1
character - rules for checking whether all random effects have > 1 level. See
action.
check.nobs.vs.rankZ
character - rules for checking whether the number of observations is greater than
(or greater than or equal to) the rank of the random effects design matrix (Z),
usually necessary for identifiable variances. As for action, with the addition of
"warningSmall" and "stopSmall", which run the test only if the dimensions
of Zare < 1e6. nobs > rank(Z) will be tested for LMMs and GLMMs with
estimated scale parameters; nobs >= rank(Z) will be tested for GLMMs with
fixed scale parameter. The rank test is done using the method="qr" option of
the rankMatrix function.
check.nobs.vs.nlev
character - rules for checking whether the number of observations is less than
(or less than or equal to) the number of levels of every grouping factor, usually
necessary for identifiable variances. As for action.nobs<nlevels will be
tested for LMMs and GLMMs with estimated scale parameters; nobs<=nlevels
will be tested for GLMMs with fixed scale parameter.
42 lmerControl
check.nobs.vs.nRE
character - rules for checking whether the number of observations is greater than
(or greater than or equal to) the number of random-effects levels for each term,
usually necessary for identifiable variances. As for check.nobs.vs.nlev.
check.conv.grad
rules for checking the gradient of the deviance function for convergence. A list
as returned by .makeCC, or a character string with only the action.
check.conv.singular
rules for checking for a singular fit, i.e. one where some parameters are on the
boundary of the feasible space (for example, random effects variances equal to 0
or correlations between random effects equal to +/- 1.0); as for check.conv.grad
above.
check.conv.hess
rules for checking the Hessian of the deviance function for convergence.; as for
check.conv.grad above.
check.rankX character - specifying if rankMatrix(X) should be compared with ncol(X) and
if columns from the design matrix should possibly be dropped to ensure that
it has full rank. Sometimes needed to make the model identifiable. The op-
tions can be abbreviated; the three "*.drop.cols" options all do drop columns,
"stop.deficient" gives an error when the rank is smaller than the number of
columns where "ignore" does no rank computation, and will typically lead to
less easily understandable errors, later.
check.scaleX character - check for problematic scaling of columns of fixed-effect model ma-
trix, e.g. parameters measured on very different scales.
check.formula.LHS
check whether specified formula has a left-hand side. Primarily for internal use
within simulate.merMod;use at your own risk as it may allow the generation
of unstable merMod objects
check.response.not.const
character - check that the response is not constant.
optCtrl alist of additional arguments to be passed to the nonlinear optimizer (see
Nelder_Mead,bobyqa). In particular, both Nelder_Mead and bobyqa use maxfun
to specify the maximum number of function evaluations they will try before giv-
ing up - in contrast to optim and optimx-wrapped optimizers, which use maxit.
action character - generic choices for the severity level of any test. "ignore": skip the
test. "warning": warn if test fails. "stop": throw an error if test fails.
tol numeric - tolerance for check
relTol numeric - tolerance for checking relative variation
... other elements to include in check specification
Details
Note that (only!) the pre-fitting “checking options” (i.e., all those starting with "check." but not
including the convergence checks ("check.conv.*") or rank-checking ("check.rank*") options)
may also be set globally via options. In that case, (g)lmerControl will use them rather than the
default values, but will not override values that are passed as explicit arguments.
For example, options(lmerControl=list(check.nobs.vs.rankZ = "ignore")) will suppress
warnings that the number of observations is less than the rank of the random effects model matrix
Z.
lmerControl 43
Value
The *Control functions return a list (inheriting from class "merControl") containing
1. general control parameters, such as optimizer,restart_edge;
2. (currently not for nlmerControl:) "checkControl", a list of data-checking specifications,
e.g., check.nobs.vs.rankZ;
3. parameters to be passed to the optimizer, i.e., the optCtrl list, which may contain maxiter.
.makeCC returns a list containing the check specification (action, tolerance, and optionally relative
tolerance).
See Also
convergence
Examples
str(lmerControl())
str(glmerControl())
## Not run:
## fit with default Nelder-Mead algorithm ...
fm0 <- lmer(Reaction ~ Days + (1 | Subject), sleepstudy)
fm1 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy)
## or with minqa::bobyqa ...
fm1_bobyqa <- update(fm1,control=lmerControl(optimizer="bobyqa"))
## or with the nlminb function used in older (<1.0) versions of lme4;
## this will usually replicate older results
require(optimx)
fm1_nlminb <- update(fm1,control=lmerControl(optimizer="optimx",
optCtrl=list(method="nlminb")))
## The other option here is method="L-BFGS-B".
## Or we can wrap base::optim():
optimwrap <- function(fn,par,lower,upper,control=list(),
...) {
if (is.null(control$method)) stop("must specify method in optCtrl")
method <- control$method
control$method <- NULL
## "Brent" requires finite upper values (lower bound will always
## be zero in this case)
if (method=="Brent") upper <- pmin(1e4,upper)
res <- optim(par=par,fn=fn,lower=lower,upper=upper,
control=control,method=method,...)
with(res,list(par=par,
fval=value,
feval=counts[1],
conv=convergence,
message=message))
}
fm0_brent <- update(fm0,control=lmerControl(optimizer="optimwrap",
optCtrl=list(method="Brent")))
## You can also use functions from the nloptr package.
if (require(nloptr)) {
defaultControl <- list(algorithm="NLOPT_LN_BOBYQA",
xtol_abs=1e-6,ftol_abs=1e-6,maxeval=1e5)
nloptwrap <- function(fn,par,lower,upper,control=list(),...) {
for (n in names(defaultControl))
44 lmList
if (is.null(control[[n]])) control[[n]] <- defaultControl[[n]]
res <- nloptr(x0=par,eval_f=fn,lb=lower,ub=upper,opts=control,...)
with(res,list(par=solution,
fval=objective,
feval=iterations,
conv=if (status>0) 0 else status,
message=message))
}
fm1_nloptr <- update(fm1,control=lmerControl(optimizer="nloptwrap"))
fm1_nloptr_NM <- update(fm1,control=lmerControl(optimizer="nloptwrap",
optCtrl=list(algorithm="NLOPT_LN_NELDERMEAD")))
}
## other algorithm options include NLOPT_LN_COBYLA, NLOPT_LN_SBPLX
## End(Not run)
lmList Fit List of lm Objects with a Common Model
Description
Fit a list of lm objects with a common model for different subgroups of the data.
Usage
lmList(formula, data, family, subset, weights, na.action,
offset, pool = TRUE, ...)
Arguments
formula a linear formula object of the form y ~ x1+...+xn | g. In the formula object,
yrepresents the response, x1,...,xn the covariates, and gthe grouping factor
specifying the partitioning of the data according to which different lm fits should
be performed.
family an optional family specification for a generalized linear model.
pool logical scalar, should the variance estimate pool the residual sums of squares
... additional, optional arguments to be passed to the model function or family eval-
uation.
data an optional data frame containing the variables named in formula. By default
the variables are taken from the environment from which lmer is called. See
Details.
subset an optional expression indicating the subset of the rows of data that should be
used in the fit. This can be a logical vector, or a numeric vector indicating which
observation numbers are to be included, or a character vector of the row names
to be included. All observations are included by default.
weights an optional vector of ‘prior weights’ to be used in the fitting process. Should be
NULL or a numeric vector.
na.action a function that indicates what should happen when the data contain NAs. The de-
fault action (na.omit, inherited from the ‘factory fresh’ value of getOption("na.action"))
strips any observations with any missing values in any variables.
lmList4-class 45
offset this can be used to specify an a priori known component to be included in the
linear predictor during fitting. This should be NULL or a numeric vector of length
equal to the number of cases. One or more offset terms can be included in the
formula instead or as well, and if more than one is specified their sum is used.
See model.offset.
Details
data should be a data frame (not, e.g. a groupedData object from the nlme package); use
as.data.frame first to convert the data.
While data is optional, the package authors strongly recommend its use, especially when
later applying methods such as update and drop1 to the fitted model (such methods are not
guaranteed to work properly if data is omitted). If data is omitted, variables will be taken
from the environment of formula (if specified as a formula) or from the parent frame (if
specified as a character vector).
Value
an object of class lmList4 (see there, notably for the methods defined).
See Also
lmList4
Examples
fm.plm <- lmList(Reaction ~ Days | Subject, sleepstudy)
coef(fm.plm)
fm.2 <- update(fm.plm, pool = FALSE)
## coefficients are the same, "pooled or unpooled":
stopifnot( all.equal(coef(fm.2), coef(fm.plm)) )
(ci <- confint(fm.plm)) # print and rather *see* :
plot(ci) # how widely they vary for the individuals
lmList4-class Class "lmList4" of ’lm’ Objects on Common Model
Description
Class "lmList4" is an S4 class with basically a list of objects of class lm with a common model
(but different data); see lmList() which returns these.
Package nlme’s lmList() returns objects of S3 class "lmList" and provides methods for them, on
which our methods partly build.
Objects from the Class
Objects can be created by calls of the form new("lmList4", ...) or, more commonly, by a call
to lmList().
46 lmResp
Methods
A dozen methods are provided. Currently, S4 methods for show, coercion (as(.,.)) and others
inherited via "list", and S3 methods for coef,confint,fitted,fixef,formula,logLik,pairs,
plot,predict,print,qqnorm,ranef,residuals,sigma,summary, and update.
sigma(object) returns the standard deviation ˆσ(of the errors in the linear models), assuming a
common variance σ2by pooling (even when pool = FALSE was used in the fit).
See Also
lmList
Examples
if(getRversion() >= "3.2.0") {
(mm <- methods(class = "lmList4"))
## The S3 ("not S4") ones :
mm[!attr(mm,"info")[,"isS4"]]
}
## For more examples: example(lmList) i.e., ?lmList
lmResp Generator objects for the response classes
Description
The generator objects for the lmResp,lmerResp,glmResp and nlsResp reference classes. Such
objects are primarily used through their new methods.
Usage
lmResp(...)
Arguments
... List of arguments (see Note).
Methods
new(y=y):Create a new lmResp or lmerResp object.
new(family=family, y=y):Create a new glmResp object.
new(y=y, nlmod=nlmod, nlenv=nlenv, pnames=pnames, gam=gam):Create a new nlsResp ob-
ject.
lmResp-class 47
Note
Arguments to the new methods must be named arguments.
y the numeric response vector
family a family object
nlmod the nonlinear model function
nlenv an environment holding data objects for evaluation of nlmod
pnames a character vector of parameter names
gam a numeric vector - the initial linear predictor
See Also
lmResp,lmerResp,glmResp,nlsResp
lmResp-class Reference Classes for Response Modules,
"(lm|glm|nls|lmer)Resp"
Description
Reference classes for response modules, including linear models, "lmResp", generalized linear
models, "glmResp", nonlinear models, "nlsResp" and linear mixed-effects models, "lmerResp".
Each reference class is associated with a C++ class of the same name. As is customary, the generator
object for each class has the same name as the class.
Extends
All reference classes extend and inherit methods from "envRefClass". Furthermore, "glmResp",
"nlsResp" and "lmerResp" all extend the "lmResp" class.
Note
Objects from these reference classes correspond to objects in C++ classes. Methods are invoked
on the C++ classes using the external pointer in the ptr field. When saving such an object the
external pointer is converted to a null pointer, which is why there are redundant fields containing
enough information as R objects to be able to regenerate the C++ object. The convention is that a
field whose name begins with an upper-case letter is an R object and the corresponding field whose
name begins with the lower-case letter is a method. Access to the external pointer should be through
the method, not through the field.
See Also
lmer,glmer,nlmer,merMod.
Examples
showClass("lmResp")
str(lmResp$new(y=1:4))
showClass("glmResp")
str(glmResp$new(family=poisson(), y=1:4))
showClass("nlsResp")
showClass("lmerResp")
str(lmerResp$new(y=1:4))
48 merMod-class
merMod-class Class "merMod" of Fitted Mixed-Effect Models
Description
A mixed-effects model is represented as a merPredD object and a response module of a class that
inherits from class lmResp. A model with a lmerResp response has class lmerMod; a glmResp
response has class glmerMod; and a nlsResp response has class nlmerMod.
Usage
## S3 method for class 'merMod'
anova(object, ..., refit = TRUE, model.names=NULL)
## S3 method for class 'merMod'
coef(object, ...)
## S3 method for class 'merMod'
deviance(object, REML = NULL, ...)
REMLcrit(object)
## S3 method for class 'merMod'
extractAIC(fit, scale = 0, k = 2, ...)
## S3 method for class 'merMod'
family(object, ...)
## S3 method for class 'merMod'
formula(x, fixed.only = FALSE, random.only = FALSE, ...)
## S3 method for class 'merMod'
fitted(object, ...)
## S3 method for class 'merMod'
logLik(object, REML = NULL, ...)
## S3 method for class 'merMod'
nobs(object, ...)
## S3 method for class 'merMod'
ngrps(object, ...)
## S3 method for class 'merMod'
terms(x, fixed.only = TRUE, random.only = FALSE, ...)
## S3 method for class 'merMod'
vcov(object, correlation = TRUE, sigm = sigma(object),
use.hessian = NULL, ...)
## S3 method for class 'merMod'
model.frame(formula, fixed.only = FALSE, ...)
## S3 method for class 'merMod'
model.matrix(object, type = c("fixed", "random", "randomListRaw"), ...)
## S3 method for class 'merMod'
print(x, digits = max(3, getOption("digits") - 3),
correlation = NULL, symbolic.cor = FALSE,
signif.stars = getOption("show.signif.stars"), ranef.comp = "Std.Dev.", ...)
## S3 method for class 'merMod'
summary(object, correlation = , use.hessian = NULL, ...)
## S3 method for class 'summary.merMod'
print(x, digits = max(3, getOption("digits") - 3),
correlation = NULL, symbolic.cor = FALSE,
merMod-class 49
signif.stars = getOption("show.signif.stars"),
ranef.comp = c("Variance", "Std.Dev."), show.resids = TRUE, ...)
## S3 method for class 'merMod'
update(object, formula., ..., evaluate = TRUE)
## S3 method for class 'merMod'
weights(object, type = c("prior", "working"), ...)
Arguments
object an Robject of class merMod, i.e., as resulting from lmer(), or glmer(), etc.
xan Robject of class merMod or summary.merMod, respectively, the latter result-
ing from summary(<merMod>).
fit an Robject of class merMod.
formula in the case of model.frame, a merMod object.
refit logical indicating if objects of class lmerMod should be refitted with ML before
comparing models. The default is TRUE to prevent the common mistake of inap-
propriately comparing REML-fitted models with different fixed effects, whose
likelihoods are not directly comparable.
model.names character vectors of model names to be used in the anova table.
scale Not currently used (see extractAIC).
ksee extractAIC.
REML Logical. If TRUE, return the restricted log-likelihood rather than the log-likelihood.
If NULL (the default), set REML to isREML(object) (see isREML).
fixed.only logical indicating if only the fixed effects components (terms or formula ele-
ments) are sought. If false, all components, including random ones, are returned.
random.only complement of fixed.only; indicates whether random components only are
sought. (Trying to specify fixed.only and random.only at the same time will
produce an error.)
correlation (logical) for vcov, indicates whether the correlation matrix as well as the variance-
covariance matrix is desired; for summary.merMod, indicates whether the cor-
relation matrix should be computed and stored along with the covariance; for
print.summary.merMod, indicates whether the correlation matrix of the fixed-
effects parameters should be printed. In the latter case, when NULL (the default),
the correlation matrix is printed when it has been computed by summary(.), and
when p <= 20.
use.hessian (logical) indicates whether to use the finite-difference Hessian of the deviance
function to compute standard errors of the fixed effects, rather estimating based
on internal information about the inverse of the model matrix (see getME(.,"RX")).
The default is to to use the Hessian whenever the fixed effect parameters are ar-
guments to the deviance function (i.e. for GLMMs with nAGQ>0), and to use
getME(.,"RX") whenever the fixed effect parameters are profiled out (i.e. for
GLMMs with nAGQ==0 or LMMs).
use.hessian=FALSE is backward-compatible with older versions of lme4, but
may give less accurate SE estimates when the estimates of the fixed-effect (see
getME(.,"beta")) and random-effect (see getME(.,"theta")) parameters are
correlated.
sigm the residual standard error; by default sigma(object).
digits number of significant digits for printing
50 merMod-class
symbolic.cor should a symbolic encoding of the fixed-effects correlation matrix be printed?
If so, the symnum function is used.
signif.stars (logical) should significance stars be used?
ranef.comp character vector of length one or two, indicating if random-effects parameters
should be reported on the variance and/or standard deviation scale.
show.resids should the quantiles of the scaled residuals be printed?
formula. see update.formula.
evaluate see update.
type For weights, type of weights to be returned; either "prior" for the initially
supplied weights or "working" for the weights at the final iteration of the pe-
nalized iteratively reweighted least squares algorithm. For model.matrix, type
of model matrix to return (one of fixed giving the fixed effects model matrix,
random giving the random effects model matrix, or randomListRaw giving a list
of the raw random effects model matrices associated with each random effects
term).
... potentially further arguments passed from other methods.
Objects from the Class
Objects of class merMod are created by calls to lmer,glmer or nlmer.
S3 methods
The following S3 methods with arguments given above exist (this list is currently not complete):
anova:returns the sequential decomposition of the contributions of fixed-effects terms or, for mul-
tiple arguments, model comparison statistics. For objects of class lmerMod the default behav-
ior is to refit the models with LM if fitted with REML = TRUE, this can be controlled via the
refit argument. See also anova.
coef:Computes the sum of the random and fixed effects coefficients for each explanatory variable
for each level of each grouping factor.
extractAIC:Computes the (generalized) Akaike An Information Criterion. If isREML(fit), then
fit is refitted using maximum likelihood.
family:family of fitted GLMM. (Warning: this accessor may not work properly with customized
families/link functions.)
fitted:Fitted values, given the conditional modes of the random effects. For more flexible access
to fitted values, use predict.merMod.
logLik:Log-likelihood at the fitted value of the parameters. Note that for GLMMs, the returned
value is only proportional to the log probability density (or distribution) of the response vari-
able. See logLik.
model.frame:returns the frame slot of merMod.
model.matrix:returns the fixed effects model matrix.
nobs,ngrps:Number of observations and vector of the numbers of levels in each grouping factor.
See ngrps.
summary:Computes and returns a list of summary statistics of the fitted model, the amount of
output can be controlled via the print method, see also summary.
print.summary:Controls the output for the summary method.
vcov:Calculate variance-covariance matrix of the fixed effect terms, see also vcov.
update:See update.
merMod-class 51
Deviance and log-likelihood of GLMMs
One must be careful when defining the deviance of a GLM. For example, should the deviance be
defined as minus twice the log-likelihood or does it involve subtracting the deviance for a saturated
model? To distinguish these two possibilities we refer to absolute deviance (minus twice the log-
likelihood) and relative deviance (relative to a saturated model, e.g. Section 2.3.1 in McCullagh and
Nelder 1989). With GLMMs however, there is an additional complication involving the distinc-
tion between the likelihood and the conditional likelihood. The latter is the likelihood obtained by
conditioning on the estimates of the conditional modes of the spherical random effects coefficients,
whereas the likelihood itself (i.e. the unconditional likelihood) involves integrating out these coeffi-
cients. The following table summarizes how to extract the various types of deviance for a glmerMod
object.
conditional unconditional
relative deviance(object) NA in lme4
absolute object@resp$aic() -2*logLik(object)
This table requires two caveat:
If the link function involves a scale parameter (e.g. Gamma) then object@resp$aic() - 2 * getME(object, "devcomp")$dims["useSc"]
is required for the absolute-conditional case.
If adaptive Gauss-Hermite quadrature is used, then logLik(object) is currently only propor-
tional to the absolute-unconditional log-likelihood.
For more information about this topic see the misc/logLikGLMM directory in the package source.
Slots
resp:A reference class object for an lme4 response module (lmResp-class).
Gp:See getME.
call:The matched call.
frame:The model frame containing all of the variables required to parse the model formula.
flist:See getME.
cnms:See getME.
lower:See getME.
theta:Covariance parameter vector.
beta:Fixed effects coefficients.
u:Conditional model of spherical random effects coefficients.
devcomp:See getME.
pp:A reference class object for an lme4 predictor module (merPredD-class).
optinfo:List containing information about the nonlinear optimization.
See Also
lmer,glmer,nlmer,merPredD,lmerResp,glmResp,nlsResp
Other methods for merMod objects documented elsewhere include: fortify.merMod,drop1.merMod,
isLMM.merMod,isGLMM.merMod,isNLMM.merMod,isREML.merMod,plot.merMod,predict.merMod,
profile.merMod,ranef.merMod,refit.merMod,refitML.merMod,residuals.merMod,sigma.merMod,
simulate.merMod,summary.merMod.
52 merPredD
Examples
showClass("merMod")
methods(class="merMod")## over 30 (S3) methods available
## -> example(lmer) for an example of vcov.merMod()
merPredD Generator object for the merPredD class
Description
The generator object for the merPredD reference class. Such an object is primarily used through its
new method.
Usage
merPredD(...)
Arguments
... List of arguments (see Note).
Note
merPredD(...) is a short form of new("merPredD", ...) to create a new merPredD object and
the ... must be named arguments, (X, Zt, Lambdat, Lind, theta,n):
X:dense model matrix for the fixed-effects parameters, to be stored in the Xfield.
Zt:transpose of the sparse model matrix for the random effects. It is stored in the Zt field.
Lambdat:transpose of the sparse lower triangular relative variance factor (stored in the Lambdat
field).
Lind:integer vector of the same length as the xslot in the Lambdat field. Its elements should be in
the range 1 to the length of the theta field.
theta:numeric vector of variance component parameters (stored in the theta field).
n:sample size, usually nrow(X).
See Also
The class definition, merPredD, also for examples.
merPredD-class 53
merPredD-class Class "merPredD" - a Dense Predictor Reference Class
Description
A reference class (see mother class definition "envRefClass"for a mixed-effects model predictor
module with a dense model matrix for the fixed-effects parameters. The reference class is associated
with a C++ class of the same name. As is customary, the generator object, merPredD, for the class
has the same name as the class.
Note
Objects from this reference class correspond to objects in a C++ class. Methods are invoked on
the C++ class object using the external pointer in the Ptr field. When saving such an object the
external pointer is converted to a null pointer, which is why there are redundant fields containing
enough information as Robjects to be able to regenerate the C++ object. The convention is that a
field whose name begins with an upper-case letter is an Robject and the corresponding field, whose
name begins with the lower-case letter is a method. References to the external pointer should be
through the method, not directly through the Ptr field.
See Also
lmer,glmer,nlmer,merPredD,merMod.
Examples
showClass("merPredD")
pp <- slot(lmer(Yield ~ 1|Batch, Dyestuff), "pp")
stopifnot(is(pp, "merPredD"))
str(pp) # an overview of all fields and methods'names.
mkdevfun Create Deviance Evaluation Function from Predictor and Response
Module
Description
From a merMod object create an Rfunction that takes a single argument, which is the new parameter
value, and returns the deviance.
Usage
mkdevfun(rho, nAGQ = 1L, maxit = 100, verbose = 0, control = list())
54 mkMerMod
Arguments
rho an environment containing pp, a prediction module, typically of class merPredD
and resp, a response module, e.g., of class lmerResp.
nAGQ scalar integer - the number of adaptive Gauss-Hermite quadrature points. A
value of 0 indicates that both the fixed-effects parameters and the random effects
are optimized by the iteratively reweighted least squares algorithm.
maxit scalar integer, currently only for GLMMs: the maximal number of Pwrss update
iterations.
verbose scalar logical: print verbose output?
control list of control parameters, a subset of those specified by lmerControl (tolPwrss
and compDev for GLMMs, tolPwrss for NLMMs)
Details
The function returned by mkdevfun evaluates the deviance of the model represented by the predictor
module, pp, and the response module, resp.
For lmer model objects the argument of the resulting function is the variance component parameter,
theta, with lower bound. For glmer or nlmer model objects with nAGQ = 0 the argument is also
theta. However, when nAGQ > 0, the argument is c(theta, beta).
Value
Afunction of one numeric argument.
See Also
lmer,glmer and nlmer
Examples
(dd <- lmer(Yield ~ 1|Batch, Dyestuff, devFunOnly=TRUE))
dd(0.8)
minqa::bobyqa(1, dd, 0)
mkMerMod Create a ’merMod’ Object
Description
Create an object of (a subclass of) class merMod from the environment of the objective function and
the value returned by the optimizer.
Usage
mkMerMod(rho, opt, reTrms, fr, mc, lme4conv = NULL)
mkRespMod 55
Arguments
rho the environment of the objective function
opt the optimization result returned by the optimizer (a list: see lmerControl for
required elements)
reTrms random effects structure from the calling function (see mkReTrms for required
elements)
fr model frame (see model.frame)
mc matched call from the calling function
lme4conv lme4-specific convergence information (results of checkConv)
Value
an object from a class that inherits from merMod.
mkRespMod Create an lmerResp, glmResp or nlsResp instance
Description
Create an lmerResp, glmResp or nlsResp instance
Usage
mkRespMod(fr, REML = NULL, family = NULL, nlenv = NULL,
nlmod = NULL, ...)
Arguments
fr a model frame
REML logical scalar, value of REML for an lmerResp instance
family the optional glm family (glmResp only)
nlenv the nonlinear model evaluation environment (nlsResp only)
nlmod the nonlinear model function (nlsResp only)
... where to look for response information if fr is missing. Can contain a model
response, y, offset, offset, and weights, weights.
Value
an lmerResp or glmResp or nlsResp instance
See Also
Other utilities: findbars,mkReTrms,nlformula,nobars,subbars
56 mkReTrms
mkReTrms Make Random Effect Terms: Create Z, Lambda, Lind, etc.
Description
From the result of findbars applied to a model formula and the evaluation frame fr, create the
model matrix Zt, etc, associated with the random-effects terms.
Usage
mkReTrms(bars, fr, drop.unused.levels=TRUE)
Arguments
bars a list of parsed random-effects terms
fr a model frame in which to evaluate these terms
drop.unused.levels
(logical) drop unused factor levels? (experimental)
Value
alist with components
Zt transpose of the sparse model matrix for the random effects
theta initial values of the covariance parameters
Lind an integer vector of indices determining the mapping of the elements of the
theta vector to the "x" slot of Lambdat
Gp
lower lower bounds on the covariance parameters
Lambdat transpose of the sparse relative covariance factor
flist list of grouping factors used in the random-effects terms
cnms a list of column names of the random effects according to the grouping factors
Ztlist list of components of the transpose of the random-effects model matrix, sepa-
rated by random-effects term
See Also
Other utilities: findbars,mkRespMod,nlformula,nobars,subbars.getME can retrieve these
components from a fitted model, although their values and/or forms may be slightly different in the
final fitted model from their original values as returned from mkReTrms.
Examples
data("Pixel", package="nlme")
mform <- pixel ~ day + I(day^2) + (day | Dog) + (1 | Side/Dog)
(bar.f <- findbars(mform)) # list with 3 terms
mf <- model.frame(subbars(mform),data=Pixel)
rt <- mkReTrms(bar.f,mf)
names(rt)
mkSimulateTemplate 57
mkSimulateTemplate Make templates suitable for guiding mixed model simulations
Description
Make data and parameter templates suitable for guiding mixed model simulations, by specifying
a model formula and other information (EXPERIMENTAL). Most useful for simulating balanced
designs and for getting started on unbalanced simulations.
Usage
mkParsTemplate(formula, data)
mkDataTemplate(formula, data, nGrps = 2, nPerGrp = 1, rfunc = NULL, ...)
Arguments
formula A mixed model formula (see lmer).
data A data frame containing the names in formula.
nGrps Number of levels of a grouping factor.
nPerGrp Number of observations per level.
rfunc Function for generating covariate data (e.g. rnorm.
... Additional parameters for rfunc.
See Also
These functions are designed to be used with simulate.merMod.
mkVarCorr Make Variance and Correlation Matrices from theta
Description
Make variance and correlation matrices from theta
Usage
mkVarCorr(sc, cnms, nc, theta, nms)
Arguments
sc scale factor (residual standard deviation).
cnms component names.
nc numeric vector: number of terms in each RE component.
theta theta vector (lower-triangle of Cholesky factors).
nms component names (FIXME: nms/cnms redundant: nms=names(cnms)?)
58 modular
Value
Amatrix
See Also
VarCorr
modular Modular Functions for Mixed Model Fits
Description
Modular functions for mixed model fits
Usage
lFormula(formula, data = NULL, REML = TRUE, subset,
weights, na.action, offset, contrasts = NULL,
control = lmerControl(), ...)
mkLmerDevfun(fr, X, reTrms, REML = TRUE, start = NULL,
verbose = 0, control = lmerControl(), ...)
optimizeLmer(devfun,
optimizer = formals(lmerControl)$optimizer,
restart_edge = formals(lmerControl)$restart_edge,
boundary.tol = formals(lmerControl)$boundary.tol,
start = NULL, verbose = 0L,
control = list(), ...)
glFormula(formula, data = NULL, family = gaussian,
subset, weights, na.action, offset, contrasts = NULL,
mustart, etastart, control = glmerControl(), ...)
mkGlmerDevfun(fr, X, reTrms, family, nAGQ = 1L,
verbose = 0L, maxit = 100L, control = glmerControl(), ...)
optimizeGlmer(devfun, optimizer = "bobyqa",
restart_edge = FALSE,
boundary.tol = formals(glmerControl)$boundary.tol,
verbose = 0L, control = list(),
nAGQ = 1L, stage = 1, start = NULL, ...)
updateGlmerDevfun(devfun, reTrms, nAGQ = 1L)
Arguments
formula a two-sided linear formula object describing both the fixed-effects and fixed-
effects part of the model, with the response on the left of a ~operator and the
terms, separated by +operators, on the right. Random-effects terms are distin-
guished by vertical bars ("|") separating expressions for design matrices from
grouping factors.
modular 59
data an optional data frame containing the variables named in formula. By default
the variables are taken from the environment from which lmer is called. While
data is optional, the package authors strongly recommend its use, especially
when later applying methods such as update and drop1 to the fitted model
(such methods are not guaranteed to work properly if data is omitted). If data
is omitted, variables will be taken from the environment of formula (if specified
as a formula) or from the parent frame (if specified as a character vector).
REML (logical) indicating to fit restricted maximum likelihood model.
subset an optional expression indicating the subset of the rows of data that should be
used in the fit. This can be a logical vector, or a numeric vector indicating which
observation numbers are to be included, or a character vector of the row names
to be included. All observations are included by default.
weights an optional vector of ‘prior weights’ to be used in the fitting process. Should be
NULL or a numeric vector.
na.action a function that indicates what should happen when the data contain NAs. The de-
fault action (na.omit, inherited from the ’factory fresh’ value of getOption("na.action"))
strips any observations with any missing values in any variables.
offset this can be used to specify an a priori known component to be included in the
linear predictor during fitting. This should be NULL or a numeric vector of length
equal to the number of cases. One or more offset terms can be included in the
formula instead or as well, and if more than one is specified their sum is used.
See model.offset.
contrasts an optional list. See the contrasts.arg of model.matrix.default.
control a list giving
for [g]lFormula:all options for running the model, see lmerControl;
for mkLmerDevfun,mkGlmerDevfun:options for the inner optimization step;
for optimizeLmer and optimizeGlmer:control parameters for nonlinear op-
timizer (typically inherited from the . . .argument to lmerControl).
fr A model frame containing the variables needed to create an lmerResp or glmResp
instance.
Xfixed-effects design matrix
reTrms information on random effects structure (see mkReTrms).
start starting values (see lmer)
verbose print output?
maxit maximal number of Pwrss update iterations.
devfun a deviance function, as generated by mkLmerDevfun
nAGQ number of Gauss-Hermite quadrature points
stage optimization stage (1: nAGQ=0, optimize over theta only; 2: nAGQ possibly
>0, optimize over theta and beta)
optimizer character - name of optimizing function(s). A character vector or list of func-
tions: length 1 for lmer or glmer, possibly length 2 for glmer. The built-in
optimizers are "Nelder_Mead"and "bobyqa"(from the minqa package). Any
minimizing function that allows box constraints can be used provided that it
1. takes input parameters fn (function to be optimized), par (starting parame-
ter values), lower (lower bounds) and control (control parameters, passed
through from the control argument) and
60 modular
2. returns a list with (at least) elements par (best-fit parameters), fval (best-fit
function value), conv (convergence code) and (optionally) message (infor-
mational message, or explanation of convergence failure).
Special provisions are made for bobyqa,Nelder_Mead, and optimizers wrapped
in the optimx package; to use optimx optimizers (including L-BFGS-B from
base optim and nlminb), pass the method argument to optim in the control
argument.
For glmer, if length(optimizer)==2, the first element will be used for the
preliminary (random effects parameters only) optimization, while the second
will be used for the final (random effects plus fixed effect parameters) phase.
See modular for more information on these two phases.
restart_edge see lmerControl
boundary.tol see lmerControl
family a GLM family; see glm and family.
mustart optional starting values on the scale of the conditional mean; see glm for details.
etastart optional starting values on the scale of the unbounded predictor; see glm for
details.
... other potential arguments; for optimizeLmer and optimizeGlmer, these are
passed to internal function optwrap, which has relevant parameters calc.derivs
and use.last.params (see lmerControl).
Details
These functions make up the internal components of an [gn]lmer fit.
[g]lFormula takes the arguments that would normally be passed to [g]lmer, checking for
errors and processing the formula and data input to create a list of objects required to fit a
mixed model.
mk(Gl|L)merDevfun takes the output of the previous step (minus the formula component)
and creates a deviance function
optimize(Gl|L)mer takes a deviance function and optimizes over theta (or over theta and
beta, if stage is set to 2 for optimizeGlmer
updateGlmerDevfun takes the first stage of a GLMM optimization (with nAGQ=0, optimizing
over theta only) and produces a second-stage deviance function
mkMerMod takes the environment of a deviance function, the results of an optimization, a list
of random-effect terms, a model frame, and a model all and produces a [g]lmerMod object.
Value
lFormula and glFormula return a list containing components:
fr model frame
Xfixed-effect design matrix
reTrms list containing information on random effects structure: result of mkReTrms
REML (lFormula only): logical indicating if restricted maximum likelihood was used (Copy of
argument.)
modular 61
mkLmerDevfun and mkGlmerDevfun return a function to calculate deviance (or restricted deviance)
as a function of the theta (random-effect) parameters. updateGlmerDevfun returns a function to
calculate the deviance as a function of a concatenation of theta and beta (fixed-effect) parameters.
These deviance functions have an environment containing objects required for their evaluation.
CAUTION: The environment of functions returned by mk(Gl|L)merDevfun contains reference
class objects (see ReferenceClasses,merPredD-class,lmResp-class), which behave in ways
that may surprise many users. For example, if the output of mk(Gl|L)merDevfun is naively copied,
then modifications to the original will also appear in the copy (and vice versa). To avoid this
behavior one must make a deep copy (see ReferenceClasses for details).
optimizeLmer and optimizeGlmer return the results of an optimization.
Examples
### Fitting a linear mixed model in 4 modularized steps
## 1. Parse the data and formula:
lmod <- lFormula(Reaction ~ Days + (Days|Subject), sleepstudy)
names(lmod)
## 2. Create the deviance function to be optimized:
(devfun <- do.call(mkLmerDevfun, lmod))
ls(environment(devfun)) # the environment of 'devfun'contains objects
# required for its evaluation
## 3. Optimize the deviance function:
opt <- optimizeLmer(devfun)
opt[1:3]
## 4. Package up the results:
mkMerMod(environment(devfun), opt, lmod$reTrms, fr = lmod$fr)
### Same model in one line
lmer(Reaction ~ Days + (Days|Subject), sleepstudy)
### Fitting a generalized linear mixed model in six modularized steps
## 1. Parse the data and formula:
glmod <- glFormula(cbind(incidence, size - incidence) ~ period + (1 | herd),
data = cbpp, family = binomial)
names(glmod)
## 2. Create the deviance function for optimizing over theta:
(devfun <- do.call(mkGlmerDevfun, glmod))
ls(environment(devfun)) # the environment of devfun contains lots of info
## 3. Optimize over theta using a rough approximation (i.e. nAGQ = 0):
(opt <- optimizeGlmer(devfun))
## 4. Update the deviance function for optimizing over theta and beta:
(devfun <- updateGlmerDevfun(devfun, glmod$reTrms))
## 5. Optimize over theta and beta:
opt <- optimizeGlmer(devfun, stage=2)
opt[1:3]
## 6. Package up the results:
mkMerMod(environment(devfun), opt, glmod$reTrms, fr = glmod$fr)
### Same model in one line
glmer(cbind(incidence, size - incidence) ~ period + (1 | herd),
data = cbpp, family = binomial)
62 NelderMead
NelderMead Nelder-Mead Optimization of Parameters, Possibly (Box) Constrained
Description
Nelder-Mead optimization of parameters, allowing optimization subject to box constraints (contrary
to the default, method = "Nelder-Mead", in R’s optim()), and using reverse communications.
Usage
Nelder_Mead(fn, par, lower = rep.int(-Inf, n), upper = rep.int(Inf, n),
control = list())
Arguments
fn afunction of a single numeric vector argument returning a numeric scalar.
par numeric vector of starting values for the parameters.
lower numeric vector of lower bounds (elements may be -Inf).
upper numeric vector of upper bounds (elements may be Inf).
control a named list of control settings. Possible settings are
iprint numeric scalar - frequency of printing evaluation information. Defaults
to 0 indicating no printing.
maxfun numeric scalar - maximum number of function evaluations allowed
(default:10000).
FtolAbs numeric scalar - absolute tolerance on change in function values (de-
fault: 1e-5)
FtolRel numeric scalar - relative tolerance on change in function values (default:1e-
15)
XtolRel numeric scalar - relative tolerance on change in parameter values (de-
fault: 1e-7)
MinfMax numeric scalar - maximum value of the minimum (default: .Ma-
chine$double.xmin)
xst numeric vector of initial step sizes to establish the simplex - all elements
must be non-zero (default: rep(0.02,length(par)))
xt numeric vector of tolerances on the parameters (default: xst*5e-4)
verbose numeric value: 0=no printing, 1=print every 20 evaluations, 2=print
every 10 evalutions, 3=print every evaluation. Sets ‘iprint’, if specified, but
does not override it.
warnOnly a logical indicating if non-convergence (codes -1,-2,-3) should not
stop(.), but rather only call warning and return a result which might in-
spected. Defaults to FALSE, i.e., stop on non-convergence.
Value
alist with components
fval numeric scalar - the minimum function value achieved
par numeric vector - the value of xproviding the minimum
NelderMead-class 63
convergence integer valued scalar, if not 0, an error code:
-4 nm_evals: maximum evaluations reached
-3 nm_forced: ?
-2 nm_nofeasible: cannot generate a feasible simplex
-1 nm_x0notfeasible: initial x is not feasible (?)
0successful convergence
message a string specifying the kind of convergence.
control the list of control settings after substituting for defaults.
feval the number of function evaluations.
See Also
The NelderMead class definition and generator function.
Examples
fr <- function(x) { ## Rosenbrock Banana function
x1 <- x[1]
x2 <- x[2]
100 * (x2 - x1 * x1)^2 + (1 - x1)^2
}
p0 <- c(-1.2, 1)
oo <- optim(p0, fr) ## also uses Nelder-Mead by default
o. <- Nelder_Mead(fr, p0)
o.1 <- Nelder_Mead(fr, p0, control=list(verbose=1))# -> some iteration output
stopifnot(identical(o.[1:4], o.1[1:4]),
all.equal(o.$par, oo$par, tolerance=1e-3))# diff: 0.0003865
o.2 <- Nelder_Mead(fr, p0, control=list(verbose=3, XtolRel=1e-15, FtolAbs= 1e-14))
all.equal(o.2[-5],o.1[-5], tolerance=1e-15)# TRUE, unexpectedly
NelderMead-class Class "NelderMead" of Nelder-Mead optimizers and its Generator
Description
Class "NelderMead" is a reference class for a Nelder-Mead simplex optimizer allowing box con-
straints on the parameters and using reverse communication.
The NelderMead() function conveniently generates such objects.
Usage
NelderMead(...)
Arguments
... Argument list (see Note below).
64 ngrps
Methods
NelderMead$new(lower, upper, xst, x0, xt)
Create a new NelderMead object
Extends
All reference classes extend and inherit methods from "envRefClass".
Note
This is the default optimizer for the second stage of glmer and nlmer fits. We found that it was
more reliable and often faster than more sophisticated optimizers.
Arguments to NelderMead() and the new method must be named arguments:
lower numeric vector of lower bounds - elements may be -Inf.
upper numeric vector of upper bounds - elements may be Inf.
xst numeric vector of initial step sizes to establish the simplex - all elements must be non-zero.
x0 numeric vector of starting values for the parameters.
xt numeric vector of tolerances on the parameters.
References
Based on code in the NLopt collection.
See Also
Nelder_Mead, the typical “constructor”. Further, glmer,nlmer
Examples
showClass("NelderMead")
ngrps Number of Levels of a Factor or a "merMod" Model
Description
Returns the number of levels of a factor or a set of factors, currently e.g., for each of the grouping
factors of lmer(),glmer(), etc.
Usage
ngrps(object, ...)
Arguments
object an Robject, see Details.
... currently ignored.
nlformula 65
Details
Currently there are methods for objects of class merMod, i.e., the result of lmer() etc, and factor
objects.
Value
The number of levels (of a factor) or vector of number of levels for each “grouping factor” of a
Examples
ngrps(factor(seq(1,10,2)))
ngrps(lmer(Reaction ~ 1|Subject, sleepstudy))
## A named vector if there's more than one grouping factor :
ngrps(lmer(strength ~ (1|batch/cask), Pastes))
## cask:batch batch
## 30 10