Package ‘lme4’

August 25, 2015

Version 1.1-10

Title Linear Mixed-Effects Models using 'Eigen' and S4

Maintainer Ben Bolker <bbolker+lme4@gmail.com>

Contact LME4 Authors <lme4-authors@lists.r-forge.r-project.org>

Author Douglas Bates [aut], Martin Maechler [aut],

Ben Bolker [aut, cre], Steven Walker [aut],

Rune Haubo Bojesen Christensen [ctb],

Henrik Singmann [ctb], Bin Dai [ctb],

Gabor Grothendieck [ctb]

Description Fit linear and generalized linear mixed-effects models.

The models and their components are represented using S4 classes and

methods. The core computational algorithms are implemented using the

'Eigen' C++ library for numerical linear algebra and 'RcppEigen' ``glue''.

Depends R (>= 3.0.2), Matrix (>= 1.1.1), methods, stats

LinkingTo Rcpp (>= 0.10.5), RcppEigen

Imports graphics, grid, splines, utils, parallel, MASS, nlme, lattice,

minqa (>= 1.1.15), nloptr (>= 1.0.4)

Suggests knitr, boot, PKPDmodels, MEMSS, testthat (>= 0.8.1), ggplot2,

mlmRev, optimx (>= 2013.8.6), gamm4, pbkrtest, HSAUR2, numDeriv

VignetteBuilder knitr

LazyData yes

License GPL (>=2)

URL https://github.com/lme4/lme4/ http://lme4.r-forge.r-project.org/

BugReports https://github.com/lme4/lme4/issues

Rtopics documented:

lme4-package........................................ 3

Arabidopsis......................................... 4

bootMer........................................... 5

cake ............................................. 8

cbpp............................................. 9

conﬁnt.merMod....................................... 10

convergence......................................... 12

1

2Rtopics documented:

devcomp........................................... 14

drop1.merMod ....................................... 14

dummy ........................................... 16

Dyestuff........................................... 17

expandDoubleVerts..................................... 18

factorize........................................... 18

ﬁndbars ........................................... 19

ﬁxef ............................................. 20

fortify ............................................ 20

getME............................................ 21

GHrule ........................................... 24

glmer ............................................ 25

glmer.nb........................................... 28

glmerLaplaceHandle .................................... 29

glmFamily.......................................... 30

glmFamily-class....................................... 30

golden-class......................................... 31

GQdk ............................................ 31

grouseticks ......................................... 32

hatvalues.merMod...................................... 33

InstEval ........................................... 34

isNested........................................... 35

isREML........................................... 36

lmer............................................. 37

lmerControl......................................... 39

lmList............................................ 44

lmList4-class ........................................ 45

lmResp ........................................... 46

lmResp-class ........................................ 47

merMod-class........................................ 48

merPredD.......................................... 52

merPredD-class....................................... 53

mkdevfun .......................................... 53

mkMerMod......................................... 54

mkRespMod......................................... 55

mkReTrms ......................................... 56

mkSimulateTemplate .................................... 57

mkVarCorr ......................................... 57

modular........................................... 58

NelderMead......................................... 62

NelderMead-class...................................... 63

ngrps ............................................ 64

nlformula .......................................... 65

nlmer ............................................ 66

nloptwrap .......................................... 68

nobars............................................ 69

Pastes ............................................ 70

Penicillin .......................................... 71

plot.merMod ........................................ 72

plots.thpr .......................................... 74

predict.merMod....................................... 75

proﬁle-methods....................................... 77

lme4-package 3

prt-utilities.......................................... 80

pvalues ........................................... 82

ranef............................................. 83

reﬁt ............................................. 84

reﬁtML ........................................... 86

rePos ............................................ 87

rePos-class ......................................... 87

residuals.merMod...................................... 88

sigma ............................................ 89

simulate.merMod...................................... 89

sleepstudy.......................................... 91

subbars ........................................... 92

troubleshooting ....................................... 93

VarCorr ........................................... 93

varianceProf......................................... 95

vcconv............................................ 95

VerbAgg........................................... 97

Index 99

lme4-package Linear, generalized linear, and nonlinear mixed models

Description

lme4 provides functions for ﬁtting and analyzing mixed models: linear (lmer), generalized linear

(glmer) and nonlinear (nlmer.)

Differences between nlme and lme4

lme4 covers approximately the same ground as the earlier nlme package. The most important

differences are:

•lme4 uses modern, efﬁcient linear algebra methods as implemented in the Eigen package, and

uses reference classes to avoid undue copying of large objects; it is therefore likely to be faster

and more memory-efﬁcient than nlme.

•lme4 includes generalized linear mixed model (GLMM) capabilities, via the glmer function.

•lme4 does not currently implement nlme’s features for modeling heteroscedasticity and cor-

relation of residuals.

•lme4 does not currently offer the same ﬂexibility as nlme for composing complex variance-

covariance structures, but it does implement crossed random effects in a way that is both easier

for the user and much faster.

•lme4 offers built-in facilities for likelihood proﬁling and parametric bootstrapping.

•lme4 is designed to be more modular than nlme, making it easier for downstream package

developers and end-users to re-use its components for extensions of the basic mixed model

framework. It also allows more ﬂexibility for specifying different functions for optimizing

over the random-effects variance-covariance parameters.

•lme4 is not (yet) as well-documented as nlme.

4Arabidopsis

Differences between current (1.0.+) and previous versions of lme4

•[gn]lmer now produces objects of class merMod rather than class mer as before

• the new version uses a combination of S3 and reference classes (see ReferenceClasses,

merPredD-class, and lmResp-class) as well as S4 classes; partly for this reason it is more

interoperable with nlme

• The internal structure of [gn]lmer is now more modular, allowing ﬁner control of the different

steps of argument checking; construction of design matrices and data structures; parameter

estimation; and construction of the ﬁnal merMod object (see modular)

• proﬁling and parametric bootstrapping are new in the current version

• the new version of lme4 does not provide an mcmcsamp (post-hoc MCMC sampling) method,

because this was deemed to be unreliable. Alternatives for computing p-values include para-

metric bootstrapping (bootMer) or methods implemented in the pbkrtest package and lever-

aged by the lmerTest package and the Anova function in the car package (see pvalues for

more details).

Caveats and trouble-shooting

• Some users who have previously installed versions of the RcppEigen and minqa packages may

encounter segmentation faults (!!); the solution is to make sure to re-install these packages

before installing lme4. (Because the problem is not with the explicit version of the packages,

but with running packages that were built with different versions of Rcpp in conjunction with

each other, simply making sure you have the latest version, or using update.packages, will

not necessarily solve the problem; you must actually re-install the packages. The problem is

most likely with minqa.)

Arabidopsis Arabidopsis clipping/fertilization data

Description

Data on genetic variation in responses to fertilization and simulated herbivory in Arabidopsis

Usage

data("Arabidopsis")

Format

A data frame with 625 observations on the following 8 variables.

reg region: a factor with 3 levels NL (Netherlands), SP (Spain), SW (Sweden)

popu population: a factor with the form n.R representing a population in region R

gen genotype: a factor with 24 (numeric-valued) levels

rack a nuisance factor with 2 levels, one for each of two greenhouse racks

nutrient fertilization treatment/nutrient level (1, minimal nutrients or 8, added nutrients)

amd simulated herbivory or "clipping" (apical meristem damage): unclipped (baseline) or clipped

status a nuisance factor for germination method (Normal,Petri.Plate, or Transplant)

total.fruits total fruit set per plant (integer)

bootMer 5

Source

From Josh Banta

References

Joshua A. Banta, Martin H. H Stevens, and Massimo Pigliucci (2010) A comprehensive test of the

’limiting resources’ framework applied to plant tolerance to apical meristem damage. Oikos 119(2),

359–369; http://dx.doi.org/10.1111/j.1600-0706.2009.17726.x

Examples

data(Arabidopsis)

summary(Arabidopsis[,"total.fruits"])

table(gsub("[0-9].","",levels(Arabidopsis[,"popu"])))

library(lattice)

stripplot(log(total.fruits+1) ~ amd|nutrient, data = Arabidopsis,

groups = gen,

strip=strip.custom(strip.names=c(TRUE,TRUE)),

type=c('p','a'), ## points and panel-average value --

## see ?panel.xyplot

scales=list(x=list(rot=90)),

main="Panel: nutrient, Color: genotype")

bootMer Model-based (Semi-)Parametric Bootstrap for Mixed Models

Description

Perform model-based (Semi-)parametric bootstrap for mixed models.

Usage

bootMer(x, FUN, nsim = 1, seed = NULL, use.u = FALSE,

type = c("parametric", "semiparametric"),

verbose = FALSE, .progress = "none", PBargs = list(),

parallel = c("no", "multicore", "snow"),

ncpus = getOption("boot.ncpus", 1L), cl = NULL)

Arguments

xa ﬁtted merMod object: see lmer,glmer, etc.

FUN a function taking a ﬁtted merMod object as input and returning the statistic of

interest, which must be a (possibly named) numeric vector.

nsim number of simulations, positive integer; the bootstrap B(or R).

seed optional argument to set.seed.

use.u logical, indicating whether the spherical random effects should be simulated /

bootstrapped as well. If TRUE, they are not changed, and all inference is condi-

tional on these values. If FALSE, new normal deviates are drawn (see Details).

type character string specifying the type of bootstrap, "parametric" or "semiparametric";

partial matching is allowed.

6bootMer

verbose logical indicating if progress should print output

.progress character string - type of progress bar to display. Default is "none"; the function

will look for a relevant *ProgressBar function, so "txt" will work in general;

"tk" is available if the tcltk package is loaded; or "win" on Windows systems.

Progress bars are disabled (with a message) for parallel operation.

PBargs a list of additional arguments to the progress bar function (the package authors

like list(style=3)).

parallel The type of parallel operation to be used (if any). If missing, the default is taken

from the option "boot.parallel" (and if that is not set, "no").

ncpus integer: number of processes to be used in parallel operation: typically one

would choose this to be the number of available CPUs.

cl An optional parallel or snow cluster for use if parallel = "snow". If not

supplied, a cluster on the local machine is created for the duration of the boot

call.

Details

The semi-parametric variant is only partially implemented, and we only provide a method for lmer

and glmer results.

The working name for bootMer() was “simulestimate()”, as it is an extension of simulate (see

simulate.merMod), but we want to emphasize its potential for valid inference.

• If use.u is FALSE and type is "parametric", each simulation generates new values of both

the “spherical” random effects uand the i.i.d. errors , using rnorm() with parameters corre-

sponding to the ﬁtted model x.

• If use.u is TRUE and type=="parametric", only the i.i.d. errors (or, for GLMMs, response

values drawn from the appropriate distributions) are resampled, with the values of ustaying

ﬁxed at their estimated values.

• If use.u is TRUE and type=="semiparametric", the i.i.d. errors are sampled from the dis-

tribution of (response) residuals. (For GLMMs, the resulting sample will no longer have the

same properties as the original sample, and the method may not make sense; a warning is

generated.) The semiparametric bootstrap is currently an experimental feature, and therefore

may not be stable.

• The case where use.u is FALSE and type=="semiparametric" is not implemented; Morris

(2002) suggests that resampling from the estimated values of uis not good practice.

Value

an object of S3 class "boot", compatible with boot package’s boot() result.

References

Davison, A.C. and Hinkley, D.V. (1997) Bootstrap Methods and Their Application. Cambridge

University Press.

Morris, J. S. (2002). The BLUPs Are Not ‘best’ When It Comes to Bootstrapping. Statistics &

Probability Letters 56(4): 425–430. doi:10.1016/S0167-7152(02)00041-X.

bootMer 7

See Also

•confint.merMod, for a more speciﬁc approach to bootstrap conﬁdence intervals on parame-

ters.

•refit(), or PBmodcomp() from the pbkrtest package, for parametric bootstrap comparison

of models.

•boot(), and then boot.ci, from the boot package.

•profile-methods, for likelihood-based inference, including conﬁdence intervals.

•pvalues, for more general approaches to inference and p-value computation in mixed models.

Examples

fm01ML <- lmer(Yield ~ 1|Batch, Dyestuff, REML = FALSE)

## see ?"profile-methods"

mySumm <- function(.) { s <- sigma(.)

c(beta =getME(., "beta"), sigma = s, sig01 = unname(s * getME(., "theta"))) }

(t0 <- mySumm(fm01ML)) # just three parameters

## alternatively:

mySumm2 <- function(.) {

c(beta=fixef(.),sigma=sigma(.), sig01=sqrt(unlist(VarCorr(.))))

}

set.seed(101)

## 3.8s (on a 5600 MIPS 64bit fast(year 2009) desktop "AMD Phenom(tm) II X4 925"):

system.time( boo01 <- bootMer(fm01ML, mySumm, nsim = 100) )

## to "look" at it

require("boot") ## a recommended package, i.e. *must* be there

boo01

## note large estimated bias for sig01

## (~30% low, decreases _slightly_ for nsim = 1000)

## extract the bootstrapped values as a data frame ...

head(as.data.frame(boo01))

## ------ Bootstrap-based confidence intervals ------------

## warnings about "Some ... intervals may be unstable" go away

## for larger bootstrap samples, e.g. nsim=500

## intercept

(bCI.1 <- boot.ci(boo01, index=1, type=c("norm", "basic", "perc")))# beta

## Residual standard deviation - original scale:

(bCI.2 <- boot.ci(boo01, index=2, type=c("norm", "basic", "perc")))

## Residual SD - transform to log scale:

(bCI.2L <- boot.ci(boo01, index=2, type=c("norm", "basic", "perc"),

h = log, hdot = function(.) 1/., hinv = exp))

## Among-batch variance:

(bCI.3 <- boot.ci(boo01, index=3, type=c("norm", "basic", "perc"))) # sig01

## Extract all CIs (somewhat awkward)

bCI.tab <- function(b,ind=length(b$t0), type="perc", conf=0.95) {

btab0 <- t(sapply(as.list(seq(ind)),

function(i)

8cake

boot.ci(b,index=i,conf=conf, type=type)$percent))

btab <- btab0[,4:5]

rownames(btab) <- names(b$t0)

a <- (1 - conf)/2

a <- c(a, 1 - a)

pct <- stats:::format.perc(a, 3)

colnames(btab) <- pct

return(btab)

}

bCI.tab(boo01)

## Graphical examination:

plot(boo01,index=3)

## Check stored values from a longer (1000-replicate) run:

load(system.file("testdata","boo01L.RData",package="lme4"))

plot(boo01L,index=3)

mean(boo01L$t[,"sig01"]==0) ## note point mass at zero!

cake Breakage Angle of Chocolate Cakes

Description

Data on the breakage angle of chocolate cakes made with three different recipes and baked at six

different temperatures. This is a split-plot design with the recipes being whole-units and the differ-

ent temperatures being applied to sub-units (within replicates). The experimental notes suggest that

the replicate numbering represents temporal ordering.

Format

A data frame with 270 observations on the following 5 variables.

replicate a factor with levels 1to 15

recipe a factor with levels A,Band C

temperature an ordered factor with levels 175 <185 <195 <205 <215 <225

angle a numeric vector giving the angle at which the cake broke.

temp numeric value of the baking temperature (degrees F).

Details

The replicate factor is nested within the recipe factor, and temperature is nested within replicate.

Source

Original data were presented in Cook (1938), and reported in Cochran and Cox (1957, p. 300).

Also cited in Lee, Nelder and Pawitan (2006).

cbpp 9

References

Cook, F. E. (1938) Chocolate cake, I. Optimum baking temperature. Master’s Thesis, Iowa State

College.

Cochran, W. G., and Cox, G. M. (1957) Experimental designs, 2nd Ed. New York, John Wiley \&

Sons.

Lee, Y., Nelder, J. A., and Pawitan, Y. (2006) Generalized linear models with random effects.

Uniﬁed analysis via H-likelihood. Boca Raton, Chapman and Hall/CRC.

Examples

str(cake)

## 'temp'is continuous, 'temperature'an ordered factor with 6 levels

(fm1 <- lmer(angle ~ recipe * temperature + (1|recipe:replicate), cake, REML= FALSE))

(fm2 <- lmer(angle ~ recipe + temperature + (1|recipe:replicate), cake, REML= FALSE))

(fm3 <- lmer(angle ~ recipe + temp + (1|recipe:replicate), cake, REML= FALSE))

## and now "choose" :

anova(fm3, fm2, fm1)

cbpp Contagious bovine pleuropneumonia

Description

Contagious bovine pleuropneumonia (CBPP) is a major disease of cattle in Africa, caused by a

mycoplasma. This dataset describes the serological incidence of CBPP in zebu cattle during a

follow-up survey implemented in 15 commercial herds located in the Boji district of Ethiopia. The

goal of the survey was to study the within-herd spread of CBPP in newly infected herds. Blood

samples were quarterly collected from all animals of these herds to determine their CBPP status.

These data were used to compute the serological incidence of CBPP (new cases occurring during a

given time period). Some data are missing (lost to follow-up).

Format

A data frame with 56 observations on the following 4 variables.

herd A factor identifying the herd (1 to 15).

incidence The number of new serological cases for a given herd and time period.

size A numeric vector describing herd size at the beginning of a given time period.

period A factor with levels 1to 4.

Details

Serological status was determined using a competitive enzyme-linked immuno-sorbent assay (cELISA).

Source

Lesnoff, M., Laval, G., Bonnet, P., Abdicho, S., Workalemahu, A., Kiﬂe, D., Peyraud, A., Lancelot,

R., Thiaucourt, F. (2004) Within-herd spread of contagious bovine pleuropneumonia in Ethiopian

highlands. Preventive Veterinary Medicine 64, 27–40.

10 conﬁnt.merMod

Examples

## response as a matrix

(m1 <- glmer(cbind(incidence, size - incidence) ~ period + (1 | herd),

family = binomial, data = cbpp))

## response as a vector of probabilities and usage of argument "weights"

m1p <- glmer(incidence / size ~ period + (1 | herd), weights = size,

family = binomial, data = cbpp)

## Confirm that these are equivalent:

stopifnot(all.equal(fixef(m1), fixef(m1p), tolerance = 1e-5),

all.equal(ranef(m1), ranef(m1p), tolerance = 1e-5))

## GLMM with individual-level variability (accounting for overdispersion)

cbpp$obs <- 1:nrow(cbpp)

(m2 <- glmer(cbind(incidence, size - incidence) ~ period + (1 | herd) + (1|obs),

family = binomial, data = cbpp))

confint.merMod Compute Conﬁdence Intervals for Parameters of a [ng]lmer Fit

Description

Compute conﬁdence intervals on the parameters of a *lmer() model ﬁt (of class"merMod").

Usage

## S3 method for class 'merMod'

confint(object, parm, level = 0.95,

method = c("profile", "Wald", "boot"), zeta,

nsim = 500,

boot.type = c("perc","basic","norm"),

FUN = NULL, quiet = FALSE,

oldNames = TRUE, ...)

Arguments

object a ﬁtted [ng]lmer model

parm parameters for which intervals are sought. Speciﬁed by an integer vector of

positions, character vector of parameter names, or (unless doing parametric

bootstrapping with a user-speciﬁed bootstrap function) "theta_" or "beta_"

to specify variance-covariance or ﬁxed effects parameters only: see the which

parameter of profile.

level conﬁdence level <1, typically above 0.90.

method acharacter string determining the method for computing the conﬁdence inter-

vals.

zeta (for method = "profile" only:) likelihood cutoff (if not speciﬁed, as by de-

fault, computed from level).

nsim number of simulations for parametric bootstrap intervals.

FUN bootstrap function; if NULL, an internal function that returns the ﬁxed-effect pa-

rameters as well as the random-effect parameters on the standard deviation/correlation

scale will be used. See bootMer for details.

conﬁnt.merMod 11

boot.type bootstrap conﬁdence interval type, as described in boot.ci. (Methods ‘stud’

and ‘bca’ are unavailable because they require additional components to be cal-

culated.)

quiet (logical) suppress messages about computationally intensive proﬁling?

oldNames (logical) use old-style names for variance-covariance parameters, e.g. ".sig01",

rather than newer (more informative) names such as "sd_(Intercept)|Subject"?

(See signames argument to profile).

... additional parameters to be passed to profile.merMod or bootMer, respec-

tively.

Details

Depending on the method speciﬁed, confint() computes conﬁdence intervals by

"profile":computing a likelihood proﬁle and ﬁnding the appropriate cutoffs based on the likeli-

hood ratio test;

"Wald":approximating the conﬁdence intervals (of ﬁxed-effect parameters only; all variance-

covariance parameters CIs will be returned as NA) based on the estimated local curvature of

the likelihood surface;

"boot":performing parametric bootstrapping with conﬁdence intervals computed from the boot-

strap distribution according to boot.type (see bootMer,boot.ci).

Value

a numeric table (matrix with column and row names) of conﬁdence intervals; the conﬁdence inter-

vals are computed on the standard deviation scale.

Note

The default method "profile" amounts to

confint(profile(object, which=parm), signames=oldNames, ...),

level, zeta)

where the profile method profile.merMod does almost all the computations. Therefore it is

typically advisable to store the proﬁle(.) result, say in pp, and then use confint(pp, level=*)

e.g., for different levels.

Examples

fm1 <- lmer(Reaction ~ Days + (Days|Subject), sleepstudy)

fm1W <- confint(fm1, method="Wald")# very fast, but ....

fm1W

testLevel <- if (nzchar(s <- Sys.getenv("LME4_TEST_LEVEL"))) as.numeric(s) else 1

if(interactive() || testLevel >= 3) {

## ~20 seconds, MacBook Pro laptop

system.time(fm1P <- confint(fm1, method="profile", ## default

oldNames = FALSE))

## ~ 40 seconds

system.time(fm1B <- confint(fm1,method="boot",

.progress="txt", PBargs=list(style=3)))

} else

load(system.file("testdata","confint_ex.rda",package="lme4"))

fm1P

fm1B

12 convergence

convergence Assessing Convergence for Fitted Models

Description

The lme4 package uses general-purpose nonlinear optimizers (e.g. Nelder-Mead or Powell’s BOBYQA

method) to estimate the variance-covariance matrices of the random effects. Assessing reliably

whether such algorithms have converged is difﬁcult. For example, evaluating the Karush-Kuhn-

Tucker conditions (convergence criteria which in the simplest case of non-constrained optimization

reduce to showing that the gradient is zero and the Hessian is positive deﬁnite) is challenging be-

cause of the difﬁculty of evaluating the gradient and Hessian.

We (the lme4 authors and maintainers) are still in the process of ﬁnding the best strategies for testing

convergence. Some of the relevant issues are

• the gradient and Hessian are the basic ingredients of KKT-style testing, but when they have

to be estimated by ﬁnite differences (as in the case of lme4; direct computation of derivatives

based on analytic expressions may eventually be available for some special classes, but we

have not yet implemented them) they may not be sufﬁciently accurate for reliable convergence

testing.

• The Hessian computation in particular represents a difﬁcult tradeoff between computational

expense and accuracy. At present the Hessian computations used for convergence checking

(and for estimating standard errors of ﬁxed-effect parameters for GLMMs) follow the ordinal

package in using a naive but computationally cheap centered ﬁnite difference computation

(with a ﬁxed step size of 10−4). A more reliable but more expensive approach is to use

Richardson extrapolation, as implemented in the numDeriv package.

• it is important to scale the estimated gradient at the estimate appropriately; two reasonable

approaches are

1. don’t scale random-effects (Cholesky) gradients, since these are essentially already unit-

less (for LMMs they are scaled relative to the residual variance; for GLMMs they are

scaled relative to the sampling variance of the conditional distribution); for GLMMs,

scale ﬁxed-effect gradients by the standard deviations of the corresponding input vari-

able, or

2. scale gradients by the inverse Cholesky factor of the Hessian, equivalent to scaling by

the estimated Wald standard error of the estimated parameters. The latter approach is

used in the current version of lme4; it has the disadvantage that it requires us to estimate

the Hessian (although the Hessian is required for reliable estimation of the ﬁxed-effect

standard errors for GLMMs in any case).

• Exploratory analyses suggest that (1) the naive estimation of the Hessian may fail for large

data sets (number of observations greater than approximately 105); (2) the magnitude of the

scaled gradient increases with sample size, so that warnings will occur even for apparently

well-behaved ﬁts with large data sets.

If you do see convergence warnings, and want to trouble-shoot/double-check the results, the fol-

lowing steps are recommended (examples are given below):

• double-check the model speciﬁcation and the data for mistakes

• center and scale continuous predictor variables (e.g. with scale)

• check for singularity: if any of the diagonal elements of the Cholesky factor are zero or very

small, the convergence testing methods may be inappropriate (see examples)

convergence 13

• double-check the Hessian calculation with the more expensive Richardson extrapolation method

(see examples)

• restart the ﬁt from the apparent optimum, or from a point perturbed slightly away from the

optimum

• try all available optimizers (e.g. several different implementations of BOBYQA and Nelder-

Mead, L-BFGS-B from optim,nlminb, . . . ) While this will of course be slow for large

ﬁts, we consider it the gold standard; if all optimizers converge to values that are practically

equivalent, then we would consider the convergence warnings to be false positives.

To quote Douglas Adams, we apologize for the inconvenience.

See Also

lmerControl

Examples

fm1 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy)

## 1. center and scale predictors:

ss.CS <- transform(sleepstudy, Days=scale(Days))

fm1.CS <- update(fm1, data=ss.CS)

## 2. check singularity

diag.vals <- getME(fm1,"theta")[getME(fm1,"lower") == 0]

any(diag.vals < 1e-6) # FALSE

## 3. recompute gradient and Hessian with Richardson extrapolation

devfun <- update(fm1, devFunOnly=TRUE)

if (isLMM(fm1)) {

pars <- getME(fm1,"theta")

} else {

## GLMM: requires both random and fixed parameters

pars <- getME(fm1, c("theta","fixef"))

}

if (require("numDeriv")) {

cat("hess:\n"); print(hess <- hessian(devfun, unlist(pars)))

cat("grad:\n"); print(grad <- grad(devfun, unlist(pars)))

cat("scaled gradient:\n")

print(scgrad <- solve(chol(hess), grad))

}

## compare with internal calculations:

fm1@optinfo$derivs

## 4. restart the fit from the original value (or

## a slightly perturbed value):

fm1.restart <- update(fm1, start=pars)

## 5. try all available optimizers

source(system.file("utils", "allFit.R", package="lme4"))

fm1.all <- allFit(fm1)

ss <- summary(fm1.all)

ss$ fixef ## extract fixed effects

ss$ llik ## log-likelihoods

ss$ sdcor ## SDs and correlations

14 drop1.merMod

ss$ theta ## Cholesky factors

ss$ which.OK ## which fits worked

devcomp Extract the deviance component list

Description

Return the deviance component list

Usage

devcomp(x)

Arguments

xa ﬁtted model of class merMod

Details

A ﬁtted model of class merMod has a devcomp slot as described in the value section.

Value

a list with components

dims a named integer vector of various dimensions

cmp a named numeric vector of components of the deviance

Note

This function is deprecated, use getME(., "devcomp")

drop1.merMod Drop all possible single ﬁxed-effect terms from a mixed effect model

Description

Drop allowable single terms from the model: see drop1 for details of how the appropriate scope for

dropping terms is determined.

Usage

## S3 method for class 'merMod'

drop1(object, scope, scale = 0,

test = c("none", "Chisq", "user"),

k = 2, trace = FALSE, sumFun, ...)

drop1.merMod 15

Arguments

object a ﬁtted merMod object.

scope a formula giving the terms to be considered for adding or dropping.

scale Currently ignored (included for S3 method compatibility)

test should the results include a test statistic relative to the original model? The χ2

test is a likelihood-ratio test, which is approximate due to ﬁnite-size effects.

kthe penalty constant in AIC

trace print tracing information?

sumFun a summary function to be used when test=="user". It must allow arguments

scale and k, but these may be ignored (e.g. speciﬁed in dots). The ﬁrst two ar-

guments must be object, the full model ﬁt, and objectDrop, a reduced model.

If objectDrop is missing, sumFun should return a vector of with the appropriate

length and names (the actual contents are ignored).

... other arguments (ignored)

Details

drop1 relies on being able to ﬁnd the appropriate information within the environment of the formula

of the original model. If the formula is created in an environment that does not contain the data, or

other variables passed to the original model (for example, if a separate function is called to deﬁne

the formula), then drop1 will fail. A workaround (see example below) is to manually specify an

appropriate environment for the formula.

Value

An object of class anova summarizing the differences in ﬁt between the models.

Examples

fm1 <- lmer(Reaction~Days+(Days|Subject),sleepstudy)

## likelihood ratio tests

drop1(fm1,test="Chisq")

## use Kenward-Roger corrected F test, or parametric bootstrap,

## to test the significance of each dropped predictor

if (require(pbkrtest) && packageVersion("pbkrtest")>="0.3.8") {

KRSumFun <- function(object, objectDrop, ...) {

krnames <- c("ndf","ddf","Fstat","p.value","F.scaling")

r <- if (missing(objectDrop)) {

setNames(rep(NA,length(krnames)),krnames)

} else {

krtest <- KRmodcomp(object,objectDrop)

unlist(krtest$stats[krnames])

}

attr(r,"method") <- c("Kenward-Roger via pbkrtest package")

r

}

drop1(fm1,test="user",sumFun=KRSumFun)

if(lme4:::testLevel() >= 3) { ## takes about 16 sec

nsim <- 100

PBSumFun <- function(object, objectDrop, ...) {

pbnames <- c("stat","p.value")

16 dummy

r <- if (missing(objectDrop)) {

setNames(rep(NA,length(pbnames)),pbnames)

} else {

pbtest <- PBmodcomp(object,objectDrop,nsim=nsim)

unlist(pbtest$test[2,pbnames])

}

attr(r,"method") <- c("Parametric bootstrap via pbkrtest package")

r

}

system.time(drop1(fm1,test="user",sumFun=PBSumFun))

}

}

## workaround for creating a formula in a separate environment

createFormula <- function(resp, fixed, rand) {

f <- reformulate(c(fixed,rand),response=resp)

## use the parent (createModel) environment, not the

## environment of this function (which does not contain 'data')

environment(f) <- parent.frame()

f

}

createModel <- function(data) {

mf.final <- createFormula("Reaction", "Days", "(Days|Subject)")

lmer(mf.final, data=data)

}

drop1(createModel(data=sleepstudy))

dummy Dummy variables (experimental)

Description

Largely a wrapper for model.matrix that accepts a factor, f, and returns a dummy matrix with

nlevels(f)-1 columns (the ﬁrst column is dropped by default). Useful whenever one wishes to

avoid the behaviour of model.matrix of always returning an nlevels(f)-column matrix, either by

the addition of an intercept column, or by keeping one column for all levels.

Usage

dummy(f, levelsToKeep)

Arguments

fAn object coercible to factor.

levelsToKeep An optional character vector giving the subset of levels(f) to be converted to

dummy variables.

Value

Amodel.matrix with dummy variables as columns.

Examples

data(Orthodont,package="nlme")

lmer(distance ~ age + (age|Subject) +

(0+dummy(Sex, "Female")|Subject), data = Orthodont)

Dyestuff 17

Dyestuff Yield of dyestuff by batch

Description

The Dyestuff data frame provides the yield of dyestuff (Naphthalene Black 12B) from 5 different

preparations from each of 6 different batchs of an intermediate product (H-acid). The Dyestuff2

data were generated data in the same structure but with a large residual variance relative to the batch

variance.

Format

Data frames, each with 30 observations on the following 2 variables.

Batch a factor indicating the batch of the intermediate product from which the preparation was

created.

Yield the yield of dyestuff from the preparation (grams of standard color).

Details

The Dyestuff data are described in Davies and Goldsmith (1972) as coming from “an investigation

to ﬁnd out how much the variation from batch to batch in the quality of an intermediate product

(H-acid) contributes to the variation in the yield of the dyestuff (Naphthalene Black 12B) made

from it. In the experiment six samples of the intermediate, representing different batches of works

manufacture, were obtained, and ﬁve preparations of the dyestuff were made in the laboratory from

each sample. The equivalent yield of each preparation as grams of standard colour was determined

by dye-trial.”

The Dyestuff2 data are described in Box and Tiao (1973) as illustrating “ the case where between-

batches mean square is less than the within-batches mean square. These data had to be constructed

for although examples of this sort undoubtably occur in practice, they seem to be rarely published.”

Source

O.L. Davies and P.L. Goldsmith (eds), Statistical Methods in Research and Production, 4th ed.,

Oliver and Boyd, (1972), section 6.4

G.E.P. Box and G.C. Tiao, Bayesian Inference in Statistical Analysis, Addison-Wesley, (1973),

section 5.1.2

Examples

require(lattice)

str(Dyestuff)

dotplot(reorder(Batch, Yield) ~ Yield, Dyestuff,

ylab = "Batch", jitter.y = TRUE, aspect = 0.3,

type = c("p", "a"))

dotplot(reorder(Batch, Yield) ~ Yield, Dyestuff2,

ylab = "Batch", jitter.y = TRUE, aspect = 0.3,

type = c("p", "a"))

(fm1 <- lmer(Yield ~ 1|Batch, Dyestuff))

(fm2 <- lmer(Yield ~ 1|Batch, Dyestuff2))

18 factorize

expandDoubleVerts Expand terms with ’||’ notation into separate ’|’ terms

Description

From the right hand side of a formula for a mixed-effects model, expand terms with the double

vertical bar operator into separate, independent random effect terms.

Usage

expandDoubleVerts(term)

Arguments

term a mixed-model formula

Value

the modiﬁed term

Note

Note that || works at the level of formula parsing. This fact can lead to results that may be confusing

when factors occur to the left of the || sign (more info at https://github.com/lme4/lme4/

issues/229).

See Also

formula,model.frame,model.matrix.

Other utilities: mkRespMod,mkReTrms,nlformula,nobars,subbars

Examples

f<-y~x+(x||g)

# the right-hand side of f is,

f[[3]]

# the expanded right-hand side,

expandDoubleVerts(f[[3]])

factorize Attempt to convert grouping variables to factors

Description

If variables within a data frame are not factors, try to convert them. Not intended for end-user use;

this is a utility function that needs to be exported, for technical reasons.

Usage

factorize(x,frloc,char.only=FALSE)

ﬁndbars 19

Arguments

xa formula

frloc a data frame

char.only (logical) convert only character variables to factors?

Value

a copy of the data frame with factors converted

findbars Determine random-effects expressions from a formula

Description

From the right hand side of a formula for a mixed-effects model, determine the pairs of expressions

that are separated by the vertical bar operator. Also expand the slash operator in grouping factor ex-

pressions and expand terms with the double vertical bar operator into separate, independent random

effect terms.

Usage

findbars(term)

Arguments

term a mixed-model formula

Value

pairs of expressions that were separated by vertical bars

Note

This function is called recursively on individual terms in the model, which is why the argument is

called term and not a name like form, indicating a formula.

See Also

formula,model.frame,model.matrix.

Other utilities: mkRespMod,mkReTrms,nlformula,nobars,subbars

Examples

findbars(f1 <- Reaction ~ Days + (Days | Subject))

## => list( Days | Subject )

## These two are equivalent:% tests in ../inst/tests/test-doubleVertNotation.R

findbars(y ~ Days + (1 | Subject) + (0 + Days | Subject))

findbars(y ~ Days + (Days || Subject))

## => list of length 2: list ( 1 | Subject , 0 + Days | Subject)

findbars(~ 1 + (1 | batch / cask))

## => list of length 2: list ( 1 | cask:batch , 1 | batch)

20 fortify

fixef Extract ﬁxed-effects estimates

Description

Extract the ﬁxed-effects estimates

Usage

## S3 method for class 'merMod'

fixef(object, add.dropped=FALSE, ...)

Arguments

object any ﬁtted model object from which ﬁxed effects estimates can be extracted.

add.dropped for models with rank-deﬁcient design matrix, reconstitute the full-length param-

eter vector by adding NA values in appropriate locations?

... optional additional arguments. Currently none are used in any methods.

Details

Extract the estimates of the ﬁxed-effects parameters from a ﬁtted model.

Value

a named, numeric vector of ﬁxed-effects estimates.

Examples

fixef(lmer(Reaction ~ Days + (1|Subject) + (0+Days|Subject), sleepstudy))

fm2 <- lmer(Reaction ~ Days + Days2 + (1|Subject),

data=transform(sleepstudy,Days2=Days))

fixef(fm2,add.dropped=TRUE)

## first two parameters are the same ...

stopifnot(all.equal(fixef(fm2,add.dropped=TRUE)[1:2],

fixef(fm2)))

fortify add information to data based on a ﬁtted model

Description

add information to data based on a ﬁtted model

Usage

fortify.merMod(model, data = getData(model),

...)

getME 21

Arguments

model ﬁtted model

data original data set, if needed

... additional arguments

Details

fortify is a function deﬁned in the ggplot2 package, q.v. for more details. fortify is not deﬁned

here, and fortify.merMod is deﬁned as a function rather than an S3 method, to avoid (1) inducing

a dependency on ggplot2 or (2) masking methods from ggplot2. This is currently an experimental

feature.

getME Extract or Get Generalized Components from a Fitted Mixed Effects

Model

Description

Extract (or “get”) “components” – in a generalized sense – from a ﬁtted mixed-effects model, i.e.,

(in this version of the package) from an object of class "merMod".

Usage

getME(object, name, ...)

## S3 method for class 'merMod'

getME(object,

name = c("X", "Z", "Zt", "Ztlist", "mmList", "y", "mu", "u", "b",

"Gp", "Tp", "L", "Lambda", "Lambdat", "Lind", "Tlist",

"A", "RX", "RZX", "sigma", "flist",

"fixef", "beta", "theta", "ST", "REML", "is_REML",

"n_rtrms", "n_rfacs", "N", "n", "p", "q",

"p_i", "l_i", "q_i", "k", "m_i", "m",

"cnms", "devcomp", "offset", "lower", "devfun", "glmer.nb.theta"),

...)

Arguments

object a ﬁtted mixed-effects model of class "merMod", i.e., typically the result of lmer(),

glmer() or nlmer().

name a character vector specifying the name(s) of the “component”. If length(name) > 1

or if name = "ALL", a named list of components will be returned. Possi-

ble values are:

"X":ﬁxed-effects model matrix

"Z":random-effects model matrix

"Zt":transpose of random-effects model matrix. Note that the structure of

Zt has changed since lme4.0; to get a backward-compatible structure, use

do.call(Matrix::rBind,getME(.,"Ztlist"))

22 getME

"Ztlist":list of components of the transpose of the random-effects model ma-

trix, separated by individual variance component

"mmList":list of raw model matrices associated with random effects terms

"y":response vector

"mu":conditional mean of the response

"u":conditional mode of the “spherical” random effects variable

"b":conditional mode of the random effects variable

"Gp":groups pointer vector. A pointer to the beginning of each group of ran-

dom effects corresponding to the random-effects terms, beginning with 0

and including a ﬁnal element giving the total number of random effects

"Tp":theta pointer vector. A pointer to the beginning of the theta sub-vectors

corresponding to the random-effects terms, beginning with 0 and including

a ﬁnal element giving the number of thetas.

"L":sparse Cholesky factor of the penalized random-effects model.

"Lambda":relative covariance factor Λof the random effects.

"Lambdat":transpose Λ0of Λabove.

"Lind":index vector for inserting elements of θinto the nonzeros of Λ.

"Tlist":vector of template matrices from which the blocks of Λare generated.

"A":Scaled sparse model matrix (class "dgCMatrix") for the unit, orthogonal

random effects, U, equal to getME(.,"Zt") %*% getME(.,"Lambdat")

"RX":Cholesky factor for the ﬁxed-effects parameters

"RZX":cross-term in the full Cholesky factor

"sigma":residual standard error; note that sigma(object) is preferred.

"flist":a list of the grouping variables (factors) involved in the random effect

terms

"fixef":ﬁxed-effects parameter estimates

"beta":ﬁxed-effects parameter estimates (identical to the result of fixef, but

without names)

"theta":random-effects parameter estimates: these are parameterized as the

relative Cholesky factors of each random effect term

"ST":A list of S and T factors in the TSST’ Cholesky factorization of the rela-

tive variance matrices of the random effects associated with each random-

effects term. The unit lower triangular matrix, T, and the diagonal matrix,

S, for each term are stored as a single matrix with diagonal elements from

Sand off-diagonal elements from T.

"n_rtrms":number of random-effects terms

"n_rfacs":number of distinct random-effects grouping factors

"N":number of rows of X

"n":length of the response vector, y

"p":number of columns of the ﬁxed effects model matrix, X

"q":number of columns of the random effects model matrix, Z

"p_i":numbers of columns of the raw model matrices, mmList

"l_i":numbers of levels of the grouping factors

"q_i":numbers of columns of the term-wise model matrices, ZtList

"k":number of random effects terms

"m_i":numbers of covariance parameters in each term

"m":total number of covariance parameters

getME 23

"cnms":the “component names”, a list.

"REML":0indicates the model was ﬁtted by maximum likelihood, any other

positive integer indicates ﬁtting by restricted maximum likelihood

"is_REML":same as the result of isREML(.)

"devcomp":a list consisting of a named numeric vector, cmp, and a named in-

teger vector, dims, describing the ﬁtted model. The elements of cmp are:

ldL2 twice the log determinant of L

ldRX2 twice the log determinant of RX

wrss weighted residual sum of squares

ussq squared length of u

pwrss penalized weighted residual sum of squares, “wrss + ussq”

drsum sum of residual deviance (GLMMs only)

REML REML criterion at optimum (LMMs ﬁt by REML only)

dev deviance criterion at optimum (models ﬁt by ML only)

sigmaML ML estimate of residual standard deviation

sigmaREML REML estimate of residual standard deviation

tolPwrss tolerance for declaring convergence in the penalized iteratively

weighted residual sum-of-squares (GLMMs only)

The elements of dims are:

Nnumber of rows of X

nlength of y

pnumber of columns of X

nmp n-p

nth length of theta

qnumber of columns of Z

nAGQ see glmer

compDev see glmerControl

useSc TRUE if model has a scale parameter

reTrms number of random effects terms

REML 0indicates the model was ﬁtted by maximum likelihood, any other

positive integer indicates ﬁtting by restricted maximum likelihood

GLMM TRUE if a GLMM

NLMM TRUE if an NLMM

"offset":model offset

"lower":lower bounds on model parameters (random effects parameters only).

"devfun":deviance function (so far only available for LMMs)

"glmer.nb.theta":negative binomial θparameter, only for glmer.nb.

"ALL":get all of the above as a list.

... currently unused in lme4, potentially further arguments in methods.

Details

The goal is to provide “everything a user may want” from a ﬁtted "merMod" object as far as it is not

available by methods, such as fixef,ranef,vcov, etc.

24 GHrule

Value

Unspeciﬁed, as very much depending on the name.

See Also

getCall(). More standard methods for "merMod" objects, such as ranef,fixef,vcov, etc.: see

methods(class="merMod")

Examples

## shows many methods you should consider *before* using getME():

methods(class = "merMod")

(fm1 <- lmer(Reaction ~ Days + (Days|Subject), sleepstudy))

Z <- getME(fm1, "Z")

stopifnot(is(Z, "CsparseMatrix"),

c(180,36) == dim(Z),

all.equal(fixef(fm1), b1 <- getME(fm1, "beta"),

check.attributes=FALSE, tolerance = 0))

## A way to get *all* getME()s :

## internal consistency check ensuring that all work:

parts <- getME(fm1, "ALL")

str(parts, max=2)

stopifnot(identical(Z, parts $ Z),

identical(b1, parts $ beta))

GHrule Univariate Gauss-Hermite quadrature rule

Description

Create a univariate Gauss-Hermite quadrature rule.

Usage

GHrule(ord, asMatrix = TRUE)

Arguments

ord scalar integer between 1 and 25 - the order, or number of nodes and weights, in

the rule. When the function being multiplied by the standard normal density is

a polynomial of order 2k-1 the rule of order k integrates the product exactly.

asMatrix logical scalar - should the result be returned as a matrix. If FALSE a data frame

is returned. Defaults to TRUE.

Details

This version of Gauss-Hermite quadrature provides the node positions and weights for a scalar

integral of a function multiplied by the standard normal density.

Originally based on package SparseGrid’s hidden GQN().

glmer 25

Value

a matrix (or data frame, is asMatrix is false) with ord rows and three columns which are zthe node

positions, wthe weights and ldnorm, the logarithm of the normal density evaluated at the nodes.

See Also

a different interface is available via GQdk().

Examples

(r5 <- GHrule(5, asMatrix=FALSE))

## second, fourth, sixth, eighth and tenth central moments of the

## standard Gaussian density

with(r5, sapply(seq(2, 10, 2), function(p) sum(w * z^p)))

glmer Fitting Generalized Linear Mixed-Effects Models

Description

Fit a generalized linear mixed-effects model (GLMM). Both ﬁxed effects and random effects are

speciﬁed via the model formula.

Usage

glmer(formula, data = NULL, family = gaussian, control = glmerControl(),

start = NULL, verbose = 0L, nAGQ = 1L, subset, weights, na.action,

offset, contrasts = NULL, mustart, etastart,

devFunOnly = FALSE, ...)

Arguments

formula a two-sided linear formula object describing both the ﬁxed-effects and ﬁxed-

effects part of the model, with the response on the left of a ~operator and the

terms, separated by +operators, on the right. Random-effects terms are distin-

guished by vertical bars ("|") separating expressions for design matrices from

grouping factors.

data an optional data frame containing the variables named in formula. By default

the variables are taken from the environment from which lmer is called. While

data is optional, the package authors strongly recommend its use, especially

when later applying methods such as update and drop1 to the ﬁtted model

(such methods are not guaranteed to work properly if data is omitted). If data

is omitted, variables will be taken from the environment of formula (if speciﬁed

as a formula) or from the parent frame (if speciﬁed as a character vector).

family a GLM family, see glm and family.

control a list (of correct class, resulting from lmerControl() or glmerControl() re-

spectively) containing control parameters, including the nonlinear optimizer to

be used and parameters to be passed through to the nonlinear optimizer, see the

*lmerControl documentation for details.

26 glmer

start a named list of starting values for the parameters in the model, or a numeric

vector. A numeric start argument will be used as the starting value of theta.

If start is a list, the theta element (a numeric vector) is used as the starting

value for the ﬁrst optimization step (default=1 for diagonal elements and 0 for

off-diagonal elements of the lower Cholesky factor); the ﬁtted value of theta

from the ﬁrst step, plus start[["fixef"]], are used as starting values for the

second optimization step. If start has both fixef and theta elements, the ﬁrst

optimization step is skipped. For more details or ﬁner control of optimization,

see modular.

verbose integer scalar. If > 0 verbose output is generated during the optimization of the

parameter estimates. If > 1 verbose output is generated during the individual

PIRLS steps.

nAGQ integer scalar - the number of points per axis for evaluating the adaptive Gauss-

Hermite approximation to the log-likelihood. Defaults to 1, corresponding to

the Laplace approximation. Values greater than 1 produce greater accuracy in

the evaluation of the log-likelihood at the expense of speed. A value of zero uses

a faster but less exact form of parameter estimation for GLMMs by optimizing

the random effects and the ﬁxed-effects coefﬁcients in the penalized iteratively

reweighted least squares step. (See Details.)

subset an optional expression indicating the subset of the rows of data that should be

used in the ﬁt. This can be a logical vector, or a numeric vector indicating which

observation numbers are to be included, or a character vector of the row names

to be included. All observations are included by default.

weights an optional vector of ‘prior weights’ to be used in the ﬁtting process. Should be

NULL or a numeric vector.

na.action a function that indicates what should happen when the data contain NAs. The de-

fault action (na.omit, inherited from the ‘factory fresh’ value of getOption("na.action"))

strips any observations with any missing values in any variables.

offset this can be used to specify an a priori known component to be included in the

linear predictor during ﬁtting. This should be NULL or a numeric vector of length

equal to the number of cases. One or more offset terms can be included in the

formula instead or as well, and if more than one is speciﬁed their sum is used.

See model.offset.

contrasts an optional list. See the contrasts.arg of model.matrix.default.

mustart optional starting values on the scale of the conditional mean, as in glm; see there

for details.

etastart optional starting values on the scale of the unbounded predictor as in glm; see

there for details.

devFunOnly logical - return only the deviance evaluation function. Note that because the

deviance function operates on variables stored in its environment, it may not

return exactly the same values on subsequent calls (but the results should always

be within machine tolerance).

... other potential arguments. A method argument was used in earlier versions of

the package. Its functionality has been replaced by the nAGQ argument.

Details

Fit a generalized linear mixed model, which incorporates both ﬁxed-effects parameters and ran-

dom effects in a linear predictor, via maximum likelihood. The linear predictor is related to the

conditional mean of the response through the inverse link function deﬁned in the GLM family.

glmer 27

The expression for the likelihood of a mixed-effects model is an integral over the random effects

space. For a linear mixed-effects model (LMM), as ﬁt by lmer, this integral can be evaluated

exactly. For a GLMM the integral must be approximated. The most reliable approximation for

GLMMs is adaptive Gauss-Hermite quadrature, at present implemented only for models with a

single scalar random effect. The nAGQ argument controls the number of nodes in the quadrature for-

mula. A model with a single, scalar random-effects term could reasonably use up to 25 quadrature

points per scalar integral.

Value

An object of class glmerMod, for which many methods are available (e.g. methods(class="glmerMod")).

See Also

lmer (for details on formulas and parameterization); glm for Generalized Linear Models (without

random effects). nlmer for nonlinear mixed-effects models.

glmer.nb to ﬁt negative binomial GLMMs.

Examples

## generalized linear mixed model

library(lattice)

xyplot(incidence/size ~ period|herd, cbpp, type=c('g','p','l'),

layout=c(3,5), index.cond = function(x,y)max(y))

(gm1 <- glmer(cbind(incidence, size - incidence) ~ period + (1 | herd),

data = cbpp, family = binomial))

## using nAGQ=0 only gets close to the optimum

(gm1a <- glmer(cbind(incidence, size - incidence) ~ period + (1 | herd),

cbpp, binomial, nAGQ = 0))

## using nAGQ = 9 provides a better evaluation of the deviance

## Currently the internal calculations use the sum of deviance residuals,

## which is not directly comparable with the nAGQ=0 or nAGQ=1 result.

(gm1a <- glmer(cbind(incidence, size - incidence) ~ period + (1 | herd),

cbpp, binomial, nAGQ = 9))

## GLMM with individual-level variability (accounting for overdispersion)

## For this data set the model is the same as one allowing for a period:herd

## interaction, which the plot indicates could be needed.

cbpp$obs <- 1:nrow(cbpp)

(gm2 <- glmer(cbind(incidence, size - incidence) ~ period +

(1 | herd) + (1|obs),

family = binomial, data = cbpp))

anova(gm1,gm2)

## glmer and glm log-likelihoods are consistent

gm1Devfun <- update(gm1,devFunOnly=TRUE)

gm0 <- glm(cbind(incidence, size - incidence) ~ period,

family = binomial, data = cbpp)

## evaluate GLMM deviance at RE variance=theta=0, beta=(GLM coeffs)

gm1Dev0 <- gm1Devfun(c(0,coef(gm0)))

## compare

stopifnot(all.equal(gm1Dev0,c(-2*logLik(gm0))))

## the toenail oncholysis data from Backer et al 1998

## these data are notoriously difficult to fit

## Not run:

if (require("HSAUR2")) {

28 glmer.nb

gm2 <- glmer(outcome~treatment*visit+(1|patientID),

data=toenail,

family=binomial,nAGQ=20)

}

## End(Not run)

glmer.nb Fitting GLMM’s for Negative Binomial

Description

Fits a generalized linear mixed-effects model (GLMM) for the negative binomial family, building

on glmer, and initializing via theta.ml from MASS.

Usage

glmer.nb(..., interval = log(th) + c(-3, 3),

tol = 5e-5, verbose = FALSE, nb.control = NULL,

initCtrl = list(limit = 20, eps = 2*tol, trace = verbose))

Arguments

... arguments as for glmer(.) such as formula,data,control, etc, but not

family!

interval interval in which to start the optimization. The default is symmetric on log scale

around the initially estimated theta.

tol tolerance for the optimization via optimize.

verbose logical indicating how much progress information should be printed during

the optimization. Use verbose = 2 (or larger) to enable verbose=TRUE in the

glmer() calls.

nb.control optional list, like glmerControl(), used in refit(*, control = control.nb)

during the optimization.

initCtrl (experimental, do not rely on this:) a list with named components as in the

default, passed to theta.ml (package MASS) for the initial value of the negative

binomial parameter theta.

Value

An object of class glmerMod, for which many methods are available (e.g. methods(class="glmerMod")),

see glmer.

Note

For historical reasons, the shape parameter of the negative binomial and the random effects param-

eters in our (G)LMM models are both called theta (θ), but are unrelated here.

The negative binomial θcan be extracted from a ﬁt g <- glmer.nb() by getME(g, "glmer.nb.theta").

Parts of glmer.nb() are still experimental and methods are still missing or suboptimal. In particu-

lar, there is no inference available for the dispersion parameter θ, yet.

glmerLaplaceHandle 29

See Also

glmer; from package MASS,negative.binomial (which we re-export currently) and theta.ml,

the latter for initialization of optimization.

The ‘Details’ of pnbinom for the deﬁnition of the negative binomial distribution.

Examples

set.seed(101)

dd <- expand.grid(f1 = factor(1:3),

f2 = LETTERS[1:2], g=1:9, rep=1:15,

KEEP.OUT.ATTRS=FALSE)

summary(mu <- 5*(-4 + with(dd, as.integer(f1) + 4*as.numeric(f2))))

dd$y <- rnbinom(nrow(dd), mu = mu, size = 0.5)

str(dd)

require("MASS")## and use its glm.nb() - as indeed we have zero random effect:

m.glm <- glm.nb(y ~ f1*f2, data=dd, trace=TRUE)

summary(m.glm)

m.nb <- glmer.nb(y ~ f1*f2 + (1|g), data=dd, verbose=TRUE)

m.nb

## The neg.binomial theta parameter:

getME(m.nb, "glmer.nb.theta")

LL <- logLik(m.nb)

## mixed model has 1 additional parameter (RE variance)

stopifnot(attr(LL,"df")==attr(logLik(m.glm),"df")+1)

plot(m.nb, resid(.) ~ g)# works, as long as data 'dd'is found

glmerLaplaceHandle Handle for glmerLaplace

Description

Handle for calling the glmerLaplace C++ function. Not intended for routine use.

Usage

glmerLaplaceHandle(pp, resp, nAGQ, tol, maxit, verbose)

Arguments

pp merPredD object

resp lmResp object

nAGQ see glmer

tol tolerance

maxit maximum number of pwrss iterations

verbose display optimizer progress

Value

Value of the objective function

30 glmFamily-class

glmFamily Generator object for the glmFamily class

Description

The generator object for the glmFamily reference class. Such an object is primarily used through

its new method.

Usage

glmFamily(...)

Arguments

... Named argument (see Note below)

Methods

new(family=family) Create a new glmFamily object

Note

Arguments to the new method must be named arguments.

See Also

glmFamily

glmFamily-class Class "glmFamily" - a reference class for family

Description

This class is a wrapper class for family objects specifying a distibution family and link function

for a generalized linear model (glm). The reference class contains an external pointer to a C++

object representing the class. For common families and link functions the functions in the family

are implemented in compiled code so they can be accessed from other compiled code and for a

speed boost.

Extends

All reference classes extend and inherit methods from "envRefClass".

Note

Objects from this reference class correspond to objects in a C++ class. Methods are invoked on

the C++ class using the external pointer in the Ptr ﬁeld. When saving such an object the external

pointer is converted to a null pointer, which is why there is a redundant ﬁeld ptr that is an active-

binding function returning the external pointer. If the Ptr ﬁeld is a null pointer, the external pointer

is regenerated for the stored family ﬁeld.

golden-class 31

See Also

family,glmFamily

Examples

str(glmFamily$new(family=poisson()))

golden-class Class "golden" and Generator for Golden Search Optimizer Class

Description

"golden" is a reference class for a golden search scalar optimizer, for a parameter within an interval.

golden() is the generator for the "golden" class. The optimizer uses reverse communications.

Usage

golden(...)

Arguments

... (partly optional) arguments passed to new() must be named arguments. lower

and upper are the bounds for the scalar parameter; they must be ﬁnite.

Extends

All reference classes extend and inherit methods from "envRefClass".

Examples

showClass("golden")

golden(lower= -100, upper= 1e100)

GQdk Sparse Gaussian / Gauss-Hermite Quadrature grid

Description

Generate the sparse multidimensional Gaussian quadrature grids.

Currently unused. See GHrule() for the version currently in use in package lme4.

Usage

GQdk(d = 1L, k = 1L)

GQN

32 grouseticks

Arguments

dinteger scalar - the dimension of the function to be integrated with respect to the

standard d-dimensional Gaussian density.

kinteger scalar - the order of the grid. A grid of order kprovides an exact result

for a polynomial of total order of 2k - 1 or less.

Value

GQdk() returns a matrix with d + 1 columns. The ﬁrst column is the weights and the remaining d

columns are the node coordinates.

GQN is a list of lists, containing the non-redundant quadrature nodes and weights for integration

of a scalar function of a d-dimensional argument with respect to the density function of the d-

dimensional Gaussian density function.

The outer list is indexed by the dimension, d, in the range of 1 to 20. The inner list is indexed by k,

the order of the quadrature.

Note

GQN contains only the non-redundant nodes. To regenerate the whole array of nodes, all possible

permutations of axes and all possible combinations of ±1must be applied to the axes. This entire

array of nodes is exactly what GQdk() reproduces.

The number of nodes gets very large very quickly with increasing dand k. See the charts at http:

//www.sparse-grids.de.

Examples

GQdk(2,5) # 53 x 3

GQN[[3]][[5]] # a 14 x 4 matrix

grouseticks Data on red grouse ticks from Elston et al. 2001

Description

Number of ticks on the heads of red grouse chicks sampled in the ﬁeld (grouseticks) and an

aggregated version (grouseticks_agg); see original source for more details

Usage

data(grouseticks)

Format

INDEX (factor) chick number (observation level)

TICKS number of ticks sampled

BROOD (factor) brood number

HEIGHT height above sea level (meters)

YEAR year (-1900)

hatvalues.merMod 33

LOCATION (factor) geographic location code

cHEIGHT centered height, derived from HEIGHT

meanTICKS mean number of ticks by brood

varTICKS variance of number of ticks by brood

Details

grouseticks_agg is just a brood-level aggregation of the data

Source

Robert Moss, via David Elston

References

Elston, D. A., R. Moss, T. Boulinier, C. Arrowsmith, and X. Lambin. 2001. "Analysis of Aggre-

gation, a Worked Example: Numbers of Ticks on Red Grouse Chicks." Parasitology 122 (05): 563-

569. doi:10.1017/S0031182001007740. http://journals.cambridge.org/action/displayAbstract?

fromPage=online&aid=82701.

Examples

data(grouseticks)

## Figure 1a from Elston et al

par(las=1,bty="l")

tvec <- c(0,1,2,5,20,40,80)

pvec <- c(4,1,3)

with(grouseticks_agg,plot(1+meanTICKS~HEIGHT,

pch=pvec[factor(YEAR)],

log="y",axes=FALSE,

xlab="Altitude (m)",

ylab="Brood mean ticks"))

axis(side=1)

axis(side=2,at=tvec+1,label=tvec)

box()

abline(v=405,lty=2)

## Figure 1b

with(grouseticks_agg,plot(varTICKS~meanTICKS,

pch=4,

xlab="Brood mean ticks",

ylab="Within-brood variance"))

curve(1*x,from=0,to=70,add=TRUE)

## Model fitting

form <- TICKS~YEAR+HEIGHT+(1|BROOD)+(1|INDEX)+(1|LOCATION)

(full_mod1 <- glmer(form, family="poisson",data=grouseticks))

hatvalues.merMod Diagonal elements of the hat matrix

34 InstEval

Description

Returns the values on the diagonal of the hat matrix, which is the matrix that transforms the response

vector (minus any offset) into the ﬁtted values (minus any offset). Note that this method should

only be used for linear mixed models. It is not clear if the hat matrix concept even makes sense for

generalized linear mixed models.

Usage

## S3 method for class 'merMod'

hatvalues(model, fullHatMatrix = FALSE, ...)

Arguments

model An object of class merMod.

fullHatMatrix Return full hat matrix (not just diagonal values)?

... Not currently used

Value

The diagonal elements of the hat matrix.

Examples

m <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy)

hatvalues(m)

InstEval University Lecture/Instructor Evaluations by Students at ETH

Description

University lecture evaluations by students at ETH Zurich, anonymized for privacy protection. This

is an interesting “medium” sized example of a partially nested mixed effect model.

Format

A data frame with 73421 observations on the following 7 variables.

sa factor with levels 1:2972 denoting individual students.

da factor with 1128 levels from 1:2160, denoting individual professors or lecturers.

studage an ordered factor with levels 2<4<6<8, denoting student’s “age” measured in the

semester number the student has been enrolled.

lectage an ordered factor with 6 levels, 1<2< ... < 6, measuring how many semesters back the

lecture rated had taken place.

service a binary factor with levels 0and 1; a lecture is a “service”, if held for a different depart-

ment than the lecturer’s main one.

dept a factor with 14 levels from 1:15, using a random code for the department of the lecture.

ya numeric vector of ratings of lectures by the students, using the discrete scale 1:5, with meanings

of ‘poor’ to ‘very good’.

Each observation is one student’s rating for a speciﬁc lecture (of one lecturer, during one semester

in the past).

isNested 35

Details

The main goal of the survey is to ﬁnd “the best liked prof”, according to the lectures given. Statis-

tical analysis of such data has been the basis for a (student) jury selecting the ﬁnal winners.

The present data set has been anonymized and slightly simpliﬁed on purpose.

Examples

str(InstEval)

head(InstEval, 16)

xtabs(~ service + dept, InstEval)

isNested Is f1 nested within f2?

Description

Does every level of f1 occur in conjunction with exactly one level of f2? The function is based on

converting a triplet sparse matrix to a compressed column-oriented form in which the nesting can

be quickly evaluated.

Usage

isNested(f1, f2)

Arguments

f1 factor 1

f2 factor 2

Value

TRUE if factor 1 is nested within factor 2

Examples

with(Pastes, isNested(cask, batch)) ## => FALSE

with(Pastes, isNested(sample, batch)) ## => TRUE

36 isREML

isREML Check characteristics of models

Description

Check characteristics of models: whether a model ﬁt corresponds to a linear (LMM), generalized

linear (GLMM), or nonlinear (NLMM) mixed model, and whether a linear mixed model has been

ﬁtted by REML or not (isREML(x) is always FALSE for GLMMs and NLMMs).

Usage

isREML(x, ...)

isLMM(x, ...)

isNLMM(x, ...)

isGLMM(x, ...)

Arguments

xa ﬁtted model.

... additional, optional arguments. (None are used in the merMod methods)

Details

These are generic functions. At present the only methods are for mixed-effects models of class

merMod.

Value

a logical value

See Also

getME

Examples

fm1 <- lmer(Reaction ~ Days + (Days|Subject), sleepstudy)

gm1 <- glmer(cbind(incidence, size - incidence) ~ period + (1 | herd),

data = cbpp, family = binomial)

nm1 <- nlmer(circumference ~ SSlogis(age, Asym, xmid, scal) ~ Asym|Tree,

Orange, start = c(Asym = 200, xmid = 725, scal = 350))

isLMM(fm1)

isGLMM(gm1)

## check all :

is.MM <- function(x) c(LMM = isLMM(x), GLMM= isGLMM(x), NLMM= isNLMM(x))

stopifnot(cbind(is.MM(fm1), is.MM(gm1), is.MM(nm1))

== diag(rep(TRUE,3)))

lmer 37

lmer Fit Linear Mixed-Effects Models

Description

Fit a linear mixed-effects model (LMM) to data, via REML or maximum likelihood.

Usage

lmer(formula, data = NULL, REML = TRUE, control = lmerControl(),

start = NULL, verbose = 0L, subset, weights, na.action,

offset, contrasts = NULL, devFunOnly = FALSE, ...)

Arguments

formula a two-sided linear formula object describing both the ﬁxed-effects and random-

effects part of the model, with the response on the left of a ~operator and the

terms, separated by +operators, on the right. Random-effects terms are distin-

guished by vertical bars ("|") separating expressions for design matrices from

grouping factors. Two vertical bars ("||") can be used to specify multiple un-

correlated random effects for the same grouping variable.

data an optional data frame containing the variables named in formula. By default

the variables are taken from the environment from which lmer is called. While

data is optional, the package authors strongly recommend its use, especially

when later applying methods such as update and drop1 to the ﬁtted model

(such methods are not guaranteed to work properly if data is omitted). If data

is omitted, variables will be taken from the environment of formula (if speciﬁed

as a formula) or from the parent frame (if speciﬁed as a character vector).

REML logical scalar - Should the estimates be chosen to optimize the REML criterion

(as opposed to the log-likelihood)?

control a list (of correct class, resulting from lmerControl() or glmerControl() re-

spectively) containing control parameters, including the nonlinear optimizer to

be used and parameters to be passed through to the nonlinear optimizer, see the

*lmerControl documentation for details.

start a named list of starting values for the parameters in the model. For lmer this

can be a numeric vector or a list with one component named "theta".

verbose integer scalar. If > 0 verbose output is generated during the optimization of the

parameter estimates. If > 1 verbose output is generated during the individual

PIRLS steps.

subset an optional expression indicating the subset of the rows of data that should be

used in the ﬁt. This can be a logical vector, or a numeric vector indicating which

observation numbers are to be included, or a character vector of the row names

to be included. All observations are included by default.

weights an optional vector of ‘prior weights’ to be used in the ﬁtting process. Should

be NULL or a numeric vector. Prior weights are not normalized or standardized

in any way. In particular, the diagonal of the residual covariance matrix is the

squared residual standard deviation parameter sigma times the vector of inverse

weights. Therefore, if the weights have relatively large magnitudes, then in

order to compensate, the sigma parameter will also need to have a relatively

large magnitude.

38 lmer

na.action a function that indicates what should happen when the data contain NAs. The de-

fault action (na.omit, inherited from the ’factory fresh’ value of getOption("na.action"))

strips any observations with any missing values in any variables.

offset this can be used to specify an a priori known component to be included in the

linear predictor during ﬁtting. This should be NULL or a numeric vector of length

equal to the number of cases. One or more offset terms can be included in the

formula instead or as well, and if more than one is speciﬁed their sum is used.

See model.offset.

contrasts an optional list. See the contrasts.arg of model.matrix.default.

devFunOnly logical - return only the deviance evaluation function. Note that because the

deviance function operates on variables stored in its environment, it may not

return exactly the same values on subsequent calls (but the results should always

be within machine tolerance).

... other potential arguments. A method argument was used in earlier versions of

the package. Its functionality has been replaced by the REML argument.

Details

• If the formula argument is speciﬁed as a character vector, the function will attempt to coerce

it to a formula. However, this is not recommended (users who want to construct formulas by

pasting together components are advised to use as.formula or reformulate); model ﬁts will

work but subsequent methods such as drop1,update may fail.

• Unlike some simpler modeling frameworks such as lm and glm which automatically detect

perfectly collinear predictor variables, [gn]lmer cannot handle design matrices of less than

full rank. For example, in cases of models with interactions that have unobserved combina-

tions of levels, it is up to the user to deﬁne a new variable (for example creating ab within the

data from the results of interaction(a,b,drop=TRUE)).

• the deviance function returned when devFunOnly is TRUE takes a single numeric vector ar-

gument, representing the theta vector. This vector deﬁnes the scaled variance-covariance

matrices of the random effects, in the Cholesky parameterization. For models with only sim-

ple (intercept-only) random effects, theta is a vector of the standard deviations of the random

effects. For more complex or multiple random effects, running getME(.,"theta") to retrieve

the theta vector for a ﬁtted model and examining the names of the vector is probably the

easiest way to determine the correspondence between the elements of the theta vector and

elements of the lower triangles of the Cholesky factors of the random effects.

Value

An object of class merMod, for which many methods are available (e.g. methods(class="merMod"))

See Also

lm for linear models; glmer for generalized linear and nlmer for nonlinear mixed models.

Examples

## linear mixed models - reference values from older code

(fm1 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy))

summary(fm1)# (with its own print method; see class?merMod % ./merMod-class.Rd

str(terms(fm1))

stopifnot(identical(terms(fm1, fixed.only=FALSE),

lmerControl 39

terms(model.frame(fm1))))

attr(terms(fm1, FALSE), "dataClasses") # fixed.only=FALSE needed for dataCl.

fm1_ML <- update(fm1,REML=FALSE)

(fm2 <- lmer(Reaction ~ Days + (Days || Subject), sleepstudy))

anova(fm1, fm2)

sm2 <- summary(fm2)

print(fm2, digits=7, ranef.comp="Var") # the print.merMod() method

print(sm2, digits=3, corr=FALSE) # the print.summary.merMod() method

(vv <- vcov.merMod(fm2, corr=TRUE))

as(vv, "corMatrix")# extracts the ("hidden") 'correlation'entry in @factors

## Fit sex-specific variances by constructing numeric dummy variables

## for sex and sex:age; in this case the estimated variance differences

## between groups in both intercept and slope are zero ...

data(Orthodont,package="nlme")

Orthodont$nsex <- as.numeric(Orthodont$Sex=="Male")

Orthodont$nsexage <- with(Orthodont, nsex*age)

lmer(distance ~ age + (age|Subject) + (0+nsex|Subject) +

(0 + nsexage|Subject), data=Orthodont)

lmerControl Control of Mixed Model Fitting

Description

Construct control structures for mixed model ﬁtting. All arguments have defaults, and can be

grouped into

• general control parameters, most importantly optimizer, further restart_edge, etc;

• model- or data-checking speciﬁcations, in short “checking options”, such as check.nobs.vs.rankZ,

or check.rankX (currently not for nlmerControl);

• all the parameters to be passed to the optimizer, e.g., maximal number of iterations, passed via

the optCtrl list argument.

Usage

lmerControl(optimizer = "bobyqa",

restart_edge = TRUE,

boundary.tol = 1e-5,

calc.derivs=TRUE,

use.last.params=FALSE,

sparseX = FALSE,

## input checking options

check.nobs.vs.rankZ = "ignore",

check.nobs.vs.nlev = "stop",

check.nlev.gtreq.5 = "ignore",

check.nlev.gtr.1 = "stop",

check.nobs.vs.nRE="stop",

check.rankX = c("message+drop.cols", "silent.drop.cols", "warn+drop.cols",

"stop.deficient", "ignore"),

40 lmerControl

check.scaleX = c("warning","stop","silent.rescale",

"message+rescale","warn+rescale","ignore"),

check.formula.LHS = "stop",

## convergence checking options

check.conv.grad = .makeCC("warning", tol = 2e-3, relTol = NULL),

check.conv.singular = .makeCC(action = "ignore", tol = 1e-4),

check.conv.hess = .makeCC(action = "warning", tol = 1e-6),

## optimizer args

optCtrl = list())

glmerControl(optimizer = c("bobyqa", "Nelder_Mead"),

restart_edge = FALSE,

boundary.tol = 1e-5,

calc.derivs=TRUE,

use.last.params=FALSE,

sparseX = FALSE,

tolPwrss=1e-7,

compDev=TRUE,

nAGQ0initStep=TRUE,

## input checking options

check.nobs.vs.rankZ = "ignore",

check.nobs.vs.nlev = "stop",

check.nlev.gtreq.5 = "ignore",

check.nlev.gtr.1 = "stop",

check.nobs.vs.nRE="stop",

check.rankX = c("message+drop.cols", "silent.drop.cols", "warn+drop.cols",

"stop.deficient", "ignore"),

check.scaleX = "warning",

check.formula.LHS = "stop",

check.response.not.const = "stop",

## convergence checking options

check.conv.grad = .makeCC("warning", tol = 1e-3, relTol = NULL),

check.conv.singular = .makeCC(action = "ignore", tol = 1e-4),

check.conv.hess = .makeCC(action = "warning", tol = 1e-6),

## optimizer args

optCtrl = list())

nlmerControl(optimizer = "Nelder_Mead", tolPwrss = 1e-10,

optCtrl = list())

.makeCC(action, tol, relTol, ...)

Arguments

optimizer character - name of optimizing function(s). A character vector or list of func-

tions: length 1 for lmer or glmer, possibly length 2 for glmer). The built-in

optimizers are Nelder_Mead and bobyqa (from the minqa package). Any min-

imizing function that allows box constraints can be used provided that it

(1) takes input parameters fn (function to be optimized), par (starting parame-

ter values), lower (lower bounds) and control (control parameters, passed

through from the control argument) and

(2) returns a list with (at least) elements par (best-ﬁt parameters), fval (best-ﬁt

lmerControl 41

function value), conv (convergence code, equal to zero for successful con-

vergence) and (optionally) message (informational message, or explanation

of convergence failure).

Special provisions are made for bobyqa,Nelder_Mead, and optimizers wrapped

in the optimx package; to use the optimx optimizers (including L-BFGS-B from

base optim and nlminb), pass the method argument to optim in the optCtrl ar-

gument (you may also need to load the optimx package manually using library(optimx)

or require(optimx)).

For glmer, if length(optimizer)==2, the ﬁrst element will be used for the

preliminary (random effects parameters only) optimization, while the second

will be used for the ﬁnal (random effects plus ﬁxed effect parameters) phase.

See modular for more information on these two phases.

calc.derivs logical - compute gradient and Hessian of nonlinear optimization solution?

use.last.params

logical - should the last value of the parameters evaluated (TRUE), rather than the

value of the parameters corresponding to the minimum deviance, be returned?

This is a "backward bug-compatibility" option; use TRUE only when trying to

match previous results.

sparseX logical - should a sparse model matrix be used for the ﬁxed-effects terms? Cur-

rently inactive.

restart_edge logical - should the optimizer attempt a restart when it ﬁnds a solution at the

boundary (i.e. zero random-effect variances or perfect +/-1 correlations)? (Cur-

rently only implemented for lmerControl.)

boundary.tol numeric - within what distance of a boundary should the boundary be checked

for a better ﬁt? (Set to zero to disable boundary checking.)

tolPwrss numeric scalar - the tolerance for declaring convergence in the penalized itera-

tively weighted residual sum-of-squares step.

compDev logical scalar - should compiled code be used for the deviance evaluation during

the optimization of the parameter estimates?

nAGQ0initStep do one initial run with nAGQ = 0.

check.nlev.gtreq.5

character - rules for checking whether all random effects have >= 5 levels. See

action.

check.nlev.gtr.1

character - rules for checking whether all random effects have > 1 level. See

action.

check.nobs.vs.rankZ

character - rules for checking whether the number of observations is greater than

(or greater than or equal to) the rank of the random effects design matrix (Z),

usually necessary for identiﬁable variances. As for action, with the addition of

"warningSmall" and "stopSmall", which run the test only if the dimensions

of Zare < 1e6. nobs > rank(Z) will be tested for LMMs and GLMMs with

estimated scale parameters; nobs >= rank(Z) will be tested for GLMMs with

ﬁxed scale parameter. The rank test is done using the method="qr" option of

the rankMatrix function.

check.nobs.vs.nlev

character - rules for checking whether the number of observations is less than

(or less than or equal to) the number of levels of every grouping factor, usually

necessary for identiﬁable variances. As for action.nobs<nlevels will be

tested for LMMs and GLMMs with estimated scale parameters; nobs<=nlevels

will be tested for GLMMs with ﬁxed scale parameter.

42 lmerControl

check.nobs.vs.nRE

character - rules for checking whether the number of observations is greater than

(or greater than or equal to) the number of random-effects levels for each term,

usually necessary for identiﬁable variances. As for check.nobs.vs.nlev.

check.conv.grad

rules for checking the gradient of the deviance function for convergence. A list

as returned by .makeCC, or a character string with only the action.

check.conv.singular

rules for checking for a singular ﬁt, i.e. one where some parameters are on the

boundary of the feasible space (for example, random effects variances equal to 0

or correlations between random effects equal to +/- 1.0); as for check.conv.grad

above.

check.conv.hess

rules for checking the Hessian of the deviance function for convergence.; as for

check.conv.grad above.

check.rankX character - specifying if rankMatrix(X) should be compared with ncol(X) and

if columns from the design matrix should possibly be dropped to ensure that

it has full rank. Sometimes needed to make the model identiﬁable. The op-

tions can be abbreviated; the three "*.drop.cols" options all do drop columns,

"stop.deficient" gives an error when the rank is smaller than the number of

columns where "ignore" does no rank computation, and will typically lead to

less easily understandable errors, later.

check.scaleX character - check for problematic scaling of columns of ﬁxed-effect model ma-

trix, e.g. parameters measured on very different scales.

check.formula.LHS

check whether speciﬁed formula has a left-hand side. Primarily for internal use

within simulate.merMod;use at your own risk as it may allow the generation

of unstable merMod objects

check.response.not.const

character - check that the response is not constant.

optCtrl alist of additional arguments to be passed to the nonlinear optimizer (see

Nelder_Mead,bobyqa). In particular, both Nelder_Mead and bobyqa use maxfun

to specify the maximum number of function evaluations they will try before giv-

ing up - in contrast to optim and optimx-wrapped optimizers, which use maxit.

action character - generic choices for the severity level of any test. "ignore": skip the

test. "warning": warn if test fails. "stop": throw an error if test fails.

tol numeric - tolerance for check

relTol numeric - tolerance for checking relative variation

... other elements to include in check speciﬁcation

Details

Note that (only!) the pre-ﬁtting “checking options” (i.e., all those starting with "check." but not

including the convergence checks ("check.conv.*") or rank-checking ("check.rank*") options)

may also be set globally via options. In that case, (g)lmerControl will use them rather than the

default values, but will not override values that are passed as explicit arguments.

For example, options(lmerControl=list(check.nobs.vs.rankZ = "ignore")) will suppress

warnings that the number of observations is less than the rank of the random effects model matrix

Z.

lmerControl 43

Value

The *Control functions return a list (inheriting from class "merControl") containing

1. general control parameters, such as optimizer,restart_edge;

2. (currently not for nlmerControl:) "checkControl", a list of data-checking speciﬁcations,

e.g., check.nobs.vs.rankZ;

3. parameters to be passed to the optimizer, i.e., the optCtrl list, which may contain maxiter.

.makeCC returns a list containing the check speciﬁcation (action, tolerance, and optionally relative

tolerance).

See Also

convergence

Examples

str(lmerControl())

str(glmerControl())

## Not run:

## fit with default Nelder-Mead algorithm ...

fm0 <- lmer(Reaction ~ Days + (1 | Subject), sleepstudy)

fm1 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy)

## or with minqa::bobyqa ...

fm1_bobyqa <- update(fm1,control=lmerControl(optimizer="bobyqa"))

## or with the nlminb function used in older (<1.0) versions of lme4;

## this will usually replicate older results

require(optimx)

fm1_nlminb <- update(fm1,control=lmerControl(optimizer="optimx",

optCtrl=list(method="nlminb")))

## The other option here is method="L-BFGS-B".

## Or we can wrap base::optim():

optimwrap <- function(fn,par,lower,upper,control=list(),

...) {

if (is.null(control$method)) stop("must specify method in optCtrl")

method <- control$method

control$method <- NULL

## "Brent" requires finite upper values (lower bound will always

## be zero in this case)

if (method=="Brent") upper <- pmin(1e4,upper)

res <- optim(par=par,fn=fn,lower=lower,upper=upper,

control=control,method=method,...)

with(res,list(par=par,

fval=value,

feval=counts[1],

conv=convergence,

message=message))

}

fm0_brent <- update(fm0,control=lmerControl(optimizer="optimwrap",

optCtrl=list(method="Brent")))

## You can also use functions from the nloptr package.

if (require(nloptr)) {

defaultControl <- list(algorithm="NLOPT_LN_BOBYQA",

xtol_abs=1e-6,ftol_abs=1e-6,maxeval=1e5)

nloptwrap <- function(fn,par,lower,upper,control=list(),...) {

for (n in names(defaultControl))

44 lmList

if (is.null(control[[n]])) control[[n]] <- defaultControl[[n]]

res <- nloptr(x0=par,eval_f=fn,lb=lower,ub=upper,opts=control,...)

with(res,list(par=solution,

fval=objective,

feval=iterations,

conv=if (status>0) 0 else status,

message=message))

}

fm1_nloptr <- update(fm1,control=lmerControl(optimizer="nloptwrap"))

fm1_nloptr_NM <- update(fm1,control=lmerControl(optimizer="nloptwrap",

optCtrl=list(algorithm="NLOPT_LN_NELDERMEAD")))

}

## other algorithm options include NLOPT_LN_COBYLA, NLOPT_LN_SBPLX

## End(Not run)

lmList Fit List of lm Objects with a Common Model

Description

Fit a list of lm objects with a common model for different subgroups of the data.

Usage

lmList(formula, data, family, subset, weights, na.action,

offset, pool = TRUE, ...)

Arguments

formula a linear formula object of the form y ~ x1+...+xn | g. In the formula object,

yrepresents the response, x1,...,xn the covariates, and gthe grouping factor

specifying the partitioning of the data according to which different lm ﬁts should

be performed.

family an optional family speciﬁcation for a generalized linear model.

pool logical scalar, should the variance estimate pool the residual sums of squares

... additional, optional arguments to be passed to the model function or family eval-

uation.

data an optional data frame containing the variables named in formula. By default

the variables are taken from the environment from which lmer is called. See

Details.

subset an optional expression indicating the subset of the rows of data that should be

used in the ﬁt. This can be a logical vector, or a numeric vector indicating which

observation numbers are to be included, or a character vector of the row names

to be included. All observations are included by default.

weights an optional vector of ‘prior weights’ to be used in the ﬁtting process. Should be

NULL or a numeric vector.

na.action a function that indicates what should happen when the data contain NAs. The de-

fault action (na.omit, inherited from the ‘factory fresh’ value of getOption("na.action"))

strips any observations with any missing values in any variables.

lmList4-class 45

offset this can be used to specify an a priori known component to be included in the

linear predictor during ﬁtting. This should be NULL or a numeric vector of length

equal to the number of cases. One or more offset terms can be included in the

formula instead or as well, and if more than one is speciﬁed their sum is used.

See model.offset.

Details

•data should be a data frame (not, e.g. a groupedData object from the nlme package); use

as.data.frame ﬁrst to convert the data.

• While data is optional, the package authors strongly recommend its use, especially when

later applying methods such as update and drop1 to the ﬁtted model (such methods are not

guaranteed to work properly if data is omitted). If data is omitted, variables will be taken

from the environment of formula (if speciﬁed as a formula) or from the parent frame (if

speciﬁed as a character vector).

Value

an object of class lmList4 (see there, notably for the methods deﬁned).

See Also

lmList4

Examples

fm.plm <- lmList(Reaction ~ Days | Subject, sleepstudy)

coef(fm.plm)

fm.2 <- update(fm.plm, pool = FALSE)

## coefficients are the same, "pooled or unpooled":

stopifnot( all.equal(coef(fm.2), coef(fm.plm)) )

(ci <- confint(fm.plm)) # print and rather *see* :

plot(ci) # how widely they vary for the individuals

lmList4-class Class "lmList4" of ’lm’ Objects on Common Model

Description

Class "lmList4" is an S4 class with basically a list of objects of class lm with a common model

(but different data); see lmList() which returns these.

Package nlme’s lmList() returns objects of S3 class "lmList" and provides methods for them, on

which our methods partly build.

Objects from the Class

Objects can be created by calls of the form new("lmList4", ...) or, more commonly, by a call

to lmList().

46 lmResp

Methods

A dozen methods are provided. Currently, S4 methods for show, coercion (as(.,.)) and others

inherited via "list", and S3 methods for coef,confint,fitted,fixef,formula,logLik,pairs,

plot,predict,print,qqnorm,ranef,residuals,sigma,summary, and update.

sigma(object) returns the standard deviation ˆσ(of the errors in the linear models), assuming a

common variance σ2by pooling (even when pool = FALSE was used in the ﬁt).

See Also

lmList

Examples

if(getRversion() >= "3.2.0") {

(mm <- methods(class = "lmList4"))

## The S3 ("not S4") ones :

mm[!attr(mm,"info")[,"isS4"]]

}

## For more examples: example(lmList) i.e., ?lmList

lmResp Generator objects for the response classes

Description

The generator objects for the lmResp,lmerResp,glmResp and nlsResp reference classes. Such

objects are primarily used through their new methods.

Usage

lmResp(...)

Arguments

... List of arguments (see Note).

Methods

new(y=y):Create a new lmResp or lmerResp object.

new(family=family, y=y):Create a new glmResp object.

new(y=y, nlmod=nlmod, nlenv=nlenv, pnames=pnames, gam=gam):Create a new nlsResp ob-

ject.

lmResp-class 47

Note

Arguments to the new methods must be named arguments.

• y the numeric response vector

• family a family object

• nlmod the nonlinear model function

• nlenv an environment holding data objects for evaluation of nlmod

• pnames a character vector of parameter names

• gam a numeric vector - the initial linear predictor

See Also

lmResp,lmerResp,glmResp,nlsResp

lmResp-class Reference Classes for Response Modules,

"(lm|glm|nls|lmer)Resp"

Description

Reference classes for response modules, including linear models, "lmResp", generalized linear

models, "glmResp", nonlinear models, "nlsResp" and linear mixed-effects models, "lmerResp".

Each reference class is associated with a C++ class of the same name. As is customary, the generator

object for each class has the same name as the class.

Extends

All reference classes extend and inherit methods from "envRefClass". Furthermore, "glmResp",

"nlsResp" and "lmerResp" all extend the "lmResp" class.

Note

Objects from these reference classes correspond to objects in C++ classes. Methods are invoked

on the C++ classes using the external pointer in the ptr ﬁeld. When saving such an object the

external pointer is converted to a null pointer, which is why there are redundant ﬁelds containing

enough information as R objects to be able to regenerate the C++ object. The convention is that a

ﬁeld whose name begins with an upper-case letter is an R object and the corresponding ﬁeld whose

name begins with the lower-case letter is a method. Access to the external pointer should be through

the method, not through the ﬁeld.

See Also

lmer,glmer,nlmer,merMod.

Examples

showClass("lmResp")

str(lmResp$new(y=1:4))

showClass("glmResp")

str(glmResp$new(family=poisson(), y=1:4))

showClass("nlsResp")

showClass("lmerResp")

str(lmerResp$new(y=1:4))

48 merMod-class

merMod-class Class "merMod" of Fitted Mixed-Effect Models

Description

A mixed-effects model is represented as a merPredD object and a response module of a class that

inherits from class lmResp. A model with a lmerResp response has class lmerMod; a glmResp

response has class glmerMod; and a nlsResp response has class nlmerMod.

Usage

## S3 method for class 'merMod'

anova(object, ..., refit = TRUE, model.names=NULL)

## S3 method for class 'merMod'

coef(object, ...)

## S3 method for class 'merMod'

deviance(object, REML = NULL, ...)

REMLcrit(object)

## S3 method for class 'merMod'

extractAIC(fit, scale = 0, k = 2, ...)

## S3 method for class 'merMod'

family(object, ...)

## S3 method for class 'merMod'

formula(x, fixed.only = FALSE, random.only = FALSE, ...)

## S3 method for class 'merMod'

fitted(object, ...)

## S3 method for class 'merMod'

logLik(object, REML = NULL, ...)

## S3 method for class 'merMod'

nobs(object, ...)

## S3 method for class 'merMod'

ngrps(object, ...)

## S3 method for class 'merMod'

terms(x, fixed.only = TRUE, random.only = FALSE, ...)

## S3 method for class 'merMod'

vcov(object, correlation = TRUE, sigm = sigma(object),

use.hessian = NULL, ...)

## S3 method for class 'merMod'

model.frame(formula, fixed.only = FALSE, ...)

## S3 method for class 'merMod'

model.matrix(object, type = c("fixed", "random", "randomListRaw"), ...)

## S3 method for class 'merMod'

print(x, digits = max(3, getOption("digits") - 3),

correlation = NULL, symbolic.cor = FALSE,

signif.stars = getOption("show.signif.stars"), ranef.comp = "Std.Dev.", ...)

## S3 method for class 'merMod'

summary(object, correlation = , use.hessian = NULL, ...)

## S3 method for class 'summary.merMod'

print(x, digits = max(3, getOption("digits") - 3),

correlation = NULL, symbolic.cor = FALSE,

merMod-class 49

signif.stars = getOption("show.signif.stars"),

ranef.comp = c("Variance", "Std.Dev."), show.resids = TRUE, ...)

## S3 method for class 'merMod'

update(object, formula., ..., evaluate = TRUE)

## S3 method for class 'merMod'

weights(object, type = c("prior", "working"), ...)

Arguments

object an Robject of class merMod, i.e., as resulting from lmer(), or glmer(), etc.

xan Robject of class merMod or summary.merMod, respectively, the latter result-

ing from summary(<merMod>).

fit an Robject of class merMod.

formula in the case of model.frame, a merMod object.

refit logical indicating if objects of class lmerMod should be reﬁtted with ML before

comparing models. The default is TRUE to prevent the common mistake of inap-

propriately comparing REML-ﬁtted models with different ﬁxed effects, whose

likelihoods are not directly comparable.

model.names character vectors of model names to be used in the anova table.

scale Not currently used (see extractAIC).

ksee extractAIC.

REML Logical. If TRUE, return the restricted log-likelihood rather than the log-likelihood.

If NULL (the default), set REML to isREML(object) (see isREML).

fixed.only logical indicating if only the ﬁxed effects components (terms or formula ele-

ments) are sought. If false, all components, including random ones, are returned.

random.only complement of fixed.only; indicates whether random components only are

sought. (Trying to specify fixed.only and random.only at the same time will

produce an error.)

correlation (logical) for vcov, indicates whether the correlation matrix as well as the variance-

covariance matrix is desired; for summary.merMod, indicates whether the cor-

relation matrix should be computed and stored along with the covariance; for

print.summary.merMod, indicates whether the correlation matrix of the ﬁxed-

effects parameters should be printed. In the latter case, when NULL (the default),

the correlation matrix is printed when it has been computed by summary(.), and

when p <= 20.

use.hessian (logical) indicates whether to use the ﬁnite-difference Hessian of the deviance

function to compute standard errors of the ﬁxed effects, rather estimating based

on internal information about the inverse of the model matrix (see getME(.,"RX")).

The default is to to use the Hessian whenever the ﬁxed effect parameters are ar-

guments to the deviance function (i.e. for GLMMs with nAGQ>0), and to use

getME(.,"RX") whenever the ﬁxed effect parameters are proﬁled out (i.e. for

GLMMs with nAGQ==0 or LMMs).

use.hessian=FALSE is backward-compatible with older versions of lme4, but

may give less accurate SE estimates when the estimates of the ﬁxed-effect (see

getME(.,"beta")) and random-effect (see getME(.,"theta")) parameters are

correlated.

sigm the residual standard error; by default sigma(object).

digits number of signiﬁcant digits for printing

50 merMod-class

symbolic.cor should a symbolic encoding of the ﬁxed-effects correlation matrix be printed?

If so, the symnum function is used.

signif.stars (logical) should signiﬁcance stars be used?

ranef.comp character vector of length one or two, indicating if random-effects parameters

should be reported on the variance and/or standard deviation scale.

show.resids should the quantiles of the scaled residuals be printed?

formula. see update.formula.

evaluate see update.

type For weights, type of weights to be returned; either "prior" for the initially

supplied weights or "working" for the weights at the ﬁnal iteration of the pe-

nalized iteratively reweighted least squares algorithm. For model.matrix, type

of model matrix to return (one of fixed giving the ﬁxed effects model matrix,

random giving the random effects model matrix, or randomListRaw giving a list

of the raw random effects model matrices associated with each random effects

term).

... potentially further arguments passed from other methods.

Objects from the Class

Objects of class merMod are created by calls to lmer,glmer or nlmer.

S3 methods

The following S3 methods with arguments given above exist (this list is currently not complete):

anova:returns the sequential decomposition of the contributions of ﬁxed-effects terms or, for mul-

tiple arguments, model comparison statistics. For objects of class lmerMod the default behav-

ior is to reﬁt the models with LM if ﬁtted with REML = TRUE, this can be controlled via the

refit argument. See also anova.

coef:Computes the sum of the random and ﬁxed effects coefﬁcients for each explanatory variable

for each level of each grouping factor.

extractAIC:Computes the (generalized) Akaike An Information Criterion. If isREML(fit), then

fit is reﬁtted using maximum likelihood.

family:family of ﬁtted GLMM. (Warning: this accessor may not work properly with customized

families/link functions.)

fitted:Fitted values, given the conditional modes of the random effects. For more ﬂexible access

to ﬁtted values, use predict.merMod.

logLik:Log-likelihood at the ﬁtted value of the parameters. Note that for GLMMs, the returned

value is only proportional to the log probability density (or distribution) of the response vari-

able. See logLik.

model.frame:returns the frame slot of merMod.

model.matrix:returns the ﬁxed effects model matrix.

nobs,ngrps:Number of observations and vector of the numbers of levels in each grouping factor.

See ngrps.

summary:Computes and returns a list of summary statistics of the ﬁtted model, the amount of

output can be controlled via the print method, see also summary.

print.summary:Controls the output for the summary method.

vcov:Calculate variance-covariance matrix of the ﬁxed effect terms, see also vcov.

update:See update.

merMod-class 51

Deviance and log-likelihood of GLMMs

One must be careful when deﬁning the deviance of a GLM. For example, should the deviance be

deﬁned as minus twice the log-likelihood or does it involve subtracting the deviance for a saturated

model? To distinguish these two possibilities we refer to absolute deviance (minus twice the log-

likelihood) and relative deviance (relative to a saturated model, e.g. Section 2.3.1 in McCullagh and

Nelder 1989). With GLMMs however, there is an additional complication involving the distinc-

tion between the likelihood and the conditional likelihood. The latter is the likelihood obtained by

conditioning on the estimates of the conditional modes of the spherical random effects coefﬁcients,

whereas the likelihood itself (i.e. the unconditional likelihood) involves integrating out these coefﬁ-

cients. The following table summarizes how to extract the various types of deviance for a glmerMod

object.

conditional unconditional

relative deviance(object) NA in lme4

absolute object@resp$aic() -2*logLik(object)

This table requires two caveat:

• If the link function involves a scale parameter (e.g. Gamma) then object@resp$aic() - 2 * getME(object, "devcomp")$dims["useSc"]

is required for the absolute-conditional case.

• If adaptive Gauss-Hermite quadrature is used, then logLik(object) is currently only propor-

tional to the absolute-unconditional log-likelihood.

For more information about this topic see the misc/logLikGLMM directory in the package source.

Slots

resp:A reference class object for an lme4 response module (lmResp-class).

Gp:See getME.

call:The matched call.

frame:The model frame containing all of the variables required to parse the model formula.

flist:See getME.

cnms:See getME.

lower:See getME.

theta:Covariance parameter vector.

beta:Fixed effects coefﬁcients.

u:Conditional model of spherical random effects coefﬁcients.

devcomp:See getME.

pp:A reference class object for an lme4 predictor module (merPredD-class).

optinfo:List containing information about the nonlinear optimization.

See Also

lmer,glmer,nlmer,merPredD,lmerResp,glmResp,nlsResp

Other methods for merMod objects documented elsewhere include: fortify.merMod,drop1.merMod,

isLMM.merMod,isGLMM.merMod,isNLMM.merMod,isREML.merMod,plot.merMod,predict.merMod,

profile.merMod,ranef.merMod,refit.merMod,refitML.merMod,residuals.merMod,sigma.merMod,

simulate.merMod,summary.merMod.

52 merPredD

Examples

showClass("merMod")

methods(class="merMod")## over 30 (S3) methods available

## -> example(lmer) for an example of vcov.merMod()

merPredD Generator object for the merPredD class

Description

The generator object for the merPredD reference class. Such an object is primarily used through its

new method.

Usage

merPredD(...)

Arguments

... List of arguments (see Note).

Note

merPredD(...) is a short form of new("merPredD", ...) to create a new merPredD object and

the ... must be named arguments, (X, Zt, Lambdat, Lind, theta,n):

X:dense model matrix for the ﬁxed-effects parameters, to be stored in the Xﬁeld.

Zt:transpose of the sparse model matrix for the random effects. It is stored in the Zt ﬁeld.

Lambdat:transpose of the sparse lower triangular relative variance factor (stored in the Lambdat

ﬁeld).

Lind:integer vector of the same length as the xslot in the Lambdat ﬁeld. Its elements should be in

the range 1 to the length of the theta ﬁeld.

theta:numeric vector of variance component parameters (stored in the theta ﬁeld).

n:sample size, usually nrow(X).

See Also

The class deﬁnition, merPredD, also for examples.

merPredD-class 53

merPredD-class Class "merPredD" - a Dense Predictor Reference Class

Description

A reference class (see mother class deﬁnition "envRefClass"for a mixed-effects model predictor

module with a dense model matrix for the ﬁxed-effects parameters. The reference class is associated

with a C++ class of the same name. As is customary, the generator object, merPredD, for the class

has the same name as the class.

Note

Objects from this reference class correspond to objects in a C++ class. Methods are invoked on

the C++ class object using the external pointer in the Ptr ﬁeld. When saving such an object the

external pointer is converted to a null pointer, which is why there are redundant ﬁelds containing

enough information as Robjects to be able to regenerate the C++ object. The convention is that a

ﬁeld whose name begins with an upper-case letter is an Robject and the corresponding ﬁeld, whose

name begins with the lower-case letter is a method. References to the external pointer should be

through the method, not directly through the Ptr ﬁeld.

See Also

lmer,glmer,nlmer,merPredD,merMod.

Examples

showClass("merPredD")

pp <- slot(lmer(Yield ~ 1|Batch, Dyestuff), "pp")

stopifnot(is(pp, "merPredD"))

str(pp) # an overview of all fields and methods'names.

mkdevfun Create Deviance Evaluation Function from Predictor and Response

Module

Description

From a merMod object create an Rfunction that takes a single argument, which is the new parameter

value, and returns the deviance.

Usage

mkdevfun(rho, nAGQ = 1L, maxit = 100, verbose = 0, control = list())

54 mkMerMod

Arguments

rho an environment containing pp, a prediction module, typically of class merPredD

and resp, a response module, e.g., of class lmerResp.

nAGQ scalar integer - the number of adaptive Gauss-Hermite quadrature points. A

value of 0 indicates that both the ﬁxed-effects parameters and the random effects

are optimized by the iteratively reweighted least squares algorithm.

maxit scalar integer, currently only for GLMMs: the maximal number of Pwrss update

iterations.

verbose scalar logical: print verbose output?

control list of control parameters, a subset of those speciﬁed by lmerControl (tolPwrss

and compDev for GLMMs, tolPwrss for NLMMs)

Details

The function returned by mkdevfun evaluates the deviance of the model represented by the predictor

module, pp, and the response module, resp.

For lmer model objects the argument of the resulting function is the variance component parameter,

theta, with lower bound. For glmer or nlmer model objects with nAGQ = 0 the argument is also

theta. However, when nAGQ > 0, the argument is c(theta, beta).

Value

Afunction of one numeric argument.

See Also

lmer,glmer and nlmer

Examples

(dd <- lmer(Yield ~ 1|Batch, Dyestuff, devFunOnly=TRUE))

dd(0.8)

minqa::bobyqa(1, dd, 0)

mkMerMod Create a ’merMod’ Object

Description

Create an object of (a subclass of) class merMod from the environment of the objective function and

the value returned by the optimizer.

Usage

mkMerMod(rho, opt, reTrms, fr, mc, lme4conv = NULL)

mkRespMod 55

Arguments

rho the environment of the objective function

opt the optimization result returned by the optimizer (a list: see lmerControl for

required elements)

reTrms random effects structure from the calling function (see mkReTrms for required

elements)

fr model frame (see model.frame)

mc matched call from the calling function

lme4conv lme4-speciﬁc convergence information (results of checkConv)

Value

an object from a class that inherits from merMod.

mkRespMod Create an lmerResp, glmResp or nlsResp instance

Description

Create an lmerResp, glmResp or nlsResp instance

Usage

mkRespMod(fr, REML = NULL, family = NULL, nlenv = NULL,

nlmod = NULL, ...)

Arguments

fr a model frame

REML logical scalar, value of REML for an lmerResp instance

family the optional glm family (glmResp only)

nlenv the nonlinear model evaluation environment (nlsResp only)

nlmod the nonlinear model function (nlsResp only)

... where to look for response information if fr is missing. Can contain a model

response, y, offset, offset, and weights, weights.

Value

an lmerResp or glmResp or nlsResp instance

See Also

Other utilities: findbars,mkReTrms,nlformula,nobars,subbars

56 mkReTrms

mkReTrms Make Random Effect Terms: Create Z, Lambda, Lind, etc.

Description

From the result of findbars applied to a model formula and the evaluation frame fr, create the

model matrix Zt, etc, associated with the random-effects terms.

Usage

mkReTrms(bars, fr, drop.unused.levels=TRUE)

Arguments

bars a list of parsed random-effects terms

fr a model frame in which to evaluate these terms

drop.unused.levels

(logical) drop unused factor levels? (experimental)

Value

alist with components

Zt transpose of the sparse model matrix for the random effects

theta initial values of the covariance parameters

Lind an integer vector of indices determining the mapping of the elements of the

theta vector to the "x" slot of Lambdat

Gp

lower lower bounds on the covariance parameters

Lambdat transpose of the sparse relative covariance factor

flist list of grouping factors used in the random-effects terms

cnms a list of column names of the random effects according to the grouping factors

Ztlist list of components of the transpose of the random-effects model matrix, sepa-

rated by random-effects term

See Also

Other utilities: findbars,mkRespMod,nlformula,nobars,subbars.getME can retrieve these

components from a ﬁtted model, although their values and/or forms may be slightly different in the

ﬁnal ﬁtted model from their original values as returned from mkReTrms.

Examples

data("Pixel", package="nlme")

mform <- pixel ~ day + I(day^2) + (day | Dog) + (1 | Side/Dog)

(bar.f <- findbars(mform)) # list with 3 terms

mf <- model.frame(subbars(mform),data=Pixel)

rt <- mkReTrms(bar.f,mf)

names(rt)

mkSimulateTemplate 57

mkSimulateTemplate Make templates suitable for guiding mixed model simulations

Description

Make data and parameter templates suitable for guiding mixed model simulations, by specifying

a model formula and other information (EXPERIMENTAL). Most useful for simulating balanced

designs and for getting started on unbalanced simulations.

Usage

mkParsTemplate(formula, data)

mkDataTemplate(formula, data, nGrps = 2, nPerGrp = 1, rfunc = NULL, ...)

Arguments

formula A mixed model formula (see lmer).

data A data frame containing the names in formula.

nGrps Number of levels of a grouping factor.

nPerGrp Number of observations per level.

rfunc Function for generating covariate data (e.g. rnorm.

... Additional parameters for rfunc.

See Also

These functions are designed to be used with simulate.merMod.

mkVarCorr Make Variance and Correlation Matrices from theta

Description

Make variance and correlation matrices from theta

Usage

mkVarCorr(sc, cnms, nc, theta, nms)

Arguments

sc scale factor (residual standard deviation).

cnms component names.

nc numeric vector: number of terms in each RE component.

theta theta vector (lower-triangle of Cholesky factors).

nms component names (FIXME: nms/cnms redundant: nms=names(cnms)?)

58 modular

Value

Amatrix

See Also

VarCorr

modular Modular Functions for Mixed Model Fits

Description

Modular functions for mixed model ﬁts

Usage

lFormula(formula, data = NULL, REML = TRUE, subset,

weights, na.action, offset, contrasts = NULL,

control = lmerControl(), ...)

mkLmerDevfun(fr, X, reTrms, REML = TRUE, start = NULL,

verbose = 0, control = lmerControl(), ...)

optimizeLmer(devfun,

optimizer = formals(lmerControl)$optimizer,

restart_edge = formals(lmerControl)$restart_edge,

boundary.tol = formals(lmerControl)$boundary.tol,

start = NULL, verbose = 0L,

control = list(), ...)

glFormula(formula, data = NULL, family = gaussian,

subset, weights, na.action, offset, contrasts = NULL,

mustart, etastart, control = glmerControl(), ...)

mkGlmerDevfun(fr, X, reTrms, family, nAGQ = 1L,

verbose = 0L, maxit = 100L, control = glmerControl(), ...)

optimizeGlmer(devfun, optimizer = "bobyqa",

restart_edge = FALSE,

boundary.tol = formals(glmerControl)$boundary.tol,

verbose = 0L, control = list(),

nAGQ = 1L, stage = 1, start = NULL, ...)

updateGlmerDevfun(devfun, reTrms, nAGQ = 1L)

Arguments

formula a two-sided linear formula object describing both the ﬁxed-effects and ﬁxed-

effects part of the model, with the response on the left of a ~operator and the

terms, separated by +operators, on the right. Random-effects terms are distin-

guished by vertical bars ("|") separating expressions for design matrices from

grouping factors.

modular 59

data an optional data frame containing the variables named in formula. By default

the variables are taken from the environment from which lmer is called. While

data is optional, the package authors strongly recommend its use, especially

when later applying methods such as update and drop1 to the ﬁtted model

(such methods are not guaranteed to work properly if data is omitted). If data

is omitted, variables will be taken from the environment of formula (if speciﬁed

as a formula) or from the parent frame (if speciﬁed as a character vector).

REML (logical) indicating to ﬁt restricted maximum likelihood model.

subset an optional expression indicating the subset of the rows of data that should be

used in the ﬁt. This can be a logical vector, or a numeric vector indicating which

observation numbers are to be included, or a character vector of the row names

to be included. All observations are included by default.

weights an optional vector of ‘prior weights’ to be used in the ﬁtting process. Should be

NULL or a numeric vector.

na.action a function that indicates what should happen when the data contain NAs. The de-

fault action (na.omit, inherited from the ’factory fresh’ value of getOption("na.action"))

strips any observations with any missing values in any variables.

offset this can be used to specify an a priori known component to be included in the

linear predictor during ﬁtting. This should be NULL or a numeric vector of length

equal to the number of cases. One or more offset terms can be included in the

formula instead or as well, and if more than one is speciﬁed their sum is used.

See model.offset.

contrasts an optional list. See the contrasts.arg of model.matrix.default.

control a list giving

for [g]lFormula:all options for running the model, see lmerControl;

for mkLmerDevfun,mkGlmerDevfun:options for the inner optimization step;

for optimizeLmer and optimizeGlmer:control parameters for nonlinear op-

timizer (typically inherited from the . . .argument to lmerControl).

fr A model frame containing the variables needed to create an lmerResp or glmResp

instance.

Xﬁxed-effects design matrix

reTrms information on random effects structure (see mkReTrms).

start starting values (see lmer)

verbose print output?

maxit maximal number of Pwrss update iterations.

devfun a deviance function, as generated by mkLmerDevfun

nAGQ number of Gauss-Hermite quadrature points

stage optimization stage (1: nAGQ=0, optimize over theta only; 2: nAGQ possibly

>0, optimize over theta and beta)

optimizer character - name of optimizing function(s). A character vector or list of func-

tions: length 1 for lmer or glmer, possibly length 2 for glmer. The built-in

optimizers are "Nelder_Mead"and "bobyqa"(from the minqa package). Any

minimizing function that allows box constraints can be used provided that it

1. takes input parameters fn (function to be optimized), par (starting parame-

ter values), lower (lower bounds) and control (control parameters, passed

through from the control argument) and

60 modular

2. returns a list with (at least) elements par (best-ﬁt parameters), fval (best-ﬁt

function value), conv (convergence code) and (optionally) message (infor-

mational message, or explanation of convergence failure).

Special provisions are made for bobyqa,Nelder_Mead, and optimizers wrapped

in the optimx package; to use optimx optimizers (including L-BFGS-B from

base optim and nlminb), pass the method argument to optim in the control

argument.

For glmer, if length(optimizer)==2, the ﬁrst element will be used for the

preliminary (random effects parameters only) optimization, while the second

will be used for the ﬁnal (random effects plus ﬁxed effect parameters) phase.

See modular for more information on these two phases.

restart_edge see lmerControl

boundary.tol see lmerControl

family a GLM family; see glm and family.

mustart optional starting values on the scale of the conditional mean; see glm for details.

etastart optional starting values on the scale of the unbounded predictor; see glm for

details.

... other potential arguments; for optimizeLmer and optimizeGlmer, these are

passed to internal function optwrap, which has relevant parameters calc.derivs

and use.last.params (see lmerControl).

Details

These functions make up the internal components of an [gn]lmer ﬁt.

•[g]lFormula takes the arguments that would normally be passed to [g]lmer, checking for

errors and processing the formula and data input to create a list of objects required to ﬁt a

mixed model.

•mk(Gl|L)merDevfun takes the output of the previous step (minus the formula component)

and creates a deviance function

•optimize(Gl|L)mer takes a deviance function and optimizes over theta (or over theta and

beta, if stage is set to 2 for optimizeGlmer

•updateGlmerDevfun takes the ﬁrst stage of a GLMM optimization (with nAGQ=0, optimizing

over theta only) and produces a second-stage deviance function

•mkMerMod takes the environment of a deviance function, the results of an optimization, a list

of random-effect terms, a model frame, and a model all and produces a [g]lmerMod object.

Value

lFormula and glFormula return a list containing components:

fr model frame

Xﬁxed-effect design matrix

reTrms list containing information on random effects structure: result of mkReTrms

REML (lFormula only): logical indicating if restricted maximum likelihood was used (Copy of

argument.)

modular 61

mkLmerDevfun and mkGlmerDevfun return a function to calculate deviance (or restricted deviance)

as a function of the theta (random-effect) parameters. updateGlmerDevfun returns a function to

calculate the deviance as a function of a concatenation of theta and beta (ﬁxed-effect) parameters.

These deviance functions have an environment containing objects required for their evaluation.

CAUTION: The environment of functions returned by mk(Gl|L)merDevfun contains reference

class objects (see ReferenceClasses,merPredD-class,lmResp-class), which behave in ways

that may surprise many users. For example, if the output of mk(Gl|L)merDevfun is naively copied,

then modiﬁcations to the original will also appear in the copy (and vice versa). To avoid this

behavior one must make a deep copy (see ReferenceClasses for details).

optimizeLmer and optimizeGlmer return the results of an optimization.

Examples

### Fitting a linear mixed model in 4 modularized steps

## 1. Parse the data and formula:

lmod <- lFormula(Reaction ~ Days + (Days|Subject), sleepstudy)

names(lmod)

## 2. Create the deviance function to be optimized:

(devfun <- do.call(mkLmerDevfun, lmod))

ls(environment(devfun)) # the environment of 'devfun'contains objects

# required for its evaluation

## 3. Optimize the deviance function:

opt <- optimizeLmer(devfun)

opt[1:3]

## 4. Package up the results:

mkMerMod(environment(devfun), opt, lmod$reTrms, fr = lmod$fr)

### Same model in one line

lmer(Reaction ~ Days + (Days|Subject), sleepstudy)

### Fitting a generalized linear mixed model in six modularized steps

## 1. Parse the data and formula:

glmod <- glFormula(cbind(incidence, size - incidence) ~ period + (1 | herd),

data = cbpp, family = binomial)

names(glmod)

## 2. Create the deviance function for optimizing over theta:

(devfun <- do.call(mkGlmerDevfun, glmod))

ls(environment(devfun)) # the environment of devfun contains lots of info

## 3. Optimize over theta using a rough approximation (i.e. nAGQ = 0):

(opt <- optimizeGlmer(devfun))

## 4. Update the deviance function for optimizing over theta and beta:

(devfun <- updateGlmerDevfun(devfun, glmod$reTrms))

## 5. Optimize over theta and beta:

opt <- optimizeGlmer(devfun, stage=2)

opt[1:3]

## 6. Package up the results:

mkMerMod(environment(devfun), opt, glmod$reTrms, fr = glmod$fr)

### Same model in one line

glmer(cbind(incidence, size - incidence) ~ period + (1 | herd),

data = cbpp, family = binomial)

62 NelderMead

NelderMead Nelder-Mead Optimization of Parameters, Possibly (Box) Constrained

Description

Nelder-Mead optimization of parameters, allowing optimization subject to box constraints (contrary

to the default, method = "Nelder-Mead", in R’s optim()), and using reverse communications.

Usage

Nelder_Mead(fn, par, lower = rep.int(-Inf, n), upper = rep.int(Inf, n),

control = list())

Arguments

fn afunction of a single numeric vector argument returning a numeric scalar.

par numeric vector of starting values for the parameters.

lower numeric vector of lower bounds (elements may be -Inf).

upper numeric vector of upper bounds (elements may be Inf).

control a named list of control settings. Possible settings are

iprint numeric scalar - frequency of printing evaluation information. Defaults

to 0 indicating no printing.

maxfun numeric scalar - maximum number of function evaluations allowed

(default:10000).

FtolAbs numeric scalar - absolute tolerance on change in function values (de-

fault: 1e-5)

FtolRel numeric scalar - relative tolerance on change in function values (default:1e-

15)

XtolRel numeric scalar - relative tolerance on change in parameter values (de-

fault: 1e-7)

MinfMax numeric scalar - maximum value of the minimum (default: .Ma-

chine$double.xmin)

xst numeric vector of initial step sizes to establish the simplex - all elements

must be non-zero (default: rep(0.02,length(par)))

xt numeric vector of tolerances on the parameters (default: xst*5e-4)

verbose numeric value: 0=no printing, 1=print every 20 evaluations, 2=print

every 10 evalutions, 3=print every evaluation. Sets ‘iprint’, if speciﬁed, but

does not override it.

warnOnly a logical indicating if non-convergence (codes -1,-2,-3) should not

stop(.), but rather only call warning and return a result which might in-

spected. Defaults to FALSE, i.e., stop on non-convergence.

Value

alist with components

fval numeric scalar - the minimum function value achieved

par numeric vector - the value of xproviding the minimum

NelderMead-class 63

convergence integer valued scalar, if not 0, an error code:

-4 nm_evals: maximum evaluations reached

-3 nm_forced: ?

-2 nm_nofeasible: cannot generate a feasible simplex

-1 nm_x0notfeasible: initial x is not feasible (?)

0successful convergence

message a string specifying the kind of convergence.

control the list of control settings after substituting for defaults.

feval the number of function evaluations.

See Also

The NelderMead class deﬁnition and generator function.

Examples

fr <- function(x) { ## Rosenbrock Banana function

x1 <- x[1]

x2 <- x[2]

100 * (x2 - x1 * x1)^2 + (1 - x1)^2

}

p0 <- c(-1.2, 1)

oo <- optim(p0, fr) ## also uses Nelder-Mead by default

o. <- Nelder_Mead(fr, p0)

o.1 <- Nelder_Mead(fr, p0, control=list(verbose=1))# -> some iteration output

stopifnot(identical(o.[1:4], o.1[1:4]),

all.equal(o.$par, oo$par, tolerance=1e-3))# diff: 0.0003865

o.2 <- Nelder_Mead(fr, p0, control=list(verbose=3, XtolRel=1e-15, FtolAbs= 1e-14))

all.equal(o.2[-5],o.1[-5], tolerance=1e-15)# TRUE, unexpectedly

NelderMead-class Class "NelderMead" of Nelder-Mead optimizers and its Generator

Description

Class "NelderMead" is a reference class for a Nelder-Mead simplex optimizer allowing box con-

straints on the parameters and using reverse communication.

The NelderMead() function conveniently generates such objects.

Usage

NelderMead(...)

Arguments

... Argument list (see Note below).

64 ngrps

Methods

NelderMead$new(lower, upper, xst, x0, xt)

Create a new NelderMead object

Extends

All reference classes extend and inherit methods from "envRefClass".

Note

This is the default optimizer for the second stage of glmer and nlmer ﬁts. We found that it was

more reliable and often faster than more sophisticated optimizers.

Arguments to NelderMead() and the new method must be named arguments:

lower numeric vector of lower bounds - elements may be -Inf.

upper numeric vector of upper bounds - elements may be Inf.

xst numeric vector of initial step sizes to establish the simplex - all elements must be non-zero.

x0 numeric vector of starting values for the parameters.

xt numeric vector of tolerances on the parameters.

References

Based on code in the NLopt collection.

See Also

Nelder_Mead, the typical “constructor”. Further, glmer,nlmer

Examples

showClass("NelderMead")

ngrps Number of Levels of a Factor or a "merMod" Model

Description

Returns the number of levels of a factor or a set of factors, currently e.g., for each of the grouping

factors of lmer(),glmer(), etc.

Usage

ngrps(object, ...)

Arguments

object an Robject, see Details.

... currently ignored.

nlformula 65

Details

Currently there are methods for objects of class merMod, i.e., the result of lmer() etc, and factor

objects.

Value

The number of levels (of a factor) or vector of number of levels for each “grouping factor” of a

Examples

ngrps(factor(seq(1,10,2)))

ngrps(lmer(Reaction ~ 1|Subject, sleepstudy))

## A named vector if there's more than one grouping factor :

ngrps(lmer(strength ~ (1|batch/cask), Pastes))

## cask:batch batch

## 30 10