Content uploaded by Rense Nieuwenhuis

Author content

All content in this area was uploaded by Rense Nieuwenhuis

Content may be subject to copyright.

Content uploaded by Rense Nieuwenhuis

Author content

All content in this area was uploaded by Rense Nieuwenhuis

Content may be subject to copyright.

CONTRIBUTED ARTI CLE 1

inﬂuence.ME: Tools for Detecting

Inﬂuential Data in Mixed Effects Models

by Rense Nieuwenhuis, Manfred te Grotenhuis, and Ben

Pelzer

Abstract inﬂuence.ME provides tools for de-

tecting inﬂuential data in mixed effects mod-

els. The application of these models has become

common practice, but the development of diag-

nostic tools has lagged behind. inﬂuence.ME

calculates standardized measures of inﬂuential

data for the point estimates of generalized mixed

effects models, such as DFBETAS, Cook’s dis-

tance, as well as percentile change and a test for

changing levels of signiﬁcance. inﬂuence.ME

calculates these measures of inﬂuence while ac-

counting for the nesting structure of the data.

The package and measures of inﬂuential data

are introduced, a practical example is given, and

strategies for dealing with inﬂuential data are

suggested.

The application of mixed effects regression models

has become common practice in the ﬁeld of social sci-

ences. As used in the social sciences, mixed effects re-

gression models take into account that observations

on individual respondents are nested within higher-

level groups such as schools, classrooms, states, and

countries (Snijders and Bosker,1999), and are often

referred to as multilevel regression models. Despite

these models’ increasing popularity, diagnostic tools

to evaluate ﬁtted models lag behind.

We introduce inﬂuence.ME (Nieuwenhuis,

Pelzer, and Te Grotenhuis,2012), an R-package

that provides tools for detecting inﬂuential cases in

mixed effects regression models estimated with lme4

(Bates and Maechler,2010). It is commonly accepted

that tests for inﬂuential data should be performed

on regression models, especially when estimates are

based on a relatively small number of cases. How-

ever, most existing procedures do not account for

the nesting structure of the data. As a result, these

existing procedures fail to detect that higher-level

cases may be inﬂuential on estimates of variables

measured at speciﬁcally that level.

In this paper, we outline the basic rationale on de-

tecting inﬂuential data, describe standardized mea-

sures of inﬂuence, provide a practical example of the

analysis of students in 23 schools, and discuss strate-

gies for dealing with inﬂuential cases. Testing for

inﬂuential cases in mixed effects regression models

is important, because inﬂuential data negatively in-

ﬂuence the statistical ﬁt and generalizability of the

model. In social science applications of mixed mod-

els the testing for inﬂuential data is especially im-

portant, since these models are frequently based on

large numbers of observations at the individual level

while the number of higher level groups is relatively

small. For instance, Van der Meer, Te Grotenhuis,

and Pelzer (2010) were unable to ﬁnd any country-

level comparative studies involving more than 54

countries. With such a relatively low number of

countries, a single country can easily be overly in-

ﬂuential on the parameter estimates of one or more

of the country-level variables.

Detecting Inﬂuential Data

All cases used to estimate a regression model exert

some level of inﬂuence on the regression parameters.

However, if a single case has extremely high or low

scores on the dependent variable relative to its ex-

pected value — given other variables in the model,

one or more of the independent variables, or both

— this case may overly inﬂuence the regression pa-

rameters by ’pulling’ the estimated regression line

towards itself. The simple inclusion or exclusion of

such a single case may then lead to substantially dif-

ferent regression estimates. This runs against dis-

tributional assumptions associated with regression

models, and as a result limits the validity and gener-

alizability of regression models in which inﬂuential

cases are present.

The analysis of residuals cannot be used for the

detection of inﬂuential cases (Crawley,2007). Cases

with high residuals (deﬁned as the difference between

the observed and the predicted scores on the depen-

dent variable) or with high standardized residuals

(deﬁned as the residual divided by the standard de-

viation of the residuals) are indicated as outliers.

However, an inﬂuential case is not always an out-

lier. On the contrary: a strongly inﬂuential case dom-

inates the regression model in such a way, that the

estimated regression line lies closely to this case. Al-

though inﬂuential cases thus have extreme values

on one or more of the variables, they can be onliers

rather than outliers. To account for this, the (standard-

ized) deleted residual is deﬁned as the difference be-

tween the observed score of a case on the dependent

variable, and the predicted score from the regression

model that is based on data from which that case was

removed.

Just as inﬂuential cases are not necessarily out-

liers, outliers are not necessarily inﬂuential cases.

This also holds for deleted residuals. The reason

for this is that the amount of inﬂuence a case exerts

on the regression slope is not only determined by

how well it’s (observed) score is ﬁtted by the spec-

iﬁed regression model, but also by its score(s) on the

The R Journal Vol. X/Y, Month, Year ISSN 2073-4859

CONTRIBUTED ARTI CLE 2

independent variable(s). The degree to which the

scores of a case on the independent variable(s) are

extreme is indicated by the leverage of this case. A

higher leverage means more extreme scores on the

independent variable(s), and a greater potential of

overly inﬂuencing the regression outcomes. How-

ever, if a case has very extreme scores on the inde-

pendent variable(s) but is ﬁtted very well by a regres-

sion model, and if this case has a low deleted (stan-

dardized) residual, this case is not necessarily overly

inﬂuencing the outcomes of the regression model.

Since neither outliers, nor cases with a high lever-

age, are necessarily inﬂuential, a different procedure

is required for detecting inﬂuential cases. The basic

rationale behind measuring inﬂuential cases is based

on the principle that when single cases are iteratively

omitted from the data, models based on these data

should not produce substantially different estimates.

If the model parameters change substantially after a

single case is excluded, this case may be regarded

as too inﬂuential. However, how much change in

the model parameters is acceptable? To standard-

ize the assessment of how inﬂuential a single case

is, several measures of inﬂuence are commonly used.

First, DFBETAS is a standardized measure of the ab-

solute difference between the estimate with a partic-

ular case included and the estimate without that par-

ticular case (Belsley, Kuh, and Welsch,1980). Second,

Cook’s distance provides an overall measurement of

the change in all parameter estimates, or a selection

thereof (Cook,1977). In addition, we introduce the

measure of percentile change and a test for changing

levels of signiﬁcance of the ﬁxed parameters.

Up to this point, this discussion on inﬂuential

data was limited to how single cases can overly in-

ﬂuence the point estimates (or BETAS) of a regres-

sion model. Single cases, however, can also bias the

conﬁdence intervals of these estimates. As indicated

above, cases with high leverage can be inﬂuential

because of their extreme values on the independent

variables, but not necessarily are. Cases with high

leverage but a low deleted residual compress stan-

dard errors, while cases with low leverage and a high

deleted residual inﬂate standard errors. Inferences

made to the population from models in which such

cases are present, may be incorrect.

Detecting Inﬂuential Data in Mixed Ef-

fects Models

Other options are available in R that help evaluat-

ing the ﬁt of regression models, including the de-

tection of inﬂuential data. The base R installation

provides various plots for regression models, includ-

ing but not limited to plots showing residuals versus

the ﬁtted scores, Cook’s distances, and the leverage

versus the deleted residuals. The latter plot can be

used to detect cases that affect the inferential prop-

erties of the model, as discussed above. These plots,

however, are not available for mixed effects models.

The LMERConvenienceFunctions package provides

model criticism plots, including the density of the

model residuals and the ﬁtted values versus the stan-

dardized residuals (Tremblay,2012). However, while

this package works with the lme4, it only is applica-

ble for linear mixed effects models.

The inﬂuence.ME package introduced here con-

tributes to these existing options, by providing sev-

eral measures of inﬂuential data for generalized mixed

effects models. The limitation is that, unfortunately,

as far as we are aware, the measure of leverage was

not developed for generalized mixed effects mod-

els. Consequently, the current installment of inﬂu-

ence.ME emphasizes detecting the inﬂuence of cases

on the point estimates of generalized mixed effect

models. It does, however, provide a basic test for de-

tecting whether single cases change the level of sig-

niﬁcance of an estimate, and therefore the ability to

make inferences from the estimated model.

To apply the logic of detecting inﬂuential data to

generalized mixed effects models, one has to mea-

sure the inﬂuence of a particular higher level group

on the estimates of a predictor measured at that level.

The straightforward way is to delete all observations

from the data that are nested within a single higher

level group, then re-estimate the regression model,

and ﬁnally evaluate the change in the estimated re-

gression parameters. This procedure is then repeated

for each higher-level group separately.

The "influence" function in the inﬂuence.ME

package performs this procedure automatically, and

returns an object containing information on the pa-

rameter estimates excluding the inﬂuence of each

higher level group separately. The returned object of

class "estex" (ESTimates EXcluding the inﬂuence of a

group) can then be passed on to one of the functions

calculating standardized measures of inﬂuence (such

as DFBETAS and Cook’s Distance, discussed in more

detail in the next section). Since the procedure of the

"influence" function entails re-estimating mixed

effects models several times, this can be computa-

tionally intensive. Unlike the standard approach in

R, we separated the estimation procedure from cal-

culating the measures of inﬂuence themselves. This

allows the user to process a single model once using

the "influence" function, and then to evaluate it us-

ing various measures and plots of inﬂuence.

In detecting inﬂuential data in mixed effects mod-

els, the key focus is on changes in the estimates of

variables measured at the group-level. However,

most mixed effects regression models estimate the ef-

fects of both lower-level and higher-level variables

simultaneously. Langford and Lewis (1998) devel-

oped a procedure in which the mixed effects model

is modiﬁed to neutralize the group’s inﬂuence on

the higher-level estimate, while at the same time al-

lowing the lower-level observations nested within

The R Journal Vol. X/Y, Month, Year ISSN 2073-4859

CONTRIBUTED ARTI CLE 3

that group to help estimate the effects of the lower-

level predictors in the model. For each higher-level

unit evaluated based on this method, the intercept-

vector of the model is set to 0, and an (additional)

dummy variable is added to the model, with score

1 for the respective higher level case. This way,

the case under investigation does not contribute to

the variance of the random intercept, nor to the ef-

fects of variables measured at the group-level. in-

ﬂuence.ME provides this functionality, which is ac-

cessed by specifying delete=FALSE as an option to

the "influence" function. As a result of the speciﬁc

modiﬁcation of the model-speciﬁcation, this speciﬁc

procedure suggested by Langford and Lewis (1998)

does not work when factor-variables are used in the

regression model.

Finally, inﬂuence.ME also allows for detecting

the inﬂuence of lower-level cases in the mixed effects

model. In social science applications of mixed effects

models, with a great number of lower-level observa-

tions nested in a limited number of groups, this will

not always be feasible. Detecting inﬂuence of lower-

level observations is supported for applications in

various disciplines where mixed effects models are

typically applied to only a limited number of obser-

vations per group. This procedure is accessed by

specifying obs=TRUE as an option to the "influence"

function. The "influence" function can either deter-

mine the inﬂuence of higher-level cases, or of single

observations, but not both at the same time.

The Outcome Measures

The "influence" function described above returns

an object with information on how much the pa-

rameter estimates in a mixed effects model change,

after the (inﬂuence of) observations of higher-level

groups and their individual-level observations were

removed from it iteratively. This returned object can

then be passed on to functions that calculate stan-

dardized measures of inﬂuence. inﬂuence.ME offers

four such measures, which are detailed in this sec-

tion.

DFBETAS

DFBETAS is a standardized measure that indicates

the level of inﬂuence observations have on single

parameter estimates (Fox,2002). Regarding mixed

models, this relates to the inﬂuence a higher-level

unit has on the parameter estimate. DFBETAS is cal-

culated as the difference in the magnitude of the pa-

rameter estimate between the model including and

the model excluding the higher level case. This abso-

lute difference is divided by the standard error of the

parameter estimate excluding the higher level unit

under investigation:

DFBETASij =ˆ

γi−ˆ

γi(−j)

se ˆ

γi(−j)

in which irefers to the parameter estimate, and jthe

higher-level group, so that ˆ

γirepresents the original

estimate of parameter i, and ˆ

γi(−j)represents the es-

timate of parameter i, after the higher-level group j

has been excluded from the data.

In inﬂuence.ME, values for DFBETAS in mixed

effects models can be calculated using the func-

tion "dfbetas", which takes the object returned

from "influence" as input. Further options include

parameters to provide a vector of index numbers or

names of the selection of parameters for which DF-

BETAS is to be calculated. The default option of

"dfbetas" is to calculate DFBETAS for estimates of

all ﬁxed effects in the model.

As a rule of thumb, a cut-off value is given for

DFBETAS (Belsley et al.,1980):

2/√n

in which n, the number of observations, refers to the

number of groups in the grouping factor under eval-

uation (and not to the number of observations nested

within the group under investigation). Values ex-

ceeding this cut-off value are regarded as overly in-

ﬂuencing the regression outcomes for that speciﬁc es-

timate.

Cook’s Distance

Since DFBETAS provides a value for each parame-

ter and for each higher-level unit that is evaluated,

this often results in quite a large number of val-

ues to evaluate (Fox,2002). An alternative is pro-

vided by Cook’s distance, a commonly used mea-

sure of inﬂuence. Cook’s distance provides a sum-

mary measure for the inﬂuence a higher level unit

exerts on all parameter estimates simultaneously, or

a selection thereof. A formula for Cook’s Distance

is provided (Snijders and Bosker,1999;Snijders and

Berkhof,2008):

C0F

j=1

r+1ˆ

γ−ˆ

γ(−j)0ˆ

∑−1

Fˆ

γ−ˆ

γ(−j)

in which ˆ

γrepresents the vector of original param-

eter estimates, ˆ

γ(−j)the parameter estimates of the

model excluding higher-level unit j, and ˆ

∑Frepre-

sents the covariance matrix. In inﬂuence.ME, the

covariance matrix of the model excluding the higher-

level unit under investigation jis used. Finally, ris

the number of parameters that are evaluated, exclud-

ing the intercept vector.

As a rule of thumb, cases are regarded as too in-

ﬂuential if the associated value for Cook’s Distance

exceeds the cut-off value of (Van der Meer et al.,

2010):

The R Journal Vol. X/Y, Month, Year ISSN 2073-4859

CONTRIBUTED ARTI CLE 4

4

n

in which nto the number of groups in the grouping

factor under evaluation.

In inﬂuence.ME, values for Cook’s distance in

mixed effects models are calculated using the func-

tion "cooks.distance", which takes the object re-

turned from "influence" as input. Further op-

tions include parameters to provide a vector of in-

dex numbers or names of the parameters for which

Cook’s Distance is to be calculated. In addition, the

user can specify sort=TRUE to have the values for

Cook’s distance returned in descending order.

As a ﬁnal note, it is pointed out that if Cook’s dis-

tance is calculated based on a single parameter, the

Cook’s distance equals the squared value of DFBE-

TAS for that parameter. This is also reﬂected in their

respective cut-off values:

r4

n=2

√n

Percentile Change

Depending upon the goal for which the mixed model

is estimated (prediction vs. hypothesis testing), the

use of formal measures of inﬂuence as DFBETAS and

Cook’s distance may be less desirable. The reason

for this is that based on these measures it is not im-

mediately clear to what extent parameter estimates

change. For substantive interpretation of the model

outcomes, the relative degree to which a parameter

estimate changes may provide more meaningful in-

formation. A simple alternative is therefore offered

by the function "pchange", which takes the same

input-options as the "dfbetas" function. For each

higher-level group, the percentage of change is cal-

culated as the absolute difference between the pa-

rameter estimate both including and excluding the

higher-level unit, divided by the parameter estimate

of the complete model and multiplied by 100%. A

percentage of change is returned for each parameter

separately, for each of the higher-level units under

investigation. In the form of a formula:

ˆ

γ−ˆ

γ(−j)1

ˆ

γ∗100%

No cut-off value is provided, for determining

what percent change in parameter estimate is con-

sidered too large will primarily depend on the goal

for which the model was estimated and, more specif-

ically, the nature of the hypotheses that are tested.

Test for changes in signiﬁcance

As discussed above, even when cases are not inﬂuen-

tial on the point estimates (BETAS) of the regression

model, cases can still inﬂuence the standard errors of

these estimates. Although inﬂuence.ME cannot pro-

vide the leverage measure to detect this, it provides

a test for changes in the statistical signiﬁcance of the

ﬁxed parameters in the mixed effects model.

The "sigtest" function tests whether excluding

the inﬂuence of a single case changes the statistical

signiﬁcance of any of the variables in the model. This

test of signiﬁcance is based on the test statistic pro-

vided by the lme4 package. The nature of this statis-

tic varies between different distributional families in

the generalized mixed effects models. For instance,

the t-statistic is related to a normal distribution while

the z-statistic is related to binomial distributions.

For each of the cases that are evaluated, the test

statistic of each variable is compared to a test-value

speciﬁed by the user. For the purpose of this test, the

parameter is regarded as statistically signiﬁcant if the

test statistic of the model exceeds the speciﬁed value.

The "sigtest" function reports for each variable the

estimated test statistic after deletion of each evalu-

ated case, whether or not this updated test statistic

results in statistical signiﬁcance based on the user-

speciﬁed value, and whether or not this new statis-

tical signiﬁcance differs from the signiﬁcance in the

original model. So, in other words, if a parameter

was statistically signiﬁcant in the original model, but

is no longer signiﬁcant after the deletion of a speciﬁc

case from the model, this is indicated by the out-

put of the "sigtest" function. It is also indicated

when an estimate was not signiﬁcant originally, but

reached statistical signiﬁcance after deletion of a spe-

ciﬁc case.

Plots

All four measures of inﬂuence discussed above, can

also be plotted. The "plot" function takes the output

of the "influence" function to create a dotplot of a

selected measure of inﬂuence (cf. Sarkar,2008). The

user can specify which measure of inﬂuence is to be

plotted using the which= option. The which= option

defaults to dfbetas. Other options are to select cook

to plot the cook’s distances, pchange to plot the per-

centage change, and sigtest to plot the test statis-

tic of a parameter estimate after deletion of speciﬁc

cases.

All plots allow the output to be sorted (by spec-

ifying sort=TRUE and the variable to sort on using

to.sort= (not required for plotting cook’s distances).

In addition, a cut-off value can be speciﬁed using

(cutoff=). Values that exceed this cut-off value will

be plotted visually differently, to facilitate the identi-

ﬁcation of inﬂuential cases. By default, the results for

all cases and all variables are plotted, but a selection

of these can be made by specifying parameters= and

/ or groups=. Finally, by specifying abs=TRUE the ab-

solute values of the measure of inﬂuence are plotted.

The R Journal Vol. X/Y, Month, Year ISSN 2073-4859

CONTRIBUTED ARTI CLE 5

Example: students in 23 schools

In our example, we are interested in the relationship

between the degree of structure that schools attempt

to enforce in their classrooms and students’ perfor-

mance on a math test. Could it be that a highly

structured class affects their performance?

The inﬂuence.ME package contains the school23

data.frame, that provides information on the per-

formance of 519 students in 23 schools. Measure-

ments include individual students’ score on a math

test, school-level measurements of class structure,

and several additional independent variables. Stu-

dent’s class and school are equivalent in this data,

since only one class per school is available. These

data are a subset of the NELS-88 data (National

Education Longitudinal Study of 1988). The data

are publicly available from the internet: http:

//www.ats.ucla.edu/stat/examples/imm/, and are

reproduced with kind permission of Ita Kreft and Jan

de Leeuw (1998).

First, using the lme4 package, we estimate a mul-

tivariate mixed effects model with students nested in

schools, a random intercept, a measurement of indi-

vidual students’ time spent on math homework, and

a measurement of class structure at the school level.

For the purpose of our example, we assume here that

the math, homework, and structure variables were

correctly measured at the interval level.

library(influence.ME)

data(school23)

school23 <- within(school23,

homework <- unclass(homework))

m23 <- lmer(math ~ homework + structure

+ (1 | school.ID),

data=school23)

print(m23, cor=FALSE)

This results in the summary of the model based

on 23 schools (assigned to object m23), as shown be-

low.

Linear mixed model fit by REML

Formula: math ~ homework +

structure + (1 | school.ID)

Data: school23

AIC BIC logLik deviance REMLdev

3734 3756 -1862 3728 3724

Random effects:

Groups Name Variance Std.Dev.

school.ID (Intercept) 19.543 4.4208

Residual 71.311 8.4446

Number of obs: 519, groups: school.ID, 23

Fixed effects:

Estimate Std. Error t value

(Intercept) 52.2356 5.3940 9.684

homework 2.3938 0.2771 8.640

structure -2.0950 1.3237 -1.583

Based on these estimates, we may conclude that

students spending more time on their math home-

work score better on a math test. Regarding class

structure, however, we do not ﬁnd a statistically sig-

niﬁcant association with math test scores. But, can

we now validly conclude that class structure does

not inﬂuence students’ math performance, based on

the outcomes of this model?

Visual Examination

Since the analysis in the previous section has been

based on the limited number of 23 schools, it is, of

course, possible that observations on single schools

have overly inﬂuenced these ﬁndings. Before using

the tools provided in the inﬂuence.ME package to

formally evaluate this, a visual examination of the re-

lationship between class structure and math test per-

formance, aggregated to the school level, will be per-

formed.

struct <- unique(subset(school23,

select=c(school.ID, structure)))

struct$mathAvg <- with(school23,

tapply(math, school.ID, mean))

dotplot(mathAvg ~ factor(structure),

struct,

type=c("p","a"),

xlab="Class structure level",

ylab="Average Math Test Score")

Class structure level

Average Math Test Score

40

45

50

55

60

2345

Figure 1: Visual examination of class structure and

math performance

The R Journal Vol. X/Y, Month, Year ISSN 2073-4859

CONTRIBUTED ARTI CLE 6

In the syntax above, a bivariate plot of the ag-

gregated math scores and class structure is created,

which is shown in Figure 1. In this plot, it is clear that

one single school represented in the lower-left corner

of the graph seems to be an outlier, and - more im-

portantly - the non-linear curve shown in this graph

clearly indicates this single school with class struc-

ture level of 2 may overly inﬂuence a linear regres-

sion line estimated based on these data.

Calculating measures of inﬂuence

In the previous section, based on Figure 1we sus-

pected that the combination in one speciﬁc school of

the low average math test results of students, and

the low level of class structure in that school, may

have overly inﬂuenced the original analysis of our

data. However, this only is a bivariate examination

of the data, and therefore does not take into account

other variables in the model. Hence, in our exam-

ple, our preliminary conclusion that this may be an

inﬂuential case is not controlled for possible effects

of the homework variable. A better test is provided

by standardized measures of inﬂuence, as calculated

from the regression model rather than from the raw

data.

The ﬁrst step in detecting inﬂuential data is to de-

termine the extent to which the parameter estimates

in model m23 change, when iteratively each of the

schools is deleted from the data. This is done with

the "influence" function:

estex.m23 <- influence(m23, "school.ID")

The "influence" function takes a mixed effects

regression model as input (here: m23), and the group-

ing factor needs to be speciﬁed, which in our case is

school.ID. We assign the output of the ”influence”

function to an object named estex.m23. Below, we

use this object as input to the "dfbetas" function, to

calculate DFBETAS.

dfbetas(estex.m23,

parameters=c(2,3))

This results in a substantial amount of output, a

portion of which is shown below. Only the DFBE-

TAS for the homework and structure variables were

returned, since parameters=c(2,3) was speciﬁed.

homework structure

6053 -0.13353732 -0.168139487

6327 -0.44770666 0.020481057

6467 0.21090081 0.015320965

7194 -0.44641247 0.036756281

7472 -0.55836772 1.254990963

...

72292 0.62278508 0.003905031

72991 0.52021424 0.021630219

The numerical output given above by the

"dfbetas" function provides a detailed report of the

values of DFBETAS in the model. For each variable,

as well as for each nesting group (in this example:

each school), a value for DFBETAS is computed and

reported upon. The cut-off value of DFBETAS equals

2/√n(Belsley et al.,1980), which in this case equals

2/√23 =.41. The estimate for class structure in this

model seems to be inﬂuenced most strongly by ob-

servations in school number 7472. The DFBETAS

for the structure variable clearly exceeds the cut-off

value of .41. Also, the estimates of the homework vari-

able changes substantially with the deletion of sev-

eral schools, as indicated by the high values of DF-

BETAS.

A plot (shown in Figure 2) of the DFBETAS is cre-

ated using:

plot(estex.m23,

which="dfbetas",

parameters=c(2,3),

xlab="DFbetaS",

ylab="School ID")

Based on Figure 2, it is clear that both the

structure and the homework variables are highly

susceptible to the inﬂuence of single schools. For

the structure variable this is not all that surpris-

ing, since class structure was measured at the school

level and shown in Figure 1to be very likely to be

inﬂuenced by a single case: school number 7472.

The observation that high values of DFBETAS were

found for the homework variable, suggests that sub-

stantial differences between these schools exist in

terms of how much time students spend on aver-

age on their homework. Therefore, we suggest that

in mixed effects regression models, both the esti-

mates of individual-level and group-level variables

are evaluated for inﬂuential data.

The measure of Cook’s distance allows to deter-

mine the inﬂuence a single higher-level group has on

the estimates of multiple variables simultaneously.

So, since the "cooks.distance" function allows to

specify a selection of variables on which the values

for Cook’s distance are to be calculated, this can be

used to limit the evaluation to the measurements at

the group-level exclusively. Note, that whereas DF-

BETAS always relates to single variables, Cook’s dis-

tance is a summary measure of changes on all param-

eter estimates it is based on. Reports on Cook’s dis-

tance thus should always specify on which variables

these values are based.

To continue our example, we illustrate the

"cooks.distance" function on a single variable,

since class structure is the only variable measured at

the school-level. In the example below, we use the

same object that was returned from the "influence"

function. The speciﬁcation of this function is simi-

lar to "dfbetas", and to create a plot of the cook’s

distances we again use the "plot" function with the

The R Journal Vol. X/Y, Month, Year ISSN 2073-4859

CONTRIBUTED ARTI CLE 7

DFbetaS

School ID

6053

6327

6467

7194

7472

7474

7801

7829

7930

24371

24725

25456

25642

26537

46417

47583

54344

62821

68448

68493

72080

72292

72991

−1.0 −0.5 0.0 0.5 1.0

●

●●

●

●

●●

●●

●●

●●●

●

●

●

●●

●●●

●

homework

−1.0 −0.5 0.0 0.5 1.0

●●

●

●●

●

●

●●

●

●

●

●

●●

●

●

●●

●

●

●

●

structure

Figure 2: DFBETAS of class structure and homework

speciﬁcation which="cook". We specify two addi-

tional arguments to augment the ﬁgure. First, we

specify sort=TRUE to have the resulting Cook’s dis-

tances sorted in a descending order in the ﬁgure. The

appropriate cut-off value for Cook’s distance with

23 nesting groups equals to 4/23 =.17. By speci-

fying the cut-off value with cutoff=.17, Cook’s dis-

tances exceeding the speciﬁed value are easily identi-

ﬁed in the resulting ﬁgure. Thus, to receive both nu-

meric output and a graphical representation (Figure

3), the following speciﬁcation of "cooks.distance"

and "plot" is given:

cooks.distance(estex.m23,

parameter=3, sort=TRUE)

plot(estex.m23, which="cook",

cutoff=.17, sort=TRUE,

xlab="Cook´s Distance",

ylab="School ID")

The output below shows one value of Cook’s dis-

tance for each nesting group, in this case for each

school.

[,1]

24371 6.825871e-06

72292 1.524927e-05

...

54344 2.256612e-01

7829 3.081222e-01

7472 1.575002e+00

Cook's Distance

School ID

47583

25642

6053

6467

68493

26537

72080

46417

25456

24371

7194

7801

68448

72991

6327

54344

72292

7930

24725

7474

7829

62821

7472

0.0 0.2 0.4 0.6

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

Figure 3: Cook’s Distance based on class structure

The R Journal Vol. X/Y, Month, Year ISSN 2073-4859

CONTRIBUTED ARTI CLE 8

Only a selection of the output is shown here. A

few schools exceed the cut-off value (in Figure 3

these are indicated with red triangles), but one school

stands out: 7472. Clearly, this school strongly in-

ﬂuences the outcomes regarding the structure vari-

able, as we already suspected based on our bivariate

visual examination in Figure 1.

Testing for Changes in Statistical Signiﬁ-

cance (sigtest)

In the example below, the "sigtest" function is used

to test for changing levels of signiﬁcance after dele-

tion of each of the 23 schools from our example

model. We are speciﬁcally interested in the level

of signiﬁcance of the structure variable, for which

it was already established above that school with

number 7472 is very inﬂuential. Since we observed

a negative effect in the original model, we specify

test=-1.96 to test for signiﬁcance at a commonly

used value (-1.96) of the test statistic. Note that since

we estimated a normally distributed model, the test

statistic here is the t-value.

sigtest(estex.m23, test=-1.96)$structure[1:10,]

In the example above, we only request the results

for the structure variable and for the ﬁrst 10 schools.

In the results presented below, three columns are

shown. The ﬁrst column (Altered.Teststat) shows

the value of the test statistic (here for the structure

variable) after the deletion of the respective schools

(indicated in the row labels). Especially school num-

ber 7472 stands out. In the original model, the test

statistic for the structure variable was -1.583, which

was not signiﬁcant. When the inﬂuence of school

number 7472 is excluded from the model, the test

statistic now is -2.72, which exceeds the selected

value of -1.96 selected by us. That the structure vari-

able would be signiﬁcant by deletion of school 7472

is indicated in the second column (Altered.Sig). The

Changed.Sig column ﬁnally conﬁrms whether the

level of signiﬁcance of the structure variable (which

was not signiﬁcant in the original model) changed to

signiﬁcant after each of the schools was deleted.

In the case of our example, the results for Cook’s

Distance and the results of this test for changing lev-

els of signiﬁcance both indicate that school number

7472 overly inﬂuences the regression outcomes re-

garding the school-level structure variable. Refer-

ring to the discussion on inﬂuential data above, how-

ever, we emphasize that this is not necessarily always

the case. Cases can inﬂuence the point estimates

without affecting their level of signiﬁcance, or affect

the level of signiﬁcance without overly affecting the

point estimate itself. Therefore, both tests should be

performed.

Altered.Teststat Altered.Sig Changed.Sig

6053 -1.326409 FALSE FALSE

6327 -1.688663 FALSE FALSE

6467 -1.589960 FALSE FALSE

7194 -1.512686 FALSE FALSE

7472 -2.715805 TRUE TRUE

7474 -1.895138 FALSE FALSE

7801 -1.534023 FALSE FALSE

7829 -1.045866 FALSE FALSE

7930 -1.566117 FALSE FALSE

24371 -1.546838 FALSE FALSE

Before, using DFBETAS, we identiﬁed several

several schools that overly inﬂuence the estimate of

the homework variable. We have there performed

sigtest test to evaluate whether deletion of any of

the schools changes the level of signiﬁcance of the

homework variable. These results are not shown here,

but indicated that the deletion of none of the schools

changed the level of signiﬁcance of the homework

variable.

Measuring the inﬂuence of lower-level ob-

servations

Finally, it is possible that a single lower-level obser-

vation affects the results of the mixed effects model,

especially for data with a limited number of lower-

level observations per group. In our example, this

would refer to a single student affecting the estimates

of either the individual-level variables, the school-

level variables, or both. Here, we test whether one

or more individual students affect the estimate of the

school-level structure variable.

To perform this test, the "influence" function is

used, and obs=TRUE is speciﬁed to indicate that single

observations (rather than groups) should be evalu-

ated. The user is warned that this procedure often

will be computationally intensive when the number

of lower-level observations is large.

Next, we request Cook’s Distances speciﬁcally for

the structure variable. Since the number of student-

level observations in this model is 519, and cut-off

value for Cook’s Distance is deﬁned as 4/n, the cut-

off value is 4/519 =.0077. The resulting output is

extensive, since a Cook’s Distance is calculated for

any of the 519 students. Therefore, in the example

below, we directly test which of the resulting Cook’s

Distances exceeds the cut-off value.

estex.obs <- influence(m23, obs=TRUE)

cks.d <- cooks.distance(estex.obs, parameter=3)

which(cks.d > 4/519)

The output is not shown here, but the reader can

verify that students with numbers 88 and 89 exert too

much inﬂuence on the estimate of the structure vari-

able. Using the sigtest function, however, showed

that the deletion of none of the students from the

The R Journal Vol. X/Y, Month, Year ISSN 2073-4859

CONTRIBUTED ARTI CLE 9

data affected the level of signiﬁcance of the struc-

ture variable, nor of any of the other variables in the

model.

Dealing with Inﬂuential Data

Now that overly inﬂuential cases have been identi-

ﬁed in our model, we have to decide how to deal

with them. Generally, there are several strategies,

including getting more data, checking data consis-

tency, adapting model speciﬁcation, deleting the in-

ﬂuential cases from the model, and obtaining addi-

tional measurements on existing cases to account for

the overly inﬂuential cases (Van der Meer et al.,2010;

Harrell, Jr.,2001).

Since overly inﬂuential data are a problem es-

pecially encountered in models based on a limited

number of cases, a straightforward remedy would

be to observe more cases in the population of inter-

est. In our example, if we would be able to sample

more schools, it may very well turn out that we ob-

serve several additional schools with a low score on

the structure variable, so that school number 7472 is

no longer inﬂuential. Secondly, there may have been

measurement, coding, or transcription errors in the

data, that have lead to extreme scores on one or more

of the variables (i.e. it may be worthwhile, if possible,

to check whether class structure and / or students’

math performance in school 7472 really is that low).

Thirdly, the model speciﬁcation may be improved. If

the data are used to estimate too complex models, or

if parameterization is incorrect, inﬂuential cases are

more likely to occur. Perhaps the structure variable

should have been treated as categorical.

These are all general strategies, but cannot always

be applied. Depending on the research setting, it is

not always feasible to obtain more observations, to

return to the raw data to check consistency, or to re-

duce model complexity or change parameterization.

The fourth strategy, deleting inﬂuential cases

from the model, can often be applied. In general,

we suggest deleting inﬂuential cases one at the time

and then to re-evaluating the model. Deleting one

or more inﬂuential cases from a mixed effects model

is done with the "exclude.influence" function. The

input of this function is a mixed effects model object,

and it returns an updated mixed effects model from

which a speciﬁed group was deleted. To illustrate,

we delete school number 7472 (which was identiﬁed

as being overly inﬂuential) and its individual-level

observations, using the example code below:

m22 <- exclude.influence(m23,

"school.ID", "7472")

print(m22, cor=FALSE)

The "exclude.influence" function takes a mixed

effects model as input, and requires the speciﬁcation

of the grouping factor (school.ID) and the group to

be deleted (7472). It returns a re-estimated mixed

effects model, that we assign to the object m22. The

summary of that model is shown below:

Linear mixed model fit by REML

Formula: math ~ homework + structure

+ (1 | school.ID)

Data: ..1

AIC BIC logLik deviance REMLdev

3560 3581 -1775 3554 3550

Random effects:

Groups Name Variance Std.Dev.

school.ID (Intercept) 15.333 3.9157

Residual 70.672 8.4067

Number of obs: 496, groups: school.ID, 22

Fixed effects:

Estimate Std. Error t value

(Intercept) 59.4146 5.9547 9.978

homework 2.5499 0.2796 9.121

structure -3.8949 1.4342 -2.716

Two things stand out when this model summary

is compared to our original analysis. First, the num-

ber of observations is lower (496 versus 519), as well

as the number of groups (22 versus 23). More impor-

tantly, though, the negative effect of the structure

variable now is statistically signiﬁcant, whereas it

was not in the original model. So, now these model

outcomes indicate that higher levels of class structure

indeed are associated with lower math test scores,

even when controlled for the students’ homework

efforts.

Further analyses should repeat the analysis for

inﬂuential data, for other schools may turn out to be

overly inﬂuential as well. These repetitive steps are

not presented here, but as it turned out, three other

schools were overly inﬂuential. However, the sub-

stantive conclusions drawn based on model m22 did

not change after their deletion.

Finally, we suggest an approach for dealing with

inﬂuential data, based on Lieberman (2005). He ar-

gues that the presence of outliers may indicate that

one or more important variables were omitted from

the model. Adding additional variables to the model

may then account for the outliers, and improve the

model ﬁt. We discussed above that an inﬂuential case

is not necessarily an outlier in a regression model.

Nevertheless, if additional variables in the model

can account for the fact that an observation has ex-

treme scores on one or more variables, the case may

no longer be an inﬂuential one.

Thus, adding important variables to the model

may solve the problem of inﬂuential data. When the

The R Journal Vol. X/Y, Month, Year ISSN 2073-4859

CONTRIBUTED ARTI CLE 10

observations in a regression model are, for instance,

randomly sampled respondents in a large-scale sur-

vey, it often is impossible to return to these respon-

dents for additional measurements. However, in so-

cial science applications of mixed effects models, the

higher-level groups are often readily accessible cases

such as schools and countries. It may very well be

possible to obtain additional measurements on these

schools or countries, and use these to remedy the

presence of inﬂuential data.

Summary

inﬂuence.ME provides tools for detecting inﬂuen-

tial data in mixed effects models. The application of

these models has become common practice, but the

development of diagnostic tools lag behind. inﬂu-

ence.ME calculates standardized measures of inﬂu-

ential data such as DFBETAS and Cook’s distance,

as well as percentile change and a test for chang-

ing in statistical signiﬁcance of ﬁxed parameter esti-

mates. The package and measures of inﬂuential data

were introduced, a practical example was given, and

strategies for dealing with inﬂuential data were sug-

gested.

Bibliography

D. Bates and M. Maechler. lme4: Linear mixed-effects

models using S4 classes, 2010. URL http://CRAN.

R-project.org/package=lme4. R package version

0.999375-35.

D. Belsley, E. Kuh, and R. Welsch. Regression Di-

agnostics. Identifying Inﬂuential Data and Sources of

Collinearity. Wiley, 1980.

R. Cook. Detection of inﬂuential observations in lin-

ear regression. Technometrics, 19(1):15–18, 1977.

M. J. Crawley. The R Book. Wiley, 2007.

J. Fox. An R and S-Plus Companion to Applied Regres-

sion. Sage, 2002.

F. E. Harrell, Jr. Regression Modeling Strategies. With

Applications to Linear Models, Logistic Regression, and

Survival Analysis. Springer, 2001.

I. Kreft and J. De Leeuw. Introducing Multilevel Mod-

eling. Sage Publications, 1998.

I. Langford and T. Lewis. Outliers in multilevel

data. Journal of the Royal Statistical Society: Series

A (Statistics in Society), 161:121–160, 1998.

E. S. Lieberman. Nested analysis as a mixed-method

strategy for comparative research. American Politi-

cal Science Review, 99:435–452, 2005.

R. Nieuwenhuis, B. Pelzer, and M. te Grotenhuis.

inﬂuence.ME: Tools for detecting inﬂuential data in

mixed effects models, 2012. URL http://CRAN.

R-project.org/package=influence.ME. R pack-

age version 0.9.

D. Sarkar. Lattice. Multivariate Data Visualization with

R. Springer, 2008.

T. Snijders and J. Berkhof. Diagnostic checks for mul-

tilevel models. In J. De Leeuw and E. Meijer, ed-

itors, Handbook of Multilevel Analysis, chapter Di-

agnostic checks for multilevel models, pages 141–

175. Springer, 2008.

T. Snijders and R. Bosker. Multilevel analysis, an in-

troduction to basic and advanced multilevel modelling.

Sage, 1999.

A. Tremblay. LMERConvenienceFunctions: A suite of

functions to back-ﬁt ﬁxed effects and forward-ﬁt ran-

dom effects, as well as other miscellaneous functions.,

2012. URL http://CRAN.R- project.org/package=

LMERConvenienceFunctions. R package version

1.6.8.2.

T. Van der Meer, M. Te Grotenhuis, and B. Pelzer. In-

ﬂuential cases in multilevel modeling. A method-

ological comment. American Sociological Review, 75:

173–178, 2010.

Rense Nieuwenhuis

Institute for Innovation and Governance Studies (IGS),

University of Twente

P.O. Box 217, 7500 AE, Enschede

The Netherlands

r.nieuwenhuis@utwente.nl

Manfred te Grotenhuis

Radboud University

Nijmegen

The Netherlands

m.tegrotenhuis@maw.ru.nl

Ben Pelzer

Radboud University

Nijmegen

The Netherlands

b.pelzer@maw.ru.nl

The R Journal Vol. X/Y, Month, Year ISSN 2073-4859