Content uploaded by Gerben Mulder

Author content

All content in this area was uploaded by Gerben Mulder on Aug 16, 2019

Content may be subject to copyright.

Estimating linear trends in single factor between

subject designs

Gerben Mulder

August 16, 2019

This is an introduction to linear trend analysis from an estimation perspective.

The contents of this introduction is based on Maxwell, Delaney, and Kelley (2017)

and Rosenthal, Rosnow, and Rubin (2000). I have taken the (invented) data

from Haans (2018). The estimation perspective to statistical analysis is aimed

at obtaining point and interval estimates of eﬀect sizes. Here, I will use the

frequentist perspective of obtaining a point estimate and a 95% Conﬁdence

Interval of the relevant eﬀect size. For linear trend analysis, the relevant eﬀect

size is the slope coeﬃcient of the linear trend, so, the purpose of the analysis is

to estimate the value of the slope and the 95% conﬁdence interval of the estimate.

We will use contrast analysis to obtain the relevant data.

The references cited above are clear about how to construct contrast coef-

ﬁcients (lambda coeﬃcients) for linear trends (and non-linear trends for that

matter) that can be used to perform a signiﬁcance test for the null-hypothesis

that the slope equals zero. Maxwell, Delaney, and Kelley (2017) describe how

to obtain a conﬁdence interval for the slope and make clear that to obtain

interpretable results from the software we use, we should consider how the linear

trend contrast values are scaled. That is, standard software (like SPSS) gives

us a point estimate and a conﬁdence interval for the contrast estimate, but

depending on how the coeﬃcients are scaled, these estimates are not necessarily

interpretable in terms of the slope of the linear trend, as I will make clear

momentarily.

So our goal of the data-analysis is to obtain a point and interval estimate of

the slope of the linear trend and the purpose of this contribution is to show how

to obtain output that is interpretable as such.

A linear trend

Let us have a look at an example of a linear trend to make clear what exactly

we are talking about here. To keeps things simple, we suppose the following

context. We have an experimental design with a single factor and a single

dependent variable. The factor we are considering is quantitive and its values are

equally spaced. This may (or may not) diﬀer from the usual experiment,where

the independent variable is a qualitative, nominal variable. An example from

Haans (2018) is the variable location, which is the row in the lecture room where

students attending the lecture are seated. There are four rows and the distance

between the rows is equal. Row 1 is the row nearest to the lecturer, and row 4 is

1

the row with the largest distance between the student and the lecturer. We will

assign values 1 through 4 to the diﬀerent rows.

We hypothesize that the distance between the student and the lecturer, where

distance is operationalized as the row where the student is seated, and mean

exam scores of the students in each row show a negative linear trend. The

purpose of the data-analysis is to estimate how large the (negative) slope of this

linear trend is. Let us ﬁrst suppose that there is a perfect negative linear trend,

in the sense that each unit increase in the location variable is associated with a

unit decrease in the mean exam score. Let us suppose that means are 4, 3, 2

and 1, respectively. The negative linear trend is depicted in the following ﬁgure.

●

●

●

●

1.0 1.5 2.0 2.5 3.0 3.5 4.0

X

Y

1234

Figure 1: Negative linear trend with slope β1=−1

The equation for this perfect linear relation between location and mean exam

score is

¯

Y

= 5 + (

−

1)

X

, that is, the slope of the negative trend equals

−

1. So,

suppose the pattern in our sample means follows this perfect negative trend, we

want our slope estimate to equal −1.

Now, following Maxwell, Delaney, and Kelley (2017), with equal sample sizes,

2

Location

Row 1 Row 2 Row 3 Row 4

5 7 5 1

6 9 4 3

7 8 6 4

8 5 7 2

9 6 8 0

¯

Y1= 7 ¯

Y2= 7 ¯

Y3= 6 ¯

Y4= 2

Table 1: The data provided by Haans (2018)

the estimated slope of the linear trend is equal to

ˆ

β1=ˆ

ψlinear

Pk

j=1 λ2

j

,(1)

where the lambda weight

λj

=

Xj−¯

X

. For the intercept of the linear trend

equation we have

ˆ

β0=¯

Y−ˆ

β1¯

X. (2)

Since the mean of the X values equals 2.5, we have lambda weights

Λ

=

[

−

1

.

5

,−

0

.

5

,

0

.

5

,

1

.

5]. The value of the linear contrast equals

−

1

.

5

∗

4 +

−

0

.

5

∗

3 + 0

.

5

∗

2+1

.

5

∗

1 =

−

5, the sum of the squared lambda weights equals 5, so

the slope estimate equals ˆ

β1=−5

5=−1, as it should.

The importance of scaling becomes clear if we use the standard recommended

lambda weights for estimating the negative linear trend. These standard weights

are

Λ

= [

−

3

,−

1

,

1

,

3] Using those weights leads to a contrast estimate of

−

10,

and, since the sum of the squared weights equals 20, to a slope estimate of

−

0

.

50,

which is half the value we are looking for. For signiﬁcance tests of the linear

trend, this diﬀerence in results doesn’t matter, but for the interpretation of the

slope it clearly does. Since getting the “correct” value for the slope estimate

requires an additional calculation (albeit a very elementary one), I recommend

sticking to setting the lambda weight to λj=Xj−¯

X.

Estimating the slope

Let us apply this to the imaginary data provided by Haans (2018). The data

are reproduced in Table 1.

The groups means of all of the four rows are

¯

Y

= [7

,

7

,

6

,

2]. The lambda

weights are

Λ

= [

−

1

.

5

,−

0

.

5

,

0

.

5

,

1

.

5]. The value of the contrast estimate equals

ˆ

ψlinear

=

−

8, the sum of the squared lambda weights equals

Pk

j=1 λ2

j

= 5, so

the estimated slope equals

−

1

.

6. The equation for the linear trend is therefore

ˆµj

= 9

.

5

−

1

.

6

Xj

. Figure 2 displays the obtained means and the estimated means

based on the linear trend.

Note that the slope estimate can also be obtained from

ralerting

, the product

moment correlation between the group means and the contrast weights. Let

3

us deﬁne the eﬀect

ˆαj

= (

¯

Yj−¯

Y

), and the lambda weights as before, then

ˆ

β1=ralerting sPk

j=1

ˆα2

j

Pk

j=1 λ2

j

.

2468

Location

Mean exam score

1234

●

●

●

●

Figure 2: Obtained group means and estimated group means based on the linear

trend

Obtaining the slope estimate with SPSS

If we estimate the linear trend contrast with SPSS, we will get a point estimate

of the contrast value and a 95% conﬁdence interval estimate. For instance, if we

use the lambda weights

Λ

= [

−

1

.

5

,−

0

.

5

,

0

.

5

,

1

.

5] and the following syntax, we

get the output presented in Figure 3.

UNIANOVA score BY row

/METHOD=SSTYPE(3)

/INTERCEPT=INCLUDE

/CRITERIA=ALPHA(0.05)

/LMATRIX = "Negative linear trend?"

4

row -1.5 -0.5 0.5 1.5 intercept 0

/DESIGN=row.

Figure 3: SPSS output

Figure 3 makes it clear that the 95% CI is of the linear trend contrast estimate,

and not of the slope. But it is easy to obtain a conﬁdence interval for the slope

estimate by using (1) on the limits of the CI of the contrast estimate. Since the

sum of the squared lambda weights equals 5.0, the conﬁdence interval for the

slope estimate is 95% CI [

−

11

.

352

/

5

,−

4

.

648

/

5] = [

−

2

.

27

,−

0

.

93]. Alternatively,

divide the lambda weights by the sum of the squared lambda weights and use

the results in the speciﬁcation of the L-matrix in SPSS:

UNIANOVA score BY row

/METHOD=SSTYPE(3)

/INTERCEPT=INCLUDE

/CRITERIA=ALPHA(0.05)

/LMATRIX = "Negative linear trend?"

row -1.5/5 -0.5/5 0.5/5 1.5/5 intercept 0

/DESIGN=row.

Using the syntax above leads to the results presented in Figure 4.

Obtaining the estimation results in R

The following R-code accomplishes the same goals.

# load the dataset

load('~\\betweenData.RData')

5

Figure 4: SPSS output for adjusted lambda weights

# load the functions of the emmeans package

library(emmeans)

# set options for the emmeans package to get

# only confidence intervals set infer=c(TRUE, TRUE)for both

# CI and p-value

emm_options(contrast =list(infer=c(TRUE,FALSE)))

# specify the contrast (note divide

# by sum of squared contrast weights)

myContrast =c(-1.5,-0.5,0.5,1.5)/5

# fit the model (this assumes the data are

# available in the workspace)

theMod =lm(examscore ~location)

# get expected marginal means

theMeans =emmeans(theMod, "location")

contrast(theMeans, list("Slope"=myContrast))

## contrast estimate SE df lower.CL upper.CL

## Slope -1.6 0.3162278 16 -2.270373 -0.9296271

##

6

## Confidence level used: 0.95

Interpreting the result

The estimate of the slope of the linear trend equals

ˆ

β1

=

−

1

.

60, 95% CI

[

−

2

.

27

,−

0

.

93]. This means that with each increase in row number (from a given

row to a location one row further away from the lecturer) the estimated exam

score will on average decrease by

−

1

.

6points, but any value between

−

2

.

27

and

−

0

.

93 is considered to be a relatively plausible candidate value, with 95%

conﬁdence. (Of course, we should probably not extrapolate beyond the rows

that were actually observed, otherwise students seated behind the lecturer will

be expected to have a higher population mean than students seated in front of

the lecturer).

In order to aid interpretation one may convert these numbers to a standardized

version (resulting in the standardized conﬁdence interval of the slope estimate)

and use rules-of-thumb for interpretation. The square root of the within condition

variance may be a suitable standardizer. The value of this standardizer is

SW

= 1

.

58 (I obtained the value of

MSwithin

= 2

.

5from the SPSS ANOVA table).

The standardized estimates are therefore

−

1

.

0, 95% CI [

−

1

.

43

,−

0

.

59], suggesting

that the negative eﬀect of moving one row further from the lecturer is somewhere

between medium and very large, with the point estimate corresponding to a

large negative eﬀect.

References

Haans, Antal (2018). Contrast Analysis: A Tutorial. Practi-

cal Assessment, Research, & Education, 23(9). Available online:

http://pareonline.net/getvn.asp?v=23&n=9

Maxwell, S.E., Delaney, H. D., & Kelley, K. (2017). Designing Experiments

and Analyzing Data. A Model Comparison Perspective. (Third Edition). New

York/ London: Routledge.

Rosenthal, R., Rosnow, R.L., & Rubin, D.B. (1993). Contrasts and Eﬀect Sizes

in Behavioral Research. A Correlational Approach. Cambridge, UK: Cambridge

University Press.

7