Content uploaded by Hongli Li

Author content

All content in this area was uploaded by Hongli Li on Nov 16, 2018

Content may be subject to copyright.

1

The Effects of Read-Aloud Accommodations for Students With and Without Disabilities: A

Meta-Analysis

Hongli Li

Georgia State University

Correspondence should be addressed to Hongli Li, Georgia State University, Department of

Educational Policy Studies, P.O. Box 3977, Atlanta GA 30303. Email: hli24@gsu.edu.

This paper has been accepted by Educational Measurement: Issues and Practice. For the final

version, please refer to http://onlinelibrary.wiley.com/doi/10.1111/emip.12027/abstract

Please cite as:

Li, H. (2014, online first). The effects of the read-aloud accommodations for students with and

without disabilities: A meta-analysis. Educational Measurement: Issues and Practices. DOI:

10.1111/emip.12027.

2

ABSTRACT

Read-aloud accommodations have been proposed as a way to help remove barriers faced by

students with disabilities in reading comprehension. Many empirical studies have examined the

effects of read-aloud accommodations; however, the results are mixed. With a variance-known

hierarchical linear modeling approach, based on 114 effect sizes from 23 studies, a meta-analysis

was conducted to examine the effects of read-aloud accommodations for students with and

without disabilities. In general, both students with disabilities and students without disabilities

benefited from the read-aloud accommodations, and the accommodation effect size for students

with disabilities was significantly larger than the effect size for students without disabilities.

Further, this meta-analysis reveals important factors that influence the effects of read-aloud

accommodations. For instance, the accommodation effect was significantly stronger when the

subject area was reading than when the subject area was math. The effect of read-aloud

accommodations was also significantly stronger when the test was read by human proctors than

when it was read by video/audio players or computers. Finally, the implications, limitations, and

directions for future research are discussed.

Keywords: Meta-analysis, read-aloud accommodations, disabilities

3

INTRODUCTION

According to the National Center for Educational Statistics (2011), the proportion of students

with disabilities in K-12 public schools increased from 8.3% in 1976–1977 to 13.1% in 2009–

2010. With the enactment of the No Child Left Behind Act of 2001, the Individuals with

Disabilities Education Act Amendments of 1997, and the Individuals with Disabilities Education

Improvement Act of 2004, schools are required to include students with disabilities in state

testing programs. The aim is to ensure that students with disabilities benefit from standards-

based reforms and achieve high educational standards. However, a major concern is that

historically general large-scale assessments, which are intended for all students but those who

participate in alternate assessments, were developed without consideration of students with

disabilities, and thus may constitute an additional challenge for students with disabilities (Dolan,

Hall, Banerjee, Chun, & Strangman, 2005). To ensure that students with disabilities are

appropriately included in state testing programs, test accommodations have been proposed to

level the playing field by removing construct-irrelevant variance caused by disabilities (Fuchs,

Fuchs, Eaton, Hamlett, & Karns, 2000; Lai & Berkeley, 2012). Among the many existing test

accommodation strategies, the read-aloud accommodation is one of the most commonly used for

students with disabilities (Sireci, Scarpati, & Li, 2005). With this accommodation, the test (or

certain parts of it, such as directions, questions, or prompts) is read to students via a teacher or a

device, in addition to the printed text (Thurlow, Moen, Lekwa, & Scullin, 2010). The read-aloud

accommodation is primarily provided to students with learning disabilities (Crawford & Tindal,

2004), and it is thought that students who struggle to decode written texts will benefit from this

accommodation (Bolt & Roach, 2009).

4

The differential boost framework (Fuchs & Fuchs, 1999) is often used to evaluate the

effects of read-aloud accommodations. In this framework, both students with disabilities and

students without disabilities are expected to benefit from the accommodation; however, students

with disabilities benefit differentially more than students without disabilities. A more strictly

defined version of this framework is the interaction hypothesis (Sireci et al., 2005; Zuriff, 2000),

according to which students who need the accommodation should benefit from it and students

who do not need the accommodation should not benefit from it. The interaction hypothesis is

more stringent in that students without disabilities should not benefit from the accommodation.

Many empirical studies have examined the effects of read-aloud accommodations for students

with disabilities; however, the results are mixed (Elbaum, 2007). Further, it is not clear which

factors influence the heterogeneous effects of read-aloud accommodations. Therefore, a

quantitative synthesis of previous studies is of particular importance in regard to providing solid

information about read-aloud accommodations to educators and policy-makers.

The purpose of the present study is to conduct a meta-analysis on the effects of read-

aloud accommodations for students with and without disabilities. Specifically, two research

questions are asked: (1) What are the effects of read-aloud accommodations for students with

and without disabilities? (2) Which factors are likely to influence the effects of read-aloud

accommodations?

LITERATURE REVIEW

According to Thurlow, Lazarus, Thompson, and Morse (2005), there are five major

accommodation categories: (a) timing—alternative test schedules, (b) response—alternative

ways to respond to the assessment, (c) setting—changes to test surroundings, (d) equipment and

5

materials—the use of additional devices or references, and (e) presentation—alternative ways to

present test materials. Read-aloud accommodations, the focus of this study, present test materials

in an alternative way. The effects of such accommodations are complicated by the involvement

of different students, subject areas, accommodation delivery methods, and other factors (Thurlow,

2007).

Read-aloud accommodations are typically used for math tests with the expectation that

the accommodation will not change the construct being tested. Elbaum (2007) summarized four

types of findings regarding the effects of read-aloud accommodations with math tests. The first

group of studies reported a significantly positive result for students with disabilities, with little or

no effect for students without disabilities (e.g., Tindal, Heath, Hollenback, Almond, & Harniss,

1998). The second group found significantly positive effects for all students, though the effects

were stronger for students with disabilities (e.g., Weston, 2003). The third group showed

significantly positive effects for all students, with no significant difference in regard to the

magnitude of the effects for students with disabilities compared with those without disabilities

(e.g., Meloy, Deville, & Frisbie, 2002). The fourth group found no significant results for either

group of students (e.g., Helwig & Tindal, 2003). In summary, the effects of read-aloud

accommodations for math tests vary considerably (Laitusis, Buzick, Stone, Hansen, & Hakkinen,

2012).

Compared with read-aloud accommodations for math tests, read-aloud accommodations

are much more controversial in the context of reading tests. According to the simple view of

reading, reading comprehension involves two components: decoding and linguistic

comprehension (Hoover & Gough, 1990). Decoding refers to rapidly deriving a representation

from printed input, whereas linguistic comprehension refers to taking lexical information and

6

deriving sentence and discourse interpretations. Read-aloud accommodations make decoding

words easier, which further facilitates reading comprehension. Many researchers take the

position that providing read-aloud accommodations for a reading test changes the construct being

measured and, therefore, should not be allowed (e.g., Bielinski, Thurlow, Ysseldyke, Freidebach,

& Freidebach, 2001; Phillips, 1994). However, this issue remains controversial. For example, as

Crawford and Tindal (2004) have argued, although providing read-aloud accommodations in a

reading test may change the skills being tested from reading comprehension to listening

comprehension, listening and reading comprehension are so highly correlated that such an

accommodated test still provides information about students’ reading skills. Also, according to

Laitusis (2010), when decoding skills are not considered to be a part of reading comprehension,

reading a reading test aloud does not necessarily change the construct being tested. A number of

studies have focused on using read-aloud accommodations for reading tests, and inconsistent

results have been reported. For instance, Meloy et al. (2002) and McKevitt and Elliott (2003)

found similar gains for students with and without disabilities as a result of receiving read-aloud

accommodations on reading tests. Crawford and Tindal (2004) and Laitusis (2010), however,

found a differential boost from the read-aloud accommodation compared to the

nonaccommodation condition for students with disabilities relative to students without

disabilities. In summary, studies on the use of read-aloud accommodations in reading tests

present mixed findings, and whether we should provide read-aloud accommodations in reading

tests continues to be a controversial issue (Thurlow et al., 2010).

A few studies (e.g., Calhoon, Fuchs, & Hamlett, 2000; Miranda, Russell, & Hoffmann,

2004) have explored whether the effects of the read-aloud accommodation differ depending on

how it is delivered. Often, a human proctor, either a teacher or a test administrator, reads the test

7

to students (e.g., Elbaum, 2007). In some studies, the test is read to students by a video or audio

player (e.g., Helwig & Tindal, 2003), and in others, the read-aloud accommodation is delivered

via computers (e.g., Burch, 2002). In an experimental study, Calhoon et al. (2000) did not find a

significant difference between the effects of the read-aloud accommodation delivered by a

human proctor and the accommodation delivered by a computer. However, 65% of the students

in that study reported that they preferred receiving accommodations via computers due to the

anonymity this method afforded them. Certainly, it would be interesting to determine whether

method of delivery has any bearing on the effects of read-aloud accommodations.

Researchers have also found that grade level is related to the effect of read-aloud

accommodations. For instance, in Laitusis (2010), the differential boost was greater in grade 4

than in grade 8 both for students with and without disabilities. In a meta-analysis of read-aloud

accommodations in math tests for students with disabilities, Elbaum (2007) found that the

accommodation effect was stronger for students with disabilities than for students without

disabilities at the elementary school level, but the converse was true for secondary school

students. Laitusis et al. (2012) also noted that read-aloud accommodation studies involving either

middle school or high school students showed less effect compared to studies involving

elementary school students. Grade level, therefore, is an important factor in considering the

effects of read-aloud accommodations.

In order to provide read-aloud accommodations, extra time is sometimes allowed in the

accommodated condition, not because this is a purposeful aspect of the accommodation design

but because the accommodation necessitates it (Olson & Dirir, 2010). For instance, extra time

may be needed to turn a video player on and to change a tape. Therefore, when the read-aloud

8

accommodation shows an effect, it is important to determine whether extra time has confounded

the observed effect (Harker & Feldt, 1993).

Researchers have conducted a number of qualitative reviews on the effects of test

accommodations (e.g., Cormier, Altman, Shyyan, & Thurlow, 2010; Laitusis et al., 2012;

Rogers, Christian, & Thurlow, 2012; Sireci et al., 2005; Tindal & Fuchs, 2000; Zenisky & Sireci,

2007). For example, Sireci et al. (2005) reviewed 59 studies on test accommodations for students

with disabilities, 23 of which used read-aloud accommodations. Despite the mixed results, they

concluded that read-aloud accommodations in math tests appeared to lead to a more valid

interpretation of the math achievement of students with disabilities. In 2012, Laitusis et al.

reviewed test accommodations for students with disabilities. They also found that the read-aloud

accommodation for math tests appeared to be warranted and that the accommodation effects

were influenced by many factors. They further suggested that read-aloud accommodations could

be used for English language arts (ELA) tests “when decoding is not a part of the construct being

measured and in middle school if the read-aloud accommodation is offered without significantly

extending the testing time” (p. 28).

In addition, meta-analysis studies have been performed on test accommodations for

students with disabilities. Chiu and Pearson (1999) conducted a meta-analysis of different types

of test accommodations for both English language learners and students with disabilities. Among

the 40 effect sizes for students with disabilities, only five involved presentation formats (i.e.,

read-aloud accommodations). They found that on average students with disabilities had a score

gain of .16 standard deviation units as a result of receiving test accommodations. However, no

conclusion was drawn specifically about the use of read-aloud accommodations for students with

disabilities. Elbaum (2007) performed a meta-analysis to determine the effects of read-aloud

9

accommodations in math tests for students with disabilities. In total, 17 studies were included,

published between 1998 and 2003. The effect sizes were examined across grade levels. For

elementary school students, the effect sizes ranged from .10 to .82, whereas for secondary school

students, the effect sizes ranged from -.07 to .30. Recently, Vanchu-Orosco (2012) performed a

meta-analysis of different types of test accommodations for students with disabilities. Based on

119 comparisons from 34 studies conducted and/or published from 1999 to 2011, she concluded

that the effect size of test accommodations for students with disabilities was .30 and the effect

size for students without disabilities was .17. Despite the large scale of this meta-analysis,

Vanchu-Orosco did not specifically study read-aloud accommodations.

As suggested by Zenisky and Sireci (2007), there is a need for more well-constructed

meta-analyses of specific accommodations. Therefore, in the present study, we perform a meta-

analysis to determine the effects of read-aloud accommodations for students with and without

disabilities and also to investigate which factors are likely to influence these effects. The meta-

analysis we propose differs from previous meta-analyses in three major respects. First, our meta-

analysis includes a larger number of read-aloud accommodation studies than previous ones. For

example, we include studies on both math and reading tests, from both published and

unpublished sources. Second, our meta-analysis focuses exclusively on read-aloud

accommodations, so that we are able to consider a larger number of variables to explain the

accommodation effects, such as subject area, accommodation delivery method, grade level, extra

time, and research design. Third, unlike Chiu and Pearson (1999), Elbaum (2007), and Vanchu-

Orosco (2012), our meta-analysis uses the variance-known HLM approach, which is explained in

detail as follows.

10

Traditionally, researchers have used fixed-effect models for meta-analysis, with the

assumption that the effect size in each study is an estimate of a common effect size of the whole

population of the studies (Hunter, Schmidt, & Jackson, 1982). In contrast, random-effects

models assume that the included studies are random samples drawn from a population of studies,

so that the findings can be generalized beyond the particular studies included in the meta-

analysis (DerSimonian & Laird, 1986). Raudenbush and Bryk (1985, 2002) proposed a two-level

variance-known hierarchical linear modeling (HLM) approach to meta-analysis, which is

regarded as a mixed-effects model (Fischer & Mansell, 2009). It goes beyond the random-effects

approach by testing whether there is systematic variance that can be explained by study

characteristics beyond simple random variation (Lipsey & Wilson, 2001). Subjects are regarded

as nested within the primary studies included in the meta-analysis. The level-1 model

investigates how effect sizes vary across studies, whereas the level-2 model explains the

potential sources of this variation by examining multiple predictors of effect sizes

simultaneously. Using a simulation study, Noortgate and Onghena (2003) have shown that the

variance-known HLM approach generally produces less biased estimates compared to the fixed-

effects approaches, unless the number of studies is small. In the present meta-analysis, we are

particularly interested in discovering factors that influence the effects of read-aloud

accommodations. Therefore, due to its flexibility (Hox, 2010), we chose the variance-known

HLM approach for this meta-analysis. The technical details of this approach are explained in the

methods section.

11

METHODS

Selecting Studies

Studies were selected for inclusion in the meta-analysis based on the following criteria. First,

only studies in which a read-aloud accommodation featured as the single test accommodation

strategy for students with disabilities and/or students without disabilities were eligible for

inclusion. Second, to address the issue that studies reporting significant effects are more likely to

be published (Glass, 1977), we considered both published and unpublished studies. Third, only

studies that have an experimental or quasi-experimental design and that present sufficient

information to calculate effect sizes were included. Finally, due to the small number of read-

aloud accommodation studies in the context of science tests, only studies involving math or

reading tests were considered.

The following procedures were used to search for eligible studies. First, using various

combinations of key words and phrases such as “read-aloud,” “oral,” and “test accommodation,”

we searched several well-known online databases, including ERIC, JSTOR, ProQuest, and

PsycINFO. Second, we searched reviews on test accommodations for students with disabilities

and without disabilities (e.g., Chiu & Pearson, 1999; Elbaum, 2007; Laitusis et al., 2012; Rogers

et al., 2012; Sireci et al., 2005; Vanchu-Orosco, 2012) and major journals for relevant articles.

Finally, we reviewed references cited in the studies that we had already determined to be eligible

and added those that we had not already found through other sources. After an initial search, we

located 94 studies, among which 71 studies were excluded because they did not meet our

inclusion criteria. Although we did not specify a time frame, all the 94 studies we initially

retrieved were published or released before 1990.

12

Most of the eligible studies involved more than one comparison (or effect size). For

instance, Laitusis (2010) included two groups of students without disabilities and two groups of

students with disabilities, such that this one article generated four comparisons (or effect sizes).

Multiple strategies have been used to address this issue (Marsh, Bornmann, Mutz, Daniel, &

O’Mara, 2009). First, the multiple effect sizes within each study can be averaged, or one effect

size can be selected from each study. As a result, the number of effect sizes is drastically

reduced. Second, this dependence can be modeled by adding a third level to the variance-known

HLM analysis. This approach, however, is constrained by the number of studies included in the

meta-analysis. Third, the dependence can be ignored when it is appropriate to do so. For instance,

as recommended by Borenstein, Hedges, Higgins, and Rothstein (2009, p. 223), when each of the

subgroups in a single study contributes independent information, the “independent subgroups are

no different than independent studies.” Although ignoring the dependence slightly biases

standard errors downward, Marsh et al. (2009) did not find much difference between results of

the method that models dependence as a third level and results of the method that ignores the

dependence. Vanchu-Orosco (2012) also obtained similar results whether she averaged the

multiple effect sizes from a single study or ignored the dependence. Because the samples used to

calculate the four effect sizes were mutually exclusive in Laitusis (2010), we treated each single

comparison from the Laitusis study as the unit of analysis in our meta-analysis. In a similar way,

after a thorough search and screening, we determined that 114 comparisons from 23 studies were

eligible for inclusion in the present meta-analysis. (See Appendix A for a list of the studies

included and their characteristics.) Nevertheless, a single study contributed different numbers of

effect sizes to this meta-analysis, ranging from 1 (e.g., Johnson, 2000) to 16 (e.g., Helwig &

Tindal, 2003). We, therefore, performed a sensitivity analysis to evaluate whether excluding a

13

particular study with a large number of effect sizes would substantially change the results of the

meta-analysis.

Coding Procedure

Based on the literature, five variables—disability status, subject area, delivery method, grade

level, and extra time—were identified as closely related to the effects of read-aloud

accommodations. These variables were subsequently used as potential predictors to account for

variations in the effect sizes among the studies. The sample size, mean test score, and standard

deviation for both the experimental groups and the control groups were also extracted in order to

calculate effect size statistics. The author coded the studies according to a coding scheme, after

which a trained graduate assistant used the same coding scheme to code all the studies

independently. A measure of inter-rater reliability, percentage agreement, was calculated for

each coded variable. The author and the graduate assistant had perfect agreement regarding the

codes for disability status, subject area, delivery method, and grade level. The percentage

agreement was 80% for extra time. Disagreements were resolved through a process of discussion

until an agreement was reached.

First, disability status was coded as “with disabilities” or “without disabilities” based on

information provided in the studies. Specifically, in all the studies included in this meta-analysis,

students with disabilities mostly were identified as having learning disabilities (see Appendix A).

Second, subject areas were coded as either math or reading. Third, the methods whereby the

read-aloud accommodations were delivered were coded according to three categories: when the

test was read by a teacher or a test administrator, it was coded as “read by human proctors”;

when read by a video tape player (Crawford & Tindal, 2004) or a cassette player (Harker &

14

Feldt, 1993), it was coded as “read by video/audio players”; and when read via a computer, it

was coded as “read by computers.” Fourth, grade level was coded according to three categories:

below 6th grade was coded as “elementary school,” 6th to 8th as “middle school,” and 9th to 12th as

“high school.” Fifth, the coding of extra time was less straightforward than the coding for the

other variables. Studies in which extra time was deliberately combined with the read-aloud

accommodation so that a package of accommodations were provided (e.g., Schulte, Elliott, &

Kratochwill, 2001) were excluded from this meta-analysis. We included only studies in which

the read-aloud was offered as a single accommodation strategy and extra time was inevitably

allowed due to practical reasons relating to delivering the read-aloud accommodation (Olson &

Dirir, 2010). When it was specifically stated that the read-aloud accommodated condition

allowed more time than the standard condition, we coded this variable as “yes”; otherwise, we

coded it as “no.” In a few cases, we contacted the authors in order to collect sufficient

information to code this variable.

In addition, we coded the research design of each study. In a study with an independent

group design, typically students were randomly assigned to either the accommodation or the

control condition. Effect size was calculated as the standardized mean difference between the

two groups using the pooled standard deviation of the two groups (raw-score effect size) (Morris

& DeShon, 2002). However, many of the read-aloud accommodation studies that we included

used a repeated measure design (i.e., each student took the test under both conditions, with read-

aloud accommodations and without). Typically, the design was counter-balanced in order to

minimize the order effects. For a study using the repeated-measure design, the correlation

between the pre-test and post-test scores is needed to calculate the effect size (change-score

effect size) (Morris & DeShon, 2002). However, many of the studies using the repeated measure

15

design did not report this correlation, so that we were not able to calculate the change-score

effect size. We, therefore, calculated the raw-score effect size for both the independent group

design and the repeated measure design studies. This practice was also adopted in other test

accommodation meta-analysis studies, such as Gregg and Nelson (2012) and Kieffer, Rivera, and

Francis (2012). In order to adjust this artifact due to different research designs, we further coded

research design as a dichotomous variable (repeated measure design or independent group

design) and used this variable as one of the level-2 predictors in the subsequent analysis (Briggs,

Ruiz-Primo, Furtak, Shepard, & Yin, 2012; Hox, 2010). The percentage agreement between the

author and the graduate assistant was 83% for this variable. However, in the studies with an

independent group design, it was not always the case that individual students were randomly

assigned to the standard or accommodated condition. Sometimes, the randomization was at the

classroom (e.g., Elbaum, 2007) or school level (e.g., Laitusis, 2010) due to practical constraints.

Data Analysis Using Variance-Known HLM

The meta-analysis was performed following the variance-known HLM approach with the

HLM 6.08 software (Raudenbush, Bryk, & Congdon, 2004). As demonstrated in Raudenbush

and Bryk (2002), dj, the effect size estimate for comparison j, is the standardized mean difference

between an experimental group and a control group. It is defined as

dj = ( Ej - Cj) /Sj [1]

where

Ej is the mean outcome for the experimental group;

Cj is the mean outcome for the control group; and

Sj is the pooled within-group standard deviation.

16

According to Hedges (1981), dj can be viewed as a statistic to estimate the corresponding

population effect size. It is approximately unbiased and normally distributed with variance Vj =

(nEj + nCj) / (nEjnCj) + δj2 / [2(nEj + nCj)] [2]

where

δj is the corresponding population effect size;

nEj is the experimental-group sample size; and

nCj is the control-group sample size.

The observed dj is used to substitute for δj in equation 2, and Vj is assumed to be known. When

there are at least 20 (Hedges & Olkin, 1985) or 30 (Raudenbush & Bryk, 2002) cases per study,

it is reasonable to assume that the variance Vj can be estimated with sufficient accuracy (Hox,

2010). Hedges (1981) also presented a correction for bias in the calculation of effect sizes when

the sample size of the experimental group, nEj, or that of the control group, nCj, is very small:

Adjusted dj = dj (1 – [3]

With the variance-known HLM approach, the level-1 outcome variable in the meta-

analysis is the effect size reported for each comparison. When the variation in effect sizes is

statistically significant, level-2 analysis is used to determine the extent to which the predictors

contribute to explaining that variation. As described by Raudenbush and Bryk (2002), the level-1

model (often referred to as the unconditional model) is

dj = δj + ej [4]

where

δj is the true overall effect size across comparisons; and

ej is the sampling error associated with dj as an estimate of δj.

Here, we assume that ej ~ N(0, Vj).

17

In the level-2 model, the true population effect size, δj, depends on comparison

characteristics and a level-2 random error term:

δj = γ0 + γ1W1j + γ2W2j + … + γ6W6j + γ7W7j + γ8W8j + µj [5]

where

W1j …W8j are the comparison characteristics predicting δj (see Table 1 for the list of

variables and the corresponding frequencies);

γ0 is the expected overall effect size when each Wij is zero;

γ1 … γ8 are regression coefficients associated with the comparison characteristics W1 to

W8; and

µj is a level-2 random error.

Insert Table 1 Here

RESULTS

Based on the procedure described in the methods section, 114 effect sizes from 23 studies were

included in this meta-analysis. The distribution of the effect sizes is illustrated in Figure 1. More

of the effect sizes were positive than negative, and no outliers were detected. The effect sizes

ranged from -.95 to 1.20, with a mean of .20 and a standard deviation of .36. The effect sizes

were approximately normally distributed with a skewness of .23 and a kurtosis of 1.51.

Insert Figure 1 Here

The predictors were entered into the model by category separately and then in a

combined way. Table 2 summarizes the estimated regression coefficients, the 95% confidence

intervals, and the random components of a series of models. Due to space limitations, we did not

refer to confidence intervals in the subsequent sections. Model 0 shows the results when no

18

predictors were included. The intercept (i.e., the estimated grand-mean effect size) was .20,

which was statistically different from zero (t (113) = 6.70, p < .001). This result indicates that on

average students who received read-aloud accommodations scored about .20 standard deviation

units higher that their non-accommodated peers. Furthermore, the estimated variance of the

effect size was .06, which was significantly different from zero. This suggests that variability

existed in the true effect sizes across comparisons. Therefore, the results show that analysis

should proceed to a level-2 conditional model in order to determine which characteristics explain

this variability.

Insert Table 2 Here

In Model 1, disability status was statistically significant (γ = .13, t (112) = 2.13, p < .05).

Students without disabilities who received read-aloud accommodations scored about .14 standard

deviation units higher than their non-accommodated peers, whereas students with disabilities

who received read-aloud accommodations scored about .27 (i.e., .14 +.13) standard deviation

units higher than their non-accommodated peers. In Model 2, the accommodation effect size for

math tests was significantly smaller than that for reading tests (γ = -.27, t (112) = -4.18, p < .001).

Specifically, students who received a read-aloud accommodation on reading tests scored about

.41 standard deviation units higher than their non-accommodated peers; however, the increase

was only .14 (i.e., 41 - .27) standard deviation units for math tests. In Model 3, both of the two

variables related to accommodation delivery methods were statistically significant. When a

human proctor read the test, students who received read-aloud accommodations scored about .34

standard deviation units higher than their non-accommodated peers, whereas the increase was .11

(i.e., .34 - .23) standard deviation units when the read-aloud was delivered by a computer and .12

19

(i.e., .34 - .22) standard deviation units when the read-aloud was delivered by a video/audio

player.

In Model 4, for middle school students and for high school students, the effect of read-

aloud accommodations was not significantly different from that for elementary school students.

In Model 5, compared to the effect size when extra time was not evidently provided in the

accommodated condition, the effect size when extra time was provided was significantly larger

by .21 standard deviation units. In Model 6, compared to the effect size in studies with a repeated

measure design, the effect size in studies with an independent group design was significantly

larger by .19 standard deviation units.

In Model 7, we entered all the predictors at one time to investigate their effects

simultaneously. As shown in Table 2, the regression coefficient related to disability status was

significantly positive in both Models 1 and 7. This result indicates that whether or not we

controlled for other predictors, the effect size of receiving read-aloud accommodations was

larger for students with disabilities than for those without disabilities. The effects of subject areas

and accommodation delivery methods were also consistent whether or not other predictors were

included in the model. There were, however, minor variations in regard to the other predictors

across models. The difference between middle schools and elementary schools became

statistically significant in Model 7. Extra time and research design, however, became statistically

non-significant in Model 7. These minor variations indicate a potential interaction among the

predictors, although this interaction is considered slight.

In addition to the regression coefficient, the proportion of variance explained was also

calculated with Model 0 as the baseline model (Raudenbush & Bryk, 2002). As shown in the last

row of Table 2, subject area and accommodation delivery method each explained over 16% of

20

the variance in effect sizes, followed by research design (5.8%), disability status (5.4%), extra

time (3.0%), and grade level (1.1%). In Model 7, when all the predictors were included, the

proportion of variance explained was 47.3%. Still, the estimated variance of the effect sizes in

this model was .034, which was significantly different from zero (χ2 = 260.77, df = 105, p <

.001). This indicates that unknown sources of variability still exists among the effect sizes

beyond what have been accounted for in this meta-analysis.

Figure 2 represents the estimated effect sizes in Model 7 when we controlled for grade

level, extra time, and research design. For example, when the subject area was reading, for

students without disabilities, the estimated effect sizes were as follows: .48 when the test was

read by a human proctor, .26 (i.e., 48 - .22) when read by a computer, and .28 (i.e., .48 - .20)

when read by a video/audio player. The estimated effect size for math was calculated in a similar

way. As shown in Figure 2, the estimated accommodation effects varied substantially across

combinations of disability status, subject area, and accommodation delivery method. We discuss

Figure 2 in greater detail in the subsequent section.

Insert Figure 2 Here

A few studies contributed a large number of effect sizes to this meta-analysis. The mean

of the eight effect sizes in Calhoon et al. (2000) was .24, and the mean of the 16 effect sizes in

Olson and Dirir (2010) was .18, both of which were close to the overall mean of .20. Also, both

studies involved multiple accommodation delivery methods and or multiple subject areas.

However, the mean of the 16 effect sizes in Helwig and Tindal (2003) was only .03, and the

mean of the 14 effect sizes in Helwig et al. (2012) was .00. These two studies focused on read-

aloud accommodations for math tests delivered via video/audio players. In addition, we

performed a sensitivity analysis by removing the effect sizes produced in each of the four studies

21

one at a time. The only changes we observed were as follows: (1) extra time became significant

in Model 7 (γ = .14, t (89) = 2.04, p < .05) when we removed Helwig and Tindal (2003), and (2)

middle school became nonsignificant in Model 7 (γ = -.10, t (89) = -1.38, p > .05) when we

removed Olson and Dirir (2010). Admittedly, one study contributing a large number of effect

sizes may create a dependence issue and bias the standard errors. However, the sensitivity

analysis shows that our conclusions did not change much as a result of including those studies in

this meta-analysis.

DISCUSSION

What Are the Effects of Read-Aloud Accommodations for Students With and Without

Disabilities?

According to the differential boost framework, as a result of receiving accommodations, students

with disabilities are expected to obtain a larger increase in their scores compared to students

without disabilities. The results of this meta-analysis support the requirement of this framework.

In both Models 1 and 7, disability status was a statistically significant predictor, indicating that

the effect of read-aloud accommodations for students with disabilities was significantly stronger

than the effect for students without disabilities whether or not we controlled for other predictors.

Specifically, in Model 1, the accommodation effect size was .14 for students without disabilities

and .27 for students with disabilities. Our result does not differ substantially from previous meta-

analysis findings. For instance, Vanchu-Orosco (2012) reported an effect size of .30 for students

with disabilities and .17 for students without disabilities for multiple types of test

accommodations. In Elbaum (2007), the mean effect size was .37 for elementary school students

with disabilities and .10 for secondary school students with disabilities. At present, in the test

accommodation literature for students with disabilities, the categories of small, medium, and

22

large effects are not clearly defined (Vanchu-Orosco, 2012). If we use a general scheme, as

suggested by Cohen (1992), a differential boost of .13 standard deviation units as found in the

present meta-analysis is regarded as small in practical terms.

The interaction hypothesis states that students who need the accommodation should

benefit from it and that students who do not need the accommodation should not benefit from it.

Here, we refer to the estimated effect sizes in Model 7, when grade level, extra time, and

research design were controlled for. As shown in Figure 2, when the subject area was reading,

regardless of the accommodation delivery method, students with disabilities and students without

disabilities both benefited from receiving read-aloud accommodations, with effect sizes ranging

from .26 to .61. When the subject area was math, students with disabilities and students without

disabilities both benefited from read-aloud accommodations provided by human proctors, with

effect sizes of .35 and .22, respectively. However, for math tests, when the accommodation was

provided by a computer or a video/audio player, the effect sizes for students with disabilities

were very small and the effect sizes for students without disabilities were zero or almost zero.

Therefore, the read-aloud accommodations did not always meet the criteria of the interaction

hypothesis.

In summary, except when read-aloud accommodations were provided in math tests via a

computer or a video/audio player, both students with disabilities and without disabilities

benefited from the accommodation, though the effect size was generally greater for students with

disabilities. The fact that students without disabilities may also benefit from read-aloud

accommodations, however, raises a fairness and validity issue (Li & Suen, 2012; Phillips, 1994).

If read-aloud accommodations are only provided to students with disabilities, students without

disabilities may be at a disadvantage because they could have benefited from the

23

accommodations as well. In other words, the accommodation may even offer students with

disabilities an unfair advantage over students without disabilities. Many studies have addressed

the effects of read-aloud accommodations, and more research is needed to fully understand the

fairness and validity of test accommodations.

Which Factors Are Likely to Influence the Effects of Read-Aloud Accommodations?

As shown in the present meta-analysis, the effect size of read-aloud accommodations for reading

tests was significantly larger than that for math tests whether or not we controlled for other

predictors. Because the read-aloud accommodation directly supports students’ decoding skills, it

is reasonable to expect that such an accommodation supports students’ reading comprehension so

that their performance in reading tests improves. However, as previously discussed, there is

concern regarding whether read-aloud accommodations change the construct of reading tests.

Given this controversy, read-aloud accommodations for reading tests are used with caution,

although the states vary considerably in terms of how much caution they exercise. According to

Wiener and Thurlow (2012), only 16% of the Partnership for Assessment of Readiness for

College and Careers (PARCC) states allow the read-aloud accommodation in state reading tests

with no conditions or consequences for scoring, reporting, and accountability; another 56% allow

such use with conditions; and 20% prohibit it. In terms of math tests, the read-aloud

accommodation helps to remove barriers faced by students with disabilities in reading

comprehension, which is regarded as construct-irrelevant variance in a math test (Elbaum, 2007).

Students, thus, have better opportunities to demonstrate their true math ability, and most states

allow read-aloud accommodations for math tests.

Another important finding of this meta-analysis pertains to the effect of the delivery

method in read-aloud accommodations. Whether or not we controlled for other predictors, read-

24

aloud accommodations provided by human proctors showed a significantly stronger effect than

those delivered by video/audio players or computers. When human proctors read tests, the actual

procedure cannot be completely standardized. For example, some proctors may read tests in a

way that provides students with clues to the answers (Meloy et al., 2002; Olson & Dirir, 2010).

Or, perhaps students are better able to focus on tasks when tests are read by proctors. Another

reason may be that the video/audio cassette is usually played to the whole class at a

predetermined speed, which may interfere with students’ thinking processes (Hollenbeck, Rozek-

Tedesco, Tindal, & Glasgow, 2000). Typically, when the read-aloud accommodation is delivered

via computers, students can listen to the items at their own pace; however, students need

sufficient training in order to use computerized accommodations (Olson & Dirir, 2010). In this

meta-analysis, only 21 of the 114 effect sizes involved read-aloud accommodations delivered via

computers. With the increasing use of computers in educational tests, more empirical studies are

needed if we are to achieve a deeper understanding of read-aloud accommodations delivered via

computers and other emerging technology (Laitusis et al., 2012). In summary, the read-aloud

delivery method is an influential factor in regard to explaining variations among the

accommodation effect sizes, and great caution should be exercised in deciding how read-aloud

accommodations are delivered.

When we controlled for other predictors in Model 7, the effect of read-aloud

accommodations was significantly stronger for elementary school students than for middle

school students. However, the high school predictor did not reach statistical significance, which

may be because the present meta-analysis did not include sufficient studies pertaining to high

school students. Laitusis et al. (2012) proposed some reasons for the grade-level difference. For

example, this difference arises probably because “reading deficits are not as severe at higher

25

grade levels or that the decoding requirements are less pronounced at higher grade levels” (p. 7).

Or, in later grades, math tests become more domain-specific and reading is less influential when

more advanced math skills are present. In future research, it would be worthwhile to examine

how students’ reading proficiency and grade level are related to the effects of read-aloud

accommodations.

As shown in Model 5, when extra time was allowed in the accommodated condition, the

effect size of read-aloud accommodations was significantly larger. This result agrees with reports

that extra time leads to improved test performance for all students, and especially for those with

disabilities (Sireci et al., 2005; Zuriff, 2000). When other predictors were controlled for in Model

7, however, extra time became statistically nonsignificant. This could be because only 20 of the

114 comparisons were coded as involving extra time. Still, no matter whether the extra time

predictor was included in the model, the regression coefficients and statistical significance of the

other predictors did not change much. Thus, we can infer that even when extra time is

unavoidably involved in read-aloud accommodations, its confounding effect is trivial.

Compared to the effect size for studies with a repeated measure design, the effect size for

studies with an independent group design was larger, as shown in Model 6. Research design,

however, became statistically nonsignificant when other predictors were included in Model 7.

We also evaluated different combinations of the research design and other predictors and did not

find noticeable changes in the results. The confounding effect of research design, therefore, is

regarded as small. Our results agree with those reported by Vanchu-Orosco (2002), who also

found that the effect size for studies using independent group design was larger than that for

studies using repeated measure design for read-aloud accommodations. However, the number of

studies using independent group design was small in both the present meta-analysis and Vanchu-

26

Orosco (2002), and thus the finding is likely to be inconclusive. Possible reasons why repeated

measure design studies had a smaller effect size include incomplete counterbalancing of order

effects, students dropping out at later conditions, and regression to the mean.

CONCLUSION, LIMITATIONS, AND FUTURE RESEARCH

As Sireci et al. (2005) put it, “[the] challenge is to implement … accommodations appropriately

and identify which accommodations are best for specific students” (p. 486). Through this meta-

analysis we have identified important patterns pertaining to the effects of read-aloud

accommodations and thus offered a basis for a more appropriate use of read-aloud

accommodations. To conclude, both students with disabilities and students without disabilities

benefited from read-aloud accommodations, and the accommodation effect for students with

disabilities was significantly greater than the effect for students without disabilities. Also,

multiple factors (e.g., disability status, subject area, accommodation delivery method) influence

the effects of read-aloud accommodations simultaneously.

Due to the lack of information for coding in this meta-analysis, we included only

theoretically meaningful predictors that could be reliably coded. First, for math tests, despite the

important role of students’ reading proficiency in read-aloud accommodations, we were not able

to include it as a predictor due to the lack of a universal criterion across studies. The

classification of with disability versus without disability could act as a partial proxy for low

reading proficiency versus high reading proficiency. Still, it is just an approximation. In fact,

given the close relationship between decoding skills and read-aloud accommodations, if students’

decoding skills were available in those studies, this would have been a more meaningful

predictor than students’ reading proficiency. Second, students’ content knowledge in math may

be confounded with the effects of read-aloud accommodations (Elbaum, 2007; Meloy et al.,

27

2002); however, we were not able to include content knowledge as a predictor due to the lack of

information on this point. Finally, disability category would have been a meaningful predictor as

well. Because read-aloud accommodations help students decode words, it is reasonable to expect

students with learning disabilities in reading (such as deficiencies in decoding) to benefit more

from such accommodations than students with other categories of disabilities (Crawford &

Tindal, 2004). In future research, it would be worthwhile to conduct studies to test how disability

category interacts with the effects of read-aloud accommodations.

In addition to the predictors we controlled for, there are variations in the tests being used

and in the ways that read-aloud accommodations are practiced. For instance, Helwig and Tindal

(2003) reported that the effects of read-aloud accommodations in a math test were influenced by

the readability of the test items. In a preliminary exploration, we attempted to code whether the

test items were multiple-choice, constructed-response questions, or both, hoping that this would

at least partly indicate the readability of the test items. However, only a few studies used tests

involving constructed response questions, and we were not able to include item type as a

predictor. The interaction between test characteristics and read-aloud accommodations, therefore,

is an important issue for further study (Cawthon, Ho, Patel, Potvin, & Trundt, 2009; Ketterlin-

Geller, Yovanoff, & Tindal, 2007). Testing settings, for instance, whether the test is

administrated to individuals, to small groups, or to an entire class, was another related factor that

we were not able to include. In future work, it would be advisable for researchers to control for

potentially confounding factors in order to facilitate a better understanding of the effects of read-

aloud accommodations.

A final note is to reflect on the methodological limitations involved in the present meta-

analysis (Berk & Freedman, 2003; Briggs, 2005). As Hunter and Schmidt (2004) warned, the

28

observed differences between effect sizes are produced in part by some unavoidable artifacts in a

meta-analysis, such as statistical assumptions, instruments with different reliabilities, and coder

reliability. For example, one assumption of the variance-known HLM approach to meta-analysis

is that the included studies are regarded as a random sample drawn from the population.

However, because we do not know the actual population, this assumption is not directly testable.

Also, the variance-known HLM approach to meta-analysis relies on the assumptions underlying

a typical HLM analysis (Raudenbush, & Bryk, 2002). Without access to the original raw data,

we cannot directly test these assumptions either. We hope the limitations of the present meta-

analysis can be addressed by well-designed experimental studies and a cumulative meta-analysis

in future work.

REFERENCES

* Indicates articles used in the meta-analysis

Berk, R. A., & Freedman, D. A. (2003). Statistical assumptions as empirical commitments. In T.

G. Blomberg and S. Cohen (Eds.), Law, punishment, and social control: Essays in honor

of Sheldon Messinger (2nd ed., 235–254). Berlin, Germany: Aldine de Gruyter.

Bielinski, J., Thurlow, M., Ysseldyke, J., Freidebach, J., & Freidebach, M. (2001). Read-aloud

accommodation: Effects on multiple-choice reading and math items (Technical Report

31). Minneapolis, MN: University of Minnesota, National Center on Educational

Outcomes.

Bolt, S. E., & Roach, A. T. (2009). Inclusive assessment and accountability: A guide to

accommodations for students with diverse needs. New York, NY: Guilford Press.

Borenstein, M., Hedges, L. V., Higgins, J. P. T., & Rothstein, H. R. (2009). Introduction to meta-

analysis. Chichester, UK: John Wiley & Sons.

29

Briggs, D. C. (2005). Meta-analysis: A case study. Evaluation Review, 29(2), 87–127.

Briggs, D. C., Ruiz-Primo, M. A., Furtak, E., Shepard, L., & Yin, Y. (2012). Meta-analytic

methodology and inferences about the efficacy of formative assessment. Educational

Measurement: Issues and Practice, 31(4), 13–17.

* Burch, M. (2002). Effects of computer-based test accommodations on the math problem-

solving performance of students with and without disabilities (Unpublished dissertation).

Vanderbilt University, Nashville, TN.

* Calhoon, M. B., Fuchs, L. S., & Hamlett, C. L. (2000). Effects of computer-based test

accommodations on mathematics performance assessments for secondary students with

learning disabilities. Learning Disability Quarterly, 23(4), 271–282.

Cawthon, S. W., Ho, E., Patel, P. G., Potvin, D. C., & Trundt, K. M. (2009) Multiple constructs

and effects of accommodations on accommodated test scores for students with disabilities.

Practical Assessment, Research and Evaluation, 14(18), 1–9.

Chiu, C., & Pearson, P. (1999, June). Synthesizing the effects of test accommodations for special

education and limited English proficiency students. Paper presented at the National

Conference on Large Scale Assessment, Snowbird, UT.

Cohen, J. (1992). A power primer. Psychological Bulletin, 112(1), 155–159.

Cormier, D. C., Altman, J. R., Shyyan, V., & Thurlow, M. L. (2010). A summary of the research

on the effects of test accommodations: 2007–2008 (Technical Report 56). Minneapolis,

MN: University of Minnesota, National Center on Educational Outcomes.

* Crawford, L., & Tindal, G. (2004). Effects of a read-aloud modification on a standardized

reading test. Exceptionality: A Special Education Journal, 12(2), 89–106.

30

DerSimonian, R., & Laird, N. (1986). Meta-analysis in clinical trials. Controlled Clinical Trials,

7(3), 177–188.

Dolan, R. P., Hall, T. E., Banerjee, M., Chun, E., & Strangman, N. (2005). Applying principles

of universal design to test delivery: The effect of computer-based read-aloud on test

performance of high school students with learning disabilities. The Journal of

Technology, Learning, and Assessment, 3(7), 4–32.

* Elbaum, B. (2007). Effects of an oral testing accommodation on the mathematics performance

of secondary students with and without learning disabilities. The Journal of Special

Education, 40(4), 218–229.

Fischer, R., & Mansell, A. (2009). Commitment across cultures: A meta-analytical approach.

Journal of International Business Studies, 40(8), 1,339–1,358.

* Fletcher, J. M., Francis, D. J., O’Malley, K., Copeland, K., Mehta, P., Caldwell, C. J.,

Kalinowski, S., Young, V., & Vaughn, S. (2009). Effects of a bundled accommodations

package on high-stakes testing for middle school students with reading disabilities.

Exceptional Children, 75(4), 447–463.

Fuchs, L. S., & Fuchs, D. (1999). Helping teachers formulate sound test accommodation

decisions for students with learning disabilities. Learning Disabilities Research &

Practice, 16(3), 174–181.

* Fuchs, L. S., Fuchs, D., Eaton, S. B., Hamlett, C. L., & Karns, K. M. (2000). Supplementing

teacher judgments of mathematics test accommodations with objective data sources.

School Psychology Review, 29(1), 65–85.

31

* Geraghty, C. A., & Vanderwood, M. L. (in press). Effects of a mathematics read aloud

accommodation for students with high and low reading skills. Journal of Special

Education.

Glass, G. V. (1977). Integrating findings: The meta-analysis of research. Review of Research in

Education, 5(1), 351–379.

Gregg, N., & Nelson, J. M. (2012). Meta-analysis on the effectiveness of extra time as a test

accommodation for transitioning adolescents with learning disabilities: More questions

than answers. Journal of Learning Disabilities, 45(2), 128–138.

* Harker, J. K., & Feldt, L. S. (1993). A comparison of achievement test performance of

nondisabled students under silent reading and reading plus listening modes of

administration. Applied Measurement in Education, 6(4), 307–320.

Hedges, L. V. (1981). Distribution theory for Glass’s estimator of effect size and related

estimators. Journal of Educational and Behavioral Statistics, 6(2), 107–128.

Hedges, L. V., & Olkin, I. (1985). Statistical methods for meta-analysis. San Diego, CA:

Academic Press.

* Helwig, R., Rozek-Tedesco, M. A., & Tindal, G. (2002). An oral versus a standard

administration of a large-scale mathematics test. The Journal of Special Education, 36(1),

39–47.

* Helwig, R., Rozek-Tedesco, M. A., Tindal, G., Heath, B., & Almond, P. J. (1999). Problem

solving on multiple-choice tests for sixth-grade students. The Journal of Educational

Research, 93(2), 113–125.

* Helwig, R., & Tindal, G. (2003). An experimental analysis of accommodation decisions on

large-scale mathematics tests. Exceptional Children, 69(2), 211–225.

32

Hollenbeck, K., Rozek-Tedesco, M. A., Tindal, G., & Glasgow, A. (2000). An exploratory study

of student-paced versus teacher-paced accommodations for large-scale math tests.

Journal of Special Education Technology, 15(2), 27–36.

Hoover, W. A., & Gough, P. B. (1990). The simple view of reading. Reading and Writing: An

Interdisciplinary Journal, 2, 127–160.

Hox, J. J. (2010). Multilevel analysis: Techniques and applications (2nd ed.). New York, NY:

Routledge.

Hunter, J. E., & Schmidt, F. L. (2004). Methods of meta-analysis (2nd ed.). Newbury Park, CA:

Sage.

Hunter, J. E., Schmidt, F. L., & Jackson, G. B. (1982). Meta-analysis: Cumulating findings

across research. Beverly Hills, CA: Sage.

Individuals with Disabilities Education Act Amendments of 1997, Pub. L. No. 105–17 (1997).

Retrieved from http://www.ed.gov/offices/OSERS/Policy/IDEA/the_law.html

Individuals with Disabilities Education Improvement Act of 2004, Pub. L. No. 108–446. (2004).

Retrieved from http://idea.ed.gov/explore/view/p/%2Croot%2Cstatute%2C

* Johnson, E. S. (2000). The effects of accommodations on performance assessments. Remedial

and Special Education, 21(5), 261–267.

* Ketterlin-Geller, L. R., Yovanoff, P., & Tindal, G. (2007). Developing a new paradigm for

conducting research on accommodations in mathematics testing. Exceptional Children,

73(3), 331–347.

Kieffer, M. J., Rivera, M., & Francis, D. J. (2012). Practical guidelines for the education of

English language learners: Research-based recommendations for the use of

33

accommodations in large-scale assessments. 2012 update. Portsmouth, NH: RMC

Research Corporation, Center on Instruction.

* Kosciolek, S., & Ysseldyke, J. E. (2000). Effects of a reading accommodation on the validity of

a reading test (Technical Report 28). Washington, DC: Council of Chief State School

Officers.

Lai, S. A., & Berkeley, S. (2012). High-stakes test accommodations: Research and practice.

Learning Disability Quarterly, 35(3), 158–169.

* Laitusis, C. C. (2010). Examining the impact of audio presentation on tests of reading

comprehension. Applied Measurement in Education, 23(2), 153–167.

Laitusis, C., Buzick, H., Stone, E., Hansen, E., & Hakkinen, M. (2012). Literature review of

testing accommodations and accessibility tools for students with disabilities. Retrieved

from http://www.smarterbalanced.org/wordpress/wp-content/uploads/2012/08/Smarter-

Balanced-Students-with-Disabilities-Literature-Review.pdf

Li, H., & Suen, H. K. (2012). Are test accommodations for English language learners fair?

Language Assessment Quarterly, 9(3), 293–309.

Lipsey, M. W., & Wilson, D. B. (2001). Practical meta-analysis. London: Sage.

Marsh, H., Bornmann, L., Mutz, R., Daniel, H-D., & O’Mara, A. (2009). Gender effects in the

peer reviews of grant proposals: A comprehensive meta-analysis comparing traditional

and multilevel approaches. Review of Educational Research, 79(3), 1290–1326.

McKevitt, B. C., & Elliott, S. N. (2003). Effects and perceived consequences of using read aloud

and teacher recommended testing accommodations on a reading achievement test. School

Psychology Review, 32(4), 583–600.

34

* Meloy, L., Deville, C., & Frisbie, D. A. (2002). The effect of a read aloud accommodation on

test scores of students with and without a learning disability in reading. Remedial and

Special Education, 23(4) 248–255.

* Miranda, H., Russell, M., & Hoffmann, T. (2004). Examining the feasibility and effect of a

computer-based read-aloud accommodation on mathematics test performance. Retrieved

from

http://www.bc.edu/research/intasc/researchprojects/enhanced_assessment/PDF/EAP_Rea

dAloud.pdf

Morris, S. B., & DeShon, R. P. (2002). Combining effect size estimates in meta-analysis with

repeated measures and independent-groups designs. Psychological Methods, 7(1), 105–

125.

National Center for Educational Statistics (2011). Digest of education statistics: 2011. Retrieved

from http://nces.ed.gov/programs/digest/d11/index.asp

No Child Left Behind Act of 2001, Pub. L. No. 107–110, 115 Stat. 1425 (2002).

Noortgate, W. V. den, & Onghena, P. (2003). Multilevel meta-analysis: A comparison with

traditional meta-analytical procedures. Educational and Psychological Measurement,

63(5), 765–790.

* Olson, J. F., & Dirir, M. D. (2010). Technical report for studies of the validity of test results for

test accommodations. Washington, DC: Council of Chief State School Officers

(CCSSO).

Phillips, S. E. (1994). High stakes testing accommodations: Validity vs. disabled rights. Applied

Measurement in Education, 7(2), 93–120.

35

Raudenbush, S. W., & Bryk, A. S. (1985). Empirical Bayes meta-analysis. Journal of

Educational Statistics, 10(2), 75–98.

Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models: Applications and data

analysis methods (2nd ed.). London: Sage.

Raudenbush, S. W., Bryk, A. S, & Congdon, R. (2004). HLM 6 for Windows [Computer

software]. Lincolnwood, IL: Scientific Software International.

Rogers, C. M., Christian, E. M., & Thurlow, M. L. (2012). A summary of the research on the

effects of test accommodations: 2009–2010 (Technical Report 65). Minneapolis, MN:

University of Minnesota, National Center on Educational Outcomes.

* Schnirman, R. K. (2005). The effect of audiocassette presentation on the performance of

students with and without learning disabilities on a group standardized math test

(Unpublished dissertation). Florida Atlantic University, Boca Raton, FL.

Schulte, A. A. G., Elliott, S. N., & Kratochwill, T. R. (2001). Effects of testing accommodations

on standardized mathematics test scores: An experimental analysis of the performances

of students with and without disabilities. School Psychology Review, 30(4), 527–547.

Sireci, S. G., Scarpati, S. E., & Li, S. (2005). Test accommodations for students with disabilities:

An analysis of the interaction hypothesis. Review of Educational Research, 75(4), 457–

490.

Thurlow, M. (2007, April). Research impact on state accommodation policies for students with

disabilities. Paper presented at the annual meeting of the American Educational Research

Associations, Chicago, IL.

36

Thurlow, M. L., Lazarus, S. S., Thompson, S. J., & Morse, A. B. (2005). State policies on

assessment participation and accommodations for students with disabilities. The Journal

of Special Education, 38(4), 232–240.

Thurlow, M. L., Moen, R. E., Lekwa, A. J., & Scullin, S. B. (2010). Examination of a reading

pen as a partial auditory accommodation for reading assessment. Minneapolis, MN:

University of Minnesota, Partnership for Accessible Reading Assessment.

* Tindal, G. (2002). Accommodating mathematics testing using a videotaped, read-aloud

administration (Research Report 143). Washington, DC: Council of Chief State School

Officers (CCSSO).

Tindal, G., & Fuchs, L. (2000). A summary of research on test changes: An empirical basis for

defining accommodations. Lexington, KY: Mid-South Regional Resource Center.

* Tindal, G., Heath, B., Hollenbeck, K., Almond, P., & Harniss, M. (1998). Accommodating

students with disabilities on large-scale tests: An experimental study. Exceptional

Children, 64(4), 439–450.

Vanchu-Orosco, M. (2012). A meta-analysis of testing accommodations for students with

disabilities: Implications for high-stakes testing (Unpublished dissertation). University of

Denver, Denver, CO.

* Weston, T. J. (2003). The validity of oral accommodation in testing: NAEP validity studies.

Washington, DC: National Center for Education Statistics.

Wiener, D., & Thurlow, M. (2012). Creating accessible PARCC reading assessments:

Separating the constructs and providing text-to-speech accommodations for students with

disabilities. Retrieved from

37

http://www.parcconline.org/sites/parcc/files/PARCCAccessibleReadingAssessmentsPape

rFINAL_0_0.pdf

* Wolf, M. K., Kim, J., & Kao, J. (2012). The effects of glossary and read-aloud

accommodations on English language learners’ performance on a mathematics

assessment. Applied Measurement in Education, 25(4), 347–374.

Zenisky, A. L., & Sireci, S. G. (2007). A summary of the research on the effects of test

accommodations: 2005–2006 (Technical Report 47). Minneapolis, MN: University of

Minnesota, National Center on Educational Outcomes.

Zuriff, G. E. (2000). Extra examination time for students with learning disabilities: An

examination of the maximum potential thesis. Applied Measurement in Education, 13(1),

99–117.

38

Appendix A. Studies Included

References

Subject

area

Delivery

method

Grade

level

Extra

time

Study

design

# of

comparisons

involving

students

without

disabilities

# of

comparisons

involving

students

with

disabilities

Categories of

disabilitiesa

Burch (2002)

Math

Computer

Elementary

Yes

Repeated

measure

1

2

All the students had

reading disabilities or

reading and math

disabilities.

Calhoon, Fuchs, &

Hamlett (2000)b

Math

Human

Computer

High

No

Repeated

measure

0

8

All the students had

learning disabilities.

Crawford & Tindal

(2004)

Reading

Video/audio

player

Elementary

Yes

Repeated

measure

2

1

62% of the students

had learning

disabilities.

Elbaum (2007)

Math

Human

Middle

High

No

Independent

group

2

2

All the students had

learning disabilities.

Fletcher, Francis, &

O’Malley, et al.

(2009)c

Reading

Human

Middle

No

Independent

group

1

1

73% of the students

had reading

disabilities; 27% had

dyslexia.

Fuchs, Fuchs,

Eaton, Hamlett, &

Karns (2000)

Math

Human

Elementary

No

Repeated

measure

2

2

All the students had

learning disabilities

Geraghty &

Vanderwood (in

press)d

Math

Human

Elementary

No

Repeated

measure

2

0

N/A

Harker & Feldt

(1993)

Reading

Video/audio

player

High

Yes

Repeated

measure

3

0

N/A

Helwig, Rozek-

Tedesco, & Tindal

Math

Video/audio

player

Elementary

Middle

No

Repeated

measure

7

7

All the students either

had learning

39

(2002)

disabilities in reading

or were

recommended to

receive read-aloud

accommodations.

Helwig, Rozek-

Tedesco, Tindal,

Heath, & Almond

(1999)

Math

Video/audio

player

Middle

No

Repeated

measure

2

0

N/A

Helwig & Tindal

(2003)e

Math

Video/audio

player

Elementary

Middle

No

Repeated

measure

8

8

Over 70% of the

students had learning

disabilities.

Johnson (2000)f

Math

Human

Elementary

No

Independent

group

1

0

N/A

Ketterlin-Geller,

Yovanoff, & Tindal

(2007)g

Math

Computer

Elementary

No

Repeated

measure

4

0

N/A

Kosciolek &

Ysseldyke (2000)

Reading

Video/audio

player

Elementary

Yes

Repeated

measure

1

1

All the students

received special

education services in

reading.

Laitusis (2010)

Reading

Video/audio

player

Elementary

Middle

No

Repeated

measure

2

2

All the students had

reading-based

learning disabilities.

Meloy, Deville, &

Frisbie (2002)h

Reading

Math

Human

Middle

Yes

Independent

group

3

3

All the students had

learning disabilities in

reading.

Miranda, Russell,

& Hoffmann (2004)

Math

Human

Computer

High

No

Independent

group

2

2

All the students had

learning disabilities.

Olson & Dirir

(2010)

Math

Reading

Human

Computer

Elementary

Middle

No

Repeated

measure

8

8

All the students were

in special education

and were eligible to

receive read-aloud

40

accommodations.

Schnirman (2005)

Math

Video/audio

player

Middle

No

Repeated

measure

2

2

All the students had

learning disabilities.

Tindal (2002)

Math

Video/audio

player

Elementary

Middle

Yes

Repeated

measure

2

2

69% of the students

had learning

disabilities.

Tindal, Heath,

Hollenbeck,

Almond, & Harniss

(1998)

Math

Human

Elementary

No

Independent

group

1

1

Most of the students

received special

education services in

reading or math.

Weston (2003)

Math

Human

Elementary

No

Repeated

measure

2

2

All the students had

learning disabilities.

Wolf, Kim, & Kao

(2012)i

Math

Human

Middle

No

Independent

group

2

0

N/A

Note. a The description is only about students with disabilities in each study.

b The four comparisons using both a computer and a video were not included.

c The read-aloud accommodations with two-day administration were not included. Passages were not read. Only stems, responses, and

proper nouns were read to the students.

d Students above the 25th percentile on the computation task were included.

e Only medium and high oral English fluency students were included.

f Only the comparison between group A (year 1997) and group B (year 1997) was included.

g Only the higher-ability reader group was included.

h The usage and expression section was coded as a reading test.

i Only non-ELL students who had received a read-aloud accommodation were included. As confirmed by the author via a personal

communication, the non-ELL students did not have disabilities.

41

Table 1. Variables and Frequencies

Variable

Coding and Notation

Frequency

Disability status

Students without disabilities (Reference group)

Students with disabilities (W1)

60

54

Subject area

Reading (Reference group)

Math (W2)

26

88

Delivery

method

Read by human proctors (Reference group)

Read by computers (W3)

Read by video/audio players (W4)

41

21

52

Grade level

Elementary school ( < 6th grade) (Reference group)

Middle school (6th to 8th) (W5)

High school (9th to 12th) (W6)

51

46

17

Extra time

Accommodated condition does not allow more time than the

non-accommodated condition (Reference group)

Accommodation condition allows more time than the non-

accommodated condition (W7)

94

20

Research design

Repeated measure design (Reference group)

Independent group design (W8)

93

21

42

Table 2. Results of the Models

Model 0

Model 1

Model 2

Model 3

Model 4

Model 5

Model 6

Model 7

Fixed effect

Intercept

.20***

(.14, .26)a

.14***

(.07, .22)

.41***

(.30, .52)

.34***

(.25, .43)

.24***

(.15, .32)

.17***

(.10, .23)

.17***

(.10, .23)

.48***

(.33, .64)

Disability

.13*

(.01, .24)

.13*

(.03, .23)

Math

-.27***

(-.40, -.15)

-.26***

(-.38, -.14)

Computer

-.23**

(-.38, -.07)

-.22**

(-.36, -.07)

Video/audio player

-.22**

(-.35, -.10)

-.20**

(-.33, -.06)

Middle school

-.10

(-.23, .03)

-.15*

(-.26, -.03)

High school

.03

(-.14, .21)

-.02

(-.18, .14)

Extra time

.21*

(.05, .36)

.13

(-.01, .27)

Independent group

design

.19*

(.04, .34)

.08

(-.09, .24)

Random effect

Standard deviation

.25387

.24696

.23165

.23206

.25247

.25008

.24635

.18431

Variance

Component

.06445b

.06099

.05366

.05385

.06374

.06254

.06069

.03397

Degree of freedom

113

112

112

111

111

112

112

105

Chi-square value

431.17

392.30

393.62

385.11

420.00

470.22

415.16

260.77

Proportion of

variance explained

N/A

5.37%

16.74%

16.45%

1.10%

2.96%

5.83%

47.29%

43

Note. * p < .05, ** p < .01, *** p < .001; a Numbers in the parentheses are the 95% confidence interval; b The variance component is

significant at .001 level across all models.

44

Figure 1. Histogram of effect sizes.

45

Figure 2. Estimated effect sizes.

Note. The effect sizes are based on Model 7, where reference groups are elementary school, no extra time, and repeated-measure

design.

0.61

0.39 0.41

0.35

0.13 0.15

0.48

0.26 0.28

0.22

0 0.02

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Reading

(human

proctor)

Reading

(computer)

Reading

(video or audio

player)

Math

(human

proctor)

Math

(computer)

Math

(video or audio

player)

With disability Without disability