The Effects of Read-Aloud Accommodations for Students With and Without Disabilities: A
Georgia State University
Correspondence should be addressed to Hongli Li, Georgia State University, Department of
Educational Policy Studies, P.O. Box 3977, Atlanta GA 30303. Email: firstname.lastname@example.org.
This paper has been accepted by Educational Measurement: Issues and Practice. For the final
version, please refer to http://onlinelibrary.wiley.com/doi/10.1111/emip.12027/abstract
Please cite as:
Li, H. (2014, online first). The effects of the read-aloud accommodations for students with and
without disabilities: A meta-analysis. Educational Measurement: Issues and Practices. DOI:
Read-aloud accommodations have been proposed as a way to help remove barriers faced by
students with disabilities in reading comprehension. Many empirical studies have examined the
effects of read-aloud accommodations; however, the results are mixed. With a variance-known
hierarchical linear modeling approach, based on 114 effect sizes from 23 studies, a meta-analysis
was conducted to examine the effects of read-aloud accommodations for students with and
without disabilities. In general, both students with disabilities and students without disabilities
benefited from the read-aloud accommodations, and the accommodation effect size for students
with disabilities was significantly larger than the effect size for students without disabilities.
Further, this meta-analysis reveals important factors that influence the effects of read-aloud
accommodations. For instance, the accommodation effect was significantly stronger when the
subject area was reading than when the subject area was math. The effect of read-aloud
accommodations was also significantly stronger when the test was read by human proctors than
when it was read by video/audio players or computers. Finally, the implications, limitations, and
directions for future research are discussed.
Keywords: Meta-analysis, read-aloud accommodations, disabilities
According to the National Center for Educational Statistics (2011), the proportion of students
with disabilities in K-12 public schools increased from 8.3% in 1976–1977 to 13.1% in 2009–
2010. With the enactment of the No Child Left Behind Act of 2001, the Individuals with
Disabilities Education Act Amendments of 1997, and the Individuals with Disabilities Education
Improvement Act of 2004, schools are required to include students with disabilities in state
testing programs. The aim is to ensure that students with disabilities benefit from standards-
based reforms and achieve high educational standards. However, a major concern is that
historically general large-scale assessments, which are intended for all students but those who
participate in alternate assessments, were developed without consideration of students with
disabilities, and thus may constitute an additional challenge for students with disabilities (Dolan,
Hall, Banerjee, Chun, & Strangman, 2005). To ensure that students with disabilities are
appropriately included in state testing programs, test accommodations have been proposed to
level the playing field by removing construct-irrelevant variance caused by disabilities (Fuchs,
Fuchs, Eaton, Hamlett, & Karns, 2000; Lai & Berkeley, 2012). Among the many existing test
accommodation strategies, the read-aloud accommodation is one of the most commonly used for
students with disabilities (Sireci, Scarpati, & Li, 2005). With this accommodation, the test (or
certain parts of it, such as directions, questions, or prompts) is read to students via a teacher or a
device, in addition to the printed text (Thurlow, Moen, Lekwa, & Scullin, 2010). The read-aloud
accommodation is primarily provided to students with learning disabilities (Crawford & Tindal,
2004), and it is thought that students who struggle to decode written texts will benefit from this
accommodation (Bolt & Roach, 2009).
The differential boost framework (Fuchs & Fuchs, 1999) is often used to evaluate the
effects of read-aloud accommodations. In this framework, both students with disabilities and
students without disabilities are expected to benefit from the accommodation; however, students
with disabilities benefit differentially more than students without disabilities. A more strictly
defined version of this framework is the interaction hypothesis (Sireci et al., 2005; Zuriff, 2000),
according to which students who need the accommodation should benefit from it and students
who do not need the accommodation should not benefit from it. The interaction hypothesis is
more stringent in that students without disabilities should not benefit from the accommodation.
Many empirical studies have examined the effects of read-aloud accommodations for students
with disabilities; however, the results are mixed (Elbaum, 2007). Further, it is not clear which
factors influence the heterogeneous effects of read-aloud accommodations. Therefore, a
quantitative synthesis of previous studies is of particular importance in regard to providing solid
information about read-aloud accommodations to educators and policy-makers.
The purpose of the present study is to conduct a meta-analysis on the effects of read-
aloud accommodations for students with and without disabilities. Specifically, two research
questions are asked: (1) What are the effects of read-aloud accommodations for students with
and without disabilities? (2) Which factors are likely to influence the effects of read-aloud
According to Thurlow, Lazarus, Thompson, and Morse (2005), there are five major
accommodation categories: (a) timing—alternative test schedules, (b) response—alternative
ways to respond to the assessment, (c) setting—changes to test surroundings, (d) equipment and
materials—the use of additional devices or references, and (e) presentation—alternative ways to
present test materials. Read-aloud accommodations, the focus of this study, present test materials
in an alternative way. The effects of such accommodations are complicated by the involvement
of different students, subject areas, accommodation delivery methods, and other factors (Thurlow,
Read-aloud accommodations are typically used for math tests with the expectation that
the accommodation will not change the construct being tested. Elbaum (2007) summarized four
types of findings regarding the effects of read-aloud accommodations with math tests. The first
group of studies reported a significantly positive result for students with disabilities, with little or
no effect for students without disabilities (e.g., Tindal, Heath, Hollenback, Almond, & Harniss,
1998). The second group found significantly positive effects for all students, though the effects
were stronger for students with disabilities (e.g., Weston, 2003). The third group showed
significantly positive effects for all students, with no significant difference in regard to the
magnitude of the effects for students with disabilities compared with those without disabilities
(e.g., Meloy, Deville, & Frisbie, 2002). The fourth group found no significant results for either
group of students (e.g., Helwig & Tindal, 2003). In summary, the effects of read-aloud
accommodations for math tests vary considerably (Laitusis, Buzick, Stone, Hansen, & Hakkinen,
Compared with read-aloud accommodations for math tests, read-aloud accommodations
are much more controversial in the context of reading tests. According to the simple view of
reading, reading comprehension involves two components: decoding and linguistic
comprehension (Hoover & Gough, 1990). Decoding refers to rapidly deriving a representation
from printed input, whereas linguistic comprehension refers to taking lexical information and
deriving sentence and discourse interpretations. Read-aloud accommodations make decoding
words easier, which further facilitates reading comprehension. Many researchers take the
position that providing read-aloud accommodations for a reading test changes the construct being
measured and, therefore, should not be allowed (e.g., Bielinski, Thurlow, Ysseldyke, Freidebach,
& Freidebach, 2001; Phillips, 1994). However, this issue remains controversial. For example, as
Crawford and Tindal (2004) have argued, although providing read-aloud accommodations in a
reading test may change the skills being tested from reading comprehension to listening
comprehension, listening and reading comprehension are so highly correlated that such an
accommodated test still provides information about students’ reading skills. Also, according to
Laitusis (2010), when decoding skills are not considered to be a part of reading comprehension,
reading a reading test aloud does not necessarily change the construct being tested. A number of
studies have focused on using read-aloud accommodations for reading tests, and inconsistent
results have been reported. For instance, Meloy et al. (2002) and McKevitt and Elliott (2003)
found similar gains for students with and without disabilities as a result of receiving read-aloud
accommodations on reading tests. Crawford and Tindal (2004) and Laitusis (2010), however,
found a differential boost from the read-aloud accommodation compared to the
nonaccommodation condition for students with disabilities relative to students without
disabilities. In summary, studies on the use of read-aloud accommodations in reading tests
present mixed findings, and whether we should provide read-aloud accommodations in reading
tests continues to be a controversial issue (Thurlow et al., 2010).
A few studies (e.g., Calhoon, Fuchs, & Hamlett, 2000; Miranda, Russell, & Hoffmann,
2004) have explored whether the effects of the read-aloud accommodation differ depending on
how it is delivered. Often, a human proctor, either a teacher or a test administrator, reads the test
to students (e.g., Elbaum, 2007). In some studies, the test is read to students by a video or audio
player (e.g., Helwig & Tindal, 2003), and in others, the read-aloud accommodation is delivered
via computers (e.g., Burch, 2002). In an experimental study, Calhoon et al. (2000) did not find a
significant difference between the effects of the read-aloud accommodation delivered by a
human proctor and the accommodation delivered by a computer. However, 65% of the students
in that study reported that they preferred receiving accommodations via computers due to the
anonymity this method afforded them. Certainly, it would be interesting to determine whether
method of delivery has any bearing on the effects of read-aloud accommodations.
Researchers have also found that grade level is related to the effect of read-aloud
accommodations. For instance, in Laitusis (2010), the differential boost was greater in grade 4
than in grade 8 both for students with and without disabilities. In a meta-analysis of read-aloud
accommodations in math tests for students with disabilities, Elbaum (2007) found that the
accommodation effect was stronger for students with disabilities than for students without
disabilities at the elementary school level, but the converse was true for secondary school
students. Laitusis et al. (2012) also noted that read-aloud accommodation studies involving either
middle school or high school students showed less effect compared to studies involving
elementary school students. Grade level, therefore, is an important factor in considering the
effects of read-aloud accommodations.
In order to provide read-aloud accommodations, extra time is sometimes allowed in the
accommodated condition, not because this is a purposeful aspect of the accommodation design
but because the accommodation necessitates it (Olson & Dirir, 2010). For instance, extra time
may be needed to turn a video player on and to change a tape. Therefore, when the read-aloud
accommodation shows an effect, it is important to determine whether extra time has confounded
the observed effect (Harker & Feldt, 1993).
Researchers have conducted a number of qualitative reviews on the effects of test
accommodations (e.g., Cormier, Altman, Shyyan, & Thurlow, 2010; Laitusis et al., 2012;
Rogers, Christian, & Thurlow, 2012; Sireci et al., 2005; Tindal & Fuchs, 2000; Zenisky & Sireci,
2007). For example, Sireci et al. (2005) reviewed 59 studies on test accommodations for students
with disabilities, 23 of which used read-aloud accommodations. Despite the mixed results, they
concluded that read-aloud accommodations in math tests appeared to lead to a more valid
interpretation of the math achievement of students with disabilities. In 2012, Laitusis et al.
reviewed test accommodations for students with disabilities. They also found that the read-aloud
accommodation for math tests appeared to be warranted and that the accommodation effects
were influenced by many factors. They further suggested that read-aloud accommodations could
be used for English language arts (ELA) tests “when decoding is not a part of the construct being
measured and in middle school if the read-aloud accommodation is offered without significantly
extending the testing time” (p. 28).
In addition, meta-analysis studies have been performed on test accommodations for
students with disabilities. Chiu and Pearson (1999) conducted a meta-analysis of different types
of test accommodations for both English language learners and students with disabilities. Among
the 40 effect sizes for students with disabilities, only five involved presentation formats (i.e.,
read-aloud accommodations). They found that on average students with disabilities had a score
gain of .16 standard deviation units as a result of receiving test accommodations. However, no
conclusion was drawn specifically about the use of read-aloud accommodations for students with
disabilities. Elbaum (2007) performed a meta-analysis to determine the effects of read-aloud
accommodations in math tests for students with disabilities. In total, 17 studies were included,
published between 1998 and 2003. The effect sizes were examined across grade levels. For
elementary school students, the effect sizes ranged from .10 to .82, whereas for secondary school
students, the effect sizes ranged from -.07 to .30. Recently, Vanchu-Orosco (2012) performed a
meta-analysis of different types of test accommodations for students with disabilities. Based on
119 comparisons from 34 studies conducted and/or published from 1999 to 2011, she concluded
that the effect size of test accommodations for students with disabilities was .30 and the effect
size for students without disabilities was .17. Despite the large scale of this meta-analysis,
Vanchu-Orosco did not specifically study read-aloud accommodations.
As suggested by Zenisky and Sireci (2007), there is a need for more well-constructed
meta-analyses of specific accommodations. Therefore, in the present study, we perform a meta-
analysis to determine the effects of read-aloud accommodations for students with and without
disabilities and also to investigate which factors are likely to influence these effects. The meta-
analysis we propose differs from previous meta-analyses in three major respects. First, our meta-
analysis includes a larger number of read-aloud accommodation studies than previous ones. For
example, we include studies on both math and reading tests, from both published and
unpublished sources. Second, our meta-analysis focuses exclusively on read-aloud
accommodations, so that we are able to consider a larger number of variables to explain the
accommodation effects, such as subject area, accommodation delivery method, grade level, extra
time, and research design. Third, unlike Chiu and Pearson (1999), Elbaum (2007), and Vanchu-
Orosco (2012), our meta-analysis uses the variance-known HLM approach, which is explained in
detail as follows.
Traditionally, researchers have used fixed-effect models for meta-analysis, with the
assumption that the effect size in each study is an estimate of a common effect size of the whole
population of the studies (Hunter, Schmidt, & Jackson, 1982). In contrast, random-effects
models assume that the included studies are random samples drawn from a population of studies,
so that the findings can be generalized beyond the particular studies included in the meta-
analysis (DerSimonian & Laird, 1986). Raudenbush and Bryk (1985, 2002) proposed a two-level
variance-known hierarchical linear modeling (HLM) approach to meta-analysis, which is
regarded as a mixed-effects model (Fischer & Mansell, 2009). It goes beyond the random-effects
approach by testing whether there is systematic variance that can be explained by study
characteristics beyond simple random variation (Lipsey & Wilson, 2001). Subjects are regarded
as nested within the primary studies included in the meta-analysis. The level-1 model
investigates how effect sizes vary across studies, whereas the level-2 model explains the
potential sources of this variation by examining multiple predictors of effect sizes
simultaneously. Using a simulation study, Noortgate and Onghena (2003) have shown that the
variance-known HLM approach generally produces less biased estimates compared to the fixed-
effects approaches, unless the number of studies is small. In the present meta-analysis, we are
particularly interested in discovering factors that influence the effects of read-aloud
accommodations. Therefore, due to its flexibility (Hox, 2010), we chose the variance-known
HLM approach for this meta-analysis. The technical details of this approach are explained in the
Studies were selected for inclusion in the meta-analysis based on the following criteria. First,
only studies in which a read-aloud accommodation featured as the single test accommodation
strategy for students with disabilities and/or students without disabilities were eligible for
inclusion. Second, to address the issue that studies reporting significant effects are more likely to
be published (Glass, 1977), we considered both published and unpublished studies. Third, only
studies that have an experimental or quasi-experimental design and that present sufficient
information to calculate effect sizes were included. Finally, due to the small number of read-
aloud accommodation studies in the context of science tests, only studies involving math or
reading tests were considered.
The following procedures were used to search for eligible studies. First, using various
combinations of key words and phrases such as “read-aloud,” “oral,” and “test accommodation,”
we searched several well-known online databases, including ERIC, JSTOR, ProQuest, and
PsycINFO. Second, we searched reviews on test accommodations for students with disabilities
and without disabilities (e.g., Chiu & Pearson, 1999; Elbaum, 2007; Laitusis et al., 2012; Rogers
et al., 2012; Sireci et al., 2005; Vanchu-Orosco, 2012) and major journals for relevant articles.
Finally, we reviewed references cited in the studies that we had already determined to be eligible
and added those that we had not already found through other sources. After an initial search, we
located 94 studies, among which 71 studies were excluded because they did not meet our
inclusion criteria. Although we did not specify a time frame, all the 94 studies we initially
retrieved were published or released before 1990.
Most of the eligible studies involved more than one comparison (or effect size). For
instance, Laitusis (2010) included two groups of students without disabilities and two groups of
students with disabilities, such that this one article generated four comparisons (or effect sizes).
Multiple strategies have been used to address this issue (Marsh, Bornmann, Mutz, Daniel, &
O’Mara, 2009). First, the multiple effect sizes within each study can be averaged, or one effect
size can be selected from each study. As a result, the number of effect sizes is drastically
reduced. Second, this dependence can be modeled by adding a third level to the variance-known
HLM analysis. This approach, however, is constrained by the number of studies included in the
meta-analysis. Third, the dependence can be ignored when it is appropriate to do so. For instance,
as recommended by Borenstein, Hedges, Higgins, and Rothstein (2009, p. 223), when each of the
subgroups in a single study contributes independent information, the “independent subgroups are
no different than independent studies.” Although ignoring the dependence slightly biases
standard errors downward, Marsh et al. (2009) did not find much difference between results of
the method that models dependence as a third level and results of the method that ignores the
dependence. Vanchu-Orosco (2012) also obtained similar results whether she averaged the
multiple effect sizes from a single study or ignored the dependence. Because the samples used to
calculate the four effect sizes were mutually exclusive in Laitusis (2010), we treated each single
comparison from the Laitusis study as the unit of analysis in our meta-analysis. In a similar way,
after a thorough search and screening, we determined that 114 comparisons from 23 studies were
eligible for inclusion in the present meta-analysis. (See Appendix A for a list of the studies
included and their characteristics.) Nevertheless, a single study contributed different numbers of
effect sizes to this meta-analysis, ranging from 1 (e.g., Johnson, 2000) to 16 (e.g., Helwig &
Tindal, 2003). We, therefore, performed a sensitivity analysis to evaluate whether excluding a
particular study with a large number of effect sizes would substantially change the results of the
Based on the literature, five variables—disability status, subject area, delivery method, grade
level, and extra time—were identified as closely related to the effects of read-aloud
accommodations. These variables were subsequently used as potential predictors to account for
variations in the effect sizes among the studies. The sample size, mean test score, and standard
deviation for both the experimental groups and the control groups were also extracted in order to
calculate effect size statistics. The author coded the studies according to a coding scheme, after
which a trained graduate assistant used the same coding scheme to code all the studies
independently. A measure of inter-rater reliability, percentage agreement, was calculated for
each coded variable. The author and the graduate assistant had perfect agreement regarding the
codes for disability status, subject area, delivery method, and grade level. The percentage
agreement was 80% for extra time. Disagreements were resolved through a process of discussion
until an agreement was reached.
First, disability status was coded as “with disabilities” or “without disabilities” based on
information provided in the studies. Specifically, in all the studies included in this meta-analysis,
students with disabilities mostly were identified as having learning disabilities (see Appendix A).
Second, subject areas were coded as either math or reading. Third, the methods whereby the
read-aloud accommodations were delivered were coded according to three categories: when the
test was read by a teacher or a test administrator, it was coded as “read by human proctors”;
when read by a video tape player (Crawford & Tindal, 2004) or a cassette player (Harker &
Feldt, 1993), it was coded as “read by video/audio players”; and when read via a computer, it
was coded as “read by computers.” Fourth, grade level was coded according to three categories:
below 6th grade was coded as “elementary school,” 6th to 8th as “middle school,” and 9th to 12th as
“high school.” Fifth, the coding of extra time was less straightforward than the coding for the
other variables. Studies in which extra time was deliberately combined with the read-aloud
accommodation so that a package of accommodations were provided (e.g., Schulte, Elliott, &
Kratochwill, 2001) were excluded from this meta-analysis. We included only studies in which
the read-aloud was offered as a single accommodation strategy and extra time was inevitably
allowed due to practical reasons relating to delivering the read-aloud accommodation (Olson &
Dirir, 2010). When it was specifically stated that the read-aloud accommodated condition
allowed more time than the standard condition, we coded this variable as “yes”; otherwise, we
coded it as “no.” In a few cases, we contacted the authors in order to collect sufficient
information to code this variable.
In addition, we coded the research design of each study. In a study with an independent
group design, typically students were randomly assigned to either the accommodation or the
control condition. Effect size was calculated as the standardized mean difference between the
two groups using the pooled standard deviation of the two groups (raw-score effect size) (Morris
& DeShon, 2002). However, many of the read-aloud accommodation studies that we included
used a repeated measure design (i.e., each student took the test under both conditions, with read-
aloud accommodations and without). Typically, the design was counter-balanced in order to
minimize the order effects. For a study using the repeated-measure design, the correlation
between the pre-test and post-test scores is needed to calculate the effect size (change-score
effect size) (Morris & DeShon, 2002). However, many of the studies using the repeated measure
design did not report this correlation, so that we were not able to calculate the change-score
effect size. We, therefore, calculated the raw-score effect size for both the independent group
design and the repeated measure design studies. This practice was also adopted in other test
accommodation meta-analysis studies, such as Gregg and Nelson (2012) and Kieffer, Rivera, and
Francis (2012). In order to adjust this artifact due to different research designs, we further coded
research design as a dichotomous variable (repeated measure design or independent group
design) and used this variable as one of the level-2 predictors in the subsequent analysis (Briggs,
Ruiz-Primo, Furtak, Shepard, & Yin, 2012; Hox, 2010). The percentage agreement between the
author and the graduate assistant was 83% for this variable. However, in the studies with an
independent group design, it was not always the case that individual students were randomly
assigned to the standard or accommodated condition. Sometimes, the randomization was at the
classroom (e.g., Elbaum, 2007) or school level (e.g., Laitusis, 2010) due to practical constraints.
Data Analysis Using Variance-Known HLM
The meta-analysis was performed following the variance-known HLM approach with the
HLM 6.08 software (Raudenbush, Bryk, & Congdon, 2004). As demonstrated in Raudenbush
and Bryk (2002), dj, the effect size estimate for comparison j, is the standardized mean difference
between an experimental group and a control group. It is defined as
dj = ( Ej - Cj) /Sj 
Ej is the mean outcome for the experimental group;
Cj is the mean outcome for the control group; and
Sj is the pooled within-group standard deviation.
According to Hedges (1981), dj can be viewed as a statistic to estimate the corresponding
population effect size. It is approximately unbiased and normally distributed with variance Vj =
(nEj + nCj) / (nEjnCj) + δj2 / [2(nEj + nCj)] 
δj is the corresponding population effect size;
nEj is the experimental-group sample size; and
nCj is the control-group sample size.
The observed dj is used to substitute for δj in equation 2, and Vj is assumed to be known. When
there are at least 20 (Hedges & Olkin, 1985) or 30 (Raudenbush & Bryk, 2002) cases per study,
it is reasonable to assume that the variance Vj can be estimated with sufficient accuracy (Hox,
2010). Hedges (1981) also presented a correction for bias in the calculation of effect sizes when
the sample size of the experimental group, nEj, or that of the control group, nCj, is very small:
Adjusted dj = dj (1 – 
With the variance-known HLM approach, the level-1 outcome variable in the meta-
analysis is the effect size reported for each comparison. When the variation in effect sizes is
statistically significant, level-2 analysis is used to determine the extent to which the predictors
contribute to explaining that variation. As described by Raudenbush and Bryk (2002), the level-1
model (often referred to as the unconditional model) is
dj = δj + ej 
δj is the true overall effect size across comparisons; and
ej is the sampling error associated with dj as an estimate of δj.
Here, we assume that ej ~ N(0, Vj).
In the level-2 model, the true population effect size, δj, depends on comparison
characteristics and a level-2 random error term:
δj = γ0 + γ1W1j + γ2W2j + … + γ6W6j + γ7W7j + γ8W8j + µj 
W1j …W8j are the comparison characteristics predicting δj (see Table 1 for the list of
variables and the corresponding frequencies);
γ0 is the expected overall effect size when each Wij is zero;
γ1 … γ8 are regression coefficients associated with the comparison characteristics W1 to
µj is a level-2 random error.
Insert Table 1 Here
Based on the procedure described in the methods section, 114 effect sizes from 23 studies were
included in this meta-analysis. The distribution of the effect sizes is illustrated in Figure 1. More
of the effect sizes were positive than negative, and no outliers were detected. The effect sizes
ranged from -.95 to 1.20, with a mean of .20 and a standard deviation of .36. The effect sizes
were approximately normally distributed with a skewness of .23 and a kurtosis of 1.51.
Insert Figure 1 Here
The predictors were entered into the model by category separately and then in a
combined way. Table 2 summarizes the estimated regression coefficients, the 95% confidence
intervals, and the random components of a series of models. Due to space limitations, we did not
refer to confidence intervals in the subsequent sections. Model 0 shows the results when no
predictors were included. The intercept (i.e., the estimated grand-mean effect size) was .20,
which was statistically different from zero (t (113) = 6.70, p < .001). This result indicates that on
average students who received read-aloud accommodations scored about .20 standard deviation
units higher that their non-accommodated peers. Furthermore, the estimated variance of the
effect size was .06, which was significantly different from zero. This suggests that variability
existed in the true effect sizes across comparisons. Therefore, the results show that analysis
should proceed to a level-2 conditional model in order to determine which characteristics explain
Insert Table 2 Here
In Model 1, disability status was statistically significant (γ = .13, t (112) = 2.13, p < .05).
Students without disabilities who received read-aloud accommodations scored about .14 standard
deviation units higher than their non-accommodated peers, whereas students with disabilities
who received read-aloud accommodations scored about .27 (i.e., .14 +.13) standard deviation
units higher than their non-accommodated peers. In Model 2, the accommodation effect size for
math tests was significantly smaller than that for reading tests (γ = -.27, t (112) = -4.18, p < .001).
Specifically, students who received a read-aloud accommodation on reading tests scored about
.41 standard deviation units higher than their non-accommodated peers; however, the increase
was only .14 (i.e., 41 - .27) standard deviation units for math tests. In Model 3, both of the two
variables related to accommodation delivery methods were statistically significant. When a
human proctor read the test, students who received read-aloud accommodations scored about .34
standard deviation units higher than their non-accommodated peers, whereas the increase was .11
(i.e., .34 - .23) standard deviation units when the read-aloud was delivered by a computer and .12
(i.e., .34 - .22) standard deviation units when the read-aloud was delivered by a video/audio
In Model 4, for middle school students and for high school students, the effect of read-
aloud accommodations was not significantly different from that for elementary school students.
In Model 5, compared to the effect size when extra time was not evidently provided in the
accommodated condition, the effect size when extra time was provided was significantly larger
by .21 standard deviation units. In Model 6, compared to the effect size in studies with a repeated
measure design, the effect size in studies with an independent group design was significantly
larger by .19 standard deviation units.
In Model 7, we entered all the predictors at one time to investigate their effects
simultaneously. As shown in Table 2, the regression coefficient related to disability status was
significantly positive in both Models 1 and 7. This result indicates that whether or not we
controlled for other predictors, the effect size of receiving read-aloud accommodations was
larger for students with disabilities than for those without disabilities. The effects of subject areas
and accommodation delivery methods were also consistent whether or not other predictors were
included in the model. There were, however, minor variations in regard to the other predictors
across models. The difference between middle schools and elementary schools became
statistically significant in Model 7. Extra time and research design, however, became statistically
non-significant in Model 7. These minor variations indicate a potential interaction among the
predictors, although this interaction is considered slight.
In addition to the regression coefficient, the proportion of variance explained was also
calculated with Model 0 as the baseline model (Raudenbush & Bryk, 2002). As shown in the last
row of Table 2, subject area and accommodation delivery method each explained over 16% of
the variance in effect sizes, followed by research design (5.8%), disability status (5.4%), extra
time (3.0%), and grade level (1.1%). In Model 7, when all the predictors were included, the
proportion of variance explained was 47.3%. Still, the estimated variance of the effect sizes in
this model was .034, which was significantly different from zero (χ2 = 260.77, df = 105, p <
.001). This indicates that unknown sources of variability still exists among the effect sizes
beyond what have been accounted for in this meta-analysis.
Figure 2 represents the estimated effect sizes in Model 7 when we controlled for grade
level, extra time, and research design. For example, when the subject area was reading, for
students without disabilities, the estimated effect sizes were as follows: .48 when the test was
read by a human proctor, .26 (i.e., 48 - .22) when read by a computer, and .28 (i.e., .48 - .20)
when read by a video/audio player. The estimated effect size for math was calculated in a similar
way. As shown in Figure 2, the estimated accommodation effects varied substantially across
combinations of disability status, subject area, and accommodation delivery method. We discuss
Figure 2 in greater detail in the subsequent section.
Insert Figure 2 Here
A few studies contributed a large number of effect sizes to this meta-analysis. The mean
of the eight effect sizes in Calhoon et al. (2000) was .24, and the mean of the 16 effect sizes in
Olson and Dirir (2010) was .18, both of which were close to the overall mean of .20. Also, both
studies involved multiple accommodation delivery methods and or multiple subject areas.
However, the mean of the 16 effect sizes in Helwig and Tindal (2003) was only .03, and the
mean of the 14 effect sizes in Helwig et al. (2012) was .00. These two studies focused on read-
aloud accommodations for math tests delivered via video/audio players. In addition, we
performed a sensitivity analysis by removing the effect sizes produced in each of the four studies
one at a time. The only changes we observed were as follows: (1) extra time became significant
in Model 7 (γ = .14, t (89) = 2.04, p < .05) when we removed Helwig and Tindal (2003), and (2)
middle school became nonsignificant in Model 7 (γ = -.10, t (89) = -1.38, p > .05) when we
removed Olson and Dirir (2010). Admittedly, one study contributing a large number of effect
sizes may create a dependence issue and bias the standard errors. However, the sensitivity
analysis shows that our conclusions did not change much as a result of including those studies in
What Are the Effects of Read-Aloud Accommodations for Students With and Without
According to the differential boost framework, as a result of receiving accommodations, students
with disabilities are expected to obtain a larger increase in their scores compared to students
without disabilities. The results of this meta-analysis support the requirement of this framework.
In both Models 1 and 7, disability status was a statistically significant predictor, indicating that
the effect of read-aloud accommodations for students with disabilities was significantly stronger
than the effect for students without disabilities whether or not we controlled for other predictors.
Specifically, in Model 1, the accommodation effect size was .14 for students without disabilities
and .27 for students with disabilities. Our result does not differ substantially from previous meta-
analysis findings. For instance, Vanchu-Orosco (2012) reported an effect size of .30 for students
with disabilities and .17 for students without disabilities for multiple types of test
accommodations. In Elbaum (2007), the mean effect size was .37 for elementary school students
with disabilities and .10 for secondary school students with disabilities. At present, in the test
accommodation literature for students with disabilities, the categories of small, medium, and
large effects are not clearly defined (Vanchu-Orosco, 2012). If we use a general scheme, as
suggested by Cohen (1992), a differential boost of .13 standard deviation units as found in the
present meta-analysis is regarded as small in practical terms.
The interaction hypothesis states that students who need the accommodation should
benefit from it and that students who do not need the accommodation should not benefit from it.
Here, we refer to the estimated effect sizes in Model 7, when grade level, extra time, and
research design were controlled for. As shown in Figure 2, when the subject area was reading,
regardless of the accommodation delivery method, students with disabilities and students without
disabilities both benefited from receiving read-aloud accommodations, with effect sizes ranging
from .26 to .61. When the subject area was math, students with disabilities and students without
disabilities both benefited from read-aloud accommodations provided by human proctors, with
effect sizes of .35 and .22, respectively. However, for math tests, when the accommodation was
provided by a computer or a video/audio player, the effect sizes for students with disabilities
were very small and the effect sizes for students without disabilities were zero or almost zero.
Therefore, the read-aloud accommodations did not always meet the criteria of the interaction
In summary, except when read-aloud accommodations were provided in math tests via a
computer or a video/audio player, both students with disabilities and without disabilities
benefited from the accommodation, though the effect size was generally greater for students with
disabilities. The fact that students without disabilities may also benefit from read-aloud
accommodations, however, raises a fairness and validity issue (Li & Suen, 2012; Phillips, 1994).
If read-aloud accommodations are only provided to students with disabilities, students without
disabilities may be at a disadvantage because they could have benefited from the
accommodations as well. In other words, the accommodation may even offer students with
disabilities an unfair advantage over students without disabilities. Many studies have addressed
the effects of read-aloud accommodations, and more research is needed to fully understand the
fairness and validity of test accommodations.
Which Factors Are Likely to Influence the Effects of Read-Aloud Accommodations?
As shown in the present meta-analysis, the effect size of read-aloud accommodations for reading
tests was significantly larger than that for math tests whether or not we controlled for other
predictors. Because the read-aloud accommodation directly supports students’ decoding skills, it
is reasonable to expect that such an accommodation supports students’ reading comprehension so
that their performance in reading tests improves. However, as previously discussed, there is
concern regarding whether read-aloud accommodations change the construct of reading tests.
Given this controversy, read-aloud accommodations for reading tests are used with caution,
although the states vary considerably in terms of how much caution they exercise. According to
Wiener and Thurlow (2012), only 16% of the Partnership for Assessment of Readiness for
College and Careers (PARCC) states allow the read-aloud accommodation in state reading tests
with no conditions or consequences for scoring, reporting, and accountability; another 56% allow
such use with conditions; and 20% prohibit it. In terms of math tests, the read-aloud
accommodation helps to remove barriers faced by students with disabilities in reading
comprehension, which is regarded as construct-irrelevant variance in a math test (Elbaum, 2007).
Students, thus, have better opportunities to demonstrate their true math ability, and most states
allow read-aloud accommodations for math tests.
Another important finding of this meta-analysis pertains to the effect of the delivery
method in read-aloud accommodations. Whether or not we controlled for other predictors, read-
aloud accommodations provided by human proctors showed a significantly stronger effect than
those delivered by video/audio players or computers. When human proctors read tests, the actual
procedure cannot be completely standardized. For example, some proctors may read tests in a
way that provides students with clues to the answers (Meloy et al., 2002; Olson & Dirir, 2010).
Or, perhaps students are better able to focus on tasks when tests are read by proctors. Another
reason may be that the video/audio cassette is usually played to the whole class at a
predetermined speed, which may interfere with students’ thinking processes (Hollenbeck, Rozek-
Tedesco, Tindal, & Glasgow, 2000). Typically, when the read-aloud accommodation is delivered
via computers, students can listen to the items at their own pace; however, students need
sufficient training in order to use computerized accommodations (Olson & Dirir, 2010). In this
meta-analysis, only 21 of the 114 effect sizes involved read-aloud accommodations delivered via
computers. With the increasing use of computers in educational tests, more empirical studies are
needed if we are to achieve a deeper understanding of read-aloud accommodations delivered via
computers and other emerging technology (Laitusis et al., 2012). In summary, the read-aloud
delivery method is an influential factor in regard to explaining variations among the
accommodation effect sizes, and great caution should be exercised in deciding how read-aloud
accommodations are delivered.
When we controlled for other predictors in Model 7, the effect of read-aloud
accommodations was significantly stronger for elementary school students than for middle
school students. However, the high school predictor did not reach statistical significance, which
may be because the present meta-analysis did not include sufficient studies pertaining to high
school students. Laitusis et al. (2012) proposed some reasons for the grade-level difference. For
example, this difference arises probably because “reading deficits are not as severe at higher
grade levels or that the decoding requirements are less pronounced at higher grade levels” (p. 7).
Or, in later grades, math tests become more domain-specific and reading is less influential when
more advanced math skills are present. In future research, it would be worthwhile to examine
how students’ reading proficiency and grade level are related to the effects of read-aloud
As shown in Model 5, when extra time was allowed in the accommodated condition, the
effect size of read-aloud accommodations was significantly larger. This result agrees with reports
that extra time leads to improved test performance for all students, and especially for those with
disabilities (Sireci et al., 2005; Zuriff, 2000). When other predictors were controlled for in Model
7, however, extra time became statistically nonsignificant. This could be because only 20 of the
114 comparisons were coded as involving extra time. Still, no matter whether the extra time
predictor was included in the model, the regression coefficients and statistical significance of the
other predictors did not change much. Thus, we can infer that even when extra time is
unavoidably involved in read-aloud accommodations, its confounding effect is trivial.
Compared to the effect size for studies with a repeated measure design, the effect size for
studies with an independent group design was larger, as shown in Model 6. Research design,
however, became statistically nonsignificant when other predictors were included in Model 7.
We also evaluated different combinations of the research design and other predictors and did not
find noticeable changes in the results. The confounding effect of research design, therefore, is
regarded as small. Our results agree with those reported by Vanchu-Orosco (2002), who also
found that the effect size for studies using independent group design was larger than that for
studies using repeated measure design for read-aloud accommodations. However, the number of
studies using independent group design was small in both the present meta-analysis and Vanchu-
Orosco (2002), and thus the finding is likely to be inconclusive. Possible reasons why repeated
measure design studies had a smaller effect size include incomplete counterbalancing of order
effects, students dropping out at later conditions, and regression to the mean.
CONCLUSION, LIMITATIONS, AND FUTURE RESEARCH
As Sireci et al. (2005) put it, “[the] challenge is to implement … accommodations appropriately
and identify which accommodations are best for specific students” (p. 486). Through this meta-
analysis we have identified important patterns pertaining to the effects of read-aloud
accommodations and thus offered a basis for a more appropriate use of read-aloud
accommodations. To conclude, both students with disabilities and students without disabilities
benefited from read-aloud accommodations, and the accommodation effect for students with
disabilities was significantly greater than the effect for students without disabilities. Also,
multiple factors (e.g., disability status, subject area, accommodation delivery method) influence
the effects of read-aloud accommodations simultaneously.
Due to the lack of information for coding in this meta-analysis, we included only
theoretically meaningful predictors that could be reliably coded. First, for math tests, despite the
important role of students’ reading proficiency in read-aloud accommodations, we were not able
to include it as a predictor due to the lack of a universal criterion across studies. The
classification of with disability versus without disability could act as a partial proxy for low
reading proficiency versus high reading proficiency. Still, it is just an approximation. In fact,
given the close relationship between decoding skills and read-aloud accommodations, if students’
decoding skills were available in those studies, this would have been a more meaningful
predictor than students’ reading proficiency. Second, students’ content knowledge in math may
be confounded with the effects of read-aloud accommodations (Elbaum, 2007; Meloy et al.,
2002); however, we were not able to include content knowledge as a predictor due to the lack of
information on this point. Finally, disability category would have been a meaningful predictor as
well. Because read-aloud accommodations help students decode words, it is reasonable to expect
students with learning disabilities in reading (such as deficiencies in decoding) to benefit more
from such accommodations than students with other categories of disabilities (Crawford &
Tindal, 2004). In future research, it would be worthwhile to conduct studies to test how disability
category interacts with the effects of read-aloud accommodations.
In addition to the predictors we controlled for, there are variations in the tests being used
and in the ways that read-aloud accommodations are practiced. For instance, Helwig and Tindal
(2003) reported that the effects of read-aloud accommodations in a math test were influenced by
the readability of the test items. In a preliminary exploration, we attempted to code whether the
test items were multiple-choice, constructed-response questions, or both, hoping that this would
at least partly indicate the readability of the test items. However, only a few studies used tests
involving constructed response questions, and we were not able to include item type as a
predictor. The interaction between test characteristics and read-aloud accommodations, therefore,
is an important issue for further study (Cawthon, Ho, Patel, Potvin, & Trundt, 2009; Ketterlin-
Geller, Yovanoff, & Tindal, 2007). Testing settings, for instance, whether the test is
administrated to individuals, to small groups, or to an entire class, was another related factor that
we were not able to include. In future work, it would be advisable for researchers to control for
potentially confounding factors in order to facilitate a better understanding of the effects of read-
A final note is to reflect on the methodological limitations involved in the present meta-
analysis (Berk & Freedman, 2003; Briggs, 2005). As Hunter and Schmidt (2004) warned, the
observed differences between effect sizes are produced in part by some unavoidable artifacts in a
meta-analysis, such as statistical assumptions, instruments with different reliabilities, and coder
reliability. For example, one assumption of the variance-known HLM approach to meta-analysis
is that the included studies are regarded as a random sample drawn from the population.
However, because we do not know the actual population, this assumption is not directly testable.
Also, the variance-known HLM approach to meta-analysis relies on the assumptions underlying
a typical HLM analysis (Raudenbush, & Bryk, 2002). Without access to the original raw data,
we cannot directly test these assumptions either. We hope the limitations of the present meta-
analysis can be addressed by well-designed experimental studies and a cumulative meta-analysis
in future work.
* Indicates articles used in the meta-analysis
Berk, R. A., & Freedman, D. A. (2003). Statistical assumptions as empirical commitments. In T.
G. Blomberg and S. Cohen (Eds.), Law, punishment, and social control: Essays in honor
of Sheldon Messinger (2nd ed., 235–254). Berlin, Germany: Aldine de Gruyter.
Bielinski, J., Thurlow, M., Ysseldyke, J., Freidebach, J., & Freidebach, M. (2001). Read-aloud
accommodation: Effects on multiple-choice reading and math items (Technical Report
31). Minneapolis, MN: University of Minnesota, National Center on Educational
Bolt, S. E., & Roach, A. T. (2009). Inclusive assessment and accountability: A guide to
accommodations for students with diverse needs. New York, NY: Guilford Press.
Borenstein, M., Hedges, L. V., Higgins, J. P. T., & Rothstein, H. R. (2009). Introduction to meta-
analysis. Chichester, UK: John Wiley & Sons.
Briggs, D. C. (2005). Meta-analysis: A case study. Evaluation Review, 29(2), 87–127.
Briggs, D. C., Ruiz-Primo, M. A., Furtak, E., Shepard, L., & Yin, Y. (2012). Meta-analytic
methodology and inferences about the efficacy of formative assessment. Educational
Measurement: Issues and Practice, 31(4), 13–17.
* Burch, M. (2002). Effects of computer-based test accommodations on the math problem-
solving performance of students with and without disabilities (Unpublished dissertation).
Vanderbilt University, Nashville, TN.
* Calhoon, M. B., Fuchs, L. S., & Hamlett, C. L. (2000). Effects of computer-based test
accommodations on mathematics performance assessments for secondary students with
learning disabilities. Learning Disability Quarterly, 23(4), 271–282.
Cawthon, S. W., Ho, E., Patel, P. G., Potvin, D. C., & Trundt, K. M. (2009) Multiple constructs
and effects of accommodations on accommodated test scores for students with disabilities.
Practical Assessment, Research and Evaluation, 14(18), 1–9.
Chiu, C., & Pearson, P. (1999, June). Synthesizing the effects of test accommodations for special
education and limited English proficiency students. Paper presented at the National
Conference on Large Scale Assessment, Snowbird, UT.
Cohen, J. (1992). A power primer. Psychological Bulletin, 112(1), 155–159.
Cormier, D. C., Altman, J. R., Shyyan, V., & Thurlow, M. L. (2010). A summary of the research
on the effects of test accommodations: 2007–2008 (Technical Report 56). Minneapolis,
MN: University of Minnesota, National Center on Educational Outcomes.
* Crawford, L., & Tindal, G. (2004). Effects of a read-aloud modification on a standardized
reading test. Exceptionality: A Special Education Journal, 12(2), 89–106.
DerSimonian, R., & Laird, N. (1986). Meta-analysis in clinical trials. Controlled Clinical Trials,
Dolan, R. P., Hall, T. E., Banerjee, M., Chun, E., & Strangman, N. (2005). Applying principles
of universal design to test delivery: The effect of computer-based read-aloud on test
performance of high school students with learning disabilities. The Journal of
Technology, Learning, and Assessment, 3(7), 4–32.
* Elbaum, B. (2007). Effects of an oral testing accommodation on the mathematics performance
of secondary students with and without learning disabilities. The Journal of Special
Education, 40(4), 218–229.
Fischer, R., & Mansell, A. (2009). Commitment across cultures: A meta-analytical approach.
Journal of International Business Studies, 40(8), 1,339–1,358.
* Fletcher, J. M., Francis, D. J., O’Malley, K., Copeland, K., Mehta, P., Caldwell, C. J.,
Kalinowski, S., Young, V., & Vaughn, S. (2009). Effects of a bundled accommodations
package on high-stakes testing for middle school students with reading disabilities.
Exceptional Children, 75(4), 447–463.
Fuchs, L. S., & Fuchs, D. (1999). Helping teachers formulate sound test accommodation
decisions for students with learning disabilities. Learning Disabilities Research &
Practice, 16(3), 174–181.
* Fuchs, L. S., Fuchs, D., Eaton, S. B., Hamlett, C. L., & Karns, K. M. (2000). Supplementing
teacher judgments of mathematics test accommodations with objective data sources.
School Psychology Review, 29(1), 65–85.
* Geraghty, C. A., & Vanderwood, M. L. (in press). Effects of a mathematics read aloud
accommodation for students with high and low reading skills. Journal of Special
Glass, G. V. (1977). Integrating findings: The meta-analysis of research. Review of Research in
Education, 5(1), 351–379.
Gregg, N., & Nelson, J. M. (2012). Meta-analysis on the effectiveness of extra time as a test
accommodation for transitioning adolescents with learning disabilities: More questions
than answers. Journal of Learning Disabilities, 45(2), 128–138.
* Harker, J. K., & Feldt, L. S. (1993). A comparison of achievement test performance of
nondisabled students under silent reading and reading plus listening modes of
administration. Applied Measurement in Education, 6(4), 307–320.
Hedges, L. V. (1981). Distribution theory for Glass’s estimator of effect size and related
estimators. Journal of Educational and Behavioral Statistics, 6(2), 107–128.
Hedges, L. V., & Olkin, I. (1985). Statistical methods for meta-analysis. San Diego, CA:
* Helwig, R., Rozek-Tedesco, M. A., & Tindal, G. (2002). An oral versus a standard
administration of a large-scale mathematics test. The Journal of Special Education, 36(1),
* Helwig, R., Rozek-Tedesco, M. A., Tindal, G., Heath, B., & Almond, P. J. (1999). Problem
solving on multiple-choice tests for sixth-grade students. The Journal of Educational
Research, 93(2), 113–125.
* Helwig, R., & Tindal, G. (2003). An experimental analysis of accommodation decisions on
large-scale mathematics tests. Exceptional Children, 69(2), 211–225.
Hollenbeck, K., Rozek-Tedesco, M. A., Tindal, G., & Glasgow, A. (2000). An exploratory study
of student-paced versus teacher-paced accommodations for large-scale math tests.
Journal of Special Education Technology, 15(2), 27–36.
Hoover, W. A., & Gough, P. B. (1990). The simple view of reading. Reading and Writing: An
Interdisciplinary Journal, 2, 127–160.
Hox, J. J. (2010). Multilevel analysis: Techniques and applications (2nd ed.). New York, NY:
Hunter, J. E., & Schmidt, F. L. (2004). Methods of meta-analysis (2nd ed.). Newbury Park, CA:
Hunter, J. E., Schmidt, F. L., & Jackson, G. B. (1982). Meta-analysis: Cumulating findings
across research. Beverly Hills, CA: Sage.
Individuals with Disabilities Education Act Amendments of 1997, Pub. L. No. 105–17 (1997).
Retrieved from http://www.ed.gov/offices/OSERS/Policy/IDEA/the_law.html
Individuals with Disabilities Education Improvement Act of 2004, Pub. L. No. 108–446. (2004).
Retrieved from http://idea.ed.gov/explore/view/p/%2Croot%2Cstatute%2C
* Johnson, E. S. (2000). The effects of accommodations on performance assessments. Remedial
and Special Education, 21(5), 261–267.
* Ketterlin-Geller, L. R., Yovanoff, P., & Tindal, G. (2007). Developing a new paradigm for
conducting research on accommodations in mathematics testing. Exceptional Children,
Kieffer, M. J., Rivera, M., & Francis, D. J. (2012). Practical guidelines for the education of
English language learners: Research-based recommendations for the use of
accommodations in large-scale assessments. 2012 update. Portsmouth, NH: RMC
Research Corporation, Center on Instruction.
* Kosciolek, S., & Ysseldyke, J. E. (2000). Effects of a reading accommodation on the validity of
a reading test (Technical Report 28). Washington, DC: Council of Chief State School
Lai, S. A., & Berkeley, S. (2012). High-stakes test accommodations: Research and practice.
Learning Disability Quarterly, 35(3), 158–169.
* Laitusis, C. C. (2010). Examining the impact of audio presentation on tests of reading
comprehension. Applied Measurement in Education, 23(2), 153–167.
Laitusis, C., Buzick, H., Stone, E., Hansen, E., & Hakkinen, M. (2012). Literature review of
testing accommodations and accessibility tools for students with disabilities. Retrieved
Li, H., & Suen, H. K. (2012). Are test accommodations for English language learners fair?
Language Assessment Quarterly, 9(3), 293–309.
Lipsey, M. W., & Wilson, D. B. (2001). Practical meta-analysis. London: Sage.
Marsh, H., Bornmann, L., Mutz, R., Daniel, H-D., & O’Mara, A. (2009). Gender effects in the
peer reviews of grant proposals: A comprehensive meta-analysis comparing traditional
and multilevel approaches. Review of Educational Research, 79(3), 1290–1326.
McKevitt, B. C., & Elliott, S. N. (2003). Effects and perceived consequences of using read aloud
and teacher recommended testing accommodations on a reading achievement test. School
Psychology Review, 32(4), 583–600.
* Meloy, L., Deville, C., & Frisbie, D. A. (2002). The effect of a read aloud accommodation on
test scores of students with and without a learning disability in reading. Remedial and
Special Education, 23(4) 248–255.
* Miranda, H., Russell, M., & Hoffmann, T. (2004). Examining the feasibility and effect of a
computer-based read-aloud accommodation on mathematics test performance. Retrieved
Morris, S. B., & DeShon, R. P. (2002). Combining effect size estimates in meta-analysis with
repeated measures and independent-groups designs. Psychological Methods, 7(1), 105–
National Center for Educational Statistics (2011). Digest of education statistics: 2011. Retrieved
No Child Left Behind Act of 2001, Pub. L. No. 107–110, 115 Stat. 1425 (2002).
Noortgate, W. V. den, & Onghena, P. (2003). Multilevel meta-analysis: A comparison with
traditional meta-analytical procedures. Educational and Psychological Measurement,
* Olson, J. F., & Dirir, M. D. (2010). Technical report for studies of the validity of test results for
test accommodations. Washington, DC: Council of Chief State School Officers
Phillips, S. E. (1994). High stakes testing accommodations: Validity vs. disabled rights. Applied
Measurement in Education, 7(2), 93–120.
Raudenbush, S. W., & Bryk, A. S. (1985). Empirical Bayes meta-analysis. Journal of
Educational Statistics, 10(2), 75–98.
Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models: Applications and data
analysis methods (2nd ed.). London: Sage.
Raudenbush, S. W., Bryk, A. S, & Congdon, R. (2004). HLM 6 for Windows [Computer
software]. Lincolnwood, IL: Scientific Software International.
Rogers, C. M., Christian, E. M., & Thurlow, M. L. (2012). A summary of the research on the
effects of test accommodations: 2009–2010 (Technical Report 65). Minneapolis, MN:
University of Minnesota, National Center on Educational Outcomes.
* Schnirman, R. K. (2005). The effect of audiocassette presentation on the performance of
students with and without learning disabilities on a group standardized math test
(Unpublished dissertation). Florida Atlantic University, Boca Raton, FL.
Schulte, A. A. G., Elliott, S. N., & Kratochwill, T. R. (2001). Effects of testing accommodations
on standardized mathematics test scores: An experimental analysis of the performances
of students with and without disabilities. School Psychology Review, 30(4), 527–547.
Sireci, S. G., Scarpati, S. E., & Li, S. (2005). Test accommodations for students with disabilities:
An analysis of the interaction hypothesis. Review of Educational Research, 75(4), 457–
Thurlow, M. (2007, April). Research impact on state accommodation policies for students with
disabilities. Paper presented at the annual meeting of the American Educational Research
Associations, Chicago, IL.
Thurlow, M. L., Lazarus, S. S., Thompson, S. J., & Morse, A. B. (2005). State policies on
assessment participation and accommodations for students with disabilities. The Journal
of Special Education, 38(4), 232–240.
Thurlow, M. L., Moen, R. E., Lekwa, A. J., & Scullin, S. B. (2010). Examination of a reading
pen as a partial auditory accommodation for reading assessment. Minneapolis, MN:
University of Minnesota, Partnership for Accessible Reading Assessment.
* Tindal, G. (2002). Accommodating mathematics testing using a videotaped, read-aloud
administration (Research Report 143). Washington, DC: Council of Chief State School
Tindal, G., & Fuchs, L. (2000). A summary of research on test changes: An empirical basis for
defining accommodations. Lexington, KY: Mid-South Regional Resource Center.
* Tindal, G., Heath, B., Hollenbeck, K., Almond, P., & Harniss, M. (1998). Accommodating
students with disabilities on large-scale tests: An experimental study. Exceptional
Children, 64(4), 439–450.
Vanchu-Orosco, M. (2012). A meta-analysis of testing accommodations for students with
disabilities: Implications for high-stakes testing (Unpublished dissertation). University of
Denver, Denver, CO.
* Weston, T. J. (2003). The validity of oral accommodation in testing: NAEP validity studies.
Washington, DC: National Center for Education Statistics.
Wiener, D., & Thurlow, M. (2012). Creating accessible PARCC reading assessments:
Separating the constructs and providing text-to-speech accommodations for students with
disabilities. Retrieved from
* Wolf, M. K., Kim, J., & Kao, J. (2012). The effects of glossary and read-aloud
accommodations on English language learners’ performance on a mathematics
assessment. Applied Measurement in Education, 25(4), 347–374.
Zenisky, A. L., & Sireci, S. G. (2007). A summary of the research on the effects of test
accommodations: 2005–2006 (Technical Report 47). Minneapolis, MN: University of
Minnesota, National Center on Educational Outcomes.
Zuriff, G. E. (2000). Extra examination time for students with learning disabilities: An
examination of the maximum potential thesis. Applied Measurement in Education, 13(1),
Appendix A. Studies Included
All the students had
reading disabilities or
reading and math
Calhoon, Fuchs, &
All the students had
Crawford & Tindal
62% of the students
All the students had
Fletcher, Francis, &
O’Malley, et al.
73% of the students
disabilities; 27% had
Eaton, Hamlett, &
All the students had
Harker & Feldt
Tedesco, & Tindal
All the students either
disabilities in reading
Heath, & Almond
Helwig & Tindal
Over 70% of the
students had learning
Yovanoff, & Tindal
All the students
education services in
All the students had
Meloy, Deville, &
All the students had
learning disabilities in
& Hoffmann (2004)
All the students had
Olson & Dirir
All the students were
in special education
and were eligible to
All the students had
69% of the students
Almond, & Harniss
Most of the students
education services in
reading or math.
All the students had
Wolf, Kim, & Kao
Note. a The description is only about students with disabilities in each study.
b The four comparisons using both a computer and a video were not included.
c The read-aloud accommodations with two-day administration were not included. Passages were not read. Only stems, responses, and
proper nouns were read to the students.
d Students above the 25th percentile on the computation task were included.
e Only medium and high oral English fluency students were included.
f Only the comparison between group A (year 1997) and group B (year 1997) was included.
g Only the higher-ability reader group was included.
h The usage and expression section was coded as a reading test.
i Only non-ELL students who had received a read-aloud accommodation were included. As confirmed by the author via a personal
communication, the non-ELL students did not have disabilities.
Table 1. Variables and Frequencies
Coding and Notation
Students without disabilities (Reference group)
Students with disabilities (W1)
Reading (Reference group)
Read by human proctors (Reference group)
Read by computers (W3)
Read by video/audio players (W4)
Elementary school ( < 6th grade) (Reference group)
Middle school (6th to 8th) (W5)
High school (9th to 12th) (W6)
Accommodated condition does not allow more time than the
non-accommodated condition (Reference group)
Accommodation condition allows more time than the non-
accommodated condition (W7)
Repeated measure design (Reference group)
Independent group design (W8)
Table 2. Results of the Models
Degree of freedom
Note. * p < .05, ** p < .01, *** p < .001; a Numbers in the parentheses are the 95% confidence interval; b The variance component is
significant at .001 level across all models.
Figure 1. Histogram of effect sizes.
Figure 2. Estimated effect sizes.
Note. The effect sizes are based on Model 7, where reference groups are elementary school, no extra time, and repeated-measure
(video or audio
(video or audio
With disability Without disability