Language Testing 2007 24 (2) 185–208 10.1177/0265532207076363 © 2007 SAGE Publications
The challenges of the Ontario
Secondary School Literacy Test for
second language students
Liying Cheng, Don A. Klinger and Ying Zheng Queen’s
Results from the Ontario Secondary School Literacy Test (OSSLT) indicate
that English as a Second Language (ESL) and English Literacy Development
(ELD) students have comparatively low success and high deferral rates. This
study examined the 2002 and 2003 OSSLT test performances of ESL/ELD
and non-ESL/ELD students in order to identify and understand the factors
that may help explain why ESL/ELD students failed the test at relatively high
rates. The analyses also attempted to determine if there were signiﬁcant and
systematic differences in ESL/ELD students’test performance. The perform-
ance of ESL/ELD students was consistently and similarly lower across item
formats, reading text types, skills and strategies, and the four writing tasks.
Using discriminant analyses, it was found that narrative text type, indirect
understanding skill, vocabulary strategy of reading, and the news report
writing task were signiﬁcant predictors of ESL/ELD membership. The
results of this study provide direction for further research and instruction
regarding English literacy achievement for these second language students
within the context of having to complete large-scale English literacy tests
designed and constructed for ﬁrst English language students.
Over the past twenty years, immigration policies have resulted in an
increasing proportion of immigrants entering educational systems
throughout North America with little or no experience or education in
English. These English as a Second Language (ESL) and English
Address for correspondence: Liying Cheng, Faculty of Education, Queens University, Kingston,
ON K7L 3N6, Canada; email: email@example.com
Literacy Development (ELD)1students are generally provided with
extra support for only a short period to help them quickly achieve fun-
damental English literacy skills (see, for example, Proposition 203,
Arizona, 2000; Proposition 227, California, 1998; Question 2,
Massachusetts, 2002). Further, such support requires additional school
expenditures that may not be readily available when funding for edu-
cation is being restricted. For example, although the number of
ESL/ELD students in the province of Ontario in Canada increased by
23% in one year alone (2001–2002), the number of ESL/ELD teach-
ers and support programs in Ontario schools has declined by 30% over
the past ﬁve years (Blackett, 2002). However, this situation is not
unique to Ontario, as jurisdictions throughout North America report
increased numbers of second language students and a lack of resources
to support these students (Mueller et al., 2004; Wright, 2005).
This shift in the student population is occurring alongside increas-
ing educational expectations and accountability. The accountability
framework has resulted in increased use of standards-based curricula
and assessments to address the fears of declining standards (Mazzeo,
2001). These accountability frameworks have resulted in increasing
performance expectations throughout North American jurisdictions.
Recent examples include, but are not limited to, the grade 3 Alberta
provincial achievement test, the Ontario Secondary Schools Literacy
Test (OSSLT), and, likely the most extreme example, the ‘No Child
Left Behind’ Act (2002). In Alberta, grade 3 students who fail the
provincial achievement test will have to write a supplemental exami-
nation in Grade 4. In Ontario, successful completion of the OSSLT, or
the Ontario Secondary School Literacy Course (OSSLC) recently
implemented for students after the ﬁrst administrations of the test for
those have failed the OSSLT, is a graduation requirement. With ‘No
Child Left Behind’in the United States, schools and districts have until
186 Challenges for second language students
1The Ontario curriculum (Ministry of Education and Training, 1999) deﬁnes ESL students as stu-
dents who are learning English as the language of instruction, can read and write in their ﬁrst lan-
guage (L1), and mostly have had continued schooling before arriving in Canada. ELD students are
those who may not read and/or write in their ﬁrst language (L1) and may have missed years of
schooling. They could come from countries where Standard English is the ofﬁcial language but
where other varieties of English are in common use. And still others live in communities in Ontario
where access to English is limited. We have used the term of ESL/ELD as it is used in the Ontario
curriculum. ESL/ELD is treated as the same group of test-takers on the OSSLT although we rec-
ognize the highly heterogeneous characteristics of the group (Cumming et al., 1993). In this paper,
the term ‘second language students’ are also used to refer to the ESL/ELD students in the context
of this study.
2014 to ensure that all students, with few, if any, student exemptions,
meet educational expectations. Schools must document their annual
yearly progress and those with less than satisfactory performance may
face a variety of sanctions.
The conﬂuence of both increased numbers of ESL/ELD students
and this expanding assessment (testing) framework has created a
new and largely unanticipated educational problem – alarmingly
high failure rates for these students (Watt and Roessingh, 2001). In
addition, these large-scale tests are designed and constructed for ﬁrst
language English speakers. Research suggests, however, that these
assessments may have lower reliability and validity for second lan-
guage students and should be interpreted differently (Abedi, 2004;
Abedi et al., 2003). Clearly, these tests result in extra challenges to
the academic success of these second language students and the
teachers who are responsible for their success. The high failure rates
of these students also highlight a systemic and urgent educational
issue that will likely become more problematic as the numbers of
second language students continue to increase and the requirements
for all students to be successful becomes increasingly important.
The OSSLT was developed by the Education Quality and
Accountability Ofﬁce (EQAO). The purpose of the Ontario
Secondary School Literacy Test (OSSLT) is to ensure that students
have acquired the essential reading and writing skills that apply to all
subject areas in the provincial curriculum up to the end of Grade 9.
All students in public and private secondary schools who are work-
ing toward an Ontario Secondary School Diploma must complete the
OSSLT, or for those who have not been successful on the OSSLT, the
OSSLC. Given the large geographic size of Ontario and variability in
urban structures (urban and rural locations), Ontario education is
expected to meet the needs of a very diverse student population.
According to the EQAO, the OSSLT provides a useful quality assur-
ance measure that shows the extent to which all Ontario students,
regardless of geographic location, curriculum streams, and language
of instruction (English or French) are meeting a common, basic stan-
dard for literacy across the province (EQAO, 2002: 1). Since its trial
run in 2000, there have been ﬁve administrations of the test
(February 2002, October 2002, October 2003, October 2004, and
The OSSLT is a cross-curricular literacy test consisting of a sepa-
rate 2.5-hour reading and a 2.5-hour writing component, both of
which must be successfully completed for credit to be obtained. The
OSSLT is administered in high schools throughout Ontario during
Liying Cheng, Don A. Klinger and Ying Zheng 187
two speciﬁed administration days in October of each year.2Students
registered in Grade 10 are eligible to write the test, although they
may choose to defer writing the assessment until a subsequent admin-
istration. Students who fail one or both components are also expected
to write the incomplete component(s) in a subsequent administration.
There are two versions of the OSSLT in English and French. The
majority of ESL/ELD students in Ontario take the OSSLT in English,
to which the implications of the study will apply in this paper.
The reading component consists of a total of 12 short passages
within three EQAO deﬁned text types: information3(50%), graphic
(25%), and narrative (25%). The passages include ﬁctional, non-
ﬁctional, and graphical texts (e.g. graphs, schedules) that vary across
administrations. In response to the passages, the students are
expected to demonstrate their understanding and comprehension
using test items having three different formats: multiple-choice (MC)
(40%), constructed response (CR) (35%), and constructed response
with explanation (CRE) (25%). The CR items require a short answer
and the CRE items require a written explanation, i.e. the students are
asked to justify or explain the thinking behind their answers. The test
items are also designed to measure three reading skills (direct under-
standing,indirect understanding, and connections) and four reading
strategies (vocabulary,syntax,organization, and graphical features).
These skills and strategies are deﬁned by EQAO based on the
Ontario learning expectations. The writing component consists of
four writing tasks: a summary, a series of paragraphs expressing an
opinion,a news report, and an information paragraph. The purpose
and audience for each task is provided (e.g. to report on an event for
the readers of a newspaper). Students are also provided with guide-
lines regarding the length and methods to structure each writing sam-
ple. For example, to complete the news report writing task, the
students are provided with a headline and picture (e.g. school
receives computers as a reward) and are expected to make up the
facts and information to address questions of who, what, where, why,
how using ‘the limited space provided’(a page with lines) as a guide-
line for the text length.
188 Challenges for second language students
2Review of the OSSLT have been conducted by the EQAO resulting in upcoming changes. Among
the changes, separate reading and writing scores are no longer to be used and the test is shortened
in length from two days to one day (with two 75-minute sessions).
3Facets of the OSSLT item formats, text types, skills and strategies of reading and the four writing
tasks are italicized in the paper as they are deﬁned by EQAO.
The completed tests are sent to EQAO4and are centrally marked
by trained teachers and markers. The MC and CR items on the
reading component are scored on a 2-point (0, 2) scale and the CRE
items are scored using item speciﬁc scoring rubrics on a 3-point scale
(2-points for correct, 1-point for partly correct, or 0 for incorrect).
Each of the four writing tasks is scored using a 4-point (1–4)5
scoring scale having speciﬁc performance descriptors. A score of 0
is given for responses classiﬁed as ‘blank/illegible and irrelevant
content/off-task.’ A score of 4 is given the highest quality written
II The interaction of test method effects and other factors
Not surprisingly, ESL/ELD students ﬁnd it very difﬁcult to succeed
in large-scale English literacy examinations. In addition, evidence
suggests that these difﬁculties extend beyond language abilities.
Taylor and Tubianasa (2001) suggest the interpretation of test results
must include the learner as a context and accommodate learners’
potential from a variety of measures. The content, types and context
of reading passages can have a signiﬁcant impact on students’ read-
ing performance (Anderson et al., 1991; Freedle and Kostin, 1993;
Kobayashi, 2002; Lee, 2002; Peretz and Shoham, 1990; Perkins,
1992). Familiarity with the content of a passage can affect perform-
ance (see also Jennings et al., 1999; Pulido, 2004). For instance,
Peretz and Shoham (1990) pointed out that a group of English as a
Foreign Language (EFL) university students found texts related to
their ﬁelds of study more comprehensible than tests related to other
topics. Carrell and Wallace (1983) investigated the individual and
interactive effects of both context and familiarity on the reading
comprehension of both native English and non-native English read-
ers (i.e. second language readers) to see if these two components of
background knowledge – context and familiarity – would interact.
The ﬁndings indicated that native (ﬁrst language) English readers
utilize context as part of a processing strategy to make cognitive pre-
dictions of what a text is going to be about as it is being read, while
nonnative (second language) English readers do not process a text in
Liying Cheng, Don A. Klinger and Ying Zheng 189
4The Education Quality and Accountability Ofﬁce (EQAO) develops, distributes, collects, and
scores the OSSLT.
5A Level 1–4 system is used in all Ontario student report cards for school academic achievement.
this way. They concluded that the more second language students are
familiar with the context, content and types of passages, the better
they would perform in measures of reading comprehension. Carrell
(1983) also argued that non-native English readers do not make the
necessary connections between the text and appropriate background
knowledge as do their native English counterparts.
In addition to test content, item format has also been shown to dif-
ferentially affect test performance (Bachman and Palmer, 1982;
Shohamy, 1983; 1984; Hancock, 1994; Bennett et al., 1991). Various
test item formats are used to assess reading comprehension
(Anderson et al., 1991). Bachman (1990) emphasized the impor-
tance of research into the effects of personal attributes and test
method facets on test performance. He claimed that ‘test developers
will have better information about which characteristics interact with
which test method facets, and should utilize this information in
designing tests that are less susceptible to such effects, that provides
the greatest opportunity for test takers to exhibit their ‘best’perform-
ance, and which are hence better and fairer measures of the language
abilities of interest’ (1990: 156). As an example, the construct valid-
ity of multiple-choice (MC) items has been criticized as students
may use test-wiseness to rule out or identify correct answers rather
than the process involved in actual reading comprehension
(Anderson et al., 1991; Bachman, 1990). Second language students
have been shown to employ such strategies to succeed in large-scale
testing situations (Cheng and Gao, 2002). It is also not clear that MC
items measure the comprehension of a given passage, or instead, ‘the
reader’s world knowledge and his or her ability to reason and think
about the contents of a passage’ (Royer, 1990: 162).
Although much attention has been given to the MC item format
(e.g. Freedle and Kostin, 1993; Katz et al., 1990; Royer, 1990), com-
paratively less research has focused on other test item formats (e.g.
constructed response) or the relationship between students’ perform-
ance and test item formats. Anderson et al. (1991) investigated the
relationships among student test-taking strategies, content analyses of
test items, and student scores on these items using a triangulation
approach. They found that combining these data sources, greater
meaning was obtained about the interactions among learner strategies,
test content and test performance. Kobayashi (2002) focused on three
other variables, text organization/text types (from loosely to tightly
organized text based on Meyer’s (1985) model of content structure
analysis), item format (close, open-ended questions, and summary),
and learners’ English proﬁciency. Her ﬁndings demonstrated that
190 Challenges for second language students
different item formats, even different types of items within the same
format measured different aspects of reading comprehension. Among
them, summary writing best distinguished learners of different lan-
guage proﬁciency (see also Riley and Lee, 1996). Her ﬁndings also
support the concept of a ‘linguistic threshold’ (2002: 210), that is,
learners below a certain level of proﬁciency will have difﬁculty with
understanding beyond sentence-level or literal understanding.
The types of writing tasks can create different challenges for
second language students (e.g. Connor-Linton, 1995a; 1995b;
Hamp-Lyons, 1996; Kobayashi and Rinnert, 1996). In addition, the
difference in second language students’ writing performance may be
caused by the familiarity with certain writing tasks derived from
their ﬁrst language (Connor-Linton, 1995b). Silva (1997) summa-
rized rhetorical, linguistic, conventional, and strategic issues that
would seriously disadvantage second language writers. He argued
that besides the linguistic aspect, in which second language students
exhibit simpler forms of writing, at the discourse level, these writers
usually demonstrate distinct features of exposition, argumentation,
and narration which, more often than not, do not meet the expecta-
tions of native English markers. Connor-Linton (1995b: 102)
claimed that different raters’ ‘characterization of the writing’ and
their ‘construction of the text’ reﬂected not only the respective
instructional goals and emphases, but also different societies’ theo-
ries of the uses and values of written English. Given the importance
of examinations such as the OSSLT, it is essential to explore and
understand the factors associated with the differences in literacy
achievement of ESL/ELD and non-ESL/ELD students.
Based on the 2002 results, 63% of the ESL/ELD publicly
schooled students who wrote the OSSLT test in February 2002 failed
at least one component of the test (EQAO, 2002), as compared to the
population failure rate of 25%. Further, 52% of the eligible
ESL/ELD students deferred writing the examination for at least one
year. The relatively poor performance of ESL/ELD students
indicates the challenging nature of this test for these students and
potentially hinders their academic success and graduation. Further,
in an accountability framework, the performance of these students
will have an increasing inﬂuence on the conclusions drawn regard-
ing education quality and opportunity to learn (Gee, 2003).
It is within this context that we examined ESL/ELD students’
performance in two test administrations (February 2002 and October
2003) of the OSSLT. The purpose of the study was to determine
if there were signiﬁcant and systematic differences across item
Liying Cheng, Don A. Klinger and Ying Zheng 191
formats, reading text types, underlying skills and strategies and the
four writing tasks that may help explain why ESL/ELD students
failed the test at such high rates relative to non-ESL/ELD students.
The research was guided by two research questions. First, what are
the challenges experienced by ESL/ELD students on the OSSLT?
Second, are there signiﬁcant and systematic differences in test per-
formance between ESL/ELD and non-ESL/ELD students in relation
to speciﬁc constructs of reading and writing tasks? The comparison
between ESL/ELD students’ test performance and that of non-
ESL/ELD counterparts were made from the following perspectives
on the basis of the previous research reviewed above:
1) item formats: multiple-choice, constructed response, and
constructed response with explanations;
2) reading constructs: text types, skills, and strategies and their
associated facets; and
3) writing tasks: a summary, a series of paragraphs expressing an
opinion, a news report, and an information paragraph.
OSSLT item level achievement data for the 2002 and 2003 administra-
tions taken by grade 10 secondary school students across Ontario were
used in this study to address these issues. The data from the two years
were used to cross-validate the ﬁndings. Analyses of the reading com-
ponent focused on item formats, text types, skills and strategies as
deﬁned by EQAO. The analyses of the writing component focused on
the proportion of students obtaining each score level (0–4) on the four
writing tasks. For the 2002 data, a total of 2686 ESL/ELD students
actually wrote the OSSLT (2164 ESL/ELD deferred taking the test in
that year). These students were compared to a stratiﬁed random sam-
ple of non-ESL/ELD students (4068 out of 138 392 students). For the
2003 data, only the 3635 ﬁrst time eligible ESL/ELD students, from
the total sample of 4311 ESL/ELD students, who wrote both compo-
nents of the test were included in the analyses. This reduced sample
was comparable to the 2002 sample in which only ﬁrst time eligible
students were included without repeating students. This 2003 sample
of ESL/ELD students was compared to a random selection of 5003
ﬁrst-time eligible non-ESL/ELD students.
Item level data were used to obtain measures of achievement sub-
divided by item formats, text types, skills, or strategies for reading
192 Challenges for second language students
and the four writing tasks. Descriptive statistics and correlations
were ﬁrst calculated to determine the overall distributions of item
level responses, the interrelationships amongst responses, and the
relative performance differences for ESL/ELD and non-ESL/ELD
students. Second, separate discriminant analyses using step-wise
procedures were conducted with ESL/ELD membership as a depend-
ent variable and the reading constructs (item formats, text types,
skills or strategies and their associated sub-constructs) and writing
tasks as independent variables. Discriminant analysis provides a
method to determine which variables best distinguish group mem-
bership. Discriminant analysis also has an advantage over logistic
regression as it provides better protection against multicollinearity
that could exist due to the nature of the constructs being measured
(Huberty, 1994; Tabachnick and Fidell, 1989).
The reading results are provided in Table 1. The ﬁrst vertical panel
contains the results for 2002, and the second vertical panel contains
the results for 2003. Descriptive statistics provide the overall distri-
butions of item level responses for ESL/ELD and non-ESL/ELD stu-
dents and illustrate the relative performance differences to answer
the ﬁrst research question of the study. In terms of item format, the
analyses from the 2002 OSSLT data demonstrate both ESL/ELD and
non-ESL/ELD students perform relatively better on the multiple-
choice (MC) test items, and the CRE was the most difﬁcult item for-
mat for both groups. For example, ESL/ELD students achieved an
average score of 58.8% (47/80) on the MC items, 54.3% (38/70) on
the CR items, and 42.0% (21/50) on the CRE items. In contrast, non-
ESL/ELD students averaged 77.5% (62/80), 75.7% (53/70), and
62.0% (31/50) on each set of items. ESL/ELD students scored
approximately 20% lower on each format, although the difference
was slightly larger for the CR items. Signiﬁcance testing using t-tests
and with alpha set to 0.01 indicated that these differences were sig-
niﬁcant. The three item formats were highly correlated (between
0.80 and 0.88). Using the standardized canonical discriminant func-
tion coefﬁcients, it was found that each of the three formats served
to determine ESL/ELD and non-ESL/ELD group membership. The
discrimination coefﬁcients were similar for each of the three formats
Liying Cheng, Don A. Klinger and Ying Zheng 193
194 Challenges for second language students
Table 1 Percentage scores and discriminant function coefﬁcients for reading item formats, text types, skills and strategies for ESL/ELD and
ESL/ELD Non-ESL/ELD Difference Discriminant ESL/ELD Non-ESL/ELD Difference Discriminant
(n ⫽2686) (n ⫽4068) coefﬁcient (n ⫽3635) (n ⫽5003) coefﬁcient
Reading item format:
Multiple choice (MC) (80) 58.8 77.5 18.8 .336 60.0 74.1 14.1 .927
Construct responses (CR) (70) 54.3 75.7 21.4 .420 58.4 72.7 14.3 .062
CR with explanations (CRE) (50) 42.0 62.0 20.0 .297 51.6 65.2 13.6 .028
Reading text types: *Wilks’ Lambda ⫽⫽.771 *Wilks’
Information (90) 51.1 71.1 20.0 .189 53.9 69.0 15.1 .545
Graphic (50) 52.0 72.0 20.0 .158 65.2 75.7 10.5 ⫺.489
Narrative (60) 56.7 78.3 21.7 .639 56.1 71.3 15.2 .877
Reading skills: *Wilks’ Lambda ⫽⫽.766 *Wilks’ Lambda ⫽⫽.813
Understanding directly stated 60.0 76.7 16.7 ⫺.579 63.3 74.7 11.4 ⫺.303
ideas and information (60)
Understanding indirectly stated 51.1 72.2 21.1 1.191 57.3 72.3 15.0 .963
ideas and information (90)
Making connections (50) 48.0 68.0 20.0 .327 50.4 65.7 15.3 .323
Reading strategies: *Wilks’ Lambda ⫽⫽.740 *Wilks’ Lambda ⫽⫽.832
Vocabulary (30) 50.0 73.3 23.3 .956 43.2 67.1 23.9 1.131
Syntax (30) 43.3 63.3 20.0 48.7 64.3 15.6 .202
Organization (36) 52.8 69.4 16.7 ⫺.113 54.0 67.9 13.9 ⫺.136
Graphic features (24) 50.0 66.7 16.7 .176 68.1 76.0 7.9 .359
*Wilks’ Lambda ⫽⫽.738 *Wilks’ Lambda ⫽⫽.748
Note:The scores have been converted into percentages.
*Standardized discriminant coefﬁcient p ⬍0.001
with the constructed response items (CR) appearing to be the best
single predictor of ESL/ELD membership (0.420).
Similar performance differences between the two groups were
found for the facets of the reading constructs of text types, skills, and
strategies. All of the differences were found to be signiﬁcant using
t-tests. The information text type, the making connections skill,
and the syntax strategy were the most difﬁcult for both the ESL/ELD
and non-ESL/ELD students. The facets of reading text types, skills
and strategies were also moderately high to highly correlated
(between 0.75 and 0.91). Each of the three text types (information,
graphic, and narrative) served to separate ESL/ELD and non-
ESL/ELD group membership, with narrative being the single best
discriminator (0.639). Each facet of the three reading skills (direct
understanding, indirect understanding, and making connections)
was signiﬁcantly related to ESL/ELD and non-ESL/ELD member-
ship, with indirect understanding being the single best discriminator
(1.191). Three of the four facets of strategies (vocabulary, organiza-
tion, and graphic features) were associated with ESL/ELD and non-
ESL/ELD group membership. Of the three, the vast majority of the
discriminatory power was due to vocabulary (0.956).
The 2003 results produced some notable differences and similari-
ties. Overall, ESL/ELD performance was higher during the 2003
administration. Of particular interest, ESL/ELD students had much
higher performance on the 2003 graphic text type and the strategy
using graphic features relative to the ESL/ELD student reading per-
formance on these facets in 2002; the differences being 13 and 18 per-
cent respectively. At the same time, differences between ESL/ELD
and non-ESL/ELD students were smaller; however, the differences
remained signiﬁcant based on the use of t-tests (alpha ⫽0.01) across
item formats, facets of reading text types, skills and strategies.
Summarizing Table 1, the average performance difference between
ESL/ELD and non-ESL/ELD across all facets of reading text types,
skills and strategies was 19.7% in 2002 and 14.3% in 2003.
The relative difﬁculty of the item formats and the facets of read-
ing text types, skills, and strategies were similar for 2003 as found in
2002. Those facets of reading text types, skills, and strategies that
were the most difﬁcult in the 2002 test administration were also
found to be the most difﬁcult in the 2003 administration. For exam-
ple, the CRE item format, information text type, and the making con-
nections skill were the most difﬁcult for both the ESL/ELD and
non-ESL/ELD students in 2003 as it was in 2002 with the exception
of the strategy facet. Vocabulary turned out to be the most difﬁcult
Liying Cheng, Don A. Klinger and Ying Zheng 195
for ESL/ELD students in 2003 while syntax remained the most
difﬁcult for non-ESL/ELD students in both years.
The correlations between item formats were similar albeit slightly
higher (0.84 to 0.90). In contrast, the correlations amongst the facets
of reading text types, skills, and strategies were slightly lower (0.66
to 0.87). In terms of group separation, the multiple-choice format
served to best separate the two groups during the 2003 test adminis-
tration, with the other two formats providing little further separation.
With respect to the reading constructs, the 2003 data provided differ-
ences in the discriminant coefﬁcients and the larger of Wilks’Lambda
values indicate that the discrimination of the ESL/ELD and non-
ESL/ELD groups was less pronounced in 2003 compared with the
2002 data. In terms of the text types, narrative remained the single
best discriminator. The reading skill of indirect understanding was
found to similarly distinguish between ESL/ELD and non-ESL/ELD
students in both years. Lastly, each of the four facets of reading strate-
gies was found to separate the ESL/ELD and non-ESL/ELD students
and vocabulary was again the single best predictor.
The writing results are provided in Table 2 and Table 3. The 2002
results are provided in the ﬁrst vertical panels of each table.
Examining the overall 2002 results, the summary was the most difﬁ-
cult for both the ESL/ELD and non-ESL/ELD students (see Table 2).
Overall, ESL/ELD students performed lower on each of the four
writing tasks, summary, paragraphs expressing an opinion, news
report, and information paragraph, with the largest average differ-
ences occurring for the news report (0.84) and the smallest for the
summary paragraph (0.38). Table 3 contains the proportion of
ESL/ELD and non-ESL/ELD students who obtained each score
point on the four writing tasks. In the 2002 administration, larger
proportions of ESL/ELD students obtained lower scores (0, 1, or 2)
than their non-ESL/ELD counterparts, especially scores of 0 or 2.
Based on signiﬁcance tests using t-tests and an alpha set to 0.01,
these differences were found to represent signiﬁcant differences in
test performance. The correlations amongst the writing tasks were
lower than those found for the facets of reading text types, skills and
strategies (0.35 to 0.53) of the 2002 data. Based on the discriminant
analysis, each of the four tasks provided separation between
ESL/ELD and non-ESL/ELD students, with the news report being
the single best discriminator among the four.
196 Challenges for second language students
Liying Cheng, Don A. Klinger and Ying Zheng 197
Table 2 Average writing task scores and discriminant coefﬁcients for ESL/ELD and non-ESL/ELD students
2002 holistic score 2003 holistic score
ESL/ELD Non-ESL/ELD Difference Standardized ESL/ELD Non-ESL/ELD Difference Standardized
(n ⫽2686) (n ⫽4068) discriminant (n ⫽3635) (n ⫽5003) discriminant
Summary 1.94 2.32 0.38 ⫺.134 2.38 2.36 ⫺0.02 ⫺.544
Opinion 2.15 2.87 0.72 .415 2.68 2.97 0.29 .332
News report 2.14 2.98 0.84 .578 2.21 2.73 0.52 .736
Information 2.03 2.77 0.74 .311 2.24 2.53 0.29 .304
*Wilks’ Lambda ⫽⫽.857 *Wilks’ Lambda ⫽⫽.936
Examining the 2003 writing results, the overall score averages for
the ESL/ELD students were higher on each task than 2002; however,
these students continued to have signiﬁcantly lower average scores
than the non-ESL/ELD students (see Table 2). The relative difﬁculty
of the writing tasks was different in 2003 for the ESL/ELD students
with both the news report and the information paragraph being more
difﬁcult than the summary task. The widest observed gap between
ESL/ELD and non ESL/ELD students occurred for the news report
(0.52) and the smallest for the summary paragraph (⫺0.02). The
score distributions for the 2003 administration were also somewhat
198 Challenges for second language students
Table 3 Distribution of ESL/ELD vs. non-ESL/ELD students’ performance on the
Writing task 2002 2003
ESL/ELD Non-ESL/ELD ESL/ELD Non-ESL/ELD
(n ⫽2686) (n ⫽4068) (n ⫽3635) (n ⫽5003)
0 points* 30.3% 17.9% 8.7% 12.1%
1 point 2.5 2.1 8.7 9.0
2 points 21.9 26.8 26.6 22.2
3 points 34.0 36.7 47.4 44.6
4 points 11.3 16.4 8.6 12.1
0 point 22.1% 6.7% 4.7% 2.4%
1 point 0.7 0.2 7.5 5.3
2 points 28.5 16.8 16.9 12.6
3 points 38.2 51.7 57.2 51.9
4 points 10.6 24.7 13.7 27.8
0 point 24.7% 6.3% 20.5% 11.6%
1 point 1.9 .2 .2 .0
2 points 20.3 11.4 25.2 13.2
3 points 40.9 52.8 46.2 54.4
4 points 12.1 29.3 7.9 20.7
0 point 26.7% 11.2% 7.4% 4.7%
1 point 2.0 .3 11.0 6.8
2 points 22.1 13.1 37.8 32.6
3 points 39.1 51.0 38.3 42.8
4 points 9.9 24.3 5.5 13.1
*0 points are awarded for Blank/Illegible or Irrelevant Content or if the response is
different than the 2002 administration (see Table 3). As with the
2002 administration, a greater percent of ESL/ELD students
obtained lower scores than non-ESL/ELD students. However, with
the exception of the non-ESL/ELD students completing the news
report task, a smaller percentage of both ESL/ELD and non-
ESL/ELD students obtained a score of 0 on the 2003 writing tasks.
And a larger percentage of the students in both groups obtained a
score of 1. Again, the news report was an exception to this trend.
The correlations among the four tasks were similar, albeit slightly
lower, than those found in 2002 (0.34 to 0.50). Based on the discrim-
inant analysis and similar to the 2002 results, each of the four tasks
provided separation between ESL/ELD and non-ESL/ELD students,
with the news report being the single best discriminator among the
four. The news report was again the single best discriminator. The
2003 data did provide differences in the discriminant coefﬁcients and
the larger of Wilks’ Lambda values indicate that the discrimination
of the ESL/ELD and non-ESL/ELD groups was less pronounced in
2003 compared with the 2002 data.
The current study examined ESL/ELD students’performance in both
the reading constructs and writing tasks of the OSSLT in comparison
with non-ESL/ELD students in order to better understand why
ESL/ELD students fail at a higher rate. The research also examined
if signiﬁcant and systematic performance differences exist between
ESL/ELD and non-ESL/ELD students in relation to the item format,
text types, skills and strategies of reading and the four writing tasks.
This study was conducted over two OSSLT administrations in order
to cross-validate the results and reﬂect a more systematic and accu-
Given that students are required to pass both the reading and writ-
ing components prior to graduation, the high ESL/ELD failure and
deferral rates reported by the EQAO (56% in 2002 for example, see
also EQAO, 2004) illustrate the potentially negative ramiﬁcations for
ESL/ELD students writing the OSSLT. Similarly, the ﬁndings of the
current study also clearly show that ESL/ELD students have a lower
overall level of performance compared to their non-ESL/ELD coun-
terparts across all reading constructs, regardless of item formats, text
types, skills, strategies, and writing tasks. Such ﬁndings are of
increasing concern given that it is estimated that ESL/ELD students
Liying Cheng, Don A. Klinger and Ying Zheng 199
form 20–50% of the general student population in urban K-12 sys-
tems across Canada (Roessingh, 1999). Recent studies conducted in
urban schools indicate that the dropout rates of ESL/ELD students
were as high as 74% (Watt and Roessingh, 1994) and 52% (Norrid-
Lacey and Spencer, 1999). Research has also indicated that high-
stakes, large-scale testing is one of the main reasons contributing to
such a high dropout rates (Catterall, 1989; Madaus and Clarke, 2001,
Mehan, 1997; Reardon and Galindo, 2002; Shepard, 1991).
While the differences in average performance across the reading
constructs and writing tasks provide evidence of lower ESL/ELD
performance (see Tables 1, 2, and 3), they also indicate that the dif-
ferences may be declining as the performance differences were
smaller in 2003 as compared to 2002. The fact that there were more
ESL/ELD students who passed the OSSLT in 2003 (42% in 2003 vs.
37% in 2002) could indicate a greater familiarity of the test content
to the teachers and students, and therefore potentially better prepara-
tion for the test.
Taking a closer look at the group performance differences, there
was a consistent pattern over the two test administrations across item
formats, reading text types, skills, and strategies. The patterns of test
performances of ESL/ELD and non-ESL/ELD students were similar
in each of the item formats. For example, the CRE item format,
information text type, the making connections reading skill proved to
be the most difﬁcult consistently for both groups in 2002 and 2003.
The one exception was that the syntax reading strategy was the most
difﬁcult for ESL/ELD students in 2002 and the vocabulary strategy
was so in 2003 (see Table 1).
Examination of the discriminant analysis results for item formats
were not consistent over the two years. For the 2002 data, the CR
item format best separated the two groups, whereas the MC item for-
mat did so for the 2003 data. Coupled with the similar differences in
the average scores across item formats, these results indicate that
item format does not provide systematic separation between
ESL/ELD and non-ESL/ELD students. Certainly, these ﬁndings
require further examination. In contrast, the consistent discriminant
analysis results for the reading text types, skills and strategies over
the two test administrations indicate that there were small but sys-
tematic performance differences that could be used to help distin-
guish between ESL/ELD and non-ESL/ELD membership. The most
difﬁcult constructs mentioned above did not necessarily distinguish
group membership. Rather, the narrative reading text type, the indi-
rect understanding reading skill, and the vocabulary reading strategy
200 Challenges for second language students
best distinguished the groups. These results are theoretically sup-
ported. For example, the narrative genre of reading potentially
requires comprehension of more embedded social, cultural, and con-
ventional meanings. The setting of such narratives tends to be quite
elaborately detailed or implied rather than overtly stated (Emmitt
et al., 2003). Given this, it may take longer for ESL/ELD students to
achieve such levels of reading comprehension compared with other
types of reading (Cummins, 1982). Information and graphic text
types of reading are more dependent on a formatted text structure
such as a museum or a bus schedule, therefore, less dependent on the
interpretation of embedded meanings (see Kobayashi, 2002). In
addition, ESL/ELD students may have less exposure to narrative
types of reading compared with information and graphic types in the
school settings (Carrell and Wallace, 1983).
In terms of reading skills, indirect understanding provided the
best discrimination, arguably the most cognitively demanding of the
three measured reading skills. These results point to a potentially
unique characteristic of reading development for ESL/ELD students
and the development of higher learning skills (see Chan et al., 2002).
However, more evidence needs to be gathered through a systematic
item analysis or test item review with both ESL/ELD and non-
ESL/ELD students to investigate how these students understand the
requirement of the items and how these skills interact with their test
The pattern for reading strategies demonstrated that vocabulary
best separated the two groups. Again these results are supported by
many research studies conducted in second/foreign language research
(e.g. Qian and Schedl, 2004; Hulstijn and Laufer, 2001; Read, 2000;
2004). Vocabulary is a key learning attribute in the developmental
learning process and indicator of language proﬁciency for second lan-
guage students where both incidental and intentional vocabulary
learning are important. In addition, increased ESL students’ vocabu-
lary can be obtained most efﬁciently with increased exposure to an
English medium of instruction over time so that ESL/ELD students
can be on a par with their native English-speaking counterparts.
As to the differences in the four writing tasks, the overall perform-
ance varied over the two administrations. For 2002, ESL/ELD stu-
dents performed least well on the summary task, whereas in 2003,
they performed least well on the news report task. The performance
in the summary task for non-ESL/ELD students remained the same
(most difﬁcult among the four tasks) in both years. Such variation
could have many causes, including variability of scoring or
Liying Cheng, Don A. Klinger and Ying Zheng 201
contextual changes in the tasks. However, such causes seemed to
affect ESL/ELD students alone. Further, ESL/ELD student writing
performance was higher in the 2003 administration of the OSSLT.
These increases may be due to more speciﬁc coaching or tutoring of
ESL/ELD students in schools associated with increasing teacher and
student familiarity of the OSSLT. Such changes may also be due to
an evolving scoring process that provides raters with better training
on how to score the writing tasks, resulting in a systematic difference
in the manner in which the writing of ESL/ELD students was scored.
Given the importance of the writing component of the OSSLT, fur-
ther examination of factors such as raters, tasks, or context that may
affect ESL/ELD student writing performance is warranted (e.g.
Connor-Linton, 1995b; Silva, 1997).
Regardless, the news report writing task best separated the two
groups in both administrations. This task requires students to use a
picture prompt and accompanying headline, a common activity for
non-ESL/ELD students. Given the importance of contextual knowl-
edge (Cummins, 1982; 1996), such visual stimuli and context may
be less familiar to some ESL/ELD students who recently came to
Ontario secondary schools, making the task relatively more difﬁcult
for these students. If this is true, it is possible that these students may
still need to adjust to a very different media literacy environment in
order to be successful on such tasks. In this sense, a large-scale test
such as the OSSLT, designed and constructed for the non-ESL/ELD
student population, will inevitably cause certain potential problems
for ESL/ELD students.
The combined results from both the 2002 and 2003 OSSLT data
analyses have provided evidence for understanding the challenges
experienced by ESL/ELD students on the OSSLT and for exploring
any signiﬁcant and systematic differences in test performance
between ESL/ELD and non-ESL/ELD students in relation to speciﬁc
constructs of reading and writing tasks. While the relative perform-
ance differences in both years were similar across reading text types,
skills, and strategies, and the four writing tasks, the best predictors
of ESL/ELD membership were generally those supported by previ-
ous second/foreign language research. However, inconsistent results
were found regarding reading item formats over the two years, which
indicates the complexity of test method facets (Bachman, 1990).
202 Challenges for second language students
While the tests are equated across years and the test constructs and
the test formats remained the same, there is not sufﬁcient evidence to
demonstrate the two tests (2002 and 2003) were equivalent. Hence
variability in speciﬁc reading constructs or writing tasks may have
been due to these differences. Further examination into the OSSLT
data from future administrations could be used to determine if these
results are found across subsequent administrations. Other analytic
procedures such as Differential Item or Bundle Functioning may also
provide insights regarding systematic performance differences for
speciﬁc items or sets of related items (bundles).
Furthermore, the ﬁndings inform ESL/ELD educators of the item
formats, reading text types, skills and strategies, and writing tasks that
are found to be most problematic for these students. Based on these
results, it appears that ESL/ELD students require a greater under-
standing of the cultural and contextual aspects of reading and writing
as characterized by narrative texts. Such a focus should also be com-
bined with increased emphasis on vocabulary and understanding of
nuance (indirect understanding). Similarly, ESL/ELD students need
continued and focused instruction on writing. Based on the writing
results, the continued focus on cultural and contextual aspects (using
visuals depicting Canadian activities as in the news report tasks) may
provide the most beneﬁt for these second language students.
We hope to use these results, as a ﬁrst step, to begin to understand
not only the areas of challenges in reading and writing that ESL/ELD
students are facing in successfully completing the OSSLT, but also
the potential validity issues of such high-stakes testing programs for
these students. Such an understanding may help create targeted sup-
port to these students so that construct irrelevant factors and negative
ramiﬁcations of such testing can be minimized. These results can
also inform us about the unique aspects of literacy performance of
the increasing ESL/ELD population in Ontario. Further, these ﬁnd-
ings also provide direction for further research and instruction
regarding English literacy development for these second language
students within the context of having to complete large-scale English
literacy tests designed and constructed for ﬁrst language students.
The results suggest the Ontario Secondary School Literacy Test is
not only an English literacy test (an academic content test), but also
a language proﬁciency test for ESL/ELD students. Such a testing
practice deserves serious ethical consideration (see Bailey and
Butler, 2004; Shepard, 1991; Shohamy, 1997).
As evidenced by current educational policies, using large-scale,
high-stakes examinations as high school graduation requirements is an
Liying Cheng, Don A. Klinger and Ying Zheng 203
increasingly common practice in many parts of the world including
European and Asian countries. However, completing a high-stakes lit-
eracy examination in a second language, as in the case of the OSSLT,
provides a unique challenge to students whose English language pro-
ﬁciency is not on par with their native English-speaking counterparts.
This challenge is more prominent in English-speaking countries such
as USA, Canada, UK and Australia where there is a large second lan-
guage student population in schools. In this sense, this study con-
tributes to the limited, yet increasing literature on how well immigrant
children do in schools in the context of system accountability (see also
Christian et al., 2005). The study also helps to highlight the ongoing
issues associated with these accountability frameworks. These juris-
dictions are often faced with the dilemma of ensuring literacy stan-
dards while also meeting accountability frameworks.
In attempts to be fair to all students, most high-stakes testing pro-
grams provide test accommodations to second language students, for
example, extra time or instructions in students’ ﬁrst languages.
However, such accommodations may not be sufﬁcient, resulting in
new policies designed to address the negative consequences of these
testing programs. The Ontario Secondary School Literacy Course
was implemented to provide a graduation mechanism for students
unable to pass the Ontario Secondary School Literacy Test. The
timeline for the ‘No Child Left Behind’ act has been extended to
2014. Further, minimum sizes for sub-groups have been imple-
mented. Hence schools with fewer than 30 students in a sub-group
would not be required to track their progress. In the case of Arizona,
this policy resulted in 680 schools from being designated as ‘Failing’
under the No Child Left Behind Act (Kossan, 2004). While pursuing
methods to ensure graduation or avoid sanctions, are the required
English literacy skills of these immigrant children being obtained or
are policy makers resigned to the belief that the school system can
not adequately support literacy achievement for these students? The
current study has provided evidence, by understanding the system-
atic performance differences of these students, that it may be possi-
ble to implement educational practices that will truly support the
literacy development and achievement of ESL/ELD students.
The authors acknowledge the support from the Social Sciences and
Humanities Research Council (SSHRC) of Canada, and from
Education Quality and Accountability Ofﬁce (EQAO) for releasing
204 Challenges for second language students
the February 2002 and October 2003 OSSLT data for this study. We
would particularly like to thank the three anonymous reviewers
whose detailed and constructive comments helped us to make the
paper stronger and more succinct.
Abedi, J. 2004: The No Child Left Behind Act and English language learners:
Assessment and accountability issues. Educational Researcher 33(1): 4–14.
Abedi, J., Leon, S. and Mirocha, J. 2003: Impact of students’language back-
ground on content-based assessment: Analyses of extant data (CSE.
Tech. Rep. No. 603). Los Angeles: University of California, National
Centre for Research on Evaluation, Standards, and Students Testing.
Anderson, N.J., Bachman, L.F., Perkins, K. and Cohen, A. 1991: An
exploratory study into the construct validity of a reading comprehension
test: Triangulation of the data sources. Language Testing 8: 41–66.
Bachman, L.F. 1990: Fundamental considerations in language testing.
Oxford: Oxford University Press.
Bachman, L.F. and Palmer, A.S. 1982: The construct validation of some com-
ponents of communicative proﬁciency. TESOL Quarterly 16: 449–65.
Bailey, A.L. and Butler, F.A. 2004: Ethical considerations in the assessment of
the language and content knowledge of U.S. school-age English learners.
Language Assessment Quarterly 1(2–3): 177–93.
Bennett, R.E., Rock, D. and Wang, M. 1991: Equivalence of free-response
and multiple choice items. Journal of Educational Measurement 28:
Blackett, K. 2002: Ontario schools losing English as a second language pro-
grams – despite increase in immigration. Retrieved on 28 July 2004 from
Carrell, P.L. 1983: Three components of background knowledge in reading
comprehension. Language Learning 33: 183–203.
Carrell, P.L. and Wallace, B. 1983: Background knowledge: Context and
familiarity in reading comprehension. Paper presented at the annual con-
vention of teachers of English to speakers of other languages, Honolulu,
HI. (ERIC document reproduction service No. ED228901).
Catterall, J.S. 1989: Standards and school dropouts: A national study of tests
required for high school graduation. American Journal of Education
Chan, C.C., Tsui, M.S., Chan, M.Y. and Hong, J.H. 2002: Applying the
structure of the observed learning outcomes (SOLO) taxonomy on stu-
dent’s learning outcomes: an empirical study. Assessment and Education
in Higher Education 27: 511–27.
Cheng, L. and Gao, L. 2002: Passage dependence in standardized reading
comprehension: Exploring the College English Test. Asian Journal of
English Language Teaching 12: 161–78.
Liying Cheng, Don A. Klinger and Ying Zheng 205
Christian, D., Saunders, B., Genesee, F., Lindholm-Leary, K. and
Goldenberg, C. 2005: Educating English language learners: A synthe-
sis of research evidence. Symposium presented at the American
Educational Research Association, Montreal, Quebec, Canada.
Connor-Linton, J. 1995a: Looking behind the curtain: What do L2 composi-
tion ratings really mean? TESOL Quarterly 29: 762–65.
—— 1995b: Crosscultural comparison of writing standards: American ESL
and Japanese EFL. World Englishes 14(1): 99–115.
Cumming, A., Hart, D., Corson, D., Labrie, N. and Cummins, J. 1993:
Provisions and demands for ESL, ESD, and ALF education in Ontario
schools. Toronto: Ontario Institute for Studies in Education. Report sub-
mitted to the Ontario Ministry of Education and Training.
Cummins, J. 1982: Tests, achievement, and bilingual students. Arlington, VA:
National Clearinghouse for Bilingual Education.
—— 1996: Negotiating identities: Education for empowerment in a diverse
society. Los Angeles, CA: California Association for Bilingual
Education Quality and Accountability Ofﬁce 2002: Ontario Secondary
School Literacy Test, February 2002: Report of Provincial Results.
Retrieved 6 February 2003 from http://www.eqao.com/pdf_e/02/
—— 2004: Ontario Secondary School Literacy Test, October 2003: Report of
Provincial Results. Retrieved 20 January 2005 from http://www.eqao.
Emmitt, M., Pollock, J. and Komesaroff, L. 2003: Language and learning,
third edition. Oxford: Oxford University Press.
Freedle, R. and Kostin, I. 1993: The prediction of TOEFL reading item
difﬁculty: Implications for construct validity. Language Testing 10:
Gee, J.P. 2003: Opportunity to learn: A language-based perspective on assess-
ment. Assessment in Education 10: 27–46.
Hamp-Lyons, L. 1996: The challenges of second language writing assessment.
In White, E., Lutz, W., and Kamusikiri, S., editors, Assessment of writing:
Policies, politics, practice. New York: Modern Language Association,
Hancock, G.R. 1994: Cognitive complexity and the comparability of multiple-
choice and constructed-response test formats. Journal of Experimental
Education 62: 143–57.
Huberty, C.J. 1994: Applied discriminant analysis. New York: Wiley.
Hulstijn, J.H. and Laufer, B. 2001: Some empirical evidence for the involve-
ment load hypothesis in vocabulary acquisition. Language Learning 51:
Jennings, M., Fox, J., Graves, B. and Shohamy, E. 1999: The test-takers’
choice: An investigation of the effect of topic on language test perform-
ance. Language Testing 16(4): 426–56.
Katz, S., Lautenschlager, G., Blackburn, A. and Harris, F. 1990: Answering
reading comprehension items without passages on the SAT.
Psychological Science 1: 122–27.
206 Challenges for second language students
Kobayashi, H. and Rinnert, C. 1996: Factors affecting composition evalua-
tion in an EFL context: Cultural rhetorical pattern and readers’ back-
ground. Language Learning 46: 397–437.
Kobayashi, M. 2002: Method effects on reading comprehension test perform-
ance: Text organization and response format. Language Testing 19:
Kossan, P. 2004: Ariz easing fed’s rules for school standards. Arizona
Republic, 4 April, p. B1.
Lee, G. 2002: The inﬂuence of several factors on reliability for complex read-
ing comprehension tests. Journal of Educational Measurement 39:
Madaus, G. and Clarke, M. 2001: The impact of high-stakes testing on minor-
ity students. In Kornhaber, M. and Orﬁeld G., editors, Raising standards
or raising barriers: Inequality and high stakes testing in public educa-
tion, New York: Century Foundation, 85–106.
Mazzeo, C. 2001: Frameworks of state: Assessment policy in historical per-
spective. Teachers College Record 103: 367–97.
Mehan, H. 1997: Contextual factors surrounding Hispanic dropouts.
Retrieved on 16 January 2004 from http://www.ncela.gwu.edu/pubs/
Meyer, B.J.F. 1985: Prose analysis: Purpose, procedures, and problems: Parts
I and II. In Britton, B. and Black, J.B., editors, Understanding expository
text. Hillsdale, NJ: Lawrence Erlbaum, 269–304.
Mueller, T.G., Singer, G.H.S. and Grace E.J. 2004: The individuals with dis-
abilities education act and California’s Proposition 227: Implications for
English language learners with special needs. Bilingual Research
Journal 28(2): 231–51.
No Child Left Behind Act 2002: Pub. L. No. 107–10.
Norrid-Lacey, B. and Spencer, D.A. 1999: Dreams I wanted to be reality:
Experience of Hispanic immigrant students at an urban high school.
Paper presented at the Annual Meeting of the American Educational
Research Association, Montreal, Canada.
Ontario Ministry of Education and Training. 1999: The Ontario Curriculum
Grade 9 to 12 English as a Second Language and English Literacy
Development. Toronto: Queen’s Printer for Ontario.
Peretz, A.S. and Shoham, M. 1990: Testing reading comprehension in LSP:
Does topic familiarity affect assessed difﬁculty and actual performance?
Reading in a Foreign Language 7: 447–55.
Perkins, K. 1992: The effect of passage and topical structure types on ESL
reading comprehension difﬁculty. Language Testing 9: 163–72.
Proposition 203, English for the Children, Arizona voter initiative. 2000.
Proposition 227, English for the Children, California voter initiative. 1998.
Pulido, D. 2004: The relationship between text comprehension and second
language incidental vocabulary acquisition: a matter of topic familiarity?
Language Learning 54(3): 469–523.
Qian, D.D. and Schedl, M. 2004: Evaluation of an in-depth vocabulary knowl-
edge measure for assessing reading comprehension. Language Testing
Liying Cheng, Don A. Klinger and Ying Zheng 207
Question 2, English Language Education in Public Schools, Massachusetts
voter initiative 2002.
Read, J. 2000: Assessing vocabulary. Cambridge: Cambridge University
—— 2004: Research in teaching vocabulary. Annual Review of Applied
Linguistics 24: 146–61.
Reardon, S.F. and Galindo, C. 2002: Do high-stakes tests affect student’s
decisions to drop out of school? Evidence from NELS. Retrieved on
28 December 2003 from http://www.pop.psu.edu/general/pubs/
Riley, S. and Lee, J.F. 1996:A comparison of recall and summary protocols as
measures of second language reading comprehension. Language Testing
Roessingh, H. 1999: Adjunct support for high school ESL learners in main-
stream English classes: Ensuring success. TESL Canada Journal 17(1):
Royer, J. 1990: The sentence veriﬁcation technique: A new direction in the
assessment of reading comprehension. In Legg, S. and Algina, J., editors,
Cognitive assessment of language and math outcomes. Norwood, NJ:
Shepard, L. 1991: When does assessment and diagnosis turn into sorting and
segregation? In Hiebert, E., editor, Literacy for a diverse society:
Perspectives, practices, and policies. New York: Teacher College Press,
Shohamy, E. 1983: The stability of oral proﬁciency assessment on the oral
interview testing procedure. Language Learning 33: 527–40.
—— 1984: Does the testing method make a difference? The case of reading
comprehension. Language Testing 1: 147–70.
—— 1997: Testing methods, testing consequences: Are they ethical? Are they
fair? Language Testing 14(3): 340–49.
Silva, T. 1997: On the ethical treatment of ESL writers. TESOL Quarterly
Tabachnick, B.G. and Fidell, L.S. 1989: Using multivariate statistics. New
York: Harper Collins.
Taylor, A.R. and Tubianasa, T. 2001: Student Assessment in Canada:
Improving the learning environment through effective evaluation. Society
for the Advancement of Excellence in Education: Kelowna, BC.
Watt, D. and Roessingh, H. 1994: ESL drop-out: The myth of educational
equity. The Alberta Journal of Educational Research 40(3): 283–96.
—— 2001: The dynamics of ESL drop-out: Plus ¸ca change. The Canadian
Modern Language Review 58: 203–22.
Wright, W.E. 2005: English language learners left behind in Arizona: The
nulliﬁcation of accommodations in the intersection of federal and state
policies. Bilingual Research Journal 29(1): 1–29.
208 Challenges for second language students