Using existing data from several locations across the U.S., this study examined the impact of students' language background on the outcome of achievement tests. The results of the analyses indicated that students' assessment results might be con-founded by their language background variables. English language learners (ELLs) generally perform lower than non-ELL students on reading, science, and math–a strong indication of the impact of English language proficiency on assessment. Moreover, the level of impact of language proficiency on assessment of ELL students is greater in the content areas with higher language demand. For example, analyses showed that ELL and non-ELL students had the greatest performance differences in the language-related subscales of tests in areas such as reading. The gap between the performance of ELL and non-ELL students was smaller in science and virtually non-existent in the math computation subscale, where language presumably has the least impact on item comprehension. The results of our analyses also indicated that test item responses by ELL stu-dents, particularly ELL students at the lower end of the English proficiency spec-trum, suffered from low reliability. That is, the language background of students may add another dimension to the assessment outcome that may be a source of measure-ment error in the assessment for English language learners. Further, the correlation between standardized achievement test scores and exter-nal criterion measures was significantly larger for the non-ELL students than for the ELL students. Analyses of the structural relationships between individual items and between items and the total test scores showed a major difference between ELL and non-ELL students. Structural models for ELL students demonstrated lower statistical fit. The factor loadings were generally lower for ELL students, and the correlations between the latent content-based variables were also weaker for them. We speculate that language factors may be a source of construct-irrelevant vari-ance in standardized achievement tests (Messick, 1994) and may affect their con-struct validity. Due to the rapidly changing demographics of the U.S. population, fairness and va-lidity issues in assessment are becoming top priorities in the national agenda. Be-tween 1990 and 1997, the number of U.S. residents not born in the United States increased by 30%, from 19.8 million to 25.8 million (Hakuta & Beatty, 2000). Ac-cording to the Survey of the States' Limited English Proficient Students and Avail-able Educational Programs and Services 1999–2000 Summary Report, over 4.4 million limited English proficient 1 students were enrolled in public schools (Na-tional Clearinghouse for English Language Acquisition and Language Instruction Educational Programs, 2002). To provide fair assessment and uphold standards on instruction for every child in this country, both federal (e.g., No Child Left Behind Act of 2001) and state legislation now require the inclusion of all students, includ-ing ELLs, into large-scale assessments (Abedi, Lord, Hofstetter, & Baker, 2000; Mazzeo, Carlson, Voelkl, & Lutkus, 2000). Such inclusion requirements have prompted new interest in modifying assessments to improve the level of English language learners' participation and to enhance validity and equitability of infer-ences drawn from the assessments themselves. Standardized, high-stakes achievement tests are frequently used for assessment and classification of ELL students, as well as for accountability purposes. They shape instruction and student learning (Linn, 1995). These tests are used by ap-proximately 52% of school districts and schools to help identify ELL students, as-sign them to school services, and reclassify them from ELL status. About 40% of districts and schools use achievement tests for assigning ELL students to specific instructional services within a school, and over 70% of districts and schools use achievement tests to reclassify students from ELL status (Zehler, Hopstock, Fleischman, & Greniuk, 1994). However, as most standardized, content-based tests (such as science and math tests) are administered in English and normed on native English-speaking test pop-ulations, they may inadvertently function as English language proficiency tests. English language learners may be unfamiliar with the linguistically complex struc-ture of test questions, may not recognize vocabulary terms, or may mistakenly in-terpret an item literally (Duran, 1989; Garcia, 1991). They may also perform less well on tests because they read more slowly (Mestre, 1988).