Article

Specifications for an English Language Testing Service

Authors:
To read the full-text of this research, you can request a copy directly from the author.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

... Finally, this paper discusses conceptual limitations of an argument-based approach and provides future directions for validation research. Chapelle and Voss (2014) clearly articulate the evolution of test validity and validation in language testing research over the past few decades by focusing on the works of prominent scholars (e.g., Bachman, 2005;Bachman & Palmer, 1996;Carroll, 1980;Kane, 1992Kane, , 2001Kane, , 2002Kane, , 2004Kane, , 2006Kane, , 2013Kane, Crooks, & Cohen, 1999;Messick, 1989). The evolution consists of the following validation approaches: (1) the one question and three validities (i.e., content, criterion-referenced, and construct validities to answer whether a test measures what it intends to measure), (2) the evidence-gathering, (3) the test usefulness, and (4) argument-based approach. ...
... This one question approach must be supported by three types of validity evidence (content, criterion-referenced, and construct). The concepts of validities are well articulated in the first and second editions of Educational Measurement (Cronbach, 1971;Cureton, 1951) as well as other related publications (American Psychological Association, American Educational Research Association,, and National Council on Measurement in Education, 1966;Carroll, 1980;Cronbach & Meehl, 1955)-all of which are cornerstone documents in guiding validity studies. ...
... Content validity refers to how relevant and representative the test items are to the tasks in the target domain of interest. Typically, content validity is systematically evaluated by experts in the domain (Carroll, 1980). Criterion-referenced validity pertains to how test scores are correlated to scores of other (existing) scores hypothesized to measure the performance in the target domain (American Educational Research Association, American Psychological Association, and National Council on Measurement in Education, 2014;Cureton, 1951;Fulcher, 2015). ...
Article
Full-text available
Purpose and background The purpose of this paper is to critically review the traditional and contemporary validation frameworks—the content, criterion, and construct validations; the evidence-gathering; the socio-cognitive model; the test usefulness; and an argument-based approach—as well as empirical studies using an argument-based approach to validation in high-stakes contexts to discuss the applicability of an argument-based approach to validation. Chapelle and Voss (2014) reported that despite the usefulness and advantages of an argument-based approach for test validation, five validation studies using this approach were found in a search from two major journals—Language Testing and Language Assessment Quarterly. We reviewed the validation approaches in language testing and extended the search for empirical studies that used an argument-based approach in five language testing journals including ProQuest Dissertation and Theses. By doing so, this paper aims to provide validation researchers with each approach’s conceptual limitations and future directions for validation research. For validity arguments to be defensible, this paper suggests that various validity evidences be required, involving multiple test stakeholders. Implications By comparing variations of an argument-based approach and reviewing eight representative studies out of 33 empirical validation studies using an argument-based approach, this paper presents the following implications for future researchers to consider: (a) defining test constructs and relevant test tasks through domain analysis; (b) inviting multiple test stakeholders to test validation; (c) investigating the intended and actual interpretations, decisions, and consequences; (d) considering social, cultural, and political values to be embedded; and (e) employing multiple methods beyond statistical analyses using test scores.
... [25]. Concerning test design for ESP courses, Carroll (1980) explains that a test has three elements: describing the participants, analyzing their "communicative needs", and then specifying test content [26]. It has also been stated that in any ESP context: "assessment takes on a greater importance?because ...
... [25]. Concerning test design for ESP courses, Carroll (1980) explains that a test has three elements: describing the participants, analyzing their "communicative needs", and then specifying test content [26]. It has also been stated that in any ESP context: "assessment takes on a greater importance?because ...
... Overall, it would be worthwhile and efficient if the testers or assessors in EAP/ESP courses can take the above crucial issues into account. For example, in order to come up with valid and reliable tests when attempting to measure the students' proficiency, a) the rationale of tests being made to measure students' proficiency can be set out clearly by analyzing the students' "communicative needs", describing the tasks that tests may represent along with the students' academic setting, and then specifying tests content (Carroll ,1980;Ebel, 1983), b) testers in EAP/ESP courses can maintain the "face validity" in their tests by focusing more on the tests' relevancies and representations of real-life of the students' academic setting (Bachman, 1990) , and c) students in EAP/ESP courses can be encouraged to take tests more seriously by their instructors to better obtain a clear-cut of their performance and proficiency . ...
Article
Full-text available
This study investigated whether English as foreign language (EFL) learners who studied English in an English for Academic Purposes (EAP) course would show significant differences in their overall levels of proficiency in English as determined by an achievement test of the four skills and the English grammar. The study also sought to determine whether there would be significant differences at the level of individual skills; thus, the study examined how strong the correlations between the different English language skills were. The participants of this study were sixty-one EFL undergraduate medical students enrolling in an EAP course at King Khalid University in Saudi Arabia, with the experiment having been conducted during a whole semester. The data collected had been analyzed using appropriate statistical tests and procedures. Study findings revealed that EFL students who studied in that EAP course showed significant differences between their scores in the English language skills; they also bore out significant differences in the subjects' overall English proficiency gains. Pedagogical implications regarding EAP programmes in terms of language acquisition, instruction, and assessments have been appended to this paper. Also, some recommendations have been set out for further investigation in the body of EAP research.
... Stern lists a number of interpretations of language proficiency (Stern 19B3): single-concept approaches-not held seriously since Oller's retraction or rewording of the unitary competence hypothesis (OIler 1976(OIler ,1983; binary concepts like Cummins's BICS/CALP (basic interpersonal communicative skills/cognitive academic language proficiency)-decontextualized language trapped by school tests (Cummins '\979,1980,198$; Canale and Swain's classic rnodel (Canalc and Swain 1980, 19Bi;Canale 1983); and multiple categories such as those put forward in a Council of EuroPe context by Van Ek and Trim and by Carroll, to whom one should add Morrow (Van Ek 1975Van Ek and Trim 1990;Carroll 1978Carroll , 1980Morrow 1977) ' Canale and Swain's ideas, Cummins's BICS/CALP, and Bialystok's distinction between explicit and implicit learning (Bialystok 1982(Bialystok ,1986 were used as the basis for the Development of Bilingual Proficiency Project (Harley et al. 1987(Harley et al. , 1990), which will be briefly discussed below. ...
... Construct validity was interpreted from theory, rather than demonstrated statistically, and reliability was not tlie top priority. This l-ras also been tlre approacl-r of the Council of Europe and Carroll's original ELTS specifications and test (Carroll 1978), basically saying, "Let's keep the cart before thc. horse"; support for this position is offered by the equivocal success of attempts to establish quantitatively the construct validity of models of competence outlined in this section, as well as by evidence of the absurdities tl-rat igrroring washback validity has led to in larrguage testing (Savignon1,992). ...
Book
Full-text available
A paper which outlines the main problems with existing scales f language proficiency from the points of view of description and of measurement, and then goes on to recommend a methodology for calibrating descriptors based on a Rasch model analysis of their use by teachers for assessing their students, as then adopted for the development in a Swiss National Research project of the set of descriptors that became the basis of those in the Common European Framework of Reference for Languages (CEFR) and the related Euuropean Language Portfolio.
... What Spolsky recognizes in his 1976 paper was that this new construct of language competence was starting to achieve orthodoxy. Its implementation in mainstream language testing can be seen in the developmental work of Morrow (1977), B.J. Carroll (1978) and Munby (1978), work for which the muchcited paper by Canale and Swain (1980) in the rst issue of Applied Linguistics provided theoretical justi cation. For instance, Morrow maintained that 'the use of language in a communicative situation has a number of features which are not measured in conventional language tests' (Morrow, 1977: 23). ...
... B.J. Carroll (1978) foreshadowed the development in the late 1970s and early 1980s of the English Language Testing Service (ELTS) test, an approach to communicative testing by way of ESP. ELTS was based on a construct of language competence as divisible rather than unitary. ...
Article
Full-text available
This article gives a personal view of the history of Anglo-American language testing over the last half-century. It argues that major developments in the field have tended to be embraced too enthusiastically, so that they have led to unbalanced views (or ‘heresies’) concerning the construct definition of language, the scope of test impact and the value of new methods of test delivery and analysis. The article considers in turn the move to integrative and then communicative testing in the 1970s and 1980s; the recent concerns about washback, impact, ethics and politics in language testing; and the effects of innovations such as item response theory and computer-based testing. Despite their unsettling in‘ uence, heresies should be welcomed for the challenge they pose to established theory and practice.
... Indeed, in much of the English for Specific Purposes (ESP) literature, there is an assumption that the specifications for an EAP test should flow as naturally from needs analysis as the EAP course itself (McDonough, 1984: lll). Carroll (1980: 13) saw the design of a test to comprise three elements: describing the participants, analysing their 'communicative needs', and then specifying test content. ...
... The ELTS test and the TEEP Carroll (1980) The development of the ELTS test was the first attempt to produce a communicative language test of EAP on a Munby type model, with six modules: business studies, agricultural science, social suryival, civil engineering, laboratory technician and medicine. For each of these roles, profiles of a 'typical' student were to be described. ...
Article
Testing and assessment in English for Academic Purposes (EAP) contexts has traditionally been carried out on the basis of a needs analysis of learners or a content analysis of courses. This is not surprising, given the dominance of needs analysis models in EAP, and a focus in test design that values adequacy of sampling as a major criterion in assessing the validity of an assessment procedure. This article will reassess this approach to the development and validation of EAP tests on the basis of the theoretical model of Messick (1989) and recent research into content specificity, arguing that using content validity as a major criterion in test design and evaluation has been mistaken.
... The version of the New Profile Scale adapted for this study consists of five scalar categories, (1) Communicative Quality, (2) Organization, (3)Argumentation, (4) Linguistic Appropriacy, and ( 5 ) Linguistic Accuracy, each at nine performance levels or bands. The use of nine levels for rating writing performance coincides with the use of nine bands established across all components of the ELTS (Carroll, 1981). The ELTS is available world wide and is meant to provide meaningful scores a t all performance levels. ...
... The use of nine levels for rating writing performance in the NPS, and thus in the ECPS, coincides with the use ofnine bands established across all components of the ELTS (Carroll, 1981). ...
Article
This study investigated the validity of using a multipletrait scoring procedure to obtain communicative writing profiles of the writing performance of adult nonnative English speakers in assessment contexts different from that for which the instrument was designed. Tran sferability could be of great benefit to those without the resources to design and pilot a multiple‐trait scoring instrument of their own. A modification of the New Profile Scale (NPS)was applied in the rating of 170 essays taken from two non‐NPS contexts, including 91 randomly selected essays of the Test of Written English and 79 essays written by a cohort of University of Michigan entering undergraduate nonnative English speaking students responding to the Michigan Writing Assessment. The scoring method taken as a who leappeared to be highly reliable in composite assessment, appropriate for application to essays of different timed lengths and rhetorical modes, and appropriateto writers of different levels of educational preparation. However, whereas the subscales of Communicative Quality and Linguistic Accuracy tended to show individual discriminant validity, little psychometric support for reporting scores on seven or five components of writing was found. Arguments for transferring the NPS for use in new writing assessment contexts would thus be educational rather than statistical.
... According to Carroll (1981), the new test was based upon a set of specifications which attempted directly to apply the Munby model to a definition of test purpose and test content. ...
Thesis
In a study of the effects of text familiarity, task type, and language proficiency on university students’ LSP test and task performance, 541 senior and junior university students majoring in electronics took the Task-Based Reading Test (TBRT). The results indicated that the effect of each of these factors on subjects’ test and performances was statistically significant. Moreover, the impact of the interactions between any given pair and also among all three of these factors on subjects’ test performance was statistically significant. Subjects’ performance on different tasks at the same level of text familiarity afforded statistically significant results. The semi-and no-proficient subjects did not perform significantly different in the following contexts: (a) true-false, sentence-completion, and writer’s-view tasks in partially familiar tests; (b) outlining, writer’s-view, true-false, and sentence completion tasks in totally unfamiliar tests; and (c) sentence-completion, outlining, and writer’s-view tasks in totally familiar tests. The differences found in subjects’ performances on the same tasks at different levels of text familiarity were also significant. However, the difference between semi- and non-proficient subjects’ performance was not statistically significant when they performed (a) the true-false task in partially familiar versus totally familiar contexts, and (b) outlining, sentence-completion, and writer’s-view tasks along the text-familiarity cline. In a comparison of different tasks, subjects’ performance of the sentence-completion task was found to be significantly different from their performance of the other four tasks in question along the text-familiarity cline. Moreover, subjects’ performances of the writer’s-view and the true-false tasks in totally unfamiliar contexts differed significantly. In addition, regression analyses revealed that the greatest influence on subjects’ overall and differential test and task performance was due to language proficiency.
... According to Carroll (1981), the new test was based upon a set of specifications which attempted directly to apply the Munby model to a definition of test purpose and test content. ...
Thesis
Of late, linguistics has been trying to come up with a universal theory of language. Linguists, sociolinguists, and psycholinguists have focused on the different aspects of language. The sum of all their efforts has, no doubt, contributed to the developing field of Universal Grammar. However, the field calls for a good number of other research projects in the different languages of the world. As such, the present study was carried out with the aim of examining Farsi ostensible invitations in terms of the universals of pragmatics. To this end, 45 field workers observed and reported 566 ostensible and 607 genuine invitations. 34 undergraduates were interviewed and afforded 68 ostensible and 68 genuine invitations. And, 41 pairs of friends were interviewed and afforded 41 ostensible invitations. The data were then put to statistical tests: the comparison of ratios was carried out for the purposes of comparing the ratios of the two types of invitations (for any probable difference) in terms of the seven features that control their use in the English language; the chi-square test was also carried out to determine whether the type of invitation was dependent on such variables as the sex, age, and social class of the inviters or not. The results of the data analysis revealed that Farsi ostensible invitations go by the universal norms that influence language use. It was also concluded that the type of invitation was dependent on the variables mentioned above.
... Language tests help to elicit from language learners the extent to which the taught skills have been mastered. Language testing is, therefore, the systematic process of getting information from learners regarding their levels of acquisition of certain skills (Alderson, 1981;Carroll, 1981). Hence the present study reviewed only the empirical research studies collected data from the field and investigated various problems made by EFL/ESL Arab learners. ...
Article
Full-text available
Nonetheless, that there are distinguished, qualified, specialized and experienced teachers; in addition to good overall designing and planning; comprehensive and purposive curriculum, integrated textbooks, but the achievement is still below the expectations of all who are concerned with English language teaching in the Arab world. The current study is set out to review kinds of pitfalls encountered by EFL Arab Learners at the tertiary level in the use of the most three problematic syntactic/semantic sub-categories: prepositions, articles and discourse markers, respectively, in their written and oral discourse. The aim is twofold, by reviewing the empirical studies conducted in the last two decades, the study seeks to probe, introduce and report first: the most common kinds of pitfalls Arab undergraduates face in learning English as a foreign/second language. Second: to reveal possible sources and causes lie behind these pitfalls and what insights and pedagogical implications might be offered. Pitfalls were discussed, conclusions had been drawn, implications and future directions were provided at the end. This is followed by practical suggestions to minimize the occurrence of pitfalls in their formal English.
... Davies (2008) comments: " The new paradigm of communicative language teaching and testing required that the British Council, as a leading exponent of professionalism in ELT, should furnish itself with a new test in order to keep itself publicly in the lead " (2008: 29). Working with the University of Cambridge Local Examinations Syndicate (UCLES), the British Council made plans for a completely new test (Carroll 1978) which would be 'communicative', and would also be tied directly to the British Council's purpose in testing: assessing proficiency in an academic study context. In 1979, the new test, the English Language Testing Service (ELTS) did replace the EPTB; but Davies calls the next few years a 'communicative interlude " (2008: 28). ...
Chapter
This volume focuses on the principles and practices of second language assessment while considering its impact on society. Part I deals with the conceptual foundations of second language assessment, and Part II addresses the theory and practice of assessing different second language skills. Part III examines the challenges and opportunities of second language assessment in a range of contexts. Part IV examines key issues. Chapter 2 by Liz Hamp-Lyons introduces key concepts fr considering the purposes for which language assessments have been, and now are, developed and used.
... According to Carroll (1981), the new test was based upon a set of specifications which attempted directly to apply the Munby model to a definition of test purpose and test content. ...
... Reading skills draw from Munby's (1978) taxonomy of micro-skills and functions, the English Language Testing System (ELTS) needs analysis (Carroll 1981) and Emmett's (1985) three-year survey of the needs of ESL students in British universities. These skills include skimming a text for general understanding, scanning a text to locate specific information, identifying the main points in a text, distinguishing main points from supporting detail, deducing lexical meanings from context, making inferences from a text, and understanding reference information. ...
Article
This paper describes an integrated English language programme for non-native English speakers studying in an English-medium academic learning environment. The programme focuses on developing academic communication skills as well as providing an opportunity for personal language and learning skills development. The course aims at integrating language and learning strategy development within the general content area of English as an International Language (Smith 1981, 1983, 1987). The course integrates process and product-oriented approaches to language programme design (White 1988) with content-based approaches to teaching academic writing (Shih 1986; Raimes 1991). The programme provides specific instruction in reading and note-taking skills, writing in academic contexts, seminar presentation, and tutorial participation as well as individual language development. Whilst the emphasis on academic communication and learning skills addresses the group of students as a whole, language development work is individualised and takes place on the basis of individual learning contracts. Students receive training in self-directed learning strategies to enable them to make their own decisions about what their needs are, when, where and how to work, and what materials to use, (Riley 1982). The course content relates directly to the learners' language experiences. Topics include native and nonnative varieties of English, implications for international and intercultural communication, and the influence of English on other languages. Research on the relationship of content knowledge to writing performance has shown that familiarity with a subject dramatically influences the writing performance of ESL writers. This is supported by Adamson, who concludes that "academic skills are best taught in connection with authentic content material" (Adamson 1990:67).
... The full collection of 1980 papers can be found in Alderson and Hughes' 1981 Issues in Language Testing (available at http://www.ling.lancs.ac.uk/groups/ltrg/ltf2010.htm). In his paper, Carroll (1981) describes the changing needs of the English Proficiency Test Battery (EPTB; used for British study entry screening), due to increases in test-taker numbers and the move towards communicative 1 approaches in language learning and teaching. His call for the diversification of the test battery was driven by a needs analysis which led to a set of draft specifications for the 'successor' of the EPTB, i.e. the English Language Testing Service (ELTS). ...
Article
Full-text available
This article presents a number of issues on the topic of Language for Specific Purposes (LSP) testing that were raised during a plenary discussion at the 30th annual Language Testing Forum. The comments particularly focused on (a) past and current conceptualizations and categorizations of LSP tests, (b) tensions between specificity and practicality in LSP test design, and (c) the role of locality in LSP testing. The views exchanged on each of these themes are reported and considered in light of current research and debates. Suggestions are made for future research in the area of LSP testing.
... It is not easy to succinctly describe the features of an LSP test, but it is clear that such tests ventured away from measuring general proficiency and proficiency in the academic context, to testing language skills for specific occupations. While interest in LSP tests dates from the early 1980s, and there were test development initiatives in that direction too (Carroll, 1981), the development of highly occupation specific tests of language skills started in the late 1980s, possibly with the listening summary translation exam which was developed in 1988-89 (Stansfield, Kenyon and Scott, 1990;1992). LSP test development spread quickly in the 1990s, extending to other countries and occupational contexts. ...
... Early work on testing languages for specific purposes (LSP) (e.g. Morrow, 1977; Carroll, 1981; Weir, 1983 ...
Article
Full-text available
In the past twenty years, language testing research and practice have witnessed the refinement of a rich variety of approaches and tools for research and development, along with a broadening of philosophical perspectives and the kinds of research questions that are being investigated. While this research has deepened our understanding of the factors and processes that affect performance on language tests, as well as of the consequences and ethics of test use, it has also revealed lacunae in our knowledge, and pointed to new areas for research. This article reviews developments in language testing research and practice over the past twenty years, and suggests some future directions in the areas of professionalizing the field and validation research. It is argued that concerns for ethical conduct must be grounded in valid test use, so that professionalization and validation research are inseparable. Thus, the way forward lies in a strong programme of validation that includes considerations of ethical test use, both as a paradigm for research and as a practical procedure for quality control in the design, development and use of language tests.
... According to Carroll (1981), the new test was based upon a set of specifications which attempted directly to apply the Munby model to a definition of test purpose and test content. ...
Thesis
In a study of the effects of text familiarity, task type, and language proficiency on university students' LSP test and task performances, 541 senior and junior university students majoring in electronics took the TBRT (Task-Based Reading Test). Variance analyses indicated that text familiarity, task type, and language proficiency, as well as the interaction between any given pair of these and also among all of them resulted in significant differences in subjects' overall and differential test and task performances. In addition, regression analyses revealed that the greatest influence on subjects' overall and differential test and task performance was due to language proficiency. The implications of the study are discussed.
... The skills focussed on in the listening and reading sections of the test draw from those focussed on in the OTESL and the research on which it is based: Munby's (1978) taxonomy of micro-skills and functions, the ELTS needs analysis (Carroll 1981), and the three-year survey of ESL students in British universities on which the Associated Ex amining Board's (U.K.) Test of English for Educational Purposes (Emmett 1985) is based. The skills selected, again, represent an attempt at creating a preference statement of essential skills and competencies identified in the research as being important for successful university study in the medium of English rather than an absolute statement of candidates' listening and reading abilities. ...
Article
This paper describes the development of an English for Academic Purposes placement test which aims to reflect an integrated approach to language use. The test aims to approximate natural language behaviour by providing series of tasks, each of which contribute to the overall context of language use. The test is also integrated in terms of theme and the rhetorical organisation of language samples provided. This provides a context in which the language samples both model and generate the context for language tasks which follow. Descriptors on which the assessment is based provide a framework for the ongoing observation and evaluation of learners' communicative performance skills.
... Language for Speci®c Purpose (LSP), and more speci®cally EAP bene®tted immediately and directly from the communicative movement. Carroll (1980Carroll ( , 1981Carroll ( , 1982 and Carroll and Hall (1985) took the principles of Munby (1978) and Wilkins (1976) to develop a framework for EAP test design, meeting the requirement of Morrow (1979) that test content should be tailored to learning needs, or purpose of communication. However, LSP testing has been plagued by the seeming impossibility of de®ning what is`speci®c' to a particular communicative setting or purpose. ...
Article
This article looks at the phenomenon of ‘communicative” language testing as it emerged in the late 1970s and early 1980s as a reaction against tests constructed of multiple choice items and the perceived over-emphasis of reliability. Lado in particular became a target for communicative testers. It is argued that many of the concerns of the communicative movement had already been addressed outside the United Kingdom, and that Lado was done an injustice. Nevertheless, the jargon of the communicative testing movement, however imprecise it may have been, has impacted upon the ways in which language testers approach problems today. The legacy of the communicative movement is traced from its first formulation, through present conundrums, to tomorrow's research questions.
Article
Full-text available
The contemporary development of the aviation sector demands not only technical skills from professionals but also a high level of language competence. English has become the international means of communication in this field, particularly within ICAO standards. Consequently, effective use of aviation English becomes a key factor for success in this profession. Our study focuses on preparing aviation specialists in Ukraine by incorporating the experience of leading British educational institutions, including Anglo-Continental, Mayflower College, and Rose of York Language School. The objective is to analyze the methodologies and approaches used by these schools for the effective implementation of their experience in Ukrainian conditions. The main goal is to study the approaches to teaching aviation English in British schools, identify their advantages and disadvantages, and develop recommendations for optimizing teaching methods in Ukrainian educational institutions. The experience of leading British educational institutions in training professionals for the aviation sector can be extremely valuable for Ukraine. The acquired knowledge and pedagogical methods will help improve the learning process in Ukrainian aviation schools, contributing to the preparation of qualified specialists with both technical and language skills at a high level. This can enhance the competitiveness of Ukrainian professionals in the international aviation services market and support the sustainability and safety of aviation communication in the country. Thus, exploring the prospects of extending British experience becomes a key aspect of our research aimed at improving the quality of aviation specialist training in Ukraine.
Book
Full-text available
This collection of articles is based on the presentations given during the ViKiPeda-2007 (Foreign Language Education) Conference in Helsinki on May 21–22, 2007. ViKiPeda-2007 continued a series of conferences focusing on foreign language education, teaching and research. The first ViKiPeda Conference was held at the University of Jyväskylä in 1999, very much thanks to the initiative of Professor Viljo Kohonen, University of Tampere, and Professor Pauli Kaikkonen, University of Jyväskylä. The second ViKiPeda Conference was held at the University of Tampere in 2001, the third at the University of Oulu in 2003, and the fourth at the University of Turku in 2005. In 2009, the ViKiPeda Confer- ence will be organised by the University of Joensuu. ViKiPeda-2007 represents an important step in this series of conferences. As its predecessors, ViKiPeda-2007 made it possible for some 60 experts on foreign language education, teaching and research to come together, to exchange ideas of mutual interest, and to network with colleagues from different universities. It was a great pleasure to have two international Keynote Speakers at our conference: Dr Daniel S. Janik from Hawai’i, USA, and Dr Elena Borzova from Russia. I would like to thank the Finnish Ministry of Education for the grant that made it possible to organise ViKiPeda-2007, to invite two international guests, and to publish these conference proceedings. I would also like to thank the following sponsors, whose contributions to ViKiPeda-2007 were highly appreciated: CICERO Learning for covering an important part of the international travel costs, and Finn Lectura, Otava, Suomen Tietokirjailijat, Tammi and WSOY for special treats between the lectures. Their book exhibitions were also an important part of the Conference. I am very grateful to the Members of the Local Organising Committee, whose work was invaluable when solving different kinds of technical, logistical and scientific problems: Dr Pirjo Harjanne, Dr Raili Hildén, Dr Esa Penttinen and Dr Leena Vaurio. Professor Annikki Koskensalo from the University of Turku, the main organiser of the previous ViKiPeda, gave us important advice based on their own experiences two years earlier, thank you very much indeed, Annikki! My special thanks go to the authors of the articles published in this book. This way an exchange of knowledge and expertise can be shared and fully disseminated. But we must not forget the vivid discussions during the ViKiPeda-2007, all informal exchanges of ideas and enthusiasm that was so concretely tangible thanks to all participants. When finalising these proceedings, I had once again a good chance to lean on Mr Kari Perenius’s expert knowledge of layout. Thank you Kari! Professori Juhani Hytönen, Director of the Department of Applied Sciences of Education, has kindly given us permission to publish the ViKiPeda-2007 Conference Proceedings in the Research Reports series of the Department of Applied Sciences, University of Helsinki. Helsinki, March 1, 2008 Seppo Tella Chair of the Organising Committee Director, Research Centre for Foreign Language Education (ReFLEct) Vice Dean, Professor, University of Helsinki
Chapter
In 1978 a news item appeared in the Albuquerque Journal that read in part: Bilingual Teaching Efforts under Fire SANTA FE (AP) - None of 136 teachers and aides in bilingual programs in New Mexico’s schools who were tested could pass a Spanish reading and writing exam at the fourth grade level, the director of bilingual education for the state Department of Education said. Henry Pascual concluded that colleges of education are spending a lot of federal money turning out Spanish-English bilingual teachers who don’t know much Spanish. (3 October 1978) The article also went on to report that Henry Pascual had observed that even in “bilingual” classrooms, all instruction was taking place in English. Spanish monolingual children placed in such “bilingual” classrooms had thus gained nothing from the implementation of bilingual education in New Mexico. They were still being totally immersed in the English language and continued to fall behind conceptually just as their older siblings had done in the days before bilingual education.
Article
This article discusses a range of current issues and future research possibilities in Communicative Language Testing (CLT) using, as its departure point, the key questions which emerged during the CLT symposium at the 2010 Language Testing Forum. The article begins with a summary of the 2010 symposium discussion in which three main issues related to CLT are identified: (a) the "mainstreaming" of CLT since 1980, (b) the difficulty for practitioners in utilising and operationalising models of communicative ability, and (c) the challenge of theorising a sufficiently rich communicative construct. These issues are each discussed and elaborated in turn, with the conclusion drawn that, whereas the communicative approach lies dormant in many test constructs, there is scope for a reinvigorated communicative approach that focuses on "adaptability." A number of future research directions with adaptability at the forefront are proposed.
Article
Looking back to the language testing world of the 1980s in the United Kingdom, we need to be aware that how we perceive or remember ourselves to have been then-whether as individual language testing academics or as corporate language testing organisations-will be shaped by multiple influences. Although we may have been present at and shared in the 1980 discussions, our recollections of how things were then and our views on how they have (or have not) changed will vary. What follows in this article offers a predominantly personal perspective. It is the view as I perceive it, in light of my own journey as a UK-based language teacher and tester over the past 30 years, seen from where I stand now as a consultant to a large international examining board in the United Kingdom. It is also therefore an institutional perspective, drawing on a long association with one particular language testing organisation. Just as my perspective is from the position of only one language testing institution, I am also only one individual from within that institution. There will undoubtedly be other stances, voices, and perspectives that are equally valid and relevant from within the same institution.
Article
This epilogue considers the set of articles included in the issue as a response to the question posed by Morrow (1979): Communicative language testing-Evolution or revolution? Whereas the other articles in the issue would suggest that we are in a phase of gradual evolution, with much continuity since 1980, this article argues that, on the contrary, we are facing a number of challenges to communicative language testing and the testing of languages for specific purposes. These challenges come from two principal sources: (a) the advances in technology that are making possible the automatic scoring of speech and writing, and the associated return to psycholinguistic, even structuralist, models of proficiency; and (b) the need to reflect in language test constructs and practice the reality of English as a lingua franca communication. The article considers these issues in the light of the influence of the Common European Framework of Reference in language testing, an institution that it seems is now considered "too big to fail."
Article
The proceedings of the first Language Testing Forum in 1980 were published in ELT Documents 111: Issues in Language Testing (Alderson & Hughes, 1981). Discussants at the 1980 Forum raised a number questions on Language for Specific Purposes (LSP) testing relating, notably, to test specificity, test content, the relationship between subject matter knowledge and language knowledge and predicting real-life language performance. The 2010 Language Testing Forum looked back at the last three decades in language testing to reflect on what developments, if any, have occurred. Following the 2010 Forum, this article addresses the questions raised in 1980 with reference to testing for a very specific purpose-the International Civil Aviation Organisation's Language Proficiency Requirements for pilots and air traffic controllers. In analysing the testing context-aeronautical radiotelephony communications-the author argues that, in spite of theoretical and methodological advances in LSP testing, these questions are still as relevant to testing LSP today as they were in the early 1980s.
Article
Full-text available
Applied Linguistics is a series of comprehensive resource books, pro-viding students and researchers with the support they need for advanced study in the core areas of English language and Applied Linguistics. Each book in the series guides readers through three main sections, enabling them to explore and develop major themes within the discipline. • Section A, Introduction, establishes the key terms and concepts and extends readers' techniques of analysis through practical application. • Section B, Extension, brings together influential articles, sets them in context and discusses their contribution to the field. • Section C, Exploration, builds on knowledge gained in the first two sections, setting thoughtful tasks around further illustrative material. This enables readers to engage more actively with the subject matter and encourages them to develop their own research responses. Throughout the book, topics are revisited, extended, interwoven and deconstructed, with the reader's understanding strengthened by tasks and follow-up questions. Language Testing and Assessment: • provides an innovative and thorough review of a wide variety of issues from prac-tical details of test development to matters of controversy and ethical practice • investigates the importance of the philosophy of pragmatism in assessment, and coins the term 'effect-driven testing' • explores test development, data analysis, validity and their relation to test effects • illustrates its thematic breadth in a series of exercises and tasks, such as analysis of test results, study of test revision and change, design of arguments for test validation and exploration of influences on test creation • presents influential and seminal readings in testing and assessment by names such as Written by experienced teachers and researchers in the field, Language Testing and Assessment is an essential resource for students and researchers of Applied Linguistics.
Chapter
The chapter provides a critical review of institutional language testing over the 50-year period following the publication of Robert Lado's Language Testing (1961). It is argued that, over the period, language testing has professionalized itself, as shown by research, university degree courses, international journals and publications, national and international language-testing associations and codes of ethics. A compromise was found early in the period between competence (structure) and performance (communication) following the brief venture into communicative language testing. The period represents a move away from a primary concern with test reliability to a wider interest in validity, from an emphasis on language testing content (especially language for specific purpose) and method (notably statistical procedures and technological resources) to a mature concern for the use of language tests and the extent of the profession's responsibility for that use.
Article
This article reports an experiment in which a group of pre-intermediate EFL students did a controlled dictation test, and then amended their protocols during an unexpected second exposure to the passage. The two sets of scores were compared with one another and with the total scores achieved in a multi-part achievement test. The results showed a modest improvement in performance but this was counterbalanced by a marked drop in dispersal and a loss of validity.
Article
It has been suggested that reading ability can be divided into various subskills, and this notion is common in ESL teaching and testing. It has, however, also been argued (Alderson and Lukmani, 1989) that teachers are unable to reach agreement about the reading subskills which may be tested by particular reading test items. This study begins by examining the place of subskills in ESL syllabus and test design, with particular attention to the enduring influence of the work of Munby (1978). The issue of teachers' perceptions of subskills and their difficulty, as represented in reading comprehension tests, is discussed. A framework is put forward for negotiat ing agreement between teachers about subskills tested by reading compre hension test items. Using this framework, very substantial agreement between a group of five experienced teachers of EAP is shown to be achieved in matching subskills to individual test items in the reading section of a test of EAP, as well as in judging the difficulty of these subskills. After brief discussion of the use of Rasch IRT in analysis of reading comprehension test items, the teachers' consensus regarding subskill difficulty level is compared to the Rasch analysis of item difficulty, and the significant correlation found gives some empirical validation to the teachers' perceptions. Implications of the findings for analysis of test content, and for teaching, are considered.
Article
It has become almost axiomatic for 'communicative' testing theory that the tests should contain exercises which are based on 'real-life' communicative situations (Morrow 1979). In essence, this makes the communicative testing enterprise an exercise in content validity. The notion of 'sampling real life' has, however, been extensively criticized (Oiler 1979:184; Alderson 1981:57), mainly because testing has been seen as the reliable prediction of success in some behavioural performance in a non-test situation. Communicative testing theory tries to tap the performance directly, and here lies the problem.This article investigates one aspect of this problem with regard to the assessment of the English Language Testing Service, 1 and attempts to uncover what appears to be a problem in the theory on which such tests are constructed-or, to put it another way, a problem in construct validity.
Article
This paper starts by discussing research into the effect of background knowledge on English for Academic Purposes (EAP) tests and discusses EAP tests in which the content of at least some of the test components is related to students’ fields of academic study. This section shows how research has demonstrated that students do not necessarily do better if they are given tests in their own academic subject areas and how, because of the difficulties inherent in test-equating, such tests may not be testing the students fairly. The paper suggests, therefore, that for international EAP tests, English for Specific Academic Purposes testing be abandoned. In its second part, the paper discusses what EAP tests might consist of in the future. Instead of EAP proficiency tests, the paper suggests that there should be aptitude tests to find out whether L1 and L2 students would be capable of rapidly acquiring the requisite academic discourse practices once they had embarked on their academic courses. Such tests for L2 students should include a test of specific grammatical skills, so that receiving institutions can be sure that students have the requisite linguistic infrastructure needed to carry out academic work in English.
Article
The main aim of our research was to investigate the language wants of English majors in Hungary. First a questionnaire was administered to 279 students at all the six universities of Hungary where there are students majoring in English language and literature combined with TESOL. The participants were mainly students in the last 2 years of their university studies and their number represented approximately 10% of the target population. The same questionnaire was also completed by 80 students who graduated from one of the universities in Hungary in the past 5 years. The design of the questionnaire was informed by the Common European Framework of Reference prepared by the Education Committee of the European Union (Council of Europe, 2001. Common European Framework of Reference for Teaching, Learning and Assessment. Cambridge University Press, Cambridge). The questionnaire was piloted and validated with think-aloud interviews and test–retest reliability analysis. The results suggest that students use English mainly for academic purposes during their university studies. The most important functions for English majors in their future occupation seem to be expressing their opinion, reading texts on the Internet, conversing with non-native speakers, writing e-mail messages, giving explanations and instructions, and translating oral and written English in a variety of occupations. No major differences between students in different years of study and at different universities in the country were found. The methods applied and the findings concerning the needs of English majors in Hungary might also be relevant for other countries with a similar educational system.
Article
Full-text available
Artikkelissa kuvataan aluksi kielitaidon arvioinnin validiustutkimuksen keskeisiä suuntauksia, joilla kehystetään käynnissä olevan suullisen kielitaidon arviointihankkeen tutkimustehtävä. Tutkimustehtävä kohdentuu kansallisiin opetussuunnitelmaperusteisiin sisältyvän taitotasoasteikon validiuteen, jota tarkastellaan validiusargumentin rakentumisen näkökulmasta. Väitettä asteikon ja sen pohjalta laadittujen puhetehtävien pätevyydestä suullisen taidon mittareina koetellaan Toulminin mallin avulla asettamalla tätä johtopäätöstä tukevat empiiriset lähtötiedot ja niitä tukevat perusteet vastakkain väitettä horjuttavan näytön kanssa. Perusteet ja varaukset koskevat väitetyn johtopäätöksen relevanssia, hyödyllisyyttä, tarkoitettuja seurauksia ja riittävyyttä. Hankkeen tutkijoiden spesifit tutkimusongelmat kohdentuvat johonkin mainituista piirteistä. Validiusargumenttia sovelletaan suullisen kielitaidon arviointihankkeen, Hy-Talkin, kontekstissa. Ongelmanasettelussa ja menetelmävalinnoissa hyödynnetään validiustutkimuksen perinteisempiä sisältö- ja kriteeriperustaisia lähestymistapoja, mutta tehdään myös uusia avauksia tiedon syventämiseksi niistä tulkinnoista ja näkemyksistä, joita suullisten tehtävien suorittajilla ja arvioijilla ilmenee.
ResearchGate has not been able to resolve any references for this publication.