About
21
Publications
8,511
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
20
Citations
Introduction
My main research interests are in language testing and language teacher education. My recent and current research focus is on item writing, item writer training, and assessment literacy for test developers and item writers.
Current institution
Additional affiliations
October 2016 - March 2020
October 2017 - present
March 2015 - September 2016
The British Council China
Position
- Consultant
Education
October 2016 - September 2020
September 2013 - September 2015
Publications
Publications (21)
The quality of test items is crucial for test validity and for the interpretation of test scores with reference to learners' language ability. Providing item writers with adequate item writing documentation and formal training can substantially improve the quality of the items they produce. This entry provides practical item writing suggestions suc...
A primary consideration in test format and task selection is the aim of the test or task, or both. Language testers and SLA researchers must know what aspects of a learner’s language ability, stage of language development, or factors affecting language development they would like to evaluate. This chapter offers SLA researchers and language testers...
A long-standing debate in the testing of listening concerns the authenticity of the listening input. On the one hand, listening texts produced by item-writers often lack spoken language characteristics. On the other hand, real-life recordings are often too context-specific to stand alone, or not suitable for item generation. In this study, we explo...
Linguistic knowledge is not normally a criterion for item-writer selection, and it is rarely a part of item-writer training. However, linguistic knowledge directly impacts on item-writers’ ability to target the intended construct in language test items.
This presentation reports on a doctoral research study which investigated an online induction i...
A central challenge in language testing is the alignment of the targeted construct with techniques used for its measurement, for the construct to be measured in a comprehensive (construct representation), relevant (construct relevance), and reliable manner. In this invited presentation, I offer some guidance on how to select suitable task types and...
The crucial importance of valid and reliable scores in performance assessment has led to a range of measures taken to reduce rater variability. These include training and monitoring of raters, statistical adjustment of scores, and approaches that advocate the necessity of plural opinions on a performance sample and involve communal rating sessions...
Our study seeks to answer the question of what the impact of creativity is on teenage Hungarian L2 learners' performance in a written argumentative and narrative task. Ninety-five participants at an intermediate level of language proficiency wrote a story based on six unrelated pictures and an argumentative essay in English. Participants also compl...
In rating tests of speaking/writing, raters might have to decide whether the response is on-topic or off-topic. Drawing on two studies of rater perceptions, this presentation explores the rater decision-making process behind making on/off-topic decisions: what makes the decisions valid and reliable? The studies’ findings might be helpful for operat...
Item quality makes a significant contribution to test validity, thus rendering the work of item writers critically important for assessment. However, little empirical research has so far been done into item writing, including item-writing training. This thesis therefore aimed to investigate an online induction item-writing training course in order...
The crucial importance of valid and reliable scores in performance assessment has led to a range of measures taken to reduce rater variability. These include training and monitoring of raters, statistical adjustment of scores, and approaches that advocate the necessity of plural opinions on a performance sample and involve communal rating sessions...
The construct of reading assessment and reading subskills. The principles of reading text selection and adaptation. In particular, the concept of readability and suggestions on how to adjust text reading difficulty for tests of reading comprehension.
The principles of producing effective reading test questions (items). The following item types are discussed: multiple choice, true/false/not given, sentence completion/ short answer questions, rearrangement, and information transfer. Practical item writing recommendations for each item type are made.
The standard process of reading test production is explained and suggestions on how it can be adapted for practical classroom use are made, among them task peer-review, task revision based on feedback, and task trialling.
The concepts of test validity and test construct, in their application to testing reading skills. The role of test specifications and quality review procedures in reading test production.
Item writers play a key role in the language test cycle, as they essentially need to operationalise the construct into actual tasks. Often, however, these assessment professionals receive a rather narrowly-focused training in writing items to a particular set of specifications. Usually, this training is limited to ‘item writing guidelines’ or instr...
Improving language assessment literacy (LAL) of various test stakeholders, particularly teachers and raters, is increasingly viewed as necessary in language testing. However, while many research studies converge and agree there is a need for increased LAL (Harding & Kremmel, 2016; Popham, 2006), the practicalities of actually implementing workable...
Language testing textbooks give limited recommendations on how to conduct item writer training. At the same time, it has been repeatedly emphasized that the quality of test items is of crucial importance to test validity and reliability (Bachman, 1990; Messick, 1996). This poster presented the pilot study of research into the effectiveness of item...
An evidence-based approach to understanding the Language Assessment Literacy (LAL) was used to investigate assessment literacy needs of language test writers and build their LAL profile. This presentation reports on the conceptual-empirical approach to building such as profile with 20 newly-trained test writers.The presentation draws on a combinati...
The presentation draws upon an evidence-based approach to understanding the Language Assessment Literacy (LAL) needs of specific language assessment stakeholders in order to build group-specific profiles that generate targeted LAL development programmes. The presentation reports on the conceptual-empirical approach to building a LAL profile for Chi...
Mainland Chinese citizens form the largest part of overseas students in UK universities and they largely acquire EAP knowledge / skills in their home country. Chinese education system is traditionally highly test-driven with teaching EAP seen as little more than preparing students to take the IELTS Academic test, which makes local teachers' Assessm...