Videoconferencing offers new opportunities for language testers to assess speaking ability in low-stakes diagnostic tests. To be considered a trusted testing tool in language testing, a test should be examined employing appropriate validation processes (Chapelle, C.A., Jamieson, J., & Hegelheimer, V. (20038.
Chapelle , C. 1994. Are C-tests valid measures for L2 vocabulary research?. Second Language Research, 10: 157–187. [CrossRef], [CSA]View all references). Validation of a web-based ESL test. Language Testing, 20, 409–439.). While developing a speaking test, language testers need to gather evidence to build a validity argument with theoretical rationales. These rationales should be based on test purpose and validation considerations that affect decision making on test design and validation (Chapelle, C. (20017.
Chang , H. 2004. “Understanding computerized adaptive testing: From Robbins-Montro to Lord and beyond”. In The SAGE handbook of quantitative methodology for the social sciences, Edited by:
Kaplan , D. 1–17. London: Sage. View all references). Computer applications in second language acquisition: Foundations for teaching, testing, and research. Cambridge: Cambridge University Press.). To obtain theoretical soundness in validation, spec-driven test development (Davidson, F., & Lynch, B. (200215.
Clark , R.E. 1994b. Media will never influence learning. Education Technology Research & Development, 42: 21–29. [CrossRef], [Web of Science ®]View all references). Testcraft: A teacher's guide to writing and using language test specifications. New Haven, CT and London: Yale University Press.) was applied to speaking test development. Experimental tests were carried out with 40 test takers using face-to-face and videoconferenced oral interviews. Findings indicated no significant difference in performance between test modes, neither overall nor across analytic scoring features. Findings from qualitative data also evidenced the comparability of the videoconferenced and face-to-face interviews in terms of comfort, computer familiarity, environment, non-verbal linguistic cues, interests, speaking opportunity, and topic/situation effects with little interviewer effect. Data taken from test spec evolution, test scores, post interview, and observations were analyzed to build a validity argument using Bachman and Palmer's (19963.
Bachman , L.F. and
Palmer , A.S. 1996. Language testing in practice, Oxford: Oxford University Press. View all references. Language testing in practice. Oxford: Oxford University Press.) usefulness analysis table. The collected evidence suggests that the videoconferenced interview was comparable to the face-to-face interview with respect to reliability, construct validity, authenticity, interactiveness, impact, and practicality.