Conference PaperPDF Available

TipTopTalk!: A game to improve the perception and production of L2 sounds

Authors:

Abstract

Swain's (1985) Comprehensible Output Hypothesis considers that input alone may not be enough for second/foreign language (L2) learners to acquire new language forms. The Hypothesis claims that producing an L2 will facilitate L2 learning due to the mental processes related with language production. Thus, learners will more likely notice discrepancies and gaps between linguistic aspects of their native language (L1) and those of their L2 when producing language than when only perceiving language. Taking Swain's Hypothesis into account, in this talk we will present a Computer Assisted Pronunciation Training designed for non-native speakers of Chinese, English, German, Portuguese (Brazilian and European) and Spanish. The game makes use of automatic speech recognition (ASR) and text-to-speech systems available in Android smartphones and tablets to (i) present learners with the target sounds by means of synthesized stimuli; (ii) test learner's discrimination of specific L2 sounds that are likely to cause intelligibility problems through exercises containing minimal pairs; and (ii) allow learners to record their speech and compare their production to that of the L2. The game provides users with immediate feedback in both perception and production exercises. In the latter exercises, when the recognizer is unable to identify an ideal or close-to-ideal response, the user can retry the answer up to five times. The main disadvantage of ASR pronunciation training is erroneous feedback, i.e., possibility of false alarms and false accepts (Neri et al., 2006). In order to encourage users' engagement and desire to keep playing the game, each correct answer entitles users to collect points so as to reach a given game status. Moreover, different language-dependent leaderboards can be displayed at the end of each round. The advantages in using a gamification design strategy are (i) the increase of learners' engagement, and (ii) the possibility of individualized and comprehensive feedback while keeping users active and comfortable to progress at their own pace in an anxiety-free context.
TipTopTalk!: A game to improve the perception and production of L2 sounds
Andreia Rauber1, Cristian Tejedor-García2, Valentín Cardeñoso-Payo2,
Enrique Cámara-Arenas3, César González-Ferreras2,
David Escudero-Mancebo2, Anabela Rato4
1Department of Computational Linguistics, University of Tübingen, Germany
2Department of Computer Science, University of Valladolid, Spain
3Department of English Philology, University of Valladolid, Spain
4Center for Humanistic Studies, University of Minho, Portugal
Swain’s (1985) Comprehensible Output Hypothesis considers that input alone may not be
enough for second/foreign language (L2) learners to acquire new language forms. The Hypothesis
claims that producing an L2 will facilitate L2 learning due to the mental processes related with
language production. Thus, learners will more likely notice discrepancies and gaps between
linguistic aspects of their native language (L1) and those of their L2 when producing language than
when only perceiving language.
Taking Swain’s Hypothesis into account, in this talk we will present a Computer Assisted
Pronunciation Training designed for non-native speakers of Chinese, English, German, Portuguese
(Brazilian and European) and Spanish. The game makes use of automatic speech recognition (ASR)
and text-to-speech systems available in Android smartphones and tablets to (i) present learners with
the target sounds by means of synthesized stimuli; (ii) test learner’s discrimination of specific L2
sounds that are likely to cause intelligibility problems through exercises containing minimal pairs;
and (ii) allow learners to record their speech and compare their production to that of the L2. The
game provides users with immediate feedback in both perception and production exercises. In the
latter exercises, when the recognizer is unable to identify an ideal or close-to-ideal response, the
user can retry the answer up to five times. The main disadvantage of ASR pronunciation training is
erroneous feedback, i.e., possibility of false alarms and false accepts (Neri et al., 2006).
In order to encourage users’ engagement and desire to keep playing the game, each correct
answer entitles users to collect points so as to reach a given game status. Moreover, different
language-dependent leaderboards can be displayed at the end of each round. The advantages in
using a gamification design strategy are (i) the increase of learners’ engagement, and (ii) the
possibility of individualized and comprehensive feedback while keeping users active and
comfortable to progress at their own pace in an anxiety-free context.
References
Neri, A., Cucchiarini, C., & Strik, H. (2006). Selecting segmental errors in L2 Dutch for optimal pronunciation
training. International Review of Applied Linguistics in Language Teaching, 44, 357-404.
Swain, M. (1985). Communicative competence: Some roles of comprehensible input and comprehensible output in
its development. In Gass, S. & Madden, C. (Eds.), Input in Second Language Acquisition (pp. 235-256). New York:
Newbury House.
... We have also reflected on the degree of engagement generated by the tool [12], [13]. However, our findings in relation to the actual teaching efficiency of the system have been less conclusive: the introduction of corrective feedback [14], [15] allowed us to confirm that there was pronunciation improvement among users after the first few turns, while protracted use of the tool seemed to invariably lead to stagnation. An extra complication concerning the assessment of pronunciation improvement among users had to do with the freedom of movement granted to them and, therefore, with the already mentioned lack of control on the part of the system. ...
Article
Over the last few years, we have witnessed a growing interest in computer-assisted pronunciation training (CAPT) tools and the commercial success of foreign language teaching applications that incorporate speech synthesis and automatic speech recognition technologies. However, empirical evidence supporting the pedagogical effectiveness of these systems remains scarce. In this study, a minimal-pair based CAPT tool that implements exposure—perception—production cycles and provides automatic feedback to learners is tested for effectiveness in training adult native Spanish users (English level B1—B2) in the production of a set of difficult English sounds. Working under controlled conditions, a group of users took a pronunciation test before and after using the tool. Test results were considered against those of an in-classroom group who followed similar training within the traditional classroom setting. Results show a significant pronunciation improvement among the learners who used the CAPT tool, as well as a correlation between human rater's assessment of post-tests and automatic CAPT assessment of users.
... With that approach, we were able to assess user's pronunciation level in a L2 [8]. We have also analyzed how the introduction of corrective feedback [9,10] increased pronunciation improvement among users after the first stages of use. 1 This work has been partially supported by the Ministerio de Economía y Empresa (MINECO) and the European Regional Development Fund FEDER under project (TIN2014-59852-R) and by Consejería de Educación of Junta de Castilla y León under project (VA050G18). While the freedom of movement on game-oriented tools leads users to maximize their score by repeating those tasks they found easy, the continued use of the tool seemed to generate stagnation. ...
Conference Paper
Full-text available
In this document, we describe the mobile application Japañol 1 , a learning tool which helps pronunciation training of Spanish as a foreign language (L2) at a segmental level. The tool has been specifically designed to be used by native Japanese people , and implies a branch of a previous CAPT gamified tool TipTopTalk!. In this case, a predefined cycle of actions related to exposure, discrimination and production is presented to the user, always under the minimal-pairs approach to pronunciation training. It incorporates freely available ASR and TTS and provides feedback to the user by means of short video tutorials, to reinforce learning progression.
Article
In the automatic evaluation system, need to learn the standard mandarin of scoring method for teaching in native Chinese pronunciation. The most pronounced goal protocols focus on the context in which native speakers are unnatural. The new Hidden Markov Model (HMM) algorithm based on the traditional algorithm likely algorithm for Chinese syllables, whose final initial period is found in the area where evidence for the measurement of weight control has been found. Experiments have also shown that this algorithm is more effective than the traditional posterior recording algorithm of the Mandarin learning method. Force Hidden Markov Model- HMM Align alignment identification for each syllable and associated recording probability for speech evaluation via race-based reliability system applications. These processes could then be formalized as a linear combination after the overall assessment functions: phonics, tone, intensity, and rhythm. Because both linear and non-linear parameters are involved in the overall evaluation functions. Incorporates variation in pronunciation to generate structure through a novel approach that incorporates tons of sub-tones that represent the missing automatic sound models. The word level assessment achieved through the pronunciation is similar to that which in the future showed the singing ability being realized by the evaluation system in full-length pronunciation as a method.
Thesis
Full-text available
The quality of speech technology (automatic speech recognition, ASR, and text–to–speech, TTS) has considerably improved and, consequently, an increasing number of computer-assisted pronunciation (CAPT) tools has included it. However, pronunciation is one area of teaching that has not been developed enough since there is scarce empirical evidence assessing the effectiveness of tools and games that include speech technology in the field of pronunciation training and teaching. This PhD thesis addresses the design and validation of an innovative CAPT system for smart devices for training second language (L2) pronunciation. Particularly, it aims to improve learner's L2 pronunciation at the segmental level with a specific set of methodological choices, such as learner's first and second language connection (L1–L2), minimal pairs, a training cycle of exposure–perception–production, individualistic and social approaches, and the inclusion of ASR and TTS technology. The experimental research conducted applying these methodological choices with real users validates the efficiency of the CAPT prototypes developed for the four main experiments of this dissertation. Data is automatically gathered by the CAPT systems to give an immediate specific feedback to users and to analyze all results. The protocols, metrics, algorithms, and methods necessary to statistically analyze and discuss the results are also detailed. The two main L2 tested during the experimental procedure are American English and Spanish. The different CAPT prototypes designed and validated in this thesis, and the methodological choices that they implement, allow to accurately measuring the relative pronunciation improvement of the individuals who trained with them. Both rater's subjective scores and CAPT's objective scores show a strong correlation, being useful in the future to be able to assess a large amount of data and reducing human costs. Results also show an intensive practice supported by a significant number of activities carried out. In the case of the controlled experiments, students who worked with the CAPT tool achieved better pronunciation improvement values than their peers in the traditional in-classroom instruction group. In the case of the challenge-based CAPT learning game proposed, the most active players in the competition kept on playing until the end and achieved significant pronunciation improvement results.
Article
Full-text available
The current emphasis in second language teaching lies in the achievement of communicative effectiveness. In line with this approach, pronunciation train- ing is nowadays geared towards helping learners avoid serious pronunciation errors, rather than eradicating the finest traces of foreign accent. However, to devise optimal pronunciation training programmes, systematic information on these pronunciation problems is needed, especially in the case of the develop- ment of Computer Assisted Pronunciation Training systems. The research reported on in this paper is aimed at obtaining systematic in- formation on segmental pronunciation errors made by learners of Dutch with different mother tongues. In particular, we aimed at identifying errors that are frequent, perceptually salient, persistent, and potentially hampering to commu- nication. To achieve this goal we conducted analyses on different corpora of speech produced by L2 learners under different conditions. This resulted in a robust inventory of pronunciation errors that can be used for designing efficient pronunciation training programs.