Figure 1 - uploaded by Sylviane Granger
Content may be subject to copyright.
Source publication
Based on a corpus of 223 argumentative essays written by English as a foreign language learners, this study shows that spelling errors, whether detected manually or automatically, are a reliable predictor of the quality of L2 texts and that reliability is further improved by sub-categorising errors. However, the benefit derived from sub-categorisat...
Contexts in source publication
Context 1
... second stage in the analysis involved comparing the results obtained for the total number of spelling errors with those obtained when sub-categorisation is introduced. The frequency of the different error categories in the learner corpus is shown in Figure 1 (N = 1,549). The figure shows that Many represents more than 23% of the errors and SplitW more than 18%. ...
Context 2
... frequency of multi-letter errors has also been underlined by Rimrott and Heift (2008) for learners of German. Figure 1 also brings out a sizeable proportion of doubling errors (Doub12 and Doub21). Although this is a classic spelling difficulty in English which affects both native and L2 writers (cf. ...
Similar publications
Communication disorder is a big challenge for school children. This study aims to identify the students with communication disabilities among primary children (Grades 3-5) at Km/Al-Mina Vidyalaya, Nintavur. Data was collected using, interviews and tests from 35 students and 3 teachers. Data were analyzed with MS Word, 2007, and MS Excel, 2007. This...
Citations
... Vajjala (2018) used two datasets of non-native English essays written in test-taking scenarios and reported a prediction accuracy of 73% in distinguishing between three proficiency levels (low, medium and high). One promising but as yet largely unexplored research direction is that of tailoring the automatic/automated systems to the native language of the writers (Leacock et al., 2015;Bestgen & Granger, 2011). ...
The aim of this article is to survey the field of learner corpus research from its origins to the present day and to provide some future perspectives. Key aspects of the field — learner corpus design and collection, learner corpus methodology, statistical analysis, research focus and links with related fields, in particular SLA, FLT and NLP — are compared in first-generation LCR, which extends from the late 1980s to 2000, and second-generation LCR, which covers the period from the early 2000s until today. The survey shows that the field has undergone major theoretical and methodological changes and considerably extended its range of applications. Future developments that are likely to gain ground are grouped into three categories: increased diversity, increased interdisciplinarity and increased automation.
... Thorough error analyses provide valuable insights that can predict the quality of learner texts (e.g. Bestgen & Granger, 2011) and aid in the development of NLP algorithms designed to detect common learner errors (Higgins et al., 2015: 590). ...
... The built-in spell checker is probably one of the most common text editing tools featured in today's word processors. It has transformed and considerably increased the efficiency of spelling error detection and correction (Bestgen & Granger, 2011), constituting a 'proofing tool' that users frequently rely upon (Pan et al., 2021). Generic spell checkers -those designed for native (L1) writers, such as the spell checkers built into Microsoft Word and Google Docs -have become increasingly sophisticated, and are now widely distributed and used across different school subjects. ...
... When the spell check function is turned on, the software automatically detects and displays various spelling errors and, accordingly, provides immediate feedback in the form of corrections and alternative spelling suggestions. However, generic spell checkers are not 'fool proof ', and their use may result in different types of errors, both human initiated and computer initiated (Bestgen & Granger, 2011;Musk, 2016Musk, , 2021. This suggests that the use of spell check software can also sometimes constrain students' writing. ...
... In examining the quality of spelling errors, research points that there are seemingly varying correction rates between different spell checkers (e.g., commercial and non-commercial). Generic spell checkers are better adapted to detect and correct single letter mistyping than lexical misspellings, due to greater target deviation in the latter (Bestgen & Granger, 2011). Generic spell checkers are further limited and less adapted to meet the needs of nonnative writers, as shown for instance among German (Rimrott & Heift, 2008) and Arabic-speaking (Saigh & Schmitt, 2012) L2 (second language) students. ...
This study focuses on the distribution of agency in software-based spell checking in L1 (Language and Literature) teaching. Drawing on video-ethnographic data from a Swedish-medium school in Finland, the research shows that built-in spell checkers can both afford and constrain studentsʼ digital writing. Through examining the micro-dynamics between human and material agency in use of spell checking, the analysis illustrates that the software does not always work as expected from the userʼs perspective, and hence becomes framed as a ‘trouble sourceʼ, assigned ‘linguistic authorityʼ, and held accountable for not meeting human intentionality. We argue that technologyʼs inherent functions and properties play a central role in the co-constitution of agency in digital writing practices, and call for a greater awareness of generic spell checkersʼ opportunities and limitations in teaching and learning.
... The use of some kind of learner corpus is definitely a must in order to identify typical learner errors that can be used and explained in didactic language tools. These corpora have been used in one way or another to develop numerous writing tools, as discussed by Bestgen andGranger (2011), Paquot (2012), Wanner, Verlinde and Alonso-Ramos (2013), Alonso-Ramos and García-Salido (2019), Frankenberg-García, Lew, Roberts, Rees and Sharma (2019), and Granger and Paquot (2022). The best type of corpus for this purpose is undoubtedly a tagged corpus with parallel correction of the errors detected, such as the Spanish one described by Davidson, Yamada, Fernández-Mira, Carando, Sánchez-Gutiérrez and Sagae (2020). ...
... Research in the field of second-language acquisition has found evidence of phoneme-shift based misspellings stemming from L1 influence in L2 text for specific language pairs (Ibrahim, 1978;Cook, 1997;Bestgen and Granger, 2011;Sari, 2014;Ogneva, 2018;Motohashi-Saigo and Ishizawa, 2020). Studies in Natural Language Understanding (NLU) have been limited to spelling correction Nagata et al. (2017); Flor et al. (2019) and native language identification Chen et al. (2017); Nicolai et al. (2013) in English learners. ...
... There has also been a fair amount of interest in the second-language acquisition field on the influence of L1 on L2 spelling. Ibrahim (1978); Cook (1997); Bestgen and Granger (2011); Sari (2014); Ogneva (2018); Motohashi-Saigo and Ishizawa (2020) all find evidence of such influence in specific language pairs. These often stem from the lack of certain sounds in L1 leading to difficulty in distinguishing similar sounds in L2. ...
A large number of people are forced to use the Web in a language they have low literacy in due to technology asymmetries. Written text in the second language (L2) from such users often contains a large number of errors that are influenced by their native language (L1). We propose a method to mine phoneme confusions (sounds in L2 that an L1 speaker is likely to conflate) for pairs of L1 and L2. These confusions are then plugged into a generative model (Bi-Phone) for synthetically producing corrupted L2 text. Through human evaluations, we show that Bi-Phone generates plausible corruptions that differ across L1s and also have widespread coverage on the Web. We also corrupt the popular language understanding benchmark SuperGLUE with our technique (FunGLUE for Phonetically Noised GLUE) and show that SoTA language understating models perform poorly. We also introduce a new phoneme prediction pre-training task which helps byte models to recover performance close to SuperGLUE. Finally, we also release the FunGLUE benchmark to promote further research in phonetically robust language models. To the best of our knowledge, FunGLUE is the first benchmark to introduce L1-L2 interactions in text.
... Words absent from the corpus are highlighted, followed by either a list of alternatives or a correction executed by the software (Mitton, 2010). While this method's detection rate is higher than 80% (Bestgen & Granger, 2011;Blázquez-Carretero & Fan, 2019;, it only pertains to single-word errors, preventing GSCs from identifying context-specific mistakes should the misspelling correspond to an existent word (Blázquez-Carretero & Fan, 2019). As GSCs are L1-oriented, their built-in autocorrect and feedback mechanism is grounded on the notion that spelling errors are performance-based and typically involve single-letter violations. ...
In 2016, Lawley proposed an easy-to-build spellchecker specifically designed to help second language (L2) learners in their writing process by facilitating self-correction. The aim was to overcome the disadvantages to L2 learners posed by generic spellcheckers (GSC), such as that embedded in Microsoft Word. Drawbacks include autocorrection, misdiagnoses, and overlooked errors. With the aim of imparting explicit L2 spelling knowledge, this correcting tool does not merely suggest possible alternatives to the detected error but also provides explanations of any relevant spelling patterns. Following Lawley’s (2016) recommendations, the present study developed a prototype computer-based pedagogic spellchecker (PSC) to aid L2 learners in self-correcting their written production in Spanish. First, a corpus was used to identify frequent spelling errors of Spanish as a foreign language (SFL) learners. Handcrafted feedback was then designed to tackle the commonest misspellings. To subsequently evaluate this PSC’s efficacy in error detection and correction, another learner Spanish corpus was used. Sixty compositions were analysed to determine the PSC’s capacity for error recognition and feedback provision in comparison with that of a GSC. Results indicate that the PSC detected over 90% of the misspellings, significantly outperforming the GSC in error detection. Both provided adequate feedback on two out of three detected errors, but the pedagogic nature of the former has the added advantage of facilitating self-learning (Blázquez-Carretero & Woore, 2021). These findings suggest that it is feasible to develop spellcheckers that provide synchronous feedback, allowing SFL learners to confidently self-correct their writing while saving time and effort on the teacher’s part.
... As regards spelling accuracy, we found that better performance was related with a greater number of words, argument overlap and lexical diversity, as well as with fewer spelling errors in the composition, number of pauses and noun incidence. In the literature, we could see some studies that showed how spelling errors are a good predictor of the quality of L2 written compositions (Bestgen et al., 2011;Harrison et al., 2016). Finally, picture naming accuracy was found to correlate positively with number of words, sentence length, connector incidence, noun overlap, argument overlap, stem overlap and lexical diversity and negatively with spelling errors and noun incidence. ...
Different studies have demonstrated that people with dyslexia have difficulties in acquiring fluent reading and writing. These problems are also evident when they learn a second language. The aim of our study was to investigate if there is a linguistic transfer effect for writing in children with dyslexia when they face tasks in English (L2), as well as the possible influence of other linguistic skills (spelling, vocabulary and reading) in English (L2) and in Spanish (L1). Participants completed a series of tasks both in Spanish and English: a picture naming task, a word reading task, a word spelling task, and a written composition of which we analysed its quality through different variables provided by the Coh-metrix software. Our results revealed that children with dyslexia show similar or parallel performance in written composition in both languages, which could imply a language transfer effect from L1 and L2. Besides, basic language skills are related to the characteristics of written composition to a greater extent in English than in Spanish, suggesting the impact of these on the quality of written composition.
... Research in the field of second-language acquisition has found evidence of phoneme-shift based misspellings stemming from L1 influence in L2 text for specific language pairs (Ibrahim, 1978;Cook, 1997;Bestgen and Granger, 2011;Sari, 2014;Ogneva, 2018;Motohashi-Saigo and Ishizawa, 2020). Studies in Natural Language Understanding (NLU) have been limited to spelling correction Nagata et al. (2017); Flor et al. (2019) and native language identification Chen et al. (2017); Nicolai et al. (2013) in English learners. ...
... There has also been a fair amount of interest in the second-language acquisition field on the influence of L1 on L2 spelling. Ibrahim (1978); Cook (1997); Bestgen and Granger (2011); Sari (2014); Ogneva (2018); Motohashi-Saigo and Ishizawa (2020) all find evidence of such influence in specific language pairs. These often stem from the lack of certain sounds in L1 leading to difficulty in distinguishing similar sounds in L2. ...
... One of studies came from Bestgen and Granger (2011) that identify the types of spelling error. Bestgen and Granger assessed 223 non-native learners (L2) in their essay. ...
The aims of this study are to find out the types of spelling errors made by the ninth-grade students of SMPN 1 Sumbawa by determining students writing and to find out the causes of spelling error made by students. In addition, this study uses the theory from Bestgen and Granger (2011) in their journal entitled “Categorizing Spelling Error to Assess L2 Writing” that discussed about the category of spelling errors. Furthermore, this research applies descriptive qualitative in which the researchers collected the data from students writing tasks and found the spelling error of three words classification which are adjective, noun and verb. The researchers found six types of spelling errors which are addition, omission, substitution, transposition, word segmentation, and multiple error letter. In addition, there are three main causes of spelling error found such as the influence of Indonesian spelling, lack of vocabulary, and spelling difficulties
... Actual error words: The spelling-error categories handled in this study are based on data [14] analyzed in the field of linguistics. Herein, among the error categories, quotation code errors and spacing errors are ignored because the frequencies of these errors are low in the case of citation codes and because their criteria for use may vary. ...
Abstract In this study, we aim to automatically construct a test dataset for testing the performance of spelling error correction systems. The Google Web 1T corpus, which includes data on 10 quadrillion phrases, is used for this purpose. Therefore, error words used in the test dataset use error words generated by real web users. There are seven types of error words. In order to obtain the error word, a word set that appears simultaneously with the surrounding context (3‐g range) of the location of the error word generation is searched. In this calculation, we exclude error words with wide edit distances that cause the resolution of original words to become exceedingly difficult. In order to select the final error word from the word set, a word with a high value is selected by calculating the context probability using 3‐g. In the experiment, the performance was measured for two systems (grammarly, MS Word) in service and the recently announced spelling error correction system (Neuspell). The highest performance was the F1 score of 56%, which shows the overall performance, indicating the need for research on spelling errors.