Article

Getting the corpus habit: EAP students’ long-term use of personal corpora

Authors:
To read the full-text of this research, you can request a copy directly from the author.

Abstract

This paper reports on the long-term use of personal do-it-yourself corpora by students of EAP. Forty international graduate students attended a course in which they built and examined their own corpora of research articles in their field. One year after the course, they completed an email questionnaire, which asked about their corpus use in the 12 months since the end of the course. Results show that 70% of the respondents had used their corpus: 38% were regular users (once per week or more), 33% irregular users (once per month or seldom) and 30% non-users. Most users consulted the corpus for checking grammar and lexis while composing and revising and 93% of them considered that corpus use had improved their academic writing. Reasons for non-use included the small size of the corpus and its lack of reliability and convenience. Case studies of a user and a non-user are presented and highlight two other factors likely to affect take-up: the individual’s writing process and the focus of their current writing concerns. The paper discusses the reasons behind long-term use of personal corpora and some of the challenges to be overcome in extending the approach more widely.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

... Furthermore, after being introduced to the basic functions of corpora, learners can use corpora as a reference/editing tool without receiving any comments or error correction from their teachers (Charles, 2012(Charles, , 2014Kennedy & Miceli, 2010;Yoon & Hirvela, 2004). Kennedy and Miceli (2010), for example, designed a longitudinal training program and found that their students were able to use various search functions when facing collocation difficulties, and subsequently built their own corpus for later consultation. ...
... Kennedy and Miceli (2010), for example, designed a longitudinal training program and found that their students were able to use various search functions when facing collocation difficulties, and subsequently built their own corpus for later consultation. Studies have also revealed that students hold positive attitudes and perceptions towards corpus use (Charles, 2012(Charles, , 2014Crosthwaite, 2017;Yoon & Hirvela, 2004). Charles (2012) investigated 40 English for Academic Purposes (EAP) students' evaluations of using their self-created corpora to solve collocation errors and found that 70% of the students wanted to use corpora in their EAP studies over the long-term. ...
... However, most research on DDL for improving students' use of collocations and word choice in their writing was conducted at the university level or above (Charles, 2012(Charles, , 2014Crosthwaite, 2017;Kennedy & Miceli, 2010;Wu, 2016;Yoon & Hirvela, 2004). Boulton and Cobb (2017) conclude that "unfortunately, there is little [DDL] research with high school learners" (p. ...
Article
Full-text available
Corpus tools are known to be effective in helping L2 learners improve their writing, especially regarding their use of words. Most corpus-based L2 writing research has focused on university students while little attention has been paid to secondary school L2 students. This study investigated whether senior secondary school students in China, upon receiving corpus-based training under the framework of data-driven learning (DDL), could improve their vocabulary use, especially the use of collocations, in their writing for the International English Language Testing System (IELTS) test. Twenty-two students aged 16–18 in a senior secondary school in Nanchang, China who were planning to take the IELTS exam participated in the study. Corpus of Contemporary American English (COCA) and Word and Phrase were the main corpora that the participants used to learn various search functions. Pre-writing and post-writing tests were administered to measure the effect of corpus training. In addition, a questionnaire and interviews were used to collect students’ perspectives and attitudes. The results indicate that students made improvement in word selection after three corpus training sessions, and their attitudes towards corpus use were positive even though they were restricted from using computers to access corpora inside their school.
... doi:10.1017/S0958344021000057 (Chambers, 2007;Charles, 2014;Boulton & Cobb, 2017). With the advancement of DDL research and practice, many researchers have recognized the need to investigate how corpus tools are actually used by learners in the long term to enhance our understanding of the efficacy of corpus use in language learning and teaching (Charles, 2014;Chambers, 2007;Crosthwaite, 2019;Hafner & Candlin, 2007;Horst et al., 2005;Kennedy & Miceli, 2017;Johns, 1997;Pérez-Paredes et al., 2011). ...
... (Chambers, 2007;Charles, 2014;Boulton & Cobb, 2017). With the advancement of DDL research and practice, many researchers have recognized the need to investigate how corpus tools are actually used by learners in the long term to enhance our understanding of the efficacy of corpus use in language learning and teaching (Charles, 2014;Chambers, 2007;Crosthwaite, 2019;Hafner & Candlin, 2007;Horst et al., 2005;Kennedy & Miceli, 2017;Johns, 1997;Pérez-Paredes et al., 2011). ...
... Explicitly, learners in the Dechert study demonstrated a predilection for noun + of + noun and noun + preposition + noun, adjective + preposition + noun patterns that have been acknowledged as essential for achieving grammatical complexity and textual density in academic writing (Biber et al., 2011;Halliday 1993). However, further studies have shown that these same patterns are consistently under-utilized in EAP student writing (Liu, 2011;Parkinson & Musgrave;2014). ...
Article
Full-text available
Corpus consultation with concordancers has been recognized as a promising way for learners to study and explore language features such as collocations at their own pace and in their own time. This study examined 1.5 million search queries sent to a collocation consultation tool called FlaxCLS (Flexible Language Acquisition Collocation Learning System; http://flax.nzdl.org) over a period of two years to identify learners’ collocation look-up patterns. This paper examines and characterizes learners’ look-up patterns as they entered search queries, clicked on the query formation aids provided by the system, and navigated through the different levels of collocation information returned by the system to support collocation learning. We looked at how learners formulated query terms, and we analyzed the characteristics of query words learners entered, the characteristics of collocations they preferred, and the sample sentences they checked. Our collocation look-up pattern analyses, similar to traditional user query analyses of the web, provide interesting and revealing insights that are hard to obtain from small-scale user studies. The findings provide valuable information and pedagogical implications for data-driven learning (DDL) researchers and language teachers in designing tailored collocation consultation systems and activities. This paper also presents multidimensional analyses of learner query data, which, to the best of our knowledge, have not been explored in DDL research.
... Similarly, there exists a growing line of classroom-based EAP research which builds on Data-Driven Learning (DDL; Johns, 1986) approaches to explore the pedagogical value of integrating corpus and genre analysis based pedagogical activities for the development of genre knowledge (e.g., Cai, 2016;Charles, 2007Charles, , 2011Charles, , 2014Chen & Flowerdew, 2018;Dong & Lu, 2020;Eriksson, 2012;Lee & Swales, 2006). Taken together, studies such as these have important implications for the analysis of writing practices and writing instruction, as they highlight the highly domain-specific nature of form-function mappings and constructions, as well as the potential value in integrating corpus and genre analysis based instructional approaches into instructional settings. ...
... Linguistic features and patterns are taught in terms of their roles in realizing writers' rhetorical goals. Students have reported positive impressions of this approach in numerous studies (Charles 2011(Charles , 2012(Charles , 2014. ...
... Charles has continued to expand her pedagogical model, demonstrating the value of students constructing and analyzing their own personal corpora (Charles, 2012), continued use of corpus research techniques even after the study window ended (Charles, 2014). Others have taken up similar approaches, though the results have not been as consistently positive. ...
Thesis
Research on domain specific genre-practices and related genre-based pedagogies have increasingly integrated corpus-based approaches to linguistic form with rhetorical approaches to writers’ functional aims. This study draws together these closely related lines of research to bridge genre-based writing research with genre-based writing pedagogy. Informed by the English for Specific/Academic Purposes tradition of Genre Theory and a Sociocultural Theory orientation to human cognition and development, the study consists of two interdependent phases: a corpus-based genre analysis phase and a corpus and genre analysis-based pedagogical intervention phase. In the first phase, a set of formal linguistic features (reporting verbs, shell nouns, phrase-frames, and select measures of syntactically complex structures) are analyzed across the rhetorical stages of 400 published research article introductions from two social science disciplines and two engineering disciplines. The texts are manually annotated for rhetorical moves, and the analysis of linguistic features is conducted with a series of manual processes, custom scripts, and automated corpus tools, and the distribution of formal features across rhetorical stages is analyzed and presented through a combination of quantitative and qualitative approaches. These analyses are directed towards the second phase, in which a Concept-Based pedagogical intervention is carried out in two sections of a doctoral academic writing course for second language writers that integrates genre and corpus analysis as two integrated class activities. The impacts of this pedagogical approach on the developing genre knowledge of graduate student participants is analyzed through a Vygotskian genetic approach to development, and learner perceptions are also reported.
... Based upon the recommendations of prior studies regarding user made corpora for specific academic purposes (e.g. Charles, 2014;Hafner & Candlin, 2007), this project aimed to construct a corpus of legal English to facilitate the learning of non-native speakers. To address the two dimensions of corpus quality suggested by Lewis (1997), the documents used to develop the corpus need to contain authentic legal English jargon, and be in sufficient quantity to produce 242 | PASAA Vol. ...
... Whereas many other corpus analysis programs contain only a concordancing tool, AntConc also contains additional useful tools for examination, such as collocation and chunking. Prior studies involving academic use of corpora have also used AntConc (Charles, 2014;Csomay & Petrović, 2012). ...
... Similar to prior studies about corpora designed for specific academic purposes (e.g. Charles, 2014;Hafner & Candalin, 2007), more advanced corpus construction techniques, such as tagging, were not used for this project. Because this corpus is freely available for educational purposes, any interested user may utilize parsing tools to PASAA Vol. ...
Article
Full-text available
While corpus linguistics has been applied towards many specific academic purposes, reports are few regarding its use to facilitate learning of legal English by non-native English speakers. Specialized corpora are required because legal English often differs significantly from ordinary usage, with words such as bar, motion, and hearing having completely different meanings and use. This paper documents the process of creating and validating a sixteen million-word corpus of (American) legal English, and provides examples of analyses available for language learners. Written decisions and oral argument transcripts from the U.S. Supreme Court and other appellate courts were ultimately chosen to comprise the corpus due to their authentic and comprehensive use of legal jargon. Overall, this corpus demonstrates that appellate court decisions, available online, can comprise a corpus tailored for legal English learning.
... Most respondents used their corpora for checking grammar and vocabulary when writing. Charles (2014) also hypothesised that students' degree level and disciplinary areas are factors influencing the self-directed use of DIY corpora. Her findings revealed, however, no connection between students' disciplinary areas and self-directed corpus use, but suggested that there is a relation between degree level and the use of DIY corpora for self-directed learning as a higher proportion of doctoral students were users of DIY corpora than Master's students. ...
... Moreover, engaging in compilation of a DIY corpus gives students important insights into what constitutes a language corpus, the importance of its structure, balance and limits of its representativeness. Many authors (Zanettin, Bernardi, & Stewart, 2003;Yoon & Hirvela, 2004;Gavioli, 2005;Charles, 2014) emphasize the need for students to go through a form of 'apprenticeship' in order to become successful corpus users. Compilation of their own DIY corpus, all the considerations and decisions required to accomplish such a task, as well as familiarization with the corpus analysis tools, help students to gain insights into linguistic varieties and to process various aspects of data mining that can empower their subsequent research and interpretation of corpus data. ...
... Acquiring basic corpus compilation principles and skills will further increase students' learning independence, giving them the opportunity to build new DIY corpora for various specialism domains when such a need arises. By extending their specialist language range, learners acquire a competence, which is nowadays often required from translators, as well as from teachers and other professionals dealing with languages (Gavioli, 2005;Charles, 2014). ...
Chapter
Can EFL students be profitably introduced to compilation of DIY corpora for various ESP domains even at the undergraduate level? How can they benefit from self-directed exploitation of language corpora at such an early stage? What language skills can a corpus-based ESP course enhance? This chapter discusses the advantages and limitations of a structured approach to pedagogical corpus consultation and corpora as self-directed learning tools as applied in an innovative corpus-based ESP course. A multifaceted enquiry of students' assessment and perception of the course - initial feedback, questionnaire, and focus group – was conducted. Results indicate that although students perceived corpus use as a complex activity, their attitude to the corpus approach was positive and they recognised the benefits of corpora as self-directed tools. Suggestions for further improvements of such practices are also discussed.
... The main purpose of this study is to examine the University researchers' evaluation of corpus-consultation techniques and the uptake of such practice immediately after the training sessions, and both two months and two years after the completion of the training initiative. This research, together with Charles' (2014), is one of the few studies that makes use of mixed methods to explore a delayed evaluation of corpus resources. We set out to contribute to the debate about how to implement Data-Driven Learning (DDL) pedagogy in the context of Higher Education (HE) and, specifically, how University researchers may benefit from DDL in training initiatives across universities in Spain and beyond. ...
... Different research efforts have pioneered the use of corpus linguistics methods in the description of general English, scientific language and academic writing (Altenberg 1987;Bathia et al. 2011;Boulton et al. 2012). The Create a Research Space (CARS) model (Swales 1990) has been applied, for example, to the research in different moves and steps in PhD literature reviews (Flowerdew and Forest 2010), the use of research articles corpus consultation (Charles 2014) or the analysis of the use of corpus-based research (Flowerdew 2016). Phraseology has been widely investigated in English for Academic Puroposes (EAP) (Staples et al. 2013) and corpus studies have confirmed the strong link between academic (sub)registers and lexical choice (Staples et al. 2016). ...
... This is evidenced by the fact that four of the five participants interviewed had made no use of online corpus consultation during the two-year period after completing the course. Consequently, our results do not seem to confirm Yoon's (2008) or Charles' (2014) findings that contact with corpus consultation affects the writing habits of researchers, although this finding may be taken cautiously, as only 20 percent of the participants agreed to be interviewed. Similarly, our results confirm that the stated level of English communicative competence played no significant role in the researchers' assessing of the usefulness of the different resources available (RQ1). ...
Article
Full-text available
This paper examines the introduction and use of corpus consultation in the course of a training initiative sponsored by the Professional Training Unit of a medium-sized University in Spain. 'Introducing Research Articles (RA) Writing' was a 12-hour module that offered researchers the opportunity to gain insight into the nature of the research articles (RA) across different disciplines. The researchers (n=25) and the instructors met three times in two-hour sessions during a two-month period. All participants completed two post-task questionnaires and a delayed questionnaire. An interview was completed two years after the end of the course. After task 2, 64 percent of the participants found corpus tools to be of great help when writing their research articles. No significant differences between B1 and B2-C1 groups were found in their assessment of the writing tools provided. Increased familiarity with the corpus tools did not result in a better appraisal of these resources and all participants seemed to favour the use of the curated list of vocabulary provided. The delayed questionnaire and subsequent delayed interviews (n=5) revelaled that the use of corpora had had limited or no impact on the writing practices of these researchers. We argue that the use of corpora in professional writing contexts requires careful planning as well as continued institutional support.
... Charles, 2015;Flowerdew, 2012), the learners' perceptions of DDL (e.g. Charles, 2014;Geluso & Yamaguchi, 2014), and factors influencing its implementation (e.g. Cotos, 2014). ...
... Cotos (2014), a frequently co-cited publication in Cluster #0 (with a co-citation frequency of 7), focused on the role of corpora in students' language learning by comparing their interactions with a local learner corpus and a native-speaker corpus. Charles (2014), also co-cited seven times in Cluster #0, conducted qualitative research on the use of self-built corpora from a longitudinal perspective. Pérez-Paredes, Sánchez-Tornel and Calero (2012), a frequently co-cited publication in Cluster #4 (co-cited five times), examined learners' search strategies in DDL activities. ...
Article
This study employs a bibliometric approach to analyse common research themes, high-impact publications and research venues, identify the most recent transformative research, and map the developmental stages of data-driven learning (DDL) since its genesis. A dataset of 126 articles and 3,297 cited references (1994-2021) retrieved from the Web of Science was analysed using CiteSpace 6.1.R2. The analysis uncovered the principal research themes and high-impact publications, and the most recent transformative research in the DDL field. The following evolutionary stages of DDL were determined based on Shneider's (2009) scientific model and the timeline generated by CiteSpace, namely, the conceptualising stage (1980s-1998), the maturing stage (1998-2011), and the expansion stage (2011-now), with Stage 4 just emerging. Finally, the analysis discerned potential future research directions, including the implementation of DDL in larger-scale classroom practice and the role of variables in DDL.
... Lu, Casal, and Liu (2020), for example, illustrated how the use of syntactically complex sentences in a corpus of social science RA introductions, defined as those ranked among the top quartile in five measures of syntactic complexity, varied significantly across different rhetorical move-steps. On the pedagogical side, a growing number of studies have also demonstrated the pedagogical usefulness of research insights into the connections between rhetorical functions and their linguistic realizations in promoting novice academic writers' knowledge of such connections (e.g., Casal, 2020;Charles, 2014Charles, , 2018Chen & Flowerdew, 2018;Cotos, Huffman, & Link, 2020;Dong & Lu, 2020;Mizumoto, Hamatani, & Imao, 2017). ...
... Finally, the results on the associations between p-frames and rhetorical move-steps provide information on discipline-general as well as discipline-specific form-function mappings in social science RA introductions. These findings can inform the design of classroom genre analysis activities (e.g., Casal, 2020;Charles, 2014Charles, , 2018Chen & Flowerdew, 2018;Dong & Lu, 2020) to help cultivate novice writers' awareness of the rhetorical and phraseological strategies commonly employed in RA introductions by expert writers across social science disciplines or in a specific social science discipline. ...
Article
This study investigated variation in the rhetorical and phraseological features of research article introductions among five social science disciplines. Our dataset consisted of the introduction sections of 500 published research articles from Anthropology, Applied Linguistics, Political Science, Psychology, and Sociology. All texts in the dataset were manually annotated for rhetorical moves and steps by a team of seven researchers using an extensively adapted version of Swales’ (2004) revised Create a Research Space (CARS) model. Our rhetorical and phraseological analysis of the corpus revealed substantial disciplinary variation in both the distribution of rhetorical move-steps and the associations between phrase-frames and rhetorical move-steps among the five social science disciplines. Our findings contribute to a better understanding of disciplinary variation in the rhetorical and linguistic features of research article writing and have useful implications for academic writing research and pedagogy.
... Participants also foresee a link between corpus work and student autonomy, a point that is recurrent in the literature (e.g. ASTON, 2011;CHARLES, 2014;GAVIOLI, 2009). As Example 17 indicates, the reference to student autonomy development is not restricted to those from a language-oriented educational background; it also made by participants with degrees in other fields. ...
... While it might be more challenging to address the question about students' interest, there is plenty of evidence in the literature that students are able to profit from datadriven learning (e.g. BOULTON, 2012;CHARLES, 2014;TODD, 2001). ...
Article
Full-text available
Previous studies on the application of corpus linguistics (CL) to education have primarily examined language-related contexts where students are pursuing a formal degree (e.g. undergraduate and Master’s programs). Little do we know about the informal learning of CL especially by (but not limited to) academics/professionals who are not educated and/or do not work in language-oriented fields. The present study addresses these research gaps by examining the perspective of participants in a non-credit-bearing continuous professional development (CPD) project aimed at academics/professionals in a range of disciplines, who did not need to have any prior knowledge of CL. More specifically, we administered a questionnaire to 28 participants of a UK-based CPD project on CL with a view to researching four main aspects: (i) these participants’ CL background; (ii) their motivations to participate in this type of project; (iii) the advantages and barriers of employing CL in their teaching practice; and (iv) their appraisal of corpus analysis integration in their research practice. The results point out to the role of CPD projects in democratizing access to CL education both to language-oriented and non-language oriented academics/professionals and in potentially raising their interest in CL learning. Lack of knowledge is perceived to be the main barrier in embedding corpus approaches to teaching and research, thus reinforcing the relevance of developing formal and informal CL learning opportunities for academics/professionals in different fields. Keywords: corpus linguistics; continuous professional development; educational corpus integration; evaluation of corpus use in professional practices; corpus application to teaching and research; language teacher education; translator education; interdisciplinarity. Resumo: Estudos sobre a aplicação da linguística de corpus (LC) à educação examinaram uma série de contextos diferentes – principalmente aqueles em que os alunos recebem um diploma de colação de grau (por exemplo, cursos de graduação e mestrado). No entanto, pouco se sabe a respeito da aprendizagem informal da LC, especialmente por (mas não se limitando a) acadêmicos/profissionais que não tem uma formação educacional e/ou não trabalham em áreas relacionadas aos estudos da linguagem. A presente pesquisa preenche essas lacunas, examinando a perspectiva dos participantes de um projeto de formação profissional contínua destinado a acadêmicos/profissionais de várias disciplinas, que não precisavam ter conhecimento prévio de LC. Mais especificamente, administramos um questionário a 28 participantes de um projeto de formação profissional contínua na área de LC realizado no Reino Unido com o objetivo de pesquisar quatro aspectos principais: (i) a formação educacional em LC dos participantes; (ii) suas motivações para participar desse tipo de projeto; (iii) as vantagens e barreiras de empregar a LC em suas práticas pedagógicas; e (iv) suas avaliações sobre a integração da análise de corpus em suas práticas de pesquisa. Os resultados apontam para o papel dos projetos de formação profissional contínua na democratização do acesso à educação em LC para profissionais tanto da área de estudos da linguagem quanto de outras áreas e no potencial aumento do interesse desses profissionais na aprendizagem de LC. A falta de conhecimento é percebida como a principal barreira para a incorporação de abordagens de corpus para o ensino e a pesquisa, reforçando assim a relevância do desenvolvimento de oportunidades de aprendizagem formal e informal para acadêmicos/profissionais em diferentes áreas. Palavras-chave: línguística de corpus; formação profissional contínua; integração educacional de corpora; avaliação do uso de corpora em práticas profissionais; aplicação de corpora no ensino e na pesquisa; formação de professores de línguas; formação de tradutores; interdisciplinaridade.
... In DDL activities, students interact with language data from a corpus and induce rules or patterns in the language (Smart, 2014). DDL has received increasing attention over the last 30 years and has shown great promise in helping students recognize and apply lexico-grammatical patterns (Huang, 2014), revealing how words function in real contexts, strengthening students' noticing skills, boosting engagement and motivation (Boulton, 2009), and providing students with resources that they can continue to use independently (Charles, 2014). Despite these promising benefits, DDL typically requires a significant amount of time and is often better suited to "refining usage in context" than learning large numbers of words (Boulton, 2012). ...
... Researchers have implemented DDL in a large variety of ways, demonstrating both great successes and some difficulties. DDL can vary in form from structured paper printout activities (e.g., Huang, 2014) to self-directed language investigation using corpus tools (e.g., Charles, 2014). It has been used with high language proficiency (e.g., Lee & Swales, 2006) and low language proficiency levels (e.g., Vyatkina, 2016) and in a variety of contexts from general language study (e.g., Sun & Wang, 2003) to special-purposes courses (e.g., Yunus & Awab, 2014). ...
Article
This paper presents a three-part methodology for identifying special-purposes words to teach in data-driven learning (DDL) vocabulary activities. Previous methods have focused on either identifying important words for an English for Specific Purposes (ESP) context or identifying learner vocabulary gaps—and little or no research has addressed how to determine whether specific words are well suited to teaching through DDL rather than through a more traditional, deductive approach. The system in this study used a corpus-based approach to identify words that are (1) important to the ESP context, (2) difficult for students, and (3) well suited to teaching through DDL. This study applied the system to the context of civil engineering and found that it was overall effective in identifying 18 words that are prevalent in civil engineering writing, that were problematic for the students whose writing was examined, and that showed indications of being well suited to DDL. This paper discusses a drawback of the system—the time required to apply it—and discusses two valuable strengths: revealing how the words functioned in civil engineering discourse and identifying words not overtly connected to engineering (e.g., existing or using) that could be easily overlooked by an instructor.
... 99). Some studies integrated corpus compilation as part of the corpus-based writing instruction (Charles, 2012(Charles, , 2014Lee & Swales, 2006;Smith, 2011). For example, Smith (2011) reported that his EFL students responded positively to the experience of building and searching their own specialized corpora as part of their final course project. ...
... For example, Smith (2011) reported that his EFL students responded positively to the experience of building and searching their own specialized corpora as part of their final course project. Charles (2012Charles ( , 2014 incorporated corpus compilation and search activities throughout a discipline-specific EAP course and reported that her students found these activities highly helpful in writing discipline-specific texts. Meanwhile, some studies reported negative learner reactions to this approach, due often to inadequate training or guidance (e.g., Chambers, 2007) or to individual differences in such factors as "academic experience, search purposes, and writing tasks" (Chang, 2014, p. 243). ...
Article
This study explored the potential of integrating corpus-based and genre-based approaches to teaching rhetorical structures in a discipline-specific English as a Foreign Language (EFL) academic writing course at a university in China. The instructor and students first collaboratively compiled a specialized corpus consisting of the introduction sections of engineering research articles (RAs), which was subsequently annotated for rhetorical moves. A series of guided corpus-based genre analysis activities were then used to help the students understand the rhetorical structures of engineering RA introductions and the linguistic features associated with different rhetorical moves. The effectiveness of this pedagogical approach was evaluated with data triangulation across pre-and post-instruction questionnaires, interviews, students' reflective journals, and student-produced writing samples. Results showed that the integrated approach helped enhance the students' genre knowledge and improve their genre-based writing skills. This study thus contributes evidence of the feasibility and effectiveness of corpus-based genre pedagogy in discipline-specific EFL contexts.
... Incluso en el enfoque por tareas, que se ha desarrollado dentro del enfoque comunicativo (Lacorte 2019), también se considera la necesidad de preparar actividades que ayuden a enfocarse en la forma según las necesidades de los alumnos (ver Estaire 2011; Long 2016). Más allá de eso, los corpus sirven como elementos de referencia que favorecen el aprendizaje autónomo del estudiante, una vez que sabe usarlos, sin la mediación del profesor (Charles 2014). ...
Article
RESUMEN Desde hace más de tres décadas se ha animado a usar los corpus en las aulas de segundas lenguas (L2). A pesar del apoyo que muchos investigadores dan a esta idea, parece que esta práctica docente aún está lejos de haberse generalizado en la enseñanza de español lengua extranjera (LE), segunda lengua (L2) o lengua de herencia (LH). Este artículo pretende conectar las necesidades del aula con las herramientas de corpus disponibles en la actualidad. Para ello se revisan tendencias de enseñanza de español LE/L2/LH junto a estudios empíricos y propuestas didácticas con corpus de español como primera lengua (L1). Además, se tienen también en cuenta las perspectivas de las autoras en su uso didáctico de corpus con estudiantes universitarios anglófonos. Este artículo pretende recordar la importancia de una investigación que se nutra de las perspectivas y necesidades de los profesores. Desde esta óptica resulta más sencillo descubrir cómo pueden contribuir a la enseñanza las herramientas de corpus. ABSTRACT For more than three decades, researchers have encouraged using corpora in the second language (L2) classroom. Still, despite the support many researchers express for this idea, it seems the use of corpora in Spanish as an L2, foreign language (FL), and heritage language (HL) classrooms remains rare. This article aims to elucidate how successful integration of corpus tools depends on a clear link between tools and classroom needs. In order to do so, common pedagogical trends in the L2 Spanish classroom will be reviewed along with empirical studies and didactic proposals that demonstrate how first language (L1) Spanish corpora can be used, while keeping in mind the application of corpora both from the perspective of students and instructors. The authors’ experiences, as instructors and researchers, will also guide part of this conversation. At the same time, this article aims to highlight the importance of conducting research informed by the perspectives and needs of instructors. From this angle, it is easier to discover how corpus tools can contribute to the classroom by pursuing a broader vision of multiple possibilities for corpus use.
... DDL has been found to improve students' abilities to draw inferences, create a heightened awareness of language patterns, and facilitate the expansion of vocabulary knowledge (O'Sullivan 2007;Gilquin and Granger 2010). While some advocates of DDL encourage students to look at first-hand corpus data using corpus management software (e.g., Lee and Swales 2006;Charles 2012Charles , 2014, others have stated that the use of professional corpus management software in the classroom can be overwhelming for both teachers and students alike (Warren 2016;Hafner and Candlin 2007). As a result, a number of user-friendly DDL tools have been successfully created and used to aid L2 writing instruction, such as Compleat Lexical Tutor (Horst, Cobb and Nicolae 2005), SKELL (Hirata and Hirata 2018) and Collocaid (Frankenberg-Garcia et al. 2019). ...
Article
Full-text available
Lexical bundles are highly frequent and functionally significant in written academic discourse. Many studies have explored lexical bundles through a disciplinary lens, but their findings are not typically incorporated into published L2 teaching-learning materials. As a result, there are a number of challenges facing teachers who want to include a focus on disciplinary lexical bundles in their academic writing instruction at tertiary level. This paper describes a method for deriving a functional list of disciplinary lexical bundles from a corpus of academic writing, and then discusses the results quantitatively and qualitatively. The findings suggest that a small number of bundles (n=47) occur across genres, levels and institutions in the academic writing of the Arts and Humanities disciplines. Finally, a series of Computer-Assisted Language Learning (CALL) resources are presented, including a tool for finding the bundles in academic texts. The functional list and resources will be of interest to those involved in the teaching-learning of academic writing in the Arts and Humanities disciplines.
... Charles, 2018;Pérez-Paredes & Mark, 2021) and building corpora (e.g. Charles, 2012Charles, , 2014Charles & Hadley, 2022). Corpora are often built for classroomspecific contexts of language teaching or learning, especially for ESP/EAP instruction at tertiary level where corpora can be tailored to help university lecturers identify student-specific language use, such as analysing ESP/EAP writing tasks (Ackerley, 2021;Chang, 2014;Charles, 2012). ...
Article
Full-text available
Given the importance of corpus linguistics in language learning, there have been calls for the integration of corpus training into teacher education programmes. However, the question of what knowledge and skills the training should target remains unclear. Hence, we advance our understanding of measures and outcomes of teacher corpus training by proposing and testing a five-component theoretical framework for measuring teachers’ perceived corpus literacy (CL) and its subskills: understanding, search, analysis, and the advantages and limitations of corpora. Also, we hypothesised that teacher CL is linked to their intention to use corpora in classroom teaching. Specifically, 183 teachers and student teachers received corpus training to develop their CL and then completed a survey to measure their CL and intention to use corpora in teaching in Likert-scale items together with open-ended questions. Confirmatory factor analysis indicated that a hierarchical factor structure for CL using the aforementioned five subfactors best fitted the data. Moreover, structural equation modelling indicated that CL is positively linked to the participants’ intention to integrate corpora into classroom teaching. While all five subskills are important for teachers, greater effort should be made to develop their corpus search and analysis skills, which can be viewed as the “bread and butter” of corpus training.
... The most consistent longitudinal research has been carried out by M. Charles. In one of her articles (Charles, 2014), she showed that out of 40 subjects 70% continued to use their created corpus a year later, of which 38% were active users who accessed the corpus every week and 32% were inactive users who accessed the corpus once a month on average. In her plenary talk at TaLC 2020 (Charles, 2020), M. Charles summarized the accumulated data for a longer research period from 2009 to 2017: she collected feedback from 221 participants one year after completing the course. ...
Article
Full-text available
Corpus linguistics is one of the most dynamic and rapidly developing areas of modern linguistics. It affects all areas of linguistics, including methodology of teaching foreign languages, translation and other linguistic disciplines. Corpus linguistics has had a direct impact on teaching foreign languages. However, in general, it remains a marginal method in teaching. Analysis of publications on the subject allows us to conclude that very few studies are long-term and aimed at working with schoolchildren. This article proposes a model for the development of sustainable interest among high school students in online corpora as sources of linguistic information, including the initiation stage in the form of project work in mini-groups to study well-known sayings with the consequent stage aiming at completing tasks supplementing the main textbook on a regular basis. The organization of project work addressing the corps of 11th grade students of the Natural Science Lyceum at Peter the Great St. Petersburg Polytechnic University is described. The paper outlines further research.
... If DDL is to become a viable technique, it is also crucial that it should move out of its favourite spheres of action: from English classes to classes in other languages and from universities to secondary and primary education (see Vyatkina 2020), but also from the classroom to everyday life (see Chen and Flowerdew 2018;Meunier 2020). Studies have shown that learners usually stop using corpora after the in-class DDL training/activities (see, e.g., Crosthwaite and Cheung 2019), although, as shown by Charles (2014), when learners build their personal corpus in their own field, they tend to continue using it after the course has finished. Capitalising on the young generation's technological habits (cf. the parallel drawn between corpus consultation and internet searches in Cobb and Boulton 2015), one needs to induce a transfer of corpus skills from a language learning situation to an everyday language use situation, thus turning 'data-driven learning' into 'data-driven use' and making language learners evolve into autonomous language users. ...
Chapter
Full-text available
This chapter considers the use of data-driven learning (DDL) in the language classroom. It highlights its pedagogical functions and describes the main issues to bear in mind when choosing a corpus and a query tool to do DDL. It also provides an overview of the way DDL may be operationalised and shows how the DDL activities should be adapted to the learning context, learners’ level and topic being investigated. The chapter then reviews studies that have evaluated participants’ attitudes towards DDL, learners’ practices and the efficiency of DDL, revealing the promises that DDL holds for language teaching. It also highlights some of the limitations of DDL which account for its relative lack of implementation in the classroom, and gives some indications as to how DDL should evolve to become a viable pedagogical approach.
... As emerging themes, English for Specific Purposes (ESP) and English for Academic Purposes (EAP) have witnessed rapid progress due to the increasing role of English as an international language (Kırkgöz & Dikilitaş, 2018). Nonetheless, with a few notable exceptions (Charles, 2014;Lee & Swales, 2006), there seems to be a dearth of research and firsthand resources in the utilization of corpora in ESP and EAP. Spotting this visible gap, this volume edited by Charles and Frankenberg-Garcia is a valuable resource, which offers a state of the art and an ample ground approach to targeting corpus methodology in ESP and EAP. ...
Article
Routledge. ISBN: 978-0-367-43234-8. Price USD 128.00 (hardcover), USD 39.16 (eBook). 194 pages. As emerging themes, English for Specific Purposes (ESP) and English for Academic Purposes (EAP) have witnessed rapid progress due to the increasing role of English as an international language (Kırkgöz & Dikilitaş, 2018). Nonetheless, with a few notable exceptions (Charles, 2014; Lee & Swales, 2006), there seems to be a dearth of research and firsthand resources in the utilization of corpora in ESP and EAP. Spotting this visible gap, this volume edited by Charles and Frankenberg-Garcia is a valuable resource, which offers a state of the art and an ample ground approach to targeting corpus methodology in ESP and EAP. The volume, published by Routledge, is an impressive and rich work of scholarship for serious readers and seasoned researchers in which contributing authors, in seven units and 3 parts, skillfully and succinctly provide an exploration of the salient role of corpora in ESP and EAP. The structural sequence of the chapters, as well as their selection, makes this book a stimulating read and warrants its place in the resource collection of this area of research. The volume provides a comprehensive account of the multifarious applications and implications of the corpus approach in ESP and EAP. The editors have been able to convene renowned and preeminent scholars to produce a resource book. At its heart, the contributors' innovations register the corpus approach as an emerging theme in ESP and EAP research. The introductory chapter is an overview of the content of the book as well as a succinct overview of the different types of corpora, including target language, interlanguage, specialized, general, and ready-made, as well as their constructive roles for ESP and EAP and the effectiveness of data-driven learning (using learners' conscious raising and noticing) in the writing classroom. The first part of the book is dedicated to the way corpora are (or can) be used for conducting research in ESP and EAP. A major role of corpora is their application in lexicography and phraseology, as they can provide the reader with authentic texts and illustrative materials (Gizatova, 2018). In addition to lexicography, a dedicated focus on corpora in academic writing deserves attention (Römer, 2010). Thus, in reading the two chapters of part 1, the reader is fascinated to witness how corpora are used in academic writing. The role of corpora in teaching English has been stressed by researchers (Boulton & Cobb, 2017; Zufferey, 2020). The role of corpus in ESP and EAP is elaborated in part 2, where the focus is on the exploitation of corpora and their potentiality in ESP and EAP. This is shown in chapter 3 (by Liou and Liu), where the corpus is seen as a method of corrective feedback for English learners, and in chapter 4 (by Ackerley) where a genre-specific corpus is utilized for ESP writing research.
... Most participants noted that corpus consultation has a positive effect on their overall writing quality (Bridle, 2019;Chambers & O'Sullivan, 2004;Chang, 2014;Charles, 2014;Crosthwaite, 2017;Dolgova & Mueller, 2019;Luo & Liao, 2015;O'Sullivan & Chambers, 2006;Poole, 2016;Yoon, 2008). ...
Article
Full-text available
Corpus linguistics has become increasingly important to both language researchers and teachers over the past three decades. As a popular practice of corpus linguistics, Data-Driven Learning (DDL) sees a rapidly growing body of research as well as instruction in the field. There is, however, a lack of comprehensive literature reviews that summarize the effectiveness, learners’ perception, as well as factors affecting the success of DDL to guide its practices. In response, this study analyzes previous DDL research to show the feasibility of the activities in EFL classrooms. For the purpose, we collected and analyzed relevant research articles from 19 journals in the discipline of applied linguistics. Our analysis revealed that while DDL has been proved generally effective in improving learners’ target language proficiency with respect to a variety of linguistic aspects, a set of its drawbacks have been elicited from the learners. The results indicate the instructors’ need to take into account the learner as well as technique background before the introduction of DDL into their classrooms.
... Concerns at this level tend to centre on the sophisticated linguistic skills needed to write in academic English (e.g. Charles, 2014;Crosthwaite, 2017Crosthwaite, , 2020Lee & Swales, 2006;Thurston & Candlin, 1998) Thus, few studies have targeted school settings or lower proficiency learners. Second, corpus-based learning activities intended for university-level students typically focus on the use of concordance lines, which may be considered too difficult for low-level school learners (Caliskan & Gönen, 2018;Poole, 2020). ...
Article
Full-text available
Despite the growing popularity of corpus linguistics among researchers in recent decades, a corpus-based approach remains largely unknown to most teachers in primary and secondary schools. Drawing on Shulman’s (1986, 1987) concept of pedagogical content knowledge, this study differentiated between two key terms—corpus literacy and corpus-based language pedagogy—and investigated how a group of TESOL teacher trainees developed their corpus literacy and corpus-based language pedagogy in a two-step training scheme. The first step, focusing on using corpus data as a learning tool, was conducted in a physical classroom; the second step, focusing on using corpus data as a teaching tool, took place in a virtual online classroom. A mixed methods design, including surveys, interviews and analyses of lesson plans, was used to collect and analyse the data. The findings revealed that most participants gained a good level of corpus literacy as measured by the self-designed survey. They also obtained a good level of initial competence in using corpus-based language pedagogy, as revealed by the rating of their lesson design and the content analyses of the lessons and interview data. The results support a differentiation between corpus literacy and corpus-based language pedagogy, attesting to the effectiveness of the two-step corpus-based teacher training. The study provides several insights regarding how to scaffold teachers in corpus-based training and teach students with corpus resources to address their vocabulary needs and difficulties. Finally, a few issues are raised regarding what teachers may consider when implementing effective corpus-based teaching in school settings.
... The utility of the corpus-based PoS-gram technique combined with genre study awaits more systematic explorations, especially by EAP teachers who promote data-driven learning (Charles, 2014;Otto, 2021). Since the corpus-based PoS-gram analysis could be a promising line of inquiry into language patterning in the EAP/ESP world, it needs to be extended to other part-genres of the RA to arrive at a full description of the most typical expressions for this genre, as well as to other specialized genres for academic and professional communication. ...
Article
This study innovatively applies the Part-of-Speech-gram (PoS-gram) procedure to the examination of language patterning and variability in a largely conventionalized part-genre (i.e., research introductions). Based on 400 article introductions from computer engineering (CE) and cognitive linguistics (CL), the study has identified key PoS-grams and their associated lexico-grammatical frames, using the written academic component of British National Corpus as the reference corpus. The analysis reveals key PoS-grams shared in CE and CL introductions, e.g., those associated with the step “purposive announcement”, as well as the discipline-specific ones such as the PoS-gram for structure-outlining only found in CE introductions. Compared to various forms of multi-word sequences like n-grams, the PoS-gram has the unique strength of grouping phraseologies with similar or identical structure and discursive functions and yet either recurrent or varying lexical choices under the co-selected grammatical categories. The advantage enriches analyses and helps yield pedagogically useful findings, in that patterning and variability is revealed not only in the overall function, structure and composition of PoS-grams but in such aspects of their recurrent or diversified tokens. This study illustrates the innovative application of corpus-based PoS-gram procedure to academic genres, which may inspire a promising new line of inquiry and the current genre pedagogy.
... This type of learning can help learners to gain and retain lexico-grammatical patterns, which are helpful to express themselves in writing. Furthermore, many studies have evidenced that the DDL approach is useful in helping EFL learners detect and correct lexico-grammatical errors, increases learners' language awareness and autonomy, and also provides insights into native speakers language data for second language teachers and learners (Charles 2014;Huang 2011;Lee and Swales 2006). ...
Article
Full-text available
Since the early twenty-first century, data-driven learning (DDL) approach that is a pedagogical application of corpus linguistics in classroom, has introduced a paradigm shift in EFL instruction. Research output, however, concerning this inductive, discovery-oriented learning is equivocal. This study, thus, explored the application of both native-speaker and local learner corpora, attesting the effect of direct vs. indirect DDL activities on 39 EFL learners’ development in CAF measures of writing. To this end, two experimental groups were taught through corpus consultation, but the control group received the conventional method of using a textbook, teacher explanations, and classroom exercises. Results obtained from three (two experimental and one control) groups of participants’ writing performances pre and post to seven sessions of paragraph writing confirmed the significant role of indirect DDL in writing more accurate and fluent paragraphs; however, no statistical evidence was found as regards syntactic complexity. Moreover, no significant effect of the direct DDL method in improving learners’ writing was observed, which is, thus, interpreted as suggestive that applying indirect DDL could be more effective than the direct DDL approach. It is concluded that classroom-based computers are not necessarily essential tools to implement the DDL pedagogy.
... 17, No. 3, Fall 2020, 791-807 participants agreed a corpus was a useful resource for writing discipline-specific texts. In order to make sure that learners continue to use the corpus as a useful resource for their academic writing, instructors' just-in-time support and refresher sessions on using corpus and technical vocabulary should be provided (Charles, 2014). ...
... Theexplicitlystatedaimofmuchofsuchcorpus-basedEAPresearchistoinformEAPwriting instruction,eitherthroughtheprovisionofinsightsintowhatshouldbeincludedinacoursesyllabus, orbyinformingmaterialsdesign.Meanwhile,pedagogicalresearchthatdirectlyintegratescorpus resourcesinacademicwritinginstructionisemergingbutrelativelylimited (Chang,2014;Charles, 2014Charles, ,2018Dong&Lu,2020;Gilmore,2009).Somestudieslookedintotheuseofcorporaasa resourcetofacilitatelearners'self-correctionoflexico-grammaticalerrorsintheirwriting (Gilmore, 2009). Others revealed the pedagogical value of specialized corpora and/or student-compiled discipline-specificcorpora,alongwithhands-on,contextualizedsearchesandexaminationsofthe usagepatternsofrelevantlinguisticfeatures,inpromotinglearnerengagementandattentiontodetails inacademicwriting (Chang,2014;Lee&Swales,2006;Charles,2014Charles, ,2018Dong&Lu,2020). ...
Article
This paper outlines the research agenda of a framework that integrates corpus- and genre-based approaches to academic writing research and pedagogy. This framework posits two primary goals of academic writing pedagogy, i.e., to help novice writers develop knowledge of the rhetorical functions characteristic of academic discourse and become proficient in making appropriate linguistic choices to materialize such functions. To these ends, research in this framework involves 1) compilation of corpora of academic writing annotated for rhetorical functions, 2) analysis of the organization and distribution of such functions, 3) analysis of the linguistic features associated with different functions, 4) development of computational tools to automate functional annotation, 5) use of the annotated corpora in academic writing pedagogy, and 6) exploration of the role of form-function mappings in academic writing assessment. The implications of this framework for promoting consistent attention to form-function mappings in academic writing research, pedagogy and assessment are discussed.
... The few available studies on this problem in the world give somewhat mixed results. On the one hand, Charles' s more than a half post graduated students from different disciplines used their Do-It-Yourself corpora on a regular basis a year after the completing the experimental writing course [22]. Lenko-Szymanska showed that one specially designed course for studentsfuture teachers of foreign languages is not enough to build sustainable technical, corpus-linguistic, and pedagogical skills [23]. ...
Chapter
Full-text available
Corpus linguistics (CL) is one of the most dynamic and rapidly developing areas of modern linguistics. It affects all areas of linguistics, including methodology of teaching foreign languages, translation and other linguistic disciplines. The reviews of publications on this subject include only the works published in English and do not reflect the contribution of Russian researchers. This article fills this gap by presenting an overview of the disparate publications of Russian authors by examining the dynamics in the growth of the publication number, the geographic distribution, publication outlets, citations, focusing on the studies on the application of CL methods in education in Russia since 2011 till the first half of 2019. Methods of finding relevant publications and processing the resulting body of the data in order to obtain answers to the research questions are described. The discussion indicates that most of the work contains guidelines on the CL use in teaching various aspects of the Russian and foreign languages with the involvement of large general-purpose corpora that are freely available on-line. Only a small number of studies present the results of pedagogical experiments. It is noted that the selected indicators make it possible to assess the degree of CL approach influence on education. Drawing on the analysis, we propose the ways of expanding the implementation of CL methods in language teaching by increasing their use in different disciplines of the pre-service teacher training program and in-service training of contemporary foreign language teachers.
... This approach has been used in studies involving learners' self-built corpora (e.g. Charles, 2014). c) Other statistical information used for the derivation of collocation (e.g. ...
Article
Full-text available
Less is more? The impact of written corrective feedback on corpus-assisted L2 error resolution. Abstract The past decade has seen a sharp increase in research into L2 learners' direct use of language corpora (typically known as 'data-driven learning', DDL) for error resolution in L2 writing. However, a crucial yet underexplored variable in this process is whether and how the form of written corrective feedback (WCF) provided on L2 writing facilitates effective corpus consultation for L2 error resolution. Focusing on L2 writers at the postgraduate level and using a short private online course for DDL training, we determine the impact of four WCF conditions (varying in their degree of directness) on students' use of corpora for lexical and grammatical error resolution and the appropriacy of error revisions made with/without corpora for these error types. The results suggest that 'less (WCF) is more' if learners are to make successful error revisions via corpus consultation, with more direct WCF conditions often resulting in students revising errors without consulting a corpus. However, less direct WCF conditions sometimes resulted in inappropriate revisions, as learners required additional information as to the nature and location of the error. Differences were also found in the effectiveness of corpus consultation for grammatical and lexical error types, with WCF a confounding factor. These results suggest that if corpora are to be used for L2 error resolution, teachers need to carefully consider whether their WCF allows for meaningful engagement with corpora to occur, and whether corpus consultation is suitable or desirable for resolving all error types.
... In the case of a DDL activity stemming from a single word combination, such as the one we presented, the discovery process that is initiated can take a number of different turns, all of which will be logically linked. This is how the student will find him or herself in the position of a detective, a research-scientist or a traveler (Bernardini 2000;Cobb 1999;Johns 1997), while the teacher will become a demonstrator, a collaborator or a guide (Boulton 2011;Charles 2014;Frankenberg-Garcia 2012). And this way, the discovery process put into place by the DDL activity will allow the learner to gain insight into language usage and the form varieties in an autonomous way: as Cobb points out in his contribution on constructivism, "knowledge encoded from data by learners themselves will be more flexi-ble, transferable, and useful than knowledge encoded and transmitted to them by an instructor" (Cobb 1999, 15). ...
Article
Full-text available
This paper aims to shed light on how research findings stemming from Learner Corpus Research (LCR) can inform the development of Data-driven learning (DDL) pedagogical activities. By doing this, it seeks to show how the gap between corpora built to be used by linguists and those tailored for learners can be filled. It starts by defining what a corpus is and how second language learning studies can benefit from the research findings based on corpora, but also from the direct use of corpora in the classroom. Then, it provides an overview of the available native and learner corpora of Italian, and how corpora in general can be adapted for DDL purposes. Finally, it describes an example of how an LCR finding can be used to develop DDL activities. It concludes with some desiderata for the future.
... According to the author, such training "involves practising corpus research and referencing skills as well as learning to make data-based generalizations". Previous studies describe training sessions that vary from "minimal training" -referred as indirect use of corpora to familiarise users with the concordancer and raise their DDL awareness [27] -to longer time [3], [28], [29] until requiring some lessons [30], [31]. At any rate, it is not clear if training is needed due to the (low) accessibility/usability of software tools [32] or because corpus use requires learners to manage some knowledge [26]. ...
Article
Full-text available
Corpus-based writing assistants are aimed to show how words are used in real context: they provide word use examples from which users can (1) draw inspiration for their writing and (2) understand how words are used together to improve their own writing. Although the idea of integrating a corpus-based writing assistant into word processors is not new, their integration is designed to be not as straightforward as writing in a word processor. In this paper, we present WriteBetter, a corpus-based writing assistant designed to be integrated into Microsoft Word, Google Docs, and Overleaf. This integration makes its use straightforward and easy as users can see corpus-based examples (1) in real-time while writing in the word processor or (2) just selecting a piece of text in their document. This facilitates user-corpus interaction as the required user’s interaction is minimal. After contextualising the state of the art regarding the benefits of corpus-consultation, we discuss the design features of WriteBetter that make it novel in relation to other tools. Next, we present a user evaluation of the first version of WriteBetter, which was carried out on 11 undergraduate students of a Chilean university, who were asked to trial the software while writing in English. Based on this evaluation, we designed a new version of WriteBetter, which was further evaluated online on 36 users. WriteBetter is now available for everyone as SaaS.
... The activities can take the form of an independent problem-solving venture, in which the teacher is integral part of the discovery process rather than being the dispenser of linguistic knowledge. In this teaching paradigm shift, we see the student becoming a detective (Johns 1997), a researcher-scientist (Cobb 1999) or even a traveler (Bernardini 2000), while the teacher is seen as a demonstrator (Frankenberg-Garcia 2014), a collaborator (Boulton 2011) or a guide (Charles 2014). The discovery process leading to the observation of new patterns in language use is what gives DDL the potential to "reach the parts other teaching can't reach" (Boulton 2008). ...
Conference Paper
This paper presents the first findings of a study on the role that semantic transparency in verb + noun collocations may play in evaluating the effectiveness of data-driven learning (DDL). It is based on a controlled between-groups experiment conducted in 8 classes of Chinese learners of Italian at the University for Foreigners of Perugia, Italy. Phraseological competence data was collected at 4 points in time over a timespan of 13 weeks. The phraseological competence test was evenly divided into multiple-choice and gap-fill items, aimed to elicit definitional and transferable knowledge respectively. Overall, control groups appear to fare consistently better than experimental groups, though the latter seem to have better retention rates. Semantic transparency was found to have a significant role in the development of phraseological competence, for both conditions under consideration. In particular, the effect of semantic transparency was found to be more prominent in relation to the transferable knowledge of collocations.
... Boulton (2008) and Jackson (1997), for example, report that their students gained problem-solving and ICT (Information and communications technology) competencies. Charles (2014) found that some students were motivated enough to consult and even add to their corpus after the end of the course. Lee and Swales (2006) report that some of their students even purchased their own copies of Wordsmith Tools (Scott, 2018), indicating a commitment to continuing with corpus construction and analysis in the future. ...
Article
It has been shown that language learners can benefit from a discovery-based learning process whereby they construct as well as consult their own specialist corpora and vocabulary portfolios, for the purposes of translator training (Castagnoli 2006), for general English (Smith 2011) and for academic English learning (Charles 2012; Smith 2015). In the present study, a cohort of 94 international students on an EAP module, majoring in Accounting and Finance, was divided into hands-on (treatment) and hands-off (control) groups. Both groups were subjected to a pre-test consisting of specialist terms that would be encountered on their course (not only in the EAP class, but also on the Accounting and Finance modules). The hands-on group spent about 20 min per weekly class constructing domain-specific DIY corpora and generating subject vocabulary portfolios. The results of a post-test indicated that the hands-on group had achieved a slightly greater improvement in domain vocabulary knowledge than the hands-off group (which used corpora and vocabulary lists provided by the teacher). A participant questionnaire showed that the students found the approaches useful for vocabulary learning.
... Working with an experimental DDL group and a control group without corpus assistance, Cortes (2014) described how she introduced students to the different moves of the research article. Various studies by Charles (2007Charles ( , 2011Charles ( , 2012Charles ( , 2014Charles ( , 2015 have investigated how her PhD level students have engaged with corpus tools and selfcompiled specialised corpora. Finally, Chen and Flowerdew (2018) have described the design, implementation and evaluation of a series of corpus-based research writing workshops for PhD students run across a group of universities. ...
Article
Corpora are widely used in the creation of language learning and teaching materials, such as dictionaries, grammar books, textbooks, and vocabulary lists. Little work, however, has focused on how the DDL approach might be introduced successfully into a teacher training program. In this paper, we describe the background, implementation, and results of a DDL-focused teacher training workshop that is designed to introduce a corpus-assisted academic writing pedagogy to in-service English language educators in Hong Kong. To evaluate the success of the workshop and gain further insights on factors that might lead to instructors accepting or rejecting the approach, we administered a questionnaire to participants after the workshop and carried out a statistical analysis of the responses. Results revealed that participants generally had a positive experience of the training. Based on correlation tests, the results also showed that factors such as prior knowledge of corpora, prior experience in using corpora, motivation for professional development, and teaching experience, correlated significantly with teachers’ perceptions of the difficulties in using corpus tools and an inclination to integrate data-driven learning in their future teaching. The findings may be related to broader research on teacher attitudes to the adoption of technology in the classroom.
... Most research on DDL has examined software-based rather than paper-based activities (Aston, 1997;Charles, 2012Charles, , 2014Conroy, 2010;Kennedy & Miceli, 2001Lee & Liou, 2003;Ma, 1994;P erez-Paredes, Sanchez-Tornel, & Calero, 2012Seidlhofer, 2000;Sun & Wang, 2003). Studies suggest that DDL approaches seem to be "most effective when using a concordancer hands-on rather than through printed materials" (Boulton & Cobb, 2017: 385). ...
Article
Data-driven learning (DDL) is a learner-focused approach which promotes language learners’ discovery of linguistic patterns of use and meaning by examining extensive samples of attested uses of language. Despite the emergence of mobile-assisted language learning (MALL) and its affordances, i.e. individualization and personalization, the potential of DDL in this context has not been widely explored. This study involved the creation of a mobile language learning app based on freely available natural language processing (NLP) tools, followed by a test of the app to gather the attitudes and perceptions of several groups of language learners across Europe. The results suggest a generally positive evaluation of DDL’s instant and personalized feedback and direct access to a variety of tools. Besides, suggestions for improvement were made concerning the design of the tasks, such as the addition of further built-in tools and adaptations to hardware constraints. Analyses also showed a need for specialized learner training, so as to grasp the potential of the feedback provided. This study may be construed as a first step towards creating more fleshed-out tools and further investigating the potential of combining DDL and MALL.
... Learners' interest in specialized corpora is largely dependent on whether relevant the corpus is perceived by learners' as being relevant to their needs. Using a corpus as a reference source for academic English writing may be ineffective and demotivating if it does not contain examples of language use in students' specific technological/scientific areas (Chang, 2014;Charles, 2014). An example of good practice which takes this factor into account is Chang (2014), where Korean IT and engineering students were encouraged to compile their own corpus, named Michelangelo, through student selection of papers and articles from journals in their fields. ...
... Learners' interest in specialized corpora is largely dependent on whether relevant the corpus is perceived by learners' as being relevant to their needs. Using a corpus as a reference source for academic English writing may be ineffective and demotivating if it does not contain examples of language use in students' specific technological/scientific areas (Chang, 2014;Charles, 2014). An example of good practice which takes this factor into account is Chang (2014), where Korean IT and engineering students were encouraged to compile their own corpus, named Michelangelo, through student selection of papers and articles from journals in their fields. ...
Article
Full-text available
This study highlights the problem of the lack of German specialized corpora for German for specific purposes (GSP) courses for engineering students and describes a project aiming at the development of such a corpus, the Kod.ING corpus. The authors show the relevance of the Kod.ING corpus in meeting the needs of Master’s degree engineering students at St Petersburg Polytechnic University who are studying lower-level German. At the preliminary stage of the pedagogical experiment, nine compound nouns and eight lexical bundles were selected from the Kod.ING corpus. These were taught to students through hands-on and hands-off data-driven learning (DDL) activities. The immediate and delayed post-tests proved the effectiveness of short DDL interventions in terms of acquisition of target vocabulary. The follow-up survey revealed students’ particular interest in hands-on activities with the Russian National Corpus (RNC). In conclusion, further research and pedagogical applications are suggested.
Article
Full-text available
Data-driven learning (DDL) has been demonstrated to be an operative strategy for assisting learners to handle a range of writing-related problems. Several studies have been conducted to compare the pedagogical effectiveness of DDL in English as a foreign language (EFL) writing. However, only a few studies have identified key factors that may affect learning outcomes when designing DDL activities. To bridge this gap, the present study looked at the medium-term effects of DDL activities in EFL writing. A pre-post quasi-experimental research design and semi-structural interviews were arranged to collect data from 64 Arab EFL undergraduate students. The DDL was carried out with the aid of BNCweb and offered the assessment of the findings by contrasting the efficiency of BNCweb with that of Sketch Engine, which is employed as a reference tool by EFL learners. The quantitative results showed that the experimental group’s use of BNCweb inspired their writing to be more fluid and consistent in the posttest as compared to the control group, which employed the Sketch Engine tool. However, no significant difference was detected between the groups in writing intricacy. The qualitative results indicated that students had positive attitudes toward using BNCweb, despite the challenges of implementing corpora in the writing process. It was recommended that integrating corpora with other types of reference sources would be a viable solution to overcome any potential obstacles for EFL learners.
Article
Corpus Linguistics has revolutionised the world of language study and is an essential component of work in Applied Linguistics. This book, now in its second edition, provides a thorough introduction to all the key research issues in Corpus Linguistics, from the point of view of Applied Linguistics. The field has progressed a great deal since the first edition, so this edition has been completely rewritten to reflect these advances, whilst still maintaining the emphasis on hands-on corpus research of the first edition. It includes chapters on qualitative and quantitative research, applications in language teaching, discourse studies, and beyond. It also includes an extensive discussion of the place of Corpus Linguistics in linguistic theory, and provides numerous detailed examples of corpus studies throughout. Providing an accessible but thorough grounding to the fascinating, fast-moving field of Corpus Linguistics, this book is essential reading for the student and the researcher alike.
Chapter
English phrasal verbs, referred to sometimes as verb + particle combinations (Fraser, 1974; Quirk et al., 1985), are “notoriously di!cult to learn” (Celce-Murcia & Larsen-Freeman, 1999, p. 401). A long body of research has systematically documented the learning challenges English as a foreign language (EFL) students face when learning English phrasal verbs due to the verbs’ "exibility in form, multiple meaning senses, appropriate use in context, and absence in some languages, such as Chinese (Folse, 2004; Sider, 1990). However, compared with these documented learning challenges, there has been a limited appearance of pedagogical discussions surrounding instruction for phrasal verbs (White, 2012). As an attempt to bridge the gap between research and teaching practice, this chapter introduces a cognitive approach for the teaching of phrasal verbs informed by Sachiko Yasuda’s (2010) study, which used conceptual metaphors to extend EFL learners’ knowledge of idiomatic phrasal verbs. Based on Yasuda’s research, we created a lesson plan for teaching 10 phrasal verbs that include the particle out. In line with Yasuda’s study, this lesson plan focuses on adult EFL learners at the tertiary level, speci#cally the Common European Framework of Reference for Languages (Council of Europe, 2020) B1 intermediate-level students. This is lesson plan is an example of how orientational metaphors can be used to facilitate students’ learning of phrasal verbs. $is lesson plan not only provides a step-by-step, detailed description following each instructional phase but also delineates the activities students may engage in to enhance their learning. $is lesson plan may also serve as a blueprint for you to consider how to construct other meaningful phrasal verbs lessons for your students. We also provide resources to help you develop and extend your knowledge of phrasal verbs (see Appendix A and Appendix B).
Article
Full-text available
Writing for international scholarly publication is hard, and arguably harder for researchers with English as an additional language. English teachers could help them, but most teachers have little or no experience of research writing or the specialized languages researchers use. This study trialled and evaluated workshops promoting the use of corpora and corpus-based tools among Brazilian researchers and English teachers learning together to develop autonomy in writing and teaching writing for scholarly publication.
Article
Over the last three decades, extensive research has been devoted to EAP students’ use of corpora for academic writing. However, corpus use has usually been ascertained immediately post-course; data on long-term use is sparse and little attention has been paid to those who give up using corpora. This study investigates the extent of corpus non-use and students’ reasons for discontinuing the practice in the long term. It draws on data from two questionnaires: (1) immediate post-course (ImmPQ); (2) delayed post-course (DelPQ) completed a year later. Participants were 182 graduates who took a six-week course during which they built and consulted do-it-yourself corpora in their own field. Results from ImmPQ showed that most students (63%) used their corpus regularly (≥ 1/week), but one year later DelPQ revealed that regular use had decreased to 36%. Although 87% of respondents to ImmPQ stated their intention to use their corpus in the future, DelPQ reported a total of 37% of non-users. There were 86 mentions of reasons for non-use; the most prevalent were: not doing any academic writing (29%), the use of other tools (20%), time issues and corpus issues (10% each). It is argued that students’ scarcity of time is a possible underlying cause of much non-use and the study suggests some ways in which long-term corpus take-up could be increased.
Chapter
Alongside extensive contributions to our understanding of word frequency across and within disciplines, genres, registers, and professional domains, corpus linguistics has contributed rich knowledge of collocations, i.e., how words pattern together. Word meaning is often defined by collocations. To illustrate, in the Corpus of Contemporary American English, “student” collocates with “college, teacher, learning, university” – words that define it in the Cambridge Dictionary Online: “a person who is learning at a college or university”. However, what if you are training to be a teacher? Education reading material in the same corpus shows the domain-specific collocations of “student” reflect important disciplinary concepts, e.g. “achievement, outcomes, progress, retention, persistence”. Teaching corpus-derived collocations in ESP, therefore, can support both fluency and conceptual learning, i.e. disciplinary literacy. This chapter details how corpus research into collocations has informed English language teaching and learning, with a focus on English for specific purposes. It reviews the research and pedagogical shifts from traditional word lists to resources including collocations, and how teachers and researchers identify useful collocations. The subject specificity of collocations is discussed, as well as differences between educational contexts, e.g. tertiary/secondary/primary. Research-based collocation lists available for EAP/ESP are reviewed, and the chapter closes with areas of future research.
Article
Corpus use by EAP students has reportedly increased over the last decade, with considerable optimism about the future of this approach (Chen & Flowerdew, 2018a). However, much research employs data from short classroom courses; little is known about how student corpus use has varied over a span of multiple years. This paper uses long-term trend data from a corpus-based course for graduates which ran 50 times (2009–2017) at a UK university. The course taught students to build do-it-yourself corpora based on their research topic and promoted autonomous consultation of this resource. Questionnaires on corpus use were administered at three stages: pre-course (544 students), immediate post-course (343) and delayed post-course, after one year (221). The data show that pre-course corpus use was constant (mean 24%), while immediate post-course use (mean 87%) and delayed post-course use rose only slightly (mean 62%) from 2009 to 2017. The lack of appreciable growth in corpus use over nine years does not support the expectation of increased take-up in future. However, the means for regular autonomous use (≥1/week) at 61% (immediate post-course) and 37% (delayed post-course), show the success of the do-it-yourself corpus approach in fostering the autonomous use of corpora by graduates.
Chapter
Previous research studies indicate that developing writing skills is a challenging process particularly for EFL learners. Academic writing puts an additional burden on the learners’ shoulders as it requires some further advanced skills, such as genre awareness, lexical flexibility, and complex syntactic knowledge, to name but a few. Corpus Linguistic Approaches to language analysis (i.e., Data-Driven Learning) has the potential to guide L2 writers in their attempt to follow the academic genre and learn the required writing skills inductively. Corpora can be exploited in three stages: observation of concordance evidence, classification of salient features and generalization of rules. Learners as the discoverers of language in this approach can benefit from the versatile features of corpora and learn from the patterns they observe through the concordance lines. In the light of the given approach and its potential to create more autonomous EFL learners, this chapter attempts to (a) explain what data-driven learning is and how it may shape the learning experience in an EFL context, (b) elaborate on how corpora can guide EFL learners in academic writing and (c) provide some hands-on uses of corpora in teaching/learning (academic) writing.
Conference Paper
Full-text available
Within a university setting, acquiring the skills of a proficient writer of academic texts presents a challenge for both L1 and L2 speakers of English. This paper will look how investigations in cognition and psycholinguistics in second language acquisition (SLA) can underpin a practical, Data-Driven Learning (DDL) task-based teaching approach. A number of researchers (cf. Ellis, 2011; Boers and Lindstromberg, 2012) have looked at perception, familiarization and pattern recognition during language acquisition. This paper will demonstrate how corpus-assisted classroom tasks presented by Gavioli (2005), Kirkgöz (2006), Charles (2012) can be seen as practical adaptations. The paper will conclude with an evaluation in how far a Data-Driven Learning (DDL) approach can turn research of cognitive processes in learners into a viable proposition for practical classroom applications. This paper gives a detailed literature review of the previous work which underpins the project.
Article
In 2003 it was suggested that insufficient attention had been paid to methodology and pedagogy in EAP and that there had been an over-emphasis on the ‘what’ at the possible expense of the ‘how’. While conferences and other professional events have gone some way towards addressing this issue, in terms of EAP journal outputs, as a survey of the Journal of English for Academic Purposes (JEAP) shows, not very much seems to have changed. The pages of JEAP still suggest a dearth of research interest in issues related to EAP methodology. This paper argues that much greater prominence should be given to methodological matters in EAP, not least because teaching accounts for the bulk of what most EAP practitioners do. The paper further suggests that the transfer of pedagogical practices from mainstream ELT to EAP should perhaps be considered a little more critically than has hitherto been the case.
Article
The present study aimed to explore how nominalization is manifested in a sample of Physics and Applied Linguistics research articles (RAs), representing hard and soft sciences respectively. To this end, 60 RAs from discipline-related professional journals were randomly selected and analyzed in light of Halliday and Matthiessen's (1999) taxonomy of nominalization. Comparing the normalized frequencies indicated that articles in Applied Linguistics differ significantly from their counterparts in Physics as they include more nominalized expressions. Moreover, the analysis brought out the findings that deployment of nominalization Type Two is significantly different from the other three types of nominalization in each discipline. Subsequently, the obtained expressions were put into their context of use in order to extract the most prevalent patterns of nominalization in the RAs. The investigation into the embedded patterns introduced 15 common patterns for Physics and Applied Linguistics RAs. Chi-square analyses suggested statistically significant differences in using only four patterns. Finally, implications accrue to the findings in reference to academic writing teachers and course designers.
Article
Although there is a large and increasing body of research on the use of corpus data by language teachers and learners, the language teachers in question are also the researchers reporting on their use of corpora with their own learners. There is thus a gap, which this plenary speech addresses, between this research and the practice of language teachers who are not corpus linguists. After a brief account of the successful integration of corpus data in the production of language learning resources such as dictionaries, grammars and course books, the potential benefits of having direct access to corpora by teachers and learners are discussed. Following this, a number of publications on the use of corpora in the language classroom, both in higher education and at secondary level, are examined, focussing on how easy or difficult it would be for teachers to replicate the use of corpus data in their own classrooms. This leads to a final section where possible solutions to the research–practice gap are considered, involving language teacher education, publishers of course books, and research involving the use of corpora in the classroom by language teachers who are not also corpus linguists.
Article
Full-text available
Our research explores the search behaviour of EFL learners (n=24) by tracking their interaction with corpus-based materials during focus-on-form activities ( Observe , Search the corpus , Rewriting ). One set of learners made no use of web services other than the BNC during the central Search the corpus activity while the other set resorted to other web services and/or consultation guidelines. The performance of the second group was higher, the learners’ formulation of corpus queries on the BNC was unsophisticated and the students tended to use the BNC search interface to a great extent in the same way as they used Google or similar services. Our findings suggest that careful consideration should be given to the cognitive aspects concerning the initiation of corpus searches, the role of computer search interfaces, as well as the implementation of corpus-based language learning. Our study offers a taxonomy of learner searches that may be of interest in future research.
Article
Full-text available
Much of the research into language learners' use of corpus resources has been conducted by means of indirect observation methodologies, like questionnaires or self-reports. While this type of study provides an excellent opportunity to reflect on the benefits and limitations of using corpora to teach and learn language, the use of indirect observation methodologies may confine the scope of research to learners' opinions about the benefits of using corpora for language learning and their self-perceived difficulties in consulting them. This article proposes and discusses the use of logs to research learners' actual use of corpus-based resources, analyzing the number of events or actions performed by each individual, the total number of different web services used, the number of activities completed, the number of searches performed on the British National Corpus (BNC) and, last, the number of words or wildcards per BNC search. Our research used these parameters to investigate whether learner interaction with corpus-based resources differed under different corpus consultation conditions: guided versus non-guided consultation. Our findings show that the individuals in the two research conditions behaved differently in two of the parameters analyzed: the number of different web services used during the completion of the tasks and the number of BNC searches. Our results corroborate empirically the suggestions found in the literature that skills and guidance are necessary when teachers take a corpus to the classroom. Similarly, we offer evidence that user tracking is essential to claim research and results validity.
Article
Full-text available
This paper illustrates how a freely available online corpus has been exploited in a module on teaching business letters covering the following four speech acts (functions) commonly found in business letters: invitations, requests, complaints and refusals. It is proposed that different strategies are required for teaching potentially non-face-threatening (invitations, requests) and face-threatening (complaints, refusals) speech acts. The hands-on pedagogic activities follow the ‘guided inductive approach’ advocated by Johansson (2009) and draw on practices and strategies covered in the literature on using corpora in language learning and teaching, viz . the need for ‘pedagogic mediation’, and the ‘noticing’ hypothesis from second language acquisition studies.
Article
Full-text available
Learning outcomes from corpus consultation: a survey of 27 empirical studies of DDL
Article
Full-text available
Large corpora such as the British National Corpus and the COBUILD Corpus and Collocations Sampler are now accessible, free of charge, online and can be usefully incorporated into a process writing approach to help develop students' writing skills. This article aims to familiarize readers with these resources and to show how they can be usefully exploited in the redrafting stages of writing to both minimize the teachers' workload and encourage greater cognitive processing of errors. An exploratory investigation comparing the use of these two online corpora in Japanese university writing classes is then described. This suggests that the participants in the study were able to significantly improve the naturalness of their writing after only a 90-minute training session and that the majority of students found these online resources beneficial, although there was a marked preference for the COBUILD Corpus and Collocations Sampler.
Article
Full-text available
The potential for corpora in language learning has attracted a significant amount of attention in recent years, including in the form of data-driven learning (DDL). Careful not to appear to over-promote the field, enthusiasts have urged caution in its application, in particular with regard to lower-level learners, and have argued that extensive learner-training in corpus techniques is an essential condition for DDL to be successful. Such limits seem eminently reasonable, but there is a notable dearth of empirical studies to support them. This paper describes a simple experiment to see how lower-level learners cope with corpus data with no prior training. The language focus here is on linking adverbials in English, which are renowned to be difficult to teach using traditional methods. The subjects are 132 first-year students at an engineering college in France of roughly intermediate and lower levels of English. They were divided into random groups to compare their ability to deal with the target items using traditional sources (extracts from a bilingual dictionary or a grammar/usage manual) or corpus data (short contexts or truncated concordances). Performance was tested prior to the experiment, subsequently to check ability to use the different information sources as a reference, and later to test recall. No evidence was found that traditional sources promote better recall, and corpus data seemed to be more effective for reference purposes. While the results of any single experiment must be treated with caution, these findings suggest the need for more empirical studies to complement the theoretical arguments and qualitative data which currently dominate the discussions of DDL.
Article
The abstract for this document is available on CSA Illumina.To view the Abstract, click the Abstract button above the document title.
Article
This paper reports on the feasibility and value of an approach to teaching EAP writing in which students construct and examine their own individual, discipline-specific corpora. The approach was trialed in multidisciplinary classes of advanced-level students (mostly graduates). The course consisted of six weekly 2-h sessions. Data were collected from initial and final questionnaires, which provided background information and asked students to evaluate the corpus work. Data from 50 participants are presented and show generally positive results. Over 90% of students found it easy to build their own corpora and most succeeded in constructing a corpus of 10–15 research articles. Most students were enthusiastic about working with their own corpora: about 90% agreed that their corpus helped them improve their writing and intended to use it in the future. This suggests that even corpora of this size and type can provide a useful resource for writing discipline-specific texts. The paper discusses the data on participants’ attitudes and experiences and considers the issues and problems that arise in connection with do-it-yourself corpus-building. It argues that this approach need not be restricted to small groups of well-resourced students, but can be implemented in mainstream EAP classes.
Article
Direct corpus use by learners or learner concordancing has been hailed as one of the promising areas that can revolutionize L2 writing and language pedagogy as a whole (Conrad, 2000; Hyland, 2003). It has been discussed to promote data-driven learning (Johns, 1988), to provide authentic contexts in which linguistic items are used, and to serve as a reference tool that students can use for language problems. However, these benefits have been more talked about than tested with empirical studies, and only recently researchers have started to conduct studies in this area. Focusing on L2 writing, the present study explored how and to what extent this potential of concordancing has been realized by reviewing the relevant studies. The inclusion criteria for the current review were studies that provide information on the effects of corpus concordancing by learners of L2 writing and on learners’ evaluation of it. Twelve studies included in the review show that if proper training and assistance are provided, learner concordancing can be a viable research and reference tool for enhancing the linguistic aspects of L2 writing and for increasing learner autonomy. Future studies are also suggested based on the gap identified in the reviewed studies.
Corpus consultation is gaining in prominence as a language learning tool. This approach to language analysis has made its way into the language classroom where its presence ranges from the presentation of printed concordance data with accompanying tasks to the direct use of concordancing software by learners themselves to carry out analyses of self-selected language features. Activities of the latter kind place concordancers fairly and squarely alongside dictionaries and grammar books as significant tools in the language learner's kit. Recent studies have indicated that research is needed to provide support for the integration of corpus consultation into the language learning environment. Here, the response of second year undergraduate EAL students was examined to a course assignment that required them to investigate language features characteristic of a range of genres using a popular concordancing software program, Wordsmith Tools. Results showed that students generally had a positive response to corpus consultation and were able to identify benefits clearly, particularly in the areas of vocabulary acquisition and increased awareness of syntactic patterns. Most of the students indicated they are likely to use concordancers in the future and this interest is strongest amongst those students who have clear goals for their language learning. Course assignments produced by the students demonstrated an increased awareness of lexico-grammatical usage, particularly with regard to vocabulary use, phrases and colligational patterns. A number of obstacles to greater uptake of concordancing are identified and suggestions are made to overcome those obstacles.
Chapter
This paper sets out to evaluate the effect on learners’ knowledge and use of language of one prominent technique in corpus pedagogy, the data-driven use of corpus concordances with learners as researchers, or Data-Driven Learning (DDL) (Johns 1988, 1991). More specifically, the paper attempts measurement of the effect of DDL on the achievement of the goal of appropriate production by learners of logical connectors, an important subskill in the context of the wider objective of the acquisition of basic academic writing skills in English. The evaluation uses learner corpora from experimental and control groups, supported by other methods. The conclusion is that DDL, applied in the context of the communicative teaching of writing skills, is moderately effective, and that there is potential both for the further development of learner corpora in an evaluative role, and for use of a wider range of instrumentation.
Article
Corpora have been used for pedagogical purposes for more than two decades but empirical studies are relatively rare, particularly in the context of grammar teaching. The present study focuses on students' attitudes towards grammar and how these attitudes are affected by the introduction of concordancing. The principal aims of the project were to increase the students' motivation by showing them that English grammar is more than a set of rules in a book and to enable them to assume more responsibility for their own learning. The idea was to introduce the use of language corpora into the curriculum for first-semester English at Växjö University in Sweden, as a complement to grammar textbooks and ordinary exercise materials. Between classes, the students worked with problem-solving assignments that involved formulating their own grammar rules based on the examples they found in the corpus. In the classroom, a system of peer teaching was applied, where the students took turns at explaining grammatical rules to each other. Besides presenting a new way of working with grammar, we also provided the students with a tool for checking questions of usage when writing English texts in the future, since the corpus we use is free of charge and available to all. The work with corpora and peer teaching was evaluated by means of questionnaires and interviews. This article describes and evaluates this initiative and presents insights gained in the process. One important conclusion is that using corpora with students requires a large amount of introduction and support. It takes time and practice to get students to become independent corpus users, knowing how to formulate relevant corpus queries and interpret the results. Working with corpora is a method that some students appreciate while others, especially weak students, find it difficult or boring. Several of the students did not find corpora very useful for learning about grammatical rules, but realized the potential of using corpora when writing texts in English.
Article
Considerable research has now been undertaken into the development of different approaches to exploiting language corpora for pedagogic purposes in the context of ESP. The question of how language corpora might be utilized by students beyond the immediate language-teaching context is, however, one as yet seldom addressed in the literature. This study attempts to explore the relationship between student use of online corpus tools and academic and professional discourse practices in the context of a professional legal training course at The City University of Hong Kong. Students enrolled in this course were given instruction in how to consult an online concordancer as language support when completing their legal writing assignments. Drawing on narratives of student experience, and other informant data including detailed logs of searches and the outcomes of assessments of English language proficiency, the paper discusses the ways in which students make strategic use of the corpus tools provided to develop competence in writing for legal purposes. The paper concludes by appraising the potential of corpus-based methods as an affordance for studying the practice of Law, in particular as a means of enhancing the acquisition of professional expertise by novice lawyers.
Article
This paper shows how top-down and bottom-up approaches can be reconciled in EAP writing materials through a pedagogic approach which combines discourse analysis with corpus investigation. The materials have been trialled with approximately 40 international graduates and are designed both to introduce concordancing and to raise awareness of certain rhetorical functions. Here I present and discuss the material on Defending your Research against Criticism. Initial discourse-based tasks help students to recognise a two-part rhetorical pattern, in which the writer first concedes the possibility of criticism and then moves to neutralise its potentially negative effect. Subsequently, students perform controlled, context sensitive corpus searches, which provide broader exposure to the pattern and focus on specific lexico-grammatical issues. These corpus-based tasks require work on a small number of expanded concordance lines in detail, a procedure which leads to enhanced understanding of the context in which the rhetorical function appears. The two types of learning activity involve different, but complementary types of work: in the discourse tasks, the focus is primarily on function, whereas in the corpus tasks, it is on form. I argue that it is the combination of the two approaches that provides the enriched input necessary for students to make the connection between general rhetorical purposes and specific lexico-grammatical choices.
Article
This paper investigates three growing areas in language teaching, namely, induction, the use of concordances, and self-correction. For a class of students at a Thai university, lexical items causing errors in writing were identified, the students made small concordances of the lexical items from the Internet, and they then induced patterns from the concordance to apply in self-correction of their errors. Generally, students were able to induce valid patterns from their self-selected concordances and make valid self-corrections of their errors, and there was a strong correlation between these two abilities. Their ability to induce and self-correct, however, was perhaps affected by the part of speech of the lexical items focused on, and their ability to apply the induced patterns in self-correction was influenced by other aspects of the lexical items.
Article
This paper presents a discussion of an experimental, innovative course in corpus-informed EAP for doctoral students. Participants were given access to specialized corpora of academic writing and speaking, instructed in the tools of the trade (web- and PC-based concordancers) and gradually inducted into the skills needed to best exploit the data and the tools for directed learning as well as self-learning. After the induction period, participants began to compile two additional written corpora: one of their own writing (term papers, dissertation drafts, unedited journal drafts) and one of ‘expert’ writing, culled from electronic versions of published papers in their own field or subfield. Students were thus able to make comparisons between their own writing and those of more established writers in their field. At the end of the course, participants presented reports of their discoveries with some discussion of how they felt their rhetorical consciousness was raised and reflected on what further use they might be making of corpus linguistics techniques in their future careers. This paper gives an overview of how this course was structured, presents the kinds of discoursal and other linguistic phenomena examined and the sometimes surprising observations made, and reports on the pluses and minuses of this corpus-informed course as a whole, seen from the point of view of both learners and instructors.
Article
Sentence-level writing errors seem immune to many of the feedback forms devised over the years, apart from the slow accumulation of examples from the environment itself, which second language (L2) learners gradually notice and use to varying degrees. A computer corpus and concordance could provide these examples in less time and more noticeable form, but until now the use of this technology has assumed roughly the degree of language awareness most learners are aiming at. We report on attempts to make concordance information accessible to lower-intermediate L2 writers. These attempts capitalize on some newly available opportunities as concordancing goes online. Our report: (1) makes a case in principle for concordance information as feedback to sentence-level written errors, (2) describes a URL-link technology that allows teachers to create and embed concordances in learners' texts, (3) describes a trial of this approach with intermediate academic learners, and (4) presents preliminary results.
Article
In recent years, there has been growing interest in the use of corpora in L2 writing instruction. Many studies have argued for corpus use from a teacher’s perspective, that is, in terms of how teachers can develop instructional materials and activities involving a corpus-based orientation. In contrast, relatively little attention has been paid to investigations of learners’ actual use of corpora and their attitudes toward such use in the L2 writing classroom. This paper describes a study of corpus use in two ESL academic writing courses. Specifically, the study examined students’ corpus use behavior and their perceptions of the strengths and weaknesses of corpora as a second language writing tool. The study’s qualitative and quantitative data indicate that, overall, the students perceived the corpus approach as beneficial to the development of L2 writing skill and increased confidence toward L2 writing.
Article
This paper reports on a qualitative study that investigated the changes in students’ writing process associated with corpus use over an extended period of time. The primary purpose of this study was to examine how corpus technology affects students’ development of competence as second language (L2) writers. The research was mainly based on case studies with six L2 writers in an English for Academic Purposes writing course. The findings revealed that corpus use not only had an immediate effect by helping the students solve immediate writing/language problems, but also promoted their perceptions of lexico-grammar and language awareness. Once the corpus approach was introduced to the writing process, the students assumed more responsibility for their writing and became more independent writers, and their confidence in writing increased. This studyidentified a wide variety of individual experiences and learning contexts that were involved in deciding the levels of the students’ willingness and success in using corpora. This paper also discusses the distinctive contributions of general corpora to English for Academic Purposes and the importance of lexical and grammatical aspects in L2 writing pedagogy.
Article
Despite considerable research interest, data-driven learning (DDL) has not become part of mainstream teaching practice. It may be that technical aspects are too daunting for teachers and students, but there seems no reason why DDL in its early stages should not eliminate the computer from the equation by using prepared materials on paper – considerably easier for the novice learner to deal with. This paper reports on a simple experiment to see how lower-level learners cope with such paper-based corpus materials and a DDL approach compared to more traditional teaching materials and practices. Pre- and post-tests show both are effective compared to control items, with the DDL items showing the biggest improvement, and questionnaire responses are more favourable to the DDL activities.
Systematising serendipity: Proposals for concordancing large corpora with language learners Rethinking language pedagogy from a corpus perspective Student writing of research articles in a foreign language: Metacognition and corpora
  • S Bernardini
Retrieved from <http://www.antlab.sci.waseda.ac.jp/antconc_index.html>. [computer software]. Bernardini, S. (2000). Systematising serendipity: Proposals for concordancing large corpora with language learners. In L. Burnard & T. McEnery (Eds.), Rethinking language pedagogy from a corpus perspective (pp. 225–234). Frankfurt, Germany: Peter Lang. Bianchi, F., & Pazzaglia, R. (2007). Student writing of research articles in a foreign language: Metacognition and corpora. In R. Facchinetti (Ed.), Corpus linguistics 25 years on (pp. 261–287). Amsterdam: Rodopi.
Learning outcomes from corpus consultation Exploring new paths in language pedagogy: Lexis and corpus-based language teaching
  • A Boulton
Boulton, A. (2010b). Learning outcomes from corpus consultation. In F. Serrano Valverde, M. Moreno Jaén, & M. Calzada Pérez (Eds.), Exploring new paths in language pedagogy: Lexis and corpus-based language teaching (pp. 129–144). London: Equinox.
Genre, corpus and discourse: Enriching EAP pedagogy
  • M Charles
Charles, M. (in press). Genre, corpus and discourse: Enriching EAP pedagogy. In P. Thompson & G. Diani (Eds.), English for Academic Purposes: Approaches and Implications. Newcastle upon Tyne: Cambridge Scholars Publishing.
Getting to 'know' connectors? Evaluating data-driven learning in a writing skills course Corpora in the foreign language classroom (pp. 267–287) Learning English grammar with a corpus: Experimenting with concordancing in a university grammar course
  • A Cresswell
Cresswell, A. (2007). Getting to 'know' connectors? Evaluating data-driven learning in a writing skills course. In E. Hidalgo, L. Quereda, & J. Santana (Eds.), Corpora in the foreign language classroom (pp. 267–287). Amsterdam: Rodopi. Estling Vannestål, M., & Lindquist, H. (2007). Learning English grammar with a corpus: Experimenting with concordancing in a university grammar course. ReCALL, 19(3), 329–350.
Using web-concordancing and other internet-based reference resources as writing assistance: A mixed methods study of Korean ESL graduate students' academic writing
  • C Yoon
Yoon, C. (2012). Using web-concordancing and other internet-based reference resources as writing assistance: A mixed methods study of Korean ESL graduate students' academic writing. In Paper presented at the 10th international teaching and language corpora conference, Warsaw, Poland.