ArticlePDF Available

Figures

Content may be subject to copyright.
Exploring the use of multidimensional analysis of learner language
to promote register awareness
Pilar Aguado-Jime
´nez*, Pascual Pe
´rez-Paredes, Purificacio
´nSa
´nchez
Dpto. de Filologı
´
a Inglesa, University of Murcia, 30071 Murcia, Spain
Received 15 December 2010; revised 6 December 2011; accepted 11 January 2012
Abstract
This paper discusses the use of multidimensional analysis (MA) of learner language to promote the awareness of linguistic
concepts such as register and variation. Our research explores the introduction of learner register awareness by using MA of learner
language in the field of university Foreign Language Teaching (FLT). In this context, a group of learners (N¼47) completed an
awareness-raising activity based on a MA of their own spoken language and the language of native speakers fulfilling the same
tasks. This research illustrates practical ways in which MA of learner and native languages can be used in the context of university
language learning. Our research shows that it is feasible to (a) carry out MA of learner language and (b) relate the analysis to the
notions of register and language variation in university EFL. The participants confirmed that after the activity they were better
prepared to understand the role of register and the connections between individual linguistic features and registers.
Ó2012 Elsevier Ltd. All rights reserved.
Keywords: Learner corpora; Multidimensional analysis; EFL; Data-driven analysis; Corpus linguistics; Language awareness; Register
1. Introduction
Using corpus data and electronic corpora in the language classroom is becoming increasingly more common,
especially in the context of higher education (Boulton, 2009; Pe
´rez-Paredes, 2010a). The use of corpus linguistics,
which first focused its attention on L1, is now gaining momentum in the field of learner language as studies involving
the use of learner corpora comprise more and more diverse areas of both language practice and language education
(Gass and Selinker, 2001). Thus, during the last decade, a significant array of initiatives has focused on the design of
learner corpora and their pedagogical implications in the domain of language learning, not only in materials design,
but also in pedagogical and methodological innovations (Tono, 1999, 2000, 2002, 2003).
The combination of existing corpus linguistics analysis methods, learner language corpora and corpus learner-
oriented pedagogic applications present important opportunities for innovation in the field of foreign language
education. More specifically, in the context of tertiary education where English for Academic Purposes (EAP) learners
face the increasing need to master not only the basics of language communication, but also the subtleties of academic
* Corresponding author. . Tel.: þ34 68 88 76 87; fax: þ34 68 88 31 85.
E-mail addresses: paguado@um.es (P. Aguado-Jime
´nez), pascualf@um.es (P. Pe
´rez-Paredes), purisan@um.es (P. Sa
´nchez).
0346-251X/$ - see front matter Ó2012 Elsevier Ltd. All rights reserved.
doi:10.1016/j.system.2012.01.008
www.elsevier.com/locate/system
Available online at www.sciencedirect.com
System xx (2012) 1e14
+MODEL
Please cite this article in press as: Aguado-Jime
´nez, P., et al., Exploring the use of multidimensional analysis of learner language to promote
register awareness, System (2012), doi:10.1016/j.system.2012.01.008
Aguado-Jiménez, P., Pérez-Paredes, P. & Sánchez, P. (2012)Exploring the Use of Multidimensional Analysis of Learner
Language to Promote Register Awareness. System, 40 (1), pp. 90-103.
90
discourse, it seems necessary that language professionals examine ways in which corpus linguistics and learner
language corpora can be put to good, sound pedagogic use.
While the analysis of isolated linguistic features in learner language, i.e. frequencies of comparative adjectives in
a text, may allow for an extremely fine-grained view on learners’ use of well-limited morphological areas of language
use, this perspective alone fails to grasp a substantial component of language as used in real-life communication
purposes in specialised contexts, including English for Special Purposes (ESP) education, that is, register and register
awareness in language education. This is the area where multidimensional analysis (MA) can play an important role.
The aim of this paper is to explore the introduction of learner register awareness by discussing both the benefits and
the challenges of using MA of learner language in English Language Teaching (ELT), particularly in the field of
university language teaching. To this end, 47 learners completed an awareness-raising activity which used MA of their
own spoken English language and the language of native speakers fulfilling the same speaking tasks. The learners then
answered a questionnaire where they gave us their own perceptions on the activity.
We intend to gain insight into the practicability of using MA of spoken learner language and report on the different
steps and stages that other professionals should follow in order to make use of MA in their own teaching contexts. In
section 2, we review how corpus linguistics and English language teaching have contributed to the study of register
variation and the analysis of learners’ data for pedagogic purposes. Sections 2 and 3, respectively, deal with the
research question and the research methodology. The results of our research are presented in section 4, while in section
5they are discussed in light of the feasibility of using MA of learner corpora in university language teaching.
2. Literature review
2.1. Corpora and language awareness
The attention to learner register in the language classroom has been scant, although register itself has been on the
researchers’ agenda for decades. In this sense, there seems to be a general consensus on the need to teach register to
language learners. Chapman (1982) stated that studying register in the Foreign Language (FL) classroom required that
learners should have an advanced level of competence and appropriate materials, which disqualifies the analysis of
register at the initial stages of foreign language learning. Jiang (2006) concludes that appropriateness must be taught
alongside grammar and lexis, while Yeung (2009) has found in corpora a valuable source of information to introduce
learners to advanced uses of particular lexical items, which involves students considering the notion of register.
Ruehlemann (2007) has turned to the spoken medium and has discussed the importance of analysing conversation as
a key register for language learners, stressing the need for approaches which are register-sensitive in the description of
language in the context of language learning. To this end, the use of corpora can play a central role in making the
analysis and study of registers more accessible now than in the past (Anderson and Corbett, 2009).
Some research efforts have examined the role of language corpus data in promoting learners’ language awareness.
These proposals rest upon the idea that learners’ work on language data will foster a more profound grasping of native
speakers’ use of the language. As opposed to indirect uses of corpora in language teaching, these efforts are more
direct, hands-on, innovative proposals involving the use of language corpora (Ro
¨mer, 2008).
Alderson (1996) spearheaded work in this area by analysing the possibilities of using corpora for language learning
assessment, highlighting the potential that corpora have in contributing to the construction and evaluation of language
tests in a variety of ways. Hung (2002) has encouraged teachers of English to explore ways in which corpora can be
used in understanding the state of the language, in designing language learning tasks, and in raising the learners’
awareness of their learning process. Authors such as Vinther (2004) have discussed the benefits of using a particular
parser (VISL) as a pedagogical tool for advanced students while others (Pe
´rez-Paredes and Cantos-Go
´mez, 2004) have
debated the use of self-check questionnaires to increase learner language awareness and attention to form.
Learner corpus research (Granger, 1994, 1998, 2003) has brought together two important areas of academic
exploration in applied linguistics: SLA and second/foreign language learning and teaching. The results obtained at the
crossroads of these fields can be applied to the improvement of second and foreign language learning and teaching,
mainly in the areas of materials design and classroom methodology (Granger, 1994, 1998; Granger et al., 2002;
Meunier, 2002). Many of the research efforts in the tradition of Granger (1994) make use of contrastive inter-
language approaches where L2 production is set against L1 language produced by native speakers. This has been the
dominant research perspective in learner language research in the last few years, one which has relied almost
2P. Aguado-Jime
´nez et al. / System xx (2012) 1e14
+MODEL
Please cite this article in press as: Aguado-Jime
´nez, P., et al., Exploring the use of multidimensional analysis of learner language to promote
register awareness, System (2012), doi:10.1016/j.system.2012.01.008
Aguado-Jiménez, P., Pérez-Paredes, P. & Sánchez, P. (2012)Exploring the Use of Multidimensional Analysis of Learner
Language to Promote Register Awareness. System, 40 (1), pp. 90-103.
91
exclusively on intergroup contrast of the patterns of use of linguistic items.
´az-Negrillo and Ferna
´ndez-Domı
´nguez
(2006, p. 84) state that work on learner corpora is “usually intended to disclose areas where learners tend to show
underuse or overuse of linguistic features as opposed to native language [.] or to gain insights into the errors that
learners of a given language tend to make”. Thus, learner language is usually characterised against the frequency of
use of isolated items in L1 attested data.
Learner corpora offer “a new type of data which can inform thinking” in SLA research (Granger, 2002, p. 5).
However, this thinking can be applied to other areas of language education. Results from MA of learner corpora can be
used as a tool to explore linguistic behaviour by foreign language learners (Johansson, 2007), who could thus become
researchers of their own learning. Bearing in mind that a set of dimensions can be fully described, learners can evaluate
their foreign language production and compare it against what is considered the “standard” target language (Aston,
2002; Pe
´rez-Paredes and Cantos-Go
´mez, 2004; Pe
´rez-Paredes, 2010b). These deviations may be due to unexpected
use of the language or to inappropriate use of the conventions for a register. This is an unquestionable source for both
convergent and divergent tasks (Bernardini, 2000) as well as semi-guided and guided awareness-raising activities.
Multidimensional analysis is a data analysis process that groups data into different categories, or dimensions, whose
components are related. This technique has been underused as a tool for language research and pedagogy, One of the
few studies where MA was used to explore learner language is Connor-Linton and Shohamy (2001). These authors
used data contributed by 10 adult female L1 Hebrew EFL learners. The authors studied the stylistic variation of non-
native speakers’ spoken discourse across different elicitation tasks and contexts (face-to-face vs. tape-mediated) in this
group of learners of varied proficiency levels (Shohamy et al., 1993). These individuals completed three different tasks
in parallel forms in order to minimize memorization effects. In the first task, they told their interviewer about
themselves; in the second, using the role-play technique, they were asked to complain about noise; in the third, they
had to ask a teacher for an extension on a term paper or a second chance on a final exam. These tasks combined with
five elicitation contexts (face-to-face conversation with a tester, with a peer, telephone interaction, videotaped prompt
and audiotaped prompt). The authors found that the t-tests of the dimension scores confirmed that each pair elicits
“stylistically and functionally equivalent performance samples” (Connor-Linton and Shohamy, 2001, p. 133). Simi-
larly, their MA provided evidence that the stylistic profiles of complaints and requests elicited similar language in
terms of communicative functions, which, according to the authors, shows some of the potential uses of MA in
designing oral proficiency interviews (OPI) which can discriminate a more varied set of speech events.
2.2. Applications of multidimensional analysis of language and language corpora
Multidimensional analysis involves the use of statistical techniques to study register variation in language corpora.
After identifying a set of linguistic features organized by their grammatical class based on “a survey of previous
research on spoken/written differences” (Biber, 1988, p.72), i.e. tense and aspect markers, pronouns and pro-verbs,
a factor analysis was carried out to explore their distribution. In MA, each set of co-occurring features in every
factor is considered a dimension at whose extremes we can find register types that show a remarkably different mean
score for this particular dimension. In Biber (1988) these dimensions of use were labelled as:
(1) involved vs. informational production. Dimension 1 includes discourse with interactional, affective, involved
purposes as well as discourse with highly informational purposes, ranging from personal telephone conversa-
tions, a register loaded with interactional features, to natural science academic prose, a register displaying
“highly informational density” (Biber, 1988, p. 107).
(2) narrative vs. non-narrative discourse. Dimension 2 distinguishes discourse with primary narrative purposes from
discourse with non-narrative purposes.
(3) situation-dependent vs. elaborated reference. Dimension 3 has to do with discourse that identifies referents
through relativisation, and discourse that relies on reference to an external situation for identification purposes.
(4) overt expression of argumentation. Dimension 4 refers to those features associated with the speaker’s expression
of point of view or with argumentative styles intended to persuade the addressee.
(5) abstract vs. non-abstract style. Dimension 5 ranges from texts with a highly abstract focus to those with non-
abstract focuses.
(6) on-line informational elaboration marking stance. Dimension 6 distinguishes between informational discourse
produced under highly constrained conditions in which the information is presented in a relatively loose,
3P. Aguado-Jime
´nez et al. / System xx (2012) 1e14
+MODEL
Please cite this article in press as: Aguado-Jime
´nez, P., et al., Exploring the use of multidimensional analysis of learner language to promote
register awareness, System (2012), doi:10.1016/j.system.2012.01.008
Aguado-Jiménez, P., Pérez-Paredes, P. & Sánchez, P. (2012)Exploring the Use of Multidimensional Analysis of Learner
Language to Promote Register Awareness. System, 40 (1), pp. 90-103.
92
fragmented manner, and other types of discourse which may be highly integrated or discourse that is not
informational;
(7) academic hedging. Dimension 7 is related to hedging in differentlinguistic contexts, although factorial results were not
strong enough for a firm characterization, and subsequently, thisdimension was not discussed in detail by Biber (1988).
By focussing on this range of dimensions, researchers “consider the likely reasons for the complementary distri-
butions between positive and negative feature sets [.] and co-occurrence patterns” (Conrad and Biber, 2001, p. 24).
The interpretation of these data leads us to the assumption that features appearing very frequently in a particular
register, are usually rare in another. For instance, in the case of Dimension 3 (situation-dependent vs. elaborated
reference) this can be seen when for example comparing official documents with broadcasts (Conrad and Biber, 2001).
This is a finding of the utmost interest as it relates to basic research skills which are not exclusive to experienced
linguists, but can also be put into practice by learners (Sinclair, 2003).
MA is different from other corpus methods in important practical ways. The focus of MA shifts the researchers’
attention from a particular language feature to the notion of language variation as a continuum, which shows major
trends of functionality that go beyond the individual analysis of particular linguistic features, i.e. hedges or nomi-
nalization. MA explores linguistic variation by using a corpus-based methodology that benefits from both quantitative
and qualitative research methods (Conrad and Biber, 2001). In this fashion, MA has played an important role in the
description of the English language (Biber, 1988, 2003, 2006; Biber et al., 1999, 2002, 2004; Conrad and Biber, 2001).
However, the application of MA may go beyond the purely linguistic boundaries of description and can be applied to
the teaching and learning of foreign languages.
Longitudinal and synchronic studies applying MA can measure and analyse (i) students’ communicative
competence in one academic discipline and establish (ii) language use profiles across different disciplines. Biber
(2006, p. 203) states that specific dimensions of language use “are associated with important differences across
academic disciplines”. It is no coincidence that Biber (2006, p. 1) comments on the more innovative position of ESL/
EFL programs, particularly ESP/EAP programs, in the sense that “ESP programs emphasize the different linguistic
patterns used for different registers [.], while EAP programs emphasize the different vocabulary and linguistic
patterns associated with specific academic disciplines.” He also insists on the need to elaborate a full linguistic
description of these registers. Comparison with a native language corpus coming from EFL students from similar
backgrounds and conditions would be very attractive, so deviations from what should be understood as “standard” in
terms of frequency use could be established, and contrastive interlanguage analysis could be developed on a different
basis from that of classical interlanguage learner corpora (Granger, 1998, 2002).
Previous research confirms that corpora are an excellent basis for materials design (Aston, 1995, 2001; Aston et al.,
2004; Gavioli, 2005; Gavioli and Aston, 2001; Granger, 1998; Granger et al., 2002; Jacobi, 2001; Partington, 1998;
Sinclair, 2004). Several pedagogic applications have been designed to help learners improve their linguistic skills by
means of learner corpora materials. This is the case of Milton’s WordPilot, based on a Hong Kong learners’ corpus
(Pravec, 2002), Allan’s (2002) TeleNex network, where students’ problem files are provided describing areas of
learner difficulty extracted from a learner corpus, and where teaching implications to help teachers deal with these
problems in the classroom are proposed, and Flowerdew’s (2001) study on collocational, pragmatic and discourse
features using the Hong Kong University of Science and Technology Corpus, together with proposals to use research
findings in pedagogical materials design. As Kennedy (1998, p. 281) states, corpus evidence suggests “which
language items and processes are most likely to be encountered by language users, and which therefore may deserve
more investment of time in instruction”. However, none of these efforts include MA or other quantitative method-
ologies to characterise learners’ register(s).
2.3. Research question
MA is a reasonable starting point for an analysis of learner language which integrates the notion of register and
variation in the context of tertiary education. Appropriate register variation is a prerequisite for academic success
(Biber, 2006) and MA provides researchers with the methodology to analyse the scope of variation across registers in
a given communication domain, such as the university. Our research question is whether the use of MA of learner
language is perceived by advanced university EFL learners as an effective tool to increase their awareness of linguistic
concepts such as register and variation.
4P. Aguado-Jime
´nez et al. / System xx (2012) 1e14
+MODEL
Please cite this article in press as: Aguado-Jime
´nez, P., et al., Exploring the use of multidimensional analysis of learner language to promote
register awareness, System (2012), doi:10.1016/j.system.2012.01.008
Aguado-Jiménez, P., Pérez-Paredes, P. & Sánchez, P. (2012)Exploring the Use of Multidimensional Analysis of Learner
Language to Promote Register Awareness. System, 40 (1), pp. 90-103.
93
3. Method
3.1. Framework
Taking into account the research on MA discussed previously, we aimed to set up a framework that (1) examines the
oral production of first-year EFL Spanish university students and (2) uses the results from this analysis to explore the
potential applications of MA of learner language in the context of higher education.
To do this we devised a 3-step approach which involved (1) the compilation of a learner corpus, (2) the MA of this
corpus and (3) the use of the MA results to explore the feasibility of using MA to promote learner language awareness.
A second corpus of British speakers was compiled, recorded, transcribed, marked up and analysed in order to exploit
the comparability of our initial learner data. The methodological framework for step 1 is De Cock (1998); for step 2 it
is Biber (1988, 2003, 2006), and for step 3 it is Lee and Swales (2006).
3.2. Setting
Our learner data (C1) was collected during 2005 from 59 students with an average age of 19.6. The average number
of years that these learners had spent studying English before starting university was 8.8. Almost half of the informants
had travelled to English-speaking countries, 45.8% staying abroad for an average of 1.9 months. All of them were
enrolled in the English Studies degree offered at the Universidad de Murcia, Spain.
The corpus of English sp eaker language (C2) was compiled at the Manchester Metropolitan Un iversity, UK. The number
of informants in C2 was 28, all of them native speakers of English (average age ¼22.25). This corpus was collected in 2006.
3.3. Data
In the learner corpus, 59 learners were interviewed by native English speakers. The interview followed the OPI format
of the Louvain InternationalDatabase of Spoken English Interlanguage (LINDSEI) corpus (Gilquin et al., 2010), and was
structured in three parts.First, speakers were given three topics to choose from: an experience that taughtthem an important
lesson, a country which impressed them or a film or play which the speaker particularly liked. This was the personal
narrative component of the interview. A small part of the interview was then devoted to interpersonal communication.
Finally, students were given four pictures which told a story and were asked to describe them and offer an account of what
was going on there.
1
This was the picture description component of the interview. The total number of words (considering
only the informants’ production) is 45,558, and the mean word count is 772.16 per contributor. In the British speakers’
corpus, the total number of words considered is 21,509, and the mean word count is 796.62 per contributor.
After transcription by qualified native speakers of English, the two corpora were tagged for part of speech (POS) at
the University of Northern Arizona under the supervision of Prof. Douglas Biber. Following the indications provided
in Biber (1988) and Conrad and Biber (2001), frequency mean scores and dimension scores were calculated for the
linguistic features therein. The analytical procedure included the identification of the most prominent co-occurring
linguistic features to compare and contrast with Biber’s work, as well as the comparison between learner language
and native speaker language, as linguistic co-occurrence has been considered of paramount importance in the study of
register variation (Ervin-Tripp, 1972; Hymes, 1974; Brown and Fraser, 1979; Halliday, 1988). In order to ease the
interpretation of MA, we used CRAT (Corpus Research Analysis Tool), a tool developed by our research team that
facilitates the handling of corpora which have been encoded following MA. This tool enables researchers to navigate
POS tags as well as dimensions. It is a freeware tool and can be obtained free for research and academic purposes.
2
3.4. Awareness-raising experience
Following Lee and Swales’ (2006, p.72) approach to the use of corpora to promote technology-enhanced
rhetorical consciousness-raising, 47 of the learners who contributed to the learner corpus took part in an activity
1
Further details can be found at http://cecl.fltr.ucl.ac.be/Cecl-Projects/Lindsei/lindsei.htm#data.
2
Further information on http://perezparedes.blogspot.com/search/label/CRAT.
5P. Aguado-Jime
´nez et al. / System xx (2012) 1e14
+MODEL
Please cite this article in press as: Aguado-Jime
´nez, P., et al., Exploring the use of multidimensional analysis of learner language to promote
register awareness, System (2012), doi:10.1016/j.system.2012.01.008
94
which involved the analysis of their own language as a means to increase language awareness on register and
language variation. A questionnaire was completed after the experience where these learners gave the researchers
feedback on the task. Before examining their own language, the students read a text which dealt with the role of
register.
We followed a 4-stage procedure in our exploration of learner language. Learners were required to read a text and
think about the role of the frequency of linguistic features and the configuration of different registers (Appendix 1).
Learners then contrasted their own frequency mean for one single linguistic feature, adverbial hedges in our case,
against the mean for this linguistic feature of the British speakers. After that, we plotted the Dimension 1 (involved vs.
information production) score of the learner interview against the score for Dimension 1 on different registers in Biber
(1988) and the dimension score of the British speakers’ interviews (Appendix 2). By doing this, we tried to profile
individual learners’ language from a MA perspective which favours register over more restricted morpho-syntactic
accounts of language. After the session, the learners completed a 1e5 Likert scale questionnaire which recorded
their opinions (Appendix 3). We decided to use Dimension 1 as it represents a basic measurement “of variation among
spoken and written texts in English” (Biber, 1988, p.104). Given the pedagogic orientation of our experience, we
believed that by focussing on this dimension, learners would develop a clearer grasp of the notion of register across
written and spoken registers.
4. Results
We carried out a MA of the language of 87 individuals, namely 59 learners of English and 28 British native English
speakers. Section 4.1 offers some insight into the kind of MA that was performed. Because of the scope of this paper,
the scores for all five dimensions cannot be discussed in full. We will, however, discuss here the results for Dimension
1, involved vs. information production, in order to offer the reader a brief account of the possibilities of MA of learner
language. In section 4.2, we will present the results of the questionnaire that the learners completed after the language
awareness session.
4.1. Multidimensional analysis of learner language
In MA, every text-type or register is given a mean score for every dimension, thus characterising register variation
and contrasting textual types against dimensions. For our corpus of language learners, we obtained the mean scores for
the five most significant dimensions discussed by Biber (1988) and Conrad and Biber (2001).Table 1 shows these
means and the scores for different registers in Biber’s (1988) study:
The finding that the mean score of our learner OPI corpus for Dimension 1, involved vs. information production, is
placed in between the registers of face-to-face conversations, on the one hand, and spontaneous speeches and
interviews, on the other, seems to highlight the register appropriateness of the overall oral production of this group of
learners. In Biber’s (1988) Dimension 1, the linguistic features with the highest positive weightings are private verbs,
which express intellectual states (e.g., believe) or non-observable intellectual acts (e.g., discover), that-deletion,
contractions, present tense verbs, and second-person pronouns. The linguistic features with the highest negative
weightings are nouns and prepositions. Appendix 4 offers the mean scores for the linguistic features which integrate
this dimension.
If we observe the means of the particular linguistic features which are relevant for the interpretation of this
dimension of use (Dimension 1), we may find interesting cases like adverbs, with a normalized mean of 29.9 in our
learner corpus against 45.5 in the native speaker data; or present tense verbs, whose mean, 105.2, is higher than the
85.8 in the native speaker data. Appendix 4 offers these mean scores for all the features comprising this dimension.
Table 1
Mean scores for Dimension 1.
EAP learner
corpus mean
Mean for interviews
in Biber (1988) study
Mean spontaneous
speeches in Biber (1988)
study
Mean for face-to-face
conversations in Biber
(1988) study
Mean for personal letters
in Biber (1988) study
Dimension 1 24.94 17.10 18.20 35.30 19.50
6P. Aguado-Jime
´nez et al. / System xx (2012) 1e14
+MODEL
Please cite this article in press as: Aguado-Jime
´nez, P., et al., Exploring the use of multidimensional analysis of learner language to promote
register awareness, System (2012), doi:10.1016/j.system.2012.01.008
Aguado-Jiménez, P., Pérez-Paredes, P. & Sánchez, P. (2012)Exploring the Use of Multidimensional Analysis of Learner
Language to Promote Register Awareness. System, 40 (1), pp. 90-103.
95
In the context of this research, we were able to go a step forward and, having compiled a comparable corpus of
British speakers, the learners that were interviewed had the chance to check for themselves the scores of peer British
speakers that completed the same interview they did (Table 2):
Learner language resembles native data during the second part of the interview, i.e. during the interaction task, but
seems to move away from this trend during the third part of the interview. All in all, using a fully comparable corpus
offers ample opportunities to discuss the interface between linguistic features and register. As we have seen, hedging is
an important feature of academic English, and appears in Dimension 1 with a load of 0.58. Non-natives are said to
underuse hedges in their production, while native speakers tend to use hedging more frequently. Is this true? The use of
hedges in our learner corpus, for example, compared with the native speakers’ corpus differs sharply. Our data show
that while only 26.80% of learners use hedges, 75% of native speakers use them. The normalized mean scores (per
1000 words) for hedges in our corpora and in the case of spoken language in the context of humanities (Biber et al.,
2004) are shown in Table 3:
Biber et al.’s (2004) results coincide with our own as regards native speakers’ production in the oral medium,
confirming the use of hedges in the university environment. More precise information regarding specific hedges like
“sort of” and “kind of” also confirms an underuse on the part of our non-native informants, which turns this feature
into an area of reinforcement in learners’ spoken discourse. However, the most interesting contribution of MA resides
in the fact that linguistic features can be fully understood in the context of a specific register as one of the many
variables that, when combined, shape the linguistic nature of a particular register.
4.2. Students’ reactions to the awareness-raising experience
47 learners were instructed about the role of register, the importance of variation and the role of linguistic features
in the different dimensions of use by confronting their own language performance and that of the British speakers
(4.4). After the session, they completed a 1e5 Likert scale questionnaire that recorded their opinions (Appendix 3).
Fig. 1 shows the percentage of learners answering each of the five items in the questionnaire:
Answers to Question 1, Now I understand better the notion of register, show that over 75% of learners perceived
that they had increased their appreciation of the concept of linguistic register (SD ¼0.80, Mode ¼4) after the
awareness-raising activity. This finding is best interpreted when examined alongside the answers to Question 2, Before
this activity, I’d never reflected on the notion of register (SD ¼1.14, Mode ¼5), which reveal that, while for some of
the learners there had been previous opportunities for reflecting on register, the mode shows that the vast majority of
learners, over 63%, agreed or strongly agreed with the statement. Answers to Question 3, I found the concept of
register difficult to understand, show that the activity was pedagogically successful (SD ¼0.79, Mode ¼2) and
actually served its original purpose as only 4% of the learners agreed or strongly agreed. Answers to Question 4, Now
I’m better prepared to understand the relationship between individual grammatical features and discourse (SD ¼0.77,
Mode ¼5), show that most learners, over 80%, perceived that the activity had been instrumental in providing them
with tools to establish a relationship between isolated grammatical features and discourse. Finally, answers to
Question 5, This activity will have an important impact on my future EAP learning, show that the awareness-raising
activity was perceived as having an important impact on future learning in the context of EAP (SD ¼0.79, Mode ¼5)
as over 80% of the learners agreed or strongly agreed with the statement.
Table 2
British vs. Spanish speakers’ mean scores.
EAP learner corpus mean British speakers corpus mean
Dimension 1 24.94 21.18
Table 3
Mean scores for “hedges”.
Hedges
EAP learners 1.23
Native speakers 2.97
Biber et al. (2004) 2.90
7P. Aguado-Jime
´nez et al. / System xx (2012) 1e14
+MODEL
Please cite this article in press as: Aguado-Jime
´nez, P., et al., Exploring the use of multidimensional analysis of learner language to promote
register awareness, System (2012), doi:10.1016/j.system.2012.01.008
Aguado-Jiménez, P., Pérez-Paredes, P. & Sánchez, P. (2012)Exploring the Use of Multidimensional Analysis of Learner
Language to Promote Register Awareness. System, 40 (1), pp. 90-103.
96
5. Discussion
Our research shows that, despite being time-consuming, it is feasible to carry out MA of learner language (Connor-
Linton and Shohamy, 2001) and relate its results to the notions of register and language variation in the context of
university foreign language learning. The learners that took part in our awareness-raising activity confirmed that after
the activity they were better prepared to understand the role of register and the connections between individual
linguistic features (
´az-Negrillo and Ferna
´ndez-Domı
´nguez, 2006; Yeung, 2009) and registers. In this sense, con-
textualising linguistic features like hedges, for example, may account for the differences found in native and non-
native discourse. In the fashion of Hyland (1996), MA of both learner and native speakers’ language can be used
by teachers to introduce the concept of “hedging” in the learners’ syllabus, to teach its function in oral discourse
(Jiang, 2006), and implement it with activities intended to make their oral production sound more natural and
appropriate (Ruehlemann, 2007). The learners who worked out this feature in the context of Biber’s (1988) dimension
of use 1 were better prepared to appreciate the wider, register-specific context of communication and its corresponding
functional interpretation.
The questionnaire also confirmed that the awareness-raising activity offered our learners the chance to consider
register for the first time. They also perceived that this activity would have an important impact on their future
language learning and that the notion of register was not overly difficult to grasp. This finding confirms that the
information we gave them was successfully presented and that the learners could use it to relate to their own activities
and experiences as learners, thus corroborating previous research which showed that learners can integrate corpus-
based language data into the flow of their learning experience (Cheng et al., 2003).
An analysis of the kind presented here can provide practitioners and learners with a better insight into how
a particular register can be profiled against existing descriptions of general English (Biber, 1988) or the English used
in particular registers and domains (Biber et al., 2002; Biber, 2003). Given the importance of academic language and
writing skills in FLT and ESP programs, teachers and researchers would normally feel more than curious about how
oral or how written the oral production of their learners is. MA can play a substantial role here. In MA, every text-type
or register is given a mean score for every factor/dimension and, this way, we can characterise register variation and
contrast textual types against dimensions. These mean scores are better appreciated when analysed in the context of
the continuum which every dimension belongs to (see section 4.1). Furthermore, our results are of great interest as
they are based on classroom activities involving a greater number of learners than in other studies, such as Lee and
Swales (2006), who used four PhD students as informants whereas Yoon and Hirvela (2004) used twenty-two in
similar corpus-based classroom experiences (N¼59 for the learner corpus and N¼47 for the questionnaire).
By studying contextualized, MA-driven uses of learner and native languages, researchers can inform material
developers on the pedagogical salience of the features under consideration. A case in point is, for example, the use of
Fig. 1. Results of the questionnaire in percentages. In the X-axis, the different questions from the Likert Scale questionnaire are displayed
together with the range of answers (from 1, I strongly disagree, to 5, I strongly agree). In the Y-axis, answers in percentages are shown.
8P. Aguado-Jime
´nez et al. / System xx (2012) 1e14
+MODEL
Please cite this article in press as: Aguado-Jime
´nez, P., et al., Exploring the use of multidimensional analysis of learner language to promote
register awareness, System (2012), doi:10.1016/j.system.2012.01.008
Aguado-Jiménez, P., Pérez-Paredes, P. & Sánchez, P. (2012)Exploring the Use of Multidimensional Analysis of Learner
Language to Promote Register Awareness. System, 40 (1), pp. 90-103.
97
contractions, which, among other features, appear to be used by our informants in ways that differ from native
speakers in oral texts. Students can be given opportunities to increase language awareness on a reflection interface that
brings together morphological as well as pragmatic uses of the English language. A good starting point is the use of
questionnaires (Pe
´rez-Paredes and Cantos-Go
´mez, 2004) that invite learners’ analysis of how declarative and
procedural knowledge (Johnson, 1996) meet in real discourse, and which according to our experience, have a very
positive impact on learners’ awareness of their learning process. This is corroborated by the opinions of the learners
that took part in our research.
MA results can also help teachers and researchers in the analysis of learners’ interlanguage, making students get
more involved in the use of linguistic data at different levels. In our learner language corpus, deviations from the mean
dimension scores suggest differences in register use. Information from this analysis could provide teachers with
interesting feedback for classroom organization and reinforcement, so new materials coming from the corpus can be
used as a starting point for students’ linguistic and communicative study. Experiences of this type have been put into
practice by Lynch (2007), where students have improved their accuracy in the English language by transcribing their
own oral productions for a later reprocessing of the language used. Guillot (2002) has outlined the procedures that
could be used in FL pedagogic contexts, including work with word frequencies and concordance lines. What interests
us is her emphasis on the use of corpus-based data to explore language in discourse as opposed to language at sentence
or word level. In this sense, our research presents an original methodology to explore learner/native language in
context.
Despite the range of classroom applications which the use of MA of learner language offers, we should be cautious
in our appraisal of its implementation in FLT. The compilation, transcription and mark-up of learner language is time-
consuming, especially if it involves spoken language. However, it is the MA itself which poses more serious chal-
lenges. Part of speech (POS) tagging of a corpus for MA can only be done if the guidelines of Biber (1988) are adapted
to the output of taggers like CLAWS or FreeLing. The MA proper can also be performed using Biber’s (1988)
indications, but this requires a high degree of familiarity with corpus linguistics tools. In our case, we decided to
go a step further and develop our own software to explore the results of the MA provided by Douglas Biber himself.
This software can be used free of charge for research purposes.
Cobb (2003) has stressed the fact that learner corpus research is a new paradigm in need of a great deal of
innovative pioneering work. This is even more so when we are faced with the application of MA to language
learning and teaching. As future work, we expect to offer further discussion on the possibilities of multidimen-
sional analysis in learner corpora applications to FLT with studies which involve a larger population of learners
and a wider choice of awareness-raising activities. In a similar way, we expect to offer new tools that facilitate the
use of MA by researchers who are not experts in corpus linguistics, thus amplifying the use of this approach in
FLT.
6. Conclusion
Based on the main features of MA previously described, learner corpora studies using MA have the potential to
generate learner data which may (i) characterise the students’ register at one or different points in time, (ii) contribute
to the design of materials for language pedagogy improvement, and (iii) use learner-driven language data to promote
language awareness in the language classroom (Aston, 2002). MA studies, then, have much in common with
contrastive analysis, as in both methodologies original data from different groups of speakers or of a different nature
are compared.
Pedagogic implications in the form of new materials and ways to teach language can stem from this discussion. Not
only could remedial work on linguistic and with an increased sense of directionality and responsibility of the students’
own learning process. In this sense, one of the benefits of MA is that, despite its potential for characterising registers,
learner language communicative areas where “differences” have been detected be of help, but also the view that the
learner has to adopt an even more active role in his or her learning. This new role has to do essentially can be analysed
from a more atomized and refined perspective, which offers language educators and learners the chance to understand
aspects of the foreign language that remain unveiled in other types of teacher-learner interactions, such as oral
production feedback sessions, general tutorials and, among others, writing marking (Cheng, 2005; Kawai, 2006;
Lynch, 2007). This granularity often escapes the attention of language professionals who tend to focus on norm-
referenced or criterion-referenced evaluations which concentrate on more general categories of language
9P. Aguado-Jime
´nez et al. / System xx (2012) 1e14
+MODEL
Please cite this article in press as: Aguado-Jime
´nez, P., et al., Exploring the use of multidimensional analysis of learner language to promote
register awareness, System (2012), doi:10.1016/j.system.2012.01.008
Aguado-Jiménez, P., Pérez-Paredes, P. & Sánchez, P. (2012)Exploring the Use of Multidimensional Analysis of Learner
Language to Promote Register Awareness. System, 40 (1), pp. 90-103.
98
assessment. To alleviate this situation, experiences such as Lynch’s (2007) or the one described here could inspire
researchers using MA to give more accurate feedback to the students.
Acknowledgements
This research has been funded through a grant from the Ayudas para la realizacio
´n de proyectos de investigacio
´n.
(Fundacio
´nSe
´neca, Regio
´n de Murcia, 00580/PI/04, 2005-2008) and the Programa Nacional de Movilidad de
Recursos Humanos de Investigacio
´n, en el marco del Plan Nacional de Investigacio
´n Cientı
´
fica, Desarrollo e
Innovacio
´n Tecnolo
´gica 2008e2011, Programa Salvador de Madariaga para investigadores seniors (Ministerio de
Ministerio de Educacio
´n, Secretarı
´a General de Universidades, PR2009-0439, 2010).
Appendix 1
Read the following extract from Ghadessy (1994: 288):
The concept of register comes under the larger concept of language variation in applied linguistics.
According to some applied linguists there are two main types of variation in language, i.e. variation based
on the user of language, and variation based on the use of language (Gregory, 1967). Dialects, idiolects,
sociolects, and genderlects are examples of the first type, while the language of science and technology, legal
English, the language of buying and selling, and the language of classroom interaction belong to the second
type. The term ’register’ has been used to refer to variation according to the use of language, i.e. functional
varieties.
Variation is a feature of language. Different uses of the language will determine different texts. Texts with the same
function and which address the same audience or readership will most likely belong to the same register and share
similar linguistic features. Features appearing frequently in a particular register, i.e. personal letters, are usually rare
in another, for example, official documents. Each group of co-occurring features is considered a dimension at whose
extremes we can find register types which show a remarkably different mean score for this particular dimension. The
idea is that a register has distinctive features and that these features are found more, or less, frequently in this
particular register.
Now read the following extract from Biber et al. (2002):
The dimensions identified in multidimensional analysis have both linguistic and functional interpretations. The
linguistic content is a group of features (e.g., nouns, attributive adjectives, prepositional phrases) that co-occur with
a markedly high frequency in texts. On the assumption that co-occurrence reflects shared functions, analysts
interpret the co-occurrence patterns to assess the situational, social, and cognitive functions most widely shared by the
linguistic features. For example, the frequent co-occurrence of first-person pronouns, second-person pronouns, hedges
[sort of, kind of, etc.], and emphatics in conversational texts is interpreted as reflecting directly interactive situations
and a primary focus on personal stance and involvement.
Features with higher loadings are thus better representatives of the dimension underlying a factor. Most of the
dimensions consist of two groupings of features, one with positive and the other with negative loadings. The positive
and negative sets represent features that occur in a complementary pattern. That is, when the features in one group
occur together frequently in a text, the features in the other group are markedly less frequent in that text, and vice
versa.
Appendix 2
Please read the transcription of your interview. Note the uses of adverbial hedges in your discourse (in bold). Your
uses of adverbial hedges like sort of, kind of, maybe or almost can be quantified through a mean score. This mean score
reflects how many times per 1000 words you use an adverbial hedge.
In your case, you use an adverbial hedge .... per 1000 words.
10 P. Aguado-Jime
´nez et al. / System xx (2012) 1e14
+MODEL
Please cite this article in press as: Aguado-Jime
´nez, P., et al., Exploring the use of multidimensional analysis of learner language to promote
register awareness, System (2012), doi:10.1016/j.system.2012.01.008
Aguado-Jiménez, P., Pérez-Paredes, P. & Sánchez, P. (2012)Exploring the Use of Multidimensional Analysis of Learner
Language to Promote Register Awareness. System, 40 (1), pp. 90-103.
99
British speakers who did the same interview as you did, used 3.4 adverbial hedges per 1000 words. See some
examples:
accepted sort of norm in a small French .. [sort of].. extremely Catholic .. er village and er people coming in
from
mixing things up a bit erm .. so yeah it gets you [kind of] like thinking .. erm .. favourite actor ’s probably erm
even though he didn’t look like the evil man he portrayed.. it was [almost] evil the way he did it
and the school was sort of open .. in the morning from [maybe] eight o’clock .. till about twelve or one there ’s
they
Despite these examples, the native speakers preferred kind of and sort of over the two other adverbial hedges:
Almost 01.42%
Maybe 20.00%
Sort of 34.28%
Kind of 44.28%
The use of adverbial hedges is an important linguistic feature of spoken register like telephone conversations and
face-to-face conversations. They play no role in registers like official documents. It is important that you consider how
this linguistic feature, as well as others, may have an impact on the way you approach different language registers,
whether spoken or written, as all of them have their own singularity.
As for your overall interview, look at how your interview profiles along this continuum. Note where the other types
of texts and registers appear on this dimension and where your peers’ mean and that of the British speakers stays.
What does this suggest to you? Read the text again (Appendix 1) and please discuss with your instructor.
11P. Aguado-Jime
´nez et al. / System xx (2012) 1e14
+MODEL
Please cite this article in press as: Aguado-Jime
´nez, P., et al., Exploring the use of multidimensional analysis of learner language to promote
register awareness, System (2012), doi:10.1016/j.system.2012.01.008
Aguado-Jiménez, P., Pérez-Paredes, P. & Sánchez, P. (2012)Exploring the Use of Multidimensional Analysis of Learner
Language to Promote Register Awareness. System, 40 (1), pp. 90-103.
100
Appendix 3
Likert scale questionnaire
1 (Strongly disagree) e5 (Strongly agree) 12345
1 Now I understand better the notion of register
2 Before this activity, I’d never reflected on the notion register
3 I found the concept of register difficult to understand
4 Now I’m better prepared to understand the relationship between individual grammatical features and discourse
5 This activity will have an important impact on my future EAP learning
Appendix 4
Dimension 1: Involved versus information production
Features and factor loading on Dimension 1 (Biber, 1988) Learner corpus mean Native speaker corpus mean
Private verbs 0.96 23.346 18.224
THAT deletion 0.91 5.484 5.397
Contractions 0.90 1.18 2.10
Present tense verbs 0.86 105.291 85.821
2nd person pronouns 0.86 29.26 21.22
DO as pro-verb 0.82 3.200 1.790
Demonstrative pronouns 0.76 4.924 5.583
General emphatics 0.74 5.221 12.759
1st person pronouns 0.74 55.966 47.297
Pronoun IT 0.71 19.765 26.128
BE as main verb 0.71 0.937 5.597
Causative subordination 0.66 6.622 4.124
Discourse particles 0.66 7.678 7.676
General hedges 0.58 2.126 3.421
Amplifiers 0.56 12.599 9.517
Sentence relatives 0.55 0.98 1.73
WH questions 0.52 3.74 1.32
Possibility modals 0.50 4.147 4.055
WH clauses 0.47 1.01 1.00
Final prepositions 0.43 5.019 4.469
Adverbs 0.42 29.934 45.452
Conditional subordination 0.32 1.11 1.76
Nouns 0.80 198.987 213.600
Word length 0.58 3.726 3.890
Prepositions 0.54 67.822 70.959
Type/token ration 0.54 38.15 46.92
Attributive adjs. 0.47 14.297 15.183
Place adverbials 0.42 7.78 10.92
Agentless passives 0.39 1.491 4.048
References
Alderson, Ch, 1996. Do corpora have a role in language assessment? In: Thomas, J., Short, M. (Eds.), Using Corpora for Language Research:
Studies in the Honour of Geoffrey Leech. Longman, London, pp. 248e259.
Allan, Q.G., 2002. The TELEC Secondary learner corpus. A resource for teacher development. In: Granger, S., Hung, J., Petch-Tyson, S. (Eds.),
Computer Learner Corpora, Second Language Acquisition and Foreign Language Teaching. John Benjamins, Amsterdam, pp. 195e211.
Anderson, W., Corbett, J., 2009. Exploring English with Online Corpora. Macmillan, Basingstoke.
Aston, G., 1995. Corpora in language pedagogy: matching theory and practice. In: Cook, G., Seidlhofer, B. (Eds.), Principle and Practice in
Applied Linguistics: Studies in Honour of H. G. Widdowson. Oxford University Press, Oxford, pp. 257e270.
Aston, G., 2002. The learner as corpus designer. In: Kettemann, B. (Ed.), Teaching and Learning by Doing Corpus Analysis. Proceedings of the
Fourth International Conference on Teaching and Language Corpora. Rodopi, Amsterdam, pp. 9e26.
Aston, G. (Ed.), 2001. Learning with Corpora. CLUEB, Bologna (Biblioteca della Scuola superiore di lingue moderne per interpreti e traduttori,
Forlı
`, 29), Houston, TX, Athelstan.
12 P. Aguado-Jime
´nez et al. / System xx (2012) 1e14
+MODEL
Please cite this article in press as: Aguado-Jime
´nez, P., et al., Exploring the use of multidimensional analysis of learner language to promote
register awareness, System (2012), doi:10.1016/j.system.2012.01.008
Aguado-Jiménez, P., Pérez-Paredes, P. & Sánchez, P. (2012)Exploring the Use of Multidimensional Analysis of Learner
Language to Promote Register Awareness. System, 40 (1), pp. 90-103.
101
Aston, G., Bernardini, S., Stewart, D. (Eds.), 2004. Corpora and Language Learners. Studies in Corpus Linguistics, vol. 17. John Benjamins,
Amsterdam.
Bernardini, S., 2000. Competence, Capacity, Corpora. A Study in Corpus-Aided Language Learning. CLUEB, Bologna.
Biber, D., 1988. Variation across Speech and Writing. C.U.P, Cambridge.
Biber, D., 2003. Variation among university spoken and written registers: a new multi-dimensional approach. In: Leistyna, P., Meyer, Ch.F. (Eds.),
Corpus Analysis. Language Structure and Language Use. Rodopi, Amsterdam, New York.
Biber, D., 2006. University Language. A Corpus-based Study of Spoken and Written Registers. John Benjamins, Amsterdam, Philadelphia.
Biber, D., Conrad, S.M.R., Reppen, Byrd, P., Helt, M., Clark, Csomay E., Cortes, V., Urzua, A., 2004. Representing Language Use in the
University: Analysis of the TOEFL 2000 Spoken and Written Academic Language Corpus. In: TOEFL Monograph Series, MS-25 (January).
Biber, D., Conrad, S., Reppen, R., Byrd, P., Helt, M., 2002. Speaking and writing in the university: a multi-dimensional comparison. TESOL
Quarterly 36, 9e48.
Biber, D., Johansson, S., Leech, G., Conrad, S., Finegan, E., 1999. Longman Grammar of Spoken and Written English. Pearson Education, Essex, England.
Boulton, A., 2009. Testing the limits of data-driven learning: language proficiency and training. ReCALL 21, 37e54.
Brown, P., Fraser, C., 1979. Speech as a marker of situation. In: Scherer, K.R., Giles, H. (Eds.), Social Markers in Speech. Cambridge University
Press, Cambridge, pp. 33e62.
Chapman, R., 1982. Developing awareness of register in English. System 10 (2), 113e118.
Cheng, W., 2005. Peer assessment of language proficiency. Language Testing 22 (1), 93e121.
Cheng, W., Warren, M., Xun-feng, X., 2003. The language learner as language researcher: putting corpus linguistics on the timetable. System 31
(2), 173e186.
Cobb, T., 2003. Analyzing late interlanguage with learner corpora: quebec replications of three European studies. The Canadian Modern Language
Review/La Revue canadienne des langues vivantes 59 (3), 393e423.
Connor-Linton, J., Shohamy, E., 2001. Register validation, oral proficiency, sampling, and the promise of multi dimensional analysis. In:
Conrad, S., Biber, D. (Eds.), Variation in English: Multi-Dimensional Studies. Pearson Education Limited, Essex, pp. 124e137.
Conrad, S., Biber, D., 2001. Variation in English. Multi-Dimensional Studies. Pearson Education, Harlow. CRAT: Corpus Research Analysis Tool.
http://perezparedes.blogspot.com/search/label/CRAT.
De Cock, S., 1998. Corpora of learner speech and writing and ELT. In: Usoniene, Aurelia (Ed.), Proceedings of the International Conference on
Germanic and Baltic Linguistic Studies and Translation. Homo Liber, Vilnius, pp. 56e66.
´az-Negrillo, A., Ferna
´ndez-Domı
´nguez, J., 2006. Error Tagging systems for learner corpora. Revista Espan
˜ola de Lingu
¨istica Aplicada 19,
83e102.
Ervin-Tripp, S., 1972. On sociolinguistic rules: alternation and co-occurrence. In: Gumperz, J.J., Hymes, D. (Eds.), Directions in Sociolinguistics.
Holt, New York, pp. 213e250.
Flowerdew, L., 2001. The exploitation of small learner corpora in EAP materials design. In: Ghadessy, M., Henry, A., Roseberry, R.L. (Eds.),
Small Corpus Studies and ELT. Benjamins, Amsterdam, pp. 363e379.
Gass, S., Selinker, L., 2001. Second Language Acquisition: An Introductory Course, Second Edition. Lawrence Erlbaum Associates.
Gavioli, L., Aston, G., 2001. Enriching reality: language corpora in language pedagogy. ELT Journal 55 (3), 238e246.
Gavioli, L., 2005. Exploring Corpora for ESP Learning. In: Studies in Corpus Linguistics, vol. 21. John Benjamins, Amsterdam.
Ghadessy, M., 1994. Key concepts in ELT: register. ELT Journal 48 (3), 288e289.
Gilquin, G., De Cock, S., Granger, S., 2010. The Louvain International Database of Spoken English Interlanguage. In: Handbook and CD-ROM.
Presses Universitaires de Louvain, Louvain-la-Neuve.
Granger, S., 1994. The learner corpus: a revolution in applied linguistics. English Today 39, 25e29.
Granger, S. (Ed.), 1998. Learner English on Computer. Longman, London.
Granger, S., 2002. A bird’s eye view of learner corpus research. In: Granger, S., Hung, J., Petch-Tyson, S. (Eds.), Computer Learner Corpora,
Language Acquisition and Foreign Language Teaching. John Benjamins Publishing Company, Amsterdam, Philadelphia.
Granger, S., 2003. The international corpus of learner English. A new resource for foreign language learning and teaching and second language
acquisition research. TESOL Quarterly 37 (3), 538e546.
Granger, S., Hung, J., Petch-Tyson, S. (Eds.), 2002. Computer Learner Corpora, Second Language Acquisition and Foreign Language Teaching.
John Benjamins, Amsterdam.
Gregory, M.J., 1967. Aspects of varieties differentiation. Journal of Linguistics 3, 177e198.
Guillot, M.N., 2002. Corpus-based work and discourse analysis in FL pedagogy: a reassessment. System 30 (1), 15e32.
Halliday, M.A.K., 1988. On the language of physical science. In: Ghadessy, M. (Ed.), Registers of Written English. Pinter Publishers, London.
Hung, T.T.N., 2002. The use of language corpora in the teaching of English. Hong Kong Journal of Applied Linguistics 7 (1), 34e48.
Hyland, K., 1996. Nurturing hedges in the ESP context. System 24 (4), 477e490.
Hymes, D., 1974. Foundations in Sociolinguistics: An Ethnographic Approach. University of Philadelphia Press, Philadelphia.
Jacobi, C.C.B. de, 2001. Lingu
¨ı
´stica de Corpus e ensino de espanhol a brasileiros: Descric¸a
˜o de padro
˜es e preparac¸a
˜o de atividades dida
´ticas
(decir/hablar; mismo; mientras/en cuanto/aunque). Dissertac¸a
˜o de mestrado, Sa
˜o Paulo, Pontifı
´cia Universidades Cato
´lica de Sa
˜o Paulo.
Jiang, X., 2006. Suggestions: what should ESL students know? System 34 (1), 36e54.
Johansson, S., 2007. Using corpora: from learning to research. In: Hidalgo, E., Quereda, L., Santana, J. (Eds.), Corpora in the Foreign Language
Classroom. Selected Papers from the Sixth Internacional Conference on Teaching and Language Corpora. Rodopi, Amsterdam, New York,
pp. 17e30.
Johnson, K., 1996. Language Teaching & Skill Learning. Oxford, Blackwell.
Kawai, G., 2006. Collaborative peer-based language learning in unsupervised asynchronous online environments. In: Proceedings of the Fourth
International Conference on Creating, Connecting and Collaborating through Computing (Berkeley, CA).
13P. Aguado-Jime
´nez et al. / System xx (2012) 1e14
+MODEL
Please cite this article in press as: Aguado-Jime
´nez, P., et al., Exploring the use of multidimensional analysis of learner language to promote
register awareness, System (2012), doi:10.1016/j.system.2012.01.008
102
Kennedy, G., 1998. An Introduction to Corpus Linguistics. Longman, London.
Lee, D., Swales, J., 2006. A corpus-based EAP COURSE for NNS doctoral students: moving from available specialized corpora to self-compiled
corpora. International Journal of Corpus Linguistics 11 (2), 256e257.
Lynch, T., 2007. Learning from the transcripts from an oral communication task. ELT Journal 61 (4), 311e320.
Meunier, F., 2002. The pedagogical value of native and learner corpora in EFL grammar teaching. In: Granger, S., Hung, J., Petch-Tyson, S.
(Eds.), Computer Learner Corpora, Second Language Acquisition and Foreign Language Teaching. John Benjamins, Amsterdam,
pp. 119e142.
Partington, A., 1998. Patterns and Meanings: Using Corpora for English Language Research and Teaching. John Benjamins, Amsterdam,
Philadelphia.
Pe
´rez-Paredes, P., Cantos-Go
´mez, P., 2004. Some lessons students learn: self-discovery and corpora. In: Aston, G., Bernardini, S., Steward, D.
(Eds.), Corpora and Language Learners. John Benjamins Publishing Company, Amsterdam, Philadelphia.
Pe
´rez-Paredes, P., 2010a. Corpus linguistics and language education in perspective: appropriation and the possibilities scenario. In: Harris, T.,
Moreno Jae
´n, M. (Eds.), Corpus Linguistics in Language Teaching. Peter Lang, Bern, pp. 53e73.
Pe
´rez-Paredes, P., 2010b. The death of the adverb revisited: attested uses of adverbs in native and non-native comparable corpora of spoken
English. In: Moreno Jae
´n, M., Calzada, M., Serrano, F. (Eds.), Exploring New Paths in Language Pedagogy Lexis and Corpus-based Language
Teaching. Equinox, London, pp. 157e172.
Pravec, N.A., 2002. Survey of learner corpora. ICAME Journal 26, 81e114.
Ro
¨mer, U., 2008. Corpora and language teaching. In: Lu
¨deling, A., Kyto
¨, M. (Eds.), Corpus Linguistics. An International Handbook, vol. 1.
Mouton de Gruyter, Berlin, pp. 112e130.
Ruehlemann, C., 2007. Conversation in Context: a Corpus-Driven Approach. Continuum, London.
Shohamy, E., Donitsa-Schmidt, S., Waizer, R., 1993. The Effect of the Elicitation Mode on the Language Samples Obtained in Oral Tests. Paper
presented at the 15th Language Testing Research Colloquium, Cambridge, England.
Sinclair, J., 2003. Reading Concordances: An Introduction. Longman, Harlow.
Sinclair, J.McH. (Ed.), 2004. How to Use Corpora in Language Teaching. Studies in Corpus Linguistics, vol. 12. Benjamins, Amsterdam.
Tono, Y., 1999. A Corpus-Based Analysis of Interlanguage Development: Analysing POS Tag Sequences of EFL Learner Corpus. Lodz
University, Poland. Paper presented at PALC’99.
Tono, Y., 2000. A corpus-based analysis of interlanguage development: analysing part-of-speech sequences of EFL learner corpora. Papers from
the International Conference at the University of Lo
´dz. In: Lewandowska-Tomaszczyk, B., Melia, P.J. (Eds.), PALC’99: Practical Applications
in Language Corpora. Peter Lang, Frankfurth am Main, pp. 323e340.
Tono, Y., 2002. The role of learner corpora in SLA research and foreign language teaching. The multiple comparison approach, Unpublished PhD
thesis, Lancaster University.
Tono, Y., 2003. Learner corpora: design, development and applications. In: Archer, D., Rayson, P., Wilson, A., McEnery, T. (Eds.), Proceedings of
the Corpus Linguistics 2003 Conference (CL 2003). Lancaster University, University Centre for Computer Corpus Research on Language,
pp. 800e809. Technical Papers 16.
Vinther, J., 2004. Can parsers be a legitimate pedagogical tool? Computer Assisted Language Learning 17 (3e4), 267e288.
Yoon, H., Hirvela, A., 2004. ESL student attitudes toward previous term corpus next term use in L2 writing. Journal of Second Language Writing
13 (4), 257e283.
Yeung, L., 2009. Use and misuse of ‘besides’: a corpus study comparing native speakers’ and learner’ English. System 37 (2), 330e342.
14 P. Aguado-Jime
´nez et al. / System xx (2012) 1e14
+MODEL
Please cite this article in press as: Aguado-Jime
´nez, P., et al., Exploring the use of multidimensional analysis of learner language to promote
register awareness, System (2012), doi:10.1016/j.system.2012.01.008
Aguado-Jiménez, P., Pérez-Paredes, P. & Sánchez, P. (2012)Exploring the Use of Multidimensional Analysis of Learner
Language to Promote Register Awareness. System, 40 (1), pp. 90-103.
103
... This area of inquiry has received little attention from corpus linguists as Googlelike resources are perceived as less reliable than corpora, which present structured information and the possibility to use frequency of use as a mediator between "real language use" and L2 learners' cognitions about language use. Some researchers (Aguado-Jim enez et al., 2012;P erez-Paredes and Cantos, 2004) have also advocated the use of DDL skills to examine learner language and promote discourse and textual language awareness (Boulton & Cobb, 2017). ...
... DDL has been found to increase language awareness, sensitivity to linguistic variation, the detection of linguistic patterns (Aguado-Jim enez et al., 2012;Boulton & Cobb, 2017;Cobb & Boulton, 2015;P erez-Paredes & Cantos, 2004), and the acquisition and application of a new set of data skills dealing with word frequency and language learning in texts (Pegrum, 2016). The app discussed in this study sought to take advantage of such affordances, and it seemingly succeeded in doing so as around half of the respondents claimed to be more aware of how they used vocabulary when writing (POS: 53.83%), of how vocabulary and grammar combine to form a coherent text (POS: 41.54%), of how to use online resources for language learning (POS: 41.54%), and to be more sensitive to the role of vocabulary frequency in language learning (POS: 59.23%). ...
... After using the DDL app, there seems to be a consensus among the subjects regarding the increased awareness of the role of individual words and their frequency. These results therefore support the academic community's advocacy of using DDL skills to promote language awareness and further understanding of the L2 language system (Aguado-Jim enez et al., 2012;Boulton & Cobb, 2017;P erez-Paredes & Cantos, 2004). ...
Article
Data-driven learning (DDL) is a learner-focused approach which promotes language learners’ discovery of linguistic patterns of use and meaning by examining extensive samples of attested uses of language. Despite the emergence of mobile-assisted language learning (MALL) and its affordances, i.e. individualization and personalization, the potential of DDL in this context has not been widely explored. This study involved the creation of a mobile language learning app based on freely available natural language processing (NLP) tools, followed by a test of the app to gather the attitudes and perceptions of several groups of language learners across Europe. The results suggest a generally positive evaluation of DDL’s instant and personalized feedback and direct access to a variety of tools. Besides, suggestions for improvement were made concerning the design of the tasks, such as the addition of further built-in tools and adaptations to hardware constraints. Analyses also showed a need for specialized learner training, so as to grasp the potential of the feedback provided. This study may be construed as a first step towards creating more fleshed-out tools and further investigating the potential of combining DDL and MALL.
... The whole process is assisted with the corpus-based methodology to annotate the texts and analyze the statistics. MD analysis is considered a practical way that shifts the researchers' attention from "a particular language feature" [12] to the notion of "language variation as a continuum" [12], providing a quantitative and qualitative method for the "holistic" [13] analysis of the complexity of texts. ...
... The whole process is assisted with the corpus-based methodology to annotate the texts and analyze the statistics. MD analysis is considered a practical way that shifts the researchers' attention from "a particular language feature" [12] to the notion of "language variation as a continuum" [12], providing a quantitative and qualitative method for the "holistic" [13] analysis of the complexity of texts. ...
... Over the last three decades, MD analysis has shed light on the complexity of linguistic variation across different registers and domains, such as learners' discourse [12], blogs [14], academic discourse [15], translation [16] and contrastive study [17]. It has provided pedagogical implications for language teaching and learning. ...
... MDA of learner language has been underused as a tool for language research and pedagogy. One of the few studies where MDA was used to explore learner language is Connor-Linton and Shohamy (2001) and one of the few pedagogic applications of MDA is Aguado et al. (2012). Considering that corpus techniques have proved useful in the analysis and characterization of learner output and in the exploration of native speaker language oriented towards test design and validation, it remains to be seen how LTA and learner corpus research (LCR) can benefit from the study of L2 interviews from a variationist perspective by using MDA. ...
... The LOCNEC (De Cock, 2004) is made up of 90,300 words contributed by 50 native speakers of English, all of them undergraduate and graduate students at Lancaster University. The extended LOCNEC includes 28 extra interviews from the British component of the CAOS-E corpus (Aguado et al., 2012). It is made up of 21,509 words contributed by British undergraduate students at Manchester Metropolitan University. ...
Article
Full-text available
This research profiles L2 interviews from a variationist perspective by using native speaker data in order to gain insight into the characteristics of three different speaking tasks in the framework of the LINDSEI learner language corpus tradition: Personal Narrative Component, an Interaction Component and a Picture Description. This way, we set out to research one area of the assessment of proficiency that is usually neglected: that of the linguistic nature of the tasks used to assess general “proficiency” in a given language. Our corpus was part-of-speech (POS) tagged and analysed using Multidimensional Analysis (MDA). We found that the different speaking tasks determine the range of linguistic features that are more likely to be generated by the communicative potential of the task itself. This profiling is of interest in areas such as language assessment, where the interview is widely used to evaluate the speakers’ communicative competence, but also in the field of learner language research.
... These are important findings that highlight the impact of corpora on younger learners across a variety of (multimedia) texts. As shown in Thompson (2004, 2007) and Aguado-Jiménez et al. (2012), this exploration seems to facilitate the appreciation of how language patterns manifest themselves and shows the relevance of metacognition in understanding the statistical properties of language (Ellis, 2012). Sealey and Thompson (2007, p. 208) claim that the linguistic evidence found in corpora can contribute to "keeping teaching about language itself firmly anchored in genuine discourse". ...
... Spoken pedagogic corpora showcase some of the language learning affordances most widely cited in the DDL literature (Pérez-Paredes, 2010), including the huge 80 Pascual Pérez-Paredes potential for language awareness raising (Aguado-Jiménez et al., 2012). However, corpora in the possibilities scenario fail to convey the communicative intent of the speakers: ...
... On a micro level, when individual features are identified and fully explored, the resulting picture tends to be more difficult to interpret. This is confirmed by research on the use of adverbs (Pérez-Paredes & Mark forthcoming) that has found that task sensitivity works differently across groups of non-native speakers in the LINDSEI corpus (Brand & Kämmerer 2006;Gilquin, De Cock & Granger 2010) and British speakers in the extended-LOCNEC (Aguado-Jiménez et al. 2012). When the frequency of use of really across the set topic, free discussion and picture description tasks was examined, Pérez-Paredes and Mark (forthcoming) found that in the German and British groups, the speaking task had an impact on frequency. ...
Article
Our research examines the use of three stance adverbs of certainty ( actually, really and obviously ) across B1, B2 and C1 levels in the Trinity Lancaster Corpus (TLC). Particularly, we examined the occurrence of these adverbs in the subset of Spanish L1 speakers from Mexico and Spain. Really, actually and obviously were found to display a distinctive frequency of use across different proficiency levels and the different speaking tasks analysed. Dialogic tasks favoured a more frequent use of really and actually , while obviously was hardly used. Qualitative analyses of the pragmatic functions of really and actually revealed that there is an increase in the use of meanings to express hedging in really and factualness in actually across the proficiency levels. Our research confirms the finding in Gablasova et al. (2017) that the type of speaking task conditions speakers’ repertoire of linguistic devices, although we argue that this conditioning operates on different levels.
Article
Full-text available
This paper conducts a comprehensive review and comparative analysis of the research on register published in Chinese and international authoritative journals from 2010 to 2021 by employing CiteSpace 5.8.R3, a visual bibliometric software. It describes the number of publications, the keywords with the strongest citation bursts, research institutions, journals and influential authors, and pinpoints the principal frontiers of register. The results indicate that the number of publications of Chinese register research has shown a significant downward trend on the whole, while international register research has shown a significant upward trend on the whole. The journal of high-cited papers on register studies in China has a low impact factor, while international hot papers on register studies have a high impact factor. Chinese scholars focus on the different research perspectives of the register (systemic functional linguistics, multidimensional analysis and corpus), while international research pays attention to register variation, especially English variation and Spanish variation, and register in academic writing. Influential scholars leading the trend of register research include Biber and Rooy. The findings of this study would provide some academic and pedagogical implications on the register for Chinese scholars.
Chapter
Feedback is an important way through which doctoral students learn the expectations of writing at their particular level of study. It is also through feedback that students are inducted into the community of practice of their discipline (Hyatt, 2005; Kumar & Stracke, 2007; Lave & Wenger, 1991). This feedback is provided by the student’s supervisor/s during their period of study as well as by examiners on completion of the degree. While a number of studies have examined supervisor feedback (see e.g. Basturkman et al., 2014; Bitchener et al., 2010; Kumar & Stracke, 2007; Morton et al., 2014; Starfield, 2019; Wang & Li, 2011) and examiner feedback (see e.g. Holbrook, et al., 2004, 2014; Starfield, et al., 2015, 2017; Stracke & Kumar, 2010), a matter that is still underexplored is how supervisors and examiners ask students to make changes to their work and the language they use do this. This chapter examines the feedback that doctoral students get on their work from both supervisors and examiners. In particular, it examines features of this feedback that students might find difficult to interpret and, as a result, respond to.
This research uses the theoretical framework of CALL normalisation developed by Bax (2003 Bax, S. (2003). CALL—past, present and future. System, 311, 13–28. doi:10.1016/S0346-251X(02)00071-4[Crossref] , [Google Scholar]) and Chambers and Bax (2006 Chambers, A., & Bax, S. (2006). Making CALL work: Towards normalisation. System, 344, 465–479. doi:10.1016/j.system.2006.08.001[Crossref] , [Google Scholar]) to offer a systematic review (Gough et al., 2012 Gough, D., Oliver, S., & Thomas, J. (2012). An introduction to systematic reviews. London: Sage. doi:10.1186/2046-4053-1-28[Crossref] , [Google Scholar]) of the uses and spread of data-driven learning (DDL) and corpora in language learning and teaching across five major CALL-related journals during the 2011–2015 period. DDL research represented 4.2% of all published papers on CALL during this time frame. The main focus of research was found to be the use of concordancing and collocations when developing university students’ writing skills. Contrary to previous research, access to technology was not identified as an impeding factor for normalisation. Syllabus integration and a lack of contribution from language teachers other than researchers emerged as threats to the normalisation of corpora use. Further theorisation is needed if DDL and corpora are to expand their influence on mainstream second language education.
Book
Cambridge Core - Discourse Analysis - Register, Genre, and Style - by Douglas Biber