Content uploaded by Marina Dodigovic
Author content
All content in this area was uploaded by Marina Dodigovic on Jan 29, 2014
Content may be subject to copyright.
Artificial Intelligence and Second
Language Learning: An Efficient Approach
to Error Remediation
Marina Dodigovic
College of Arts and Sciences, American University of Sharjah,
United Arab Emirates
While theoretical approaches to error correction vary in the second language acqui-
sition (SLA) literature, most sources agree that such correction is useful and leads to
learning. While some point out the relevance of the communicative context in which
the correction takes place, others stress the value of consciousness-raising. Trying to
reconcile the two approaches, this paper describes an application of artificial intel-
ligence in the second language error remediation process. The software presented is
called the Intelligent Tutor. It diagnoses some typical errors in the writing of univer-
sity students who are learning English as a second language. A quasi-experimental
study consisting of a grammaticality judgment pre-test, a treatment in the form of
the Intelligent Tutor and a short answer post-test, was carried out with 266 university
students in three countries. The findings show that artificial intelligence is an effi-
cient instrument of error remediation, reducing the error rate by an average of 83%.
This paper discusses the theoretical underpinnings of the software used, briefly de-
scribes the software itself and then presents the study design, its findings and their
implications in the wider context of second language learning.
doi: 10.2167/la416.0
Keywords: artificial intelligence, second language learning, error correction,
remediation, Intelligent Tutor
Introduction
To correct or not to correct the errors made by second language learners
has been a controversial issue in second language acquisition (SLA). While the
nativists (Krashen, 1987) believe that language can be acquired without paying
attention to it and that therefore the negative evidence (Gass & Varonis, 1994;
James, 1998) arising from error correction is not helpful, cognitive theorists
see paying attention to (Schmidt, 2001) and noticing (Doughty, 2001; Long &
Robinson, 1998; Swain, 1998) the features of language as the key to second
language learning. Since error correction directs the learner’s attention to aspects
of language and forces her to notice its systemic features, in light of this theory,
correction appears to be helpful. So far, there has been some evidence (Tomasello
& Herron, cited in R. Ellis, 1997) that systematically addressing typical errors is
conducive to second language learning.
This paper claims that there are definite benefits to error correction, benefits
that might eventually lead to remediation ( James, 1998) or the disappearance of
0965-8416/07/02 099-15 $20.00/0 C
2007 M. Dodigovic
LANGUAGE AWARENESS Vol. 16, No. 2, 2007
99
100
Language Awareness
errors. It also claims that artificial intelligence can be a very useful instrument
of second language error correction and remediation. The study described here
revolves around the software package developed by the author to correct typical
errors made by learners of English as a second language. More detail regard-
ing the software itself and its development can be found in the CALL Journal
(Dodigovic, 2002), a special issue of Language Awareness (Dodigovic, 2003) and
a recent book published by Multilingual Matters (Dodigovic, 2005). All of these
publications give extensive cross-disciplinary theoretical background to the soft-
ware, while anticipating a large scale study of its effectiveness. This paper fi-
nally brings about the long awaited effectiveness data and should therefore be
regarded as a natural sequel to the three cited sources.
Artificial Intelligence in Second Language Learning
Artificial intelligence (AI) is a term referring to machines which emulate the
behaviour of intelligent beings (Borchardt & Page, 1994). AI is an interdisci-
plinary area of knowledge and research, whose aim is to understand how the
human mind works and how to apply the same principles in technology de-
sign. In language learning and teaching tasks, AI can be used to emulate the
behaviour of a teacher or a learner (Matthews, 1993). In order to emulate the
behaviour of a language teacher, a machine needs to have the ‘knowledge’ of
teaching methodology. In order to emulate a learner, a machine has to have the
knowledge of learning styles and strategies (Bull, 1997). In particular, however,
both of these emulations require the knowledge of language.
The discipline responsible for representation of the linguistic knowledge
within AI is sometimes referred to as computational linguistics (O’Grady et al.,
1997). The aspect of computational linguistics that has been most frequently
utilised to assist in the second language learning process is computational syn-
tax (O’Grady et al., 1997). This subdiscipline of computational linguistics is
responsible for natural language processing or the automated breaking down
of sentence structures into parts of speech. Computer programs designed to
perform such analysis are called parsers (Swartz & Yazdani, 1992).
Those systems used in SLA that contain parsers are referred to as Intelligent
Tutoring Systems, or specimens of Intelligent Computer Assisted Language
Learning (ICALL) (e.g. Hamburger et al., 1999; Holland et al., 1993). A number
of such systems have been designed so far with the purpose to correct learner
errors (e.g. Heift, 2003; Tschichold, 1999). The parsers in such systems as a rule
have to reflect two different models of language, the teacher’s and the student’s.
While the teacher’s language is expected to be correct or conformant with the
standard, the student’s language is expected to exhibit non-standard features,
those features that are most commonly referred to as errors ( James, 1998). These
features constitute what has been referred to as interlanguage (Cook, 1993;
Selinker, 1972), more recently alternatively named learner language (Ellis &
Barkhuizen, 2005). Combining the two models of language is not an easy task
for a parser. In order to cater to the standard, correct language, a parser needs
to be specific and intolerant of error. On the other hand, if it is to process the
student’s erroneous input, it has to be robust and tolerant of error. Normally the
two requirements are seldom placed on parsers simultaneously. However, in
Artificial Intelligence and Second Language Learning
101
pedagogical applications, the parsers have to have both qualities: to recognize
correct language as correct and to be able to make sense of incorrect input
(Ehsani & Knodt, 1998).
The device evaluated in this study is the Intelligent Tutor, a computer program
designed to diagnose and correct some typical errors produced by adult learners
of English as a second language. The parser contained in this program to some
extent meets the contradicting requirements of pedagogical parsers. It recognises
a fair bit of correct language as such, while it anticipates and makes sense of
some typical learner errors. In order to do that it employs a grammar based on
the so-called mal-rules or bug rules (Heift, 2003). In a nutshell, this means that
the program masters the student interlanguage and has an inherent system of
erroneous grammar at its command. Much more detail about the make-up of
this grammar is found in Dodigovic (2002, 2003, 2005).
It has often been remarked that one should not accept anything less than
perfection in parsers. In other words, there is an expectation that parsers should
be able to recognize all of the correct language as such (Salaberry, 1996). In
defence of parsers, one should say that even human native speakers perform
this task with more or less success. Thus, why would one expect a computer
to succeed where even humans are known to fail? Since the pedagogical needs
are sometimes centred on isolated aspects of language at one time, rather than
whole language all of the time, it seems reasonable to accept the performance
of a parser that can process some aspects of language very well (Ehsani &
Knodt, 1998). This is especially the case in situations where human tutors are not
available to address all of the students’ needs. Paired with realistic simulations
of everyday situations, parsers can contribute to the authenticity of learning
experience (Chapelle, 2001).
This study assumes that the students need to test their hypotheses about
the second language (L2) (Swain, 1998), on ever-fresh examples of language
production. Thus, the advantages of parsers are real and can be harnessed to
produce considerable learning effects. This study demonstrates the effectiveness
of the Intelligent Tutor in terms of learning outcomes (Chapelle, 2001).
Errors and Correction in Second Language Learning
Since the main purpose of the Intelligent Tutor is to diagnose and correct L2
errors, we shall briefly review the stances on error correction in SLA theory.
Error correction is supported in the SLA literature (e.g. Doughty, 2001; Gregg,
2001; Long & Robinson, 1998) in two different ways. The first one insists on its
relevance to the communicative context (e.g. Doughty & Williams, 1998; Long
& Robinson, 1998), while the second one stresses the value of consciousness
raising (R. Ellis, 1997; James, 1998). The two approaches can be called focus-
on-form (e.g. Doughty, 2001; Doughty & Long, 2003) and language awareness
( James, 1998) or consciousness-raising approaches (R. Ellis, 1997), respectively.
The former seeks not to distract the learner from the message and meaning,
and therefore offers corrective recasts or asks clarification questions. The latter
allows for the metalinguistic aspect to come to the foreground.
Most SLA theorists nowadays agree that noticing is a crucial event in language
error correction and learning (Schmidt, 2001). To James (1998), noticing supports
102
Language Awareness
consciousness-raising, which is equated with the explanation of the unknown
leading to what Krashen (1987), N. Ellis (2001) and R. Ellis (1997) call explicit
or conscious learning or the kind of learning that is responsible for accuracy.
James (1998) also supports awareness-raising through the explication of what
is implicitly known (N. Ellis, 2001; R. Ellis, 1997) thus converting it to explicit
knowledge (N. Ellis, 2001; R. Ellis, 1997). Other approaches (i.e. the audio-
lingual teaching method and communicative language learning) seem to be
geared towards unconscious (implicit) learning, and hence fluency.
However, it is doubtful that language learning is a totally implicit process
(N. Ellis, 2001). For instance, noticing an error invites a cognitive comparison
(Doughty, 2001; R. Ellis in James, 1998) between the student’s interlanguage and
the target language. Whereas to Doughty (2001) this is a cognitive intrusion
designed to facilitate what seems to be implicit learning, James (1998) identifies
this comparison of the correct and incorrect linguistic structure as a form of
error analysis, hence explicit learning. Overall though most theorists nowadays
agree that correction is helpful.
Suggested ways of correcting errors vary across the board. Some interaction-
ists, for example, believe in recasts (Doughty, 2001; Long & Robinson, 1998),
claiming that this is the least disruptive cognitive intrusion likely to have the
desired repair effect. Lyster and Ranta (in Mitchell & Myles, 1998), on the other
hand, argue that, despite the valuable negative evidence they offer, recasts do
not compel the learners to self-correct. Apart from recasts, research and theory
on correction types and their effectiveness include stating the relevant linguis-
tic rule, error type indication without recast, mere underlining and mere error
count per line, the comparison between which currently seems inconclusive
( James, 1998). Learner preferences for correction types also seem to vary, while
it is not certain that the preferred method of correction is the most useful one
(Oxford, 1995).
An important pedagogical question is also when to correct an error. Given the
nature of the question, it seems very important to decide whether an erroneous
utterance contains a genuine error. James (1998) for example distinguishes be-
tween a slip, an odd mistake or a systemic error. A slip is expected to result in
self-correction, a mistake calls for feedback, in this case a clue to the required
structure, and error calls for a full correction of the erroneous structure ( James,
1998). The Intelligent Tutor was designed to deal with genuine errors. However,
it allows for slips by giving the students ample opportunities to self-correct as
well as for mistakes by generating hints.
Freiermuth (1997) suggests that the issue of when to correct is a complex one,
requiring the consideration of at least five different variables: (1) the exposure to
a language structure; (2) the discrepancy between the erroneous and the correct
form; (3) the impact of the error on communication; (4) the frequency of the
error; and (5) the students’ need to master this structure.
Another important question often raised in conjunction with learner errors is
whether they are developmental (intralingual) or related to the first language
(L1) transfer (interlingual) ( James, 1998). While in the past, the opinion seemed
to prevail that most errors are L1-transfer-based or interlingual, which gave rise
to contrastive analysis or comparison between the learner’s first and second
language for the purpose of anticipating problem areas for the learner (Cook,
Artificial Intelligence and Second Language Learning
103
1993; James, 1998), nowadays second language acquisitionists tend to focus on
intralingual errors, i.e. those that are common to all language learners regardless
of their linguistic background (Ellis & Barkhuizen, 2005).
Given the above criteria, the Intelligent Tutor assumes that the learners have
been taught, but may not have had adequate exposure to the target structure.
It further assumes that they have a genuine and serious misconception about
some aspect of the target language. The errors that it is designed to diagnose and
correct have been identified as frequent in a corpus of learner writing. Compre-
hensive needs analysis has finally determined that the target learner population
needs these errors to be remedied in order to accomplish their language tasks.
While the Intelligent Tutor was designed with interlingual constructs in mind,
the learner corpus it was modelled on was taken from a broad sample of learn-
ers, thus making the tutor robust enough to support the intralingual approach
as well.
Individual Learner Differences
Another advantage of the Intelligent Tutor is that it can to a modest extent
accommodate individual learners. These seem to differ in a number of ways.
While intelligence, language aptitude and affective factors seem to be very im-
portant in individual learning success, ICALL has not yet attempted to support
individual differences at this level. The Intelligent Tutor makes an attempt, albeit
humble, at catering to varying learning styles.
Learning styles seem important in deciding how to execute the error repair or
correction (e.g. Gregg, 2001; James, 1998; Oxford, 1995; Sawyer & Ranta, 2001).
In fact, it is being argued that meeting the students’ needs in terms of cognitive
styles and sensory preferences is a characteristic of high quality instructional
materials (Fleming, 2001). There are ICALL programs doing just that (e.g. Bull,
1997; Holland et al., 1993).
The Intelligent Tutor uses Willing’s (1988, 1989) approach to learner types,
based on what is known about the left brain and the right brain. A basic divi-
sion of learners here is essentially the one between ‘analytical’ or left-brained
and ‘concrete’ or right-brained, as established by Witkin et al. (1977). The way
the analytical learner processes information is linear, sequential, rational, ob-
jective, abstract, verbal, mathematical, with focus on detail, engaging in re-
flective and cautious thinking, responding to selective, low intensity stimuli
(Willing, 1989). The concrete learner on the other hand processes information in
a holistic, pattern-seeking, spatial, intuitive, subjective, concrete, emotional and
visual way, focusing on overall impression, while being impulsive and trusting
hunches, requiring rich, varied input (Willing, 1989).
Willing’s approach has been selected because it was tested on over 500 sub-
jects coming from the same kind of sample as the target user of the Tutor, us-
ing open-ended interview, survey and factor analysis as procedures. Moreover,
this study specifically examines language learning styles rather than learning
styles in general. Based on the patterns emerging from the factor analysis of
his data, Willing (1988) subdivides all learners into four types: (1) concrete, or
those who like games, pictures, video, talking in pairs, practising outside class;
(2) analytical, or those who like studying grammar, using books, reading,
104
Language Awareness
studying alone, being given tasks to work on by the teacher; (3) communicative,
or those who prefer listening to native speakers, talking to friends, using the
target language in everyday situations or basically learning by listening and
finally (4) authority-oriented, or those who prefer teacher explanation, using
textbooks and learning words by seeing them.
The above learner types are seen as stages along the field dependence/field
independence continuum (Willing, 1988). Ehrman (1998: 63) defines field inde-
pendence as the ‘ability to distinguish and isolate sensory experiences from the
surrounding sensory input’. The same author also suggests that research often
associates FI with certain personality traits such as being task-oriented rather
than people-oriented, individualistic rather than compliant and interacting with
others in a cool rather than a warm way (Ehrman, 1998). Field dependence is
by contrast defined as the lack of field independence (Ehrman, 1998). While
the analytical learner represents the extreme of the field independence end, the
concrete is its opposite – the extreme field dependence case. The communicative
learner on the other hand is predominantly field independent, with a tendency
to use communication as a strategy towards analytical practices (Willing, 1988).
The authority-oriented learner is consequently a concrete learner with a need
for structure provided by an authority, e.g. the teacher (Willing, 1988).
Communicative learners (Willing, 1989) are more likely to employ computer-
mediated communication (Warschauer, 1999) or perhaps human–computer
communication disguised as human–human interaction, following the sugges-
tion by Chapelle (1997). For this reason, the Intelligent Tutor utilises an em-
ulation of the communicative language learning approach which specifically
focuses on enhancing students’ autonomy and control over the language learn-
ing process. The tutor however also approaches the analytical learners the way
they wish to be approached, and that is by giving them problems to solve,
helping them understand the nature of their own mistakes and giving them
opportunities to learn grammar (Willing, 1989). The student, however, always
has the choice of the preferred course of action, although it is assumed that the
choice they make will be based on their learning style.
After diagnosing an error, the tutor offers three solutions to the learner: try
again, get a hint, see the solution. While the ‘try again’ and the ‘get a hint’
option are designed for analytical and communicative learners respectively, the
‘solution’ is designed for concrete and authority-oriented learners. The concrete
learner might take the correction as a recast, while the authority-oriented learner
might embrace it as the solution coming from an authority. To the communicative
learner getting a hint may represent a part of her communicative strategy. An
analytical learner can enter the correct version and obtain a parse tree which
provides what this learner type needs – analysis. The parse tree is also likely to
reinforce the correct language while raising the awareness of the structure. In
other words, the analytical learner, who wants to know not only what is correct
but also why it is correct, will have this need fully met.
Thus, the Intelligent Tutor is designed to encourage consciousness-raising on
the one hand, by giving a solution for what the student does not know, and
awareness-raising on the other, by giving hints regarding something that the
student is expected to know ( James, 1998). A further step in awareness-raising
is taken by displaying the parse tree upon successful completion of the task, thus
Artificial Intelligence and Second Language Learning
105
giving the student a chance to learn explicitly something that she may already
know implicitly (N. Ellis, 2002).
The Design of the Intelligent Tutor
As mentioned, the Intelligent Tutor bases its ability to diagnose some learner
errors and correct them on the frequency, gravity and communicational signifi-
cance of the errors found in a learner corpus, taking into account the target learn-
ers’ exposure to the structures in question as well as the learners’ specific needs.
It can be said that this program involves a hybrid development approach. This
requires the explanation of the two most frequent approaches to pedagogical
parser development. While a number of parser-based tutors can accommodate
a range of predictable errors, based on the developers’ hunches (Holland et al.,
1993; Matthews, 1993) rather than on systematic research, another approach
uses error-tagged learner corpora as a standard system element (Granger, 2003;
L’Haire & Faltin, 2003). The Intelligent Tutor discussed here has been developed
using a learner corpus, but the corpus itself is not a part of the system. In the
following, I briefly review the errors the system recognises and then explain
how the system works, providing an example of computer–learner interaction.
The system is capable of addressing seven major structural errors and a few
lesser morphological errors. While four of these major errors were originally
assumed to be specific to Chinese–English interlanguage (Yip, 1995), and the
remaining three to Indonesian–English interlanguage (Yong, 2001), both the
learner corpus and further investigation have shown that all of the seven major
errors are made by learners across the board, regardless of their first language.
The seven major errors are shown in Table 1. More information on these error
types can be found in Dodigovic (2002, 2003, 2005).
The program itself is set up as a tutorial dialogue between the student and
the computer and is similar in framework to the program called Daedalus
Integrated Writing Environment (DIWE). The main point of difference between
these two programs is that DIWE does not use artificial intelligence, but enables
several learners to interact with one another and provide peer feedback. The
Intelligent Tutor on the other hand uses a parser to give feedback. In addition to
Table 1 The seven major error types recognized by the Intelligent Tutor
Error type Example
Pseudo-passive Malaria can find all over the world.
Ergative construction The immune system can be failed.
Tough movement More difficult to be realized...
Existential construction There is a new problem occur.
Malformed expressions of feelings/
reactions/states The disease had∗dominant over human.
Missing copula Secondly, communities∗affected.
Finite/nonfinite verb confusion It will caused death of both mother and
baby.
106
Language Awareness
diagnosing and correcting learner errors, in which point it differs from DIWE,
the system is, like DIWE, designed to guide the students through the essay
planning process, but unlike DIWE this is not its main purpose. The student
is supposed to be familiar with several texts on malaria which constitute the
readings for an essay assignment. The computer hence asks questions regarding
the topic and the readings. To each question, the student is supposed to reply by
writing a sentence. Each question is likely to induce a particular error type for the
purpose of correction, since some of the literature on learner errors recommends
this procedure as an efficient way to teach language (James, 1998).
If the student’s answer is grammatically correct, she is rewarded by praise
and receives the full parse tree of her sentence. If her answer is grammatically
incorrect, the system lets her know that and offers three options: to try again, to
get a hint or to get a solution. Let us clarify this with an example.
One of the questions, designed to induce the ergative construction error, is the
question ‘Can the human immune system efficiently stop malaria’? The user is
then encouraged to use ‘fail’ in her answer, as this ergative verb may induce the
ergative construction error. The user is then likely, and is indeed known to have
answered ‘The immune system can be failed’. The first parse of that response will
establish that the sentence is erroneous and offer the three options. If the student
chooses the hint option, the system will point out that the ‘be failed’ part is incor-
rect and needs to be replaced with an infinitive. This option is shown in Figure 1.
The student now has a better chance to correct her sentence. If she does, she
is rewarded with a parse, as shown in Figure 2.
If the solution is selected as an option, the system responds with a grammat-
ically correct sentence, as shown in Figure 3.
Figure 1 The system provides a hint
Artificial Intelligence and Second Language Learning
107
Figure 2 The correct sentence rewarded with a parse
Figure 3 The solution option
108
Language Awareness
Study Design
The research question in this study has been ‘does the exposure to the
Intelligent Tutor (i.e. systematic error correction) have an effect on learning’?
The hypothesis is that systematic error correction through the Intelligent Tutor
(treatment) has a significant effect on the learning outcomes. For this reason,
the study design has been restricted to one group and one treatment procedure,
preceded by a pre-test and followed by a post-test, both pertaining to seven
typical structural errors and a few morphological ones.
This study uses some of the experimental principles, without the fully
fledged experimental model. This is called a quasi-experiment (McDonough
& McDonough, 1997: 162). Quasi-experiment in this context simply means that
there is no comparison of tasks or a comparison of treatment with the absence of
treatment (Larsen-Freeman & Long, 1991; McDonough & McDonough, 1997). In
other words, while the experimental group is working with the Intelligent Tu-
tor, there is no control group working with a different resource, or receiving no
treatment whatsoever. This is the case firstly because the objective of the study
is primarily to determine whether there is any improvement at all following the
use of the Intelligent Tutor and if so how significant this improvement is. Sec-
ondly, in our wish to avoid comparing radically different categories with one
another, such as the textbook and the computer, an incompatibility criticised
in CALL literature (Goodfellow, 1999; MacWhinney, 1995), this study has not
sought to prove that the Intelligent Tutor is better than the textbook or a teacher
or indeed any other incompatible resource.
The pre-test is a grammaticality judgement test (R. Ellis, 1997). In this test,
the takers have to literally judge the grammaticality of utterances. Their perfor-
mance on this test is then taken to reflect their competence or command of these
structures (R. Ellis, 1997; Yip, 1995), in this case the presence or absence of the
typical errors. This kind of test is often associated with the universal grammar
theory (R. Ellis, 1997) and would appear to be somewhat theoretically biased.
However, it is helpful, assuming that certain aspects of grammatical knowl-
edge cannot be understood by mere analysis of production data (Yip, 1995: 8).
Thus, frequent avoidance of difficult structures in non-native speaker produc-
tion (Yip, 1995: 5) can give a misleading account of the learner errors. This partly
justifies the choice of this instrument here. An additional reason for selecting
the grammaticality judgement test is that it will most likely reveal real errors
of competence, rather than slips of the pen or mere mistakes of performance
(Cook, 1993; R. Ellis, 1997; James, 1998).
There are altogether 12 questions in the pre-test, which is designed as a multi-
ple choice test. Each question was very similar to those asked by the Intelligent
Tutor. Thus the students had to find one or more correct paraphrases of the
initial statement in each question. The erroneous distracters in this test contain,
but are not restricted to, the seven most common errors. The results of the test
demonstrate that four errors were consistently made by the test case students:
tough movement, part of speech confusion, finite–non-finite verb confusion and
ergative construction. Some evidence of the pre-existence of these errors was
also found in the writing of these students collected for the purpose of backing
up this study with another set of data.
Artificial Intelligence and Second Language Learning
109
The post-test is a short answer test, which allows the students to produce their
own sentences. The format is deliberately different to the format of the pre-test
for the purpose of preventing the mere learning from the pre-test to influence
the outcome of the post-test (R. Ellis, 1997). Larsen-Freeman and Long (1991: 32)
successfully address concerns one might have regarding the task incongruence
by reporting on several studies that found no significant differences between
the rate and type of errors elicited through a variety of tasks. Improvement
can be measured in terms of the amount of error found in the student-written
sentences. While the anticipated errors are in the focus of this instrument, all
errors made by the students are recorded, even if they are not the anticipated
ones. Piloting the study has shown that very few unanticipated errors actually
occur.
The treatment procedure equals the quasi-experimental task (McDonough &
McDonough, 1997) and that is the work with the Intelligent Tutor. The subjects
are taken from three different target populations (Taiwan, Australia and UAE)
to ascertain variety. There is no randomisation or pair matching, since only
one treatment is indicated. However, matching with the software designer’s
objectives in a number of confounding variables provides some controls and
counterbalances (McDonough & McDonough, 1997: 160). Thus the subjects are
university students or applicants, aged 19–21, with the TOEFL score of approx-
imately 500.
In block periods, the students had an opportunity to work with the Intelligent
Tutor, after having completed the pre-test and prior to taking the post-test.
At the beginning of the period, the students were given several texts on the
topic of malaria. The texts were adjusted to match the students’ level of English
proficiency (TOEFL 500–550). The students worked in self-study mode, but were
not prevented from consulting with their classmates or the teacher.
Overall, 266 students participated in this study as subjects. Of these, 107 were
located in the United Arab Emirates (UAE) (60 in the Emirate of Sharjah, 47
in the Emirate of Dubai), 83 in Sydney, Australia, and 77 in Taichung, Taiwan.
The participating institutions were the American University of Sharjah (AUS),
Zayed University (ZU), Insearch Institute at the University of Technology Syd-
ney and, last but not least, Ling Tung College, Taichung. The study took several
months in 2004 and 2005 to complete. The results are discussed in the following
section.
Results
Compared to the pre-test, the post-test has shown an average reduction in
the error rate of 83% across the three student samples (Taiwan, Australia and
the Emirates). The best result was achieved by the Taiwanese students (94%
error reduction rate), followed by the AUS students (85% error reduction rate),
ZU students (79% error reduction rate) and then the overseas English language
learners in Australia (73% error reduction rate). Figure 4 demonstrates this
reduction rate visually.
A related samples t-test was conducted and the reduction in error rate was
found to be statistically significant (p<0.01). We can therefore conclude that
the Intelligent Tutor is the most likely cause of a considerable improvement in
110
Language Awareness
Figure 4 Difference in error rate before and after using the Intelligent Tutor
learning outcomes of a fairly large number of students. This study also suggests
that, more generally speaking, artificial intelligence can assist learners of English
as a second language with the remediation of their L2 errors. This in itself is
an important step in gaining a better understanding of the processes of second
language acquisition and teaching.
Discussion
To play the devil’s advocate, it could be said that the results are too good to
be true. We will therefore examine the possible pitfalls in this study and de-
cide together to what extent they may be responsible for the result. An obvious
question to ask would be whether the errors were simply avoided in the post-
test, which was open-ended, by choosing alternative grammatical structures.
This has however not been the case. The students attempted the same struc-
tures they were struggling with in the previous two sessions, the pre-test and
the Intelligent Tutor respectively, this time with a much higher rate of success.
Another possibility is that the students might have memorised entire sentences
while working with pre-test or the Intelligent Tutor. Their post-test responses
were however mostly creative and different in many little ways from the ex-
ample sentences presented by the pre-test and the Intelligent Tutor. Finally, the
likelihood that the students copied from each other was small, as the answers
differed from student to student. Thus, the results seem to suggest a genuine
improvement in grammar.
Several other factors might have contributed to the extraordinary improve-
ment in grammar recorded in this study. The first one is called ‘Hawthorne
effect’ (McDonough & McDonough, 1997: 166). This refers to a substantial im-
provement in student performance under study conditions due to their noticing
the uniqueness of the situation. While the quasi-experimental procedure was
introduced in a fairly low-key manner, it is still possible that the students might
have sensed something out of the ordinary about it. Since the entire procedure
was conducted by this author, who is also the author of the software used in
the treatment, it is of course possible that the author’s enthusiasm contributed
to the success of the study subjects.
Finally, the pre-test itself might have contributed to learning, thus enhancing
the effect of the treatment. For instance, R. Ellis (1997: 161) reports that some tasks
can significantly raise the learners’ consciousness concerning some linguistic
Artificial Intelligence and Second Language Learning
111
property of the target language. Judgement of well-formed vs. deviant linguistic
data, or in other words our grammaticality judgement test, is precisely such a
task that could help a learner arrive at an explicit understanding of the linguistic
item in question. However, no studies come to mind where the extraordinary
success of treatment was totally unrelated to the treatment itself.
It would be, of course, ideal to extend this study over an even larger sample
of students, which is very likely to happen in the near future. Another desider-
atum is to examine the correlation between the students’ choices when working
with the program, and their learning style. In other words, it would be really
interesting to see whether for instance the analytical learners really like to rectify
their incorrect answers themselves rather than getting a hint and so on.
What some readers may wish to see accomplished is perhaps a study with a
group of students using the Intelligent Tutor over a longer period of time. This is
a worthy project, which regretfully must be put off, subject to the availability of
funding for further program development. In its current version, the Intelligent
Tutor is a prototype supporting a very small curriculum, which would not com-
fortably stretch over a longer period of time. In its predicament, the Intelligent
Tutor is however not an isolated case. The literature on Intelligent CALL points
out the inconsistency of funding for ICALL projects and the resulting small
size of the available curricula as the main impediments towards the general ac-
ceptance of this technology in the second language pedagogy (Holland, 1995).
The fact, though, that an available prototype is being tested on a progressively
increasing learner sample, with consistently encouraging outcomes may help
to turn the tide by proving that artificial intelligence with its capacity for error
remediation can be truly useful to second language learners.
Correspondence
Any correspondence should be directed to Marina Dodigovic, Assistant Pro-
fessor of English and TESOL, American University of Sharjah, College of Arts
and Sciences, P.O. Box 26666, Sharjah, United Arab Emirates (mdodigovic@
ausharjah.edu).
References
Borchardt, F. and Page, E. (1994) Let computers use the past to predict the future. Paper
presented at the Language Aptitude Invitational Symposium, CALL Arlington, 27
September.
Bull, S. (1997) Promoting effective learning strategy use in CALL. Computer Assisted
Language Learning 10 (1), 3–39.
Chapelle, C. (1997) CALL in year 2000: Still in search for research paradigms? Lan-
guage Learning and Technology 1 (1), 19–43. http://llt.msu.edu/vol1num1/chapelle/
default.html. Accessed 18.10.2003.
Chapelle, C. (2001) Computer Applications in Second Language Acquisition. Cambridge:
Cambridge University Press.
Cook, V. (1993) Linguistics and Second Language Acquisition. London: Macmillan.
Dodigovic, M. (2002) Developing writing skills with a Cyber-Coach. CALL Journal 15 (1),
9–25.
Dodigovic, M. (2003) Natural Language Processing (NLP) as an instrument of raising the
language awareness of learners of English as a second language. Language Awareness
12, 187–203.
112
Language Awareness
Dodigovic, M. (2005) Artificial Intelligence in Second Language Learning: Raising Error Aware-
ness. Clevedon: Multilingual Matters.
Doughty, C. (2001) Cognitive underpinnings of focus on form. In P. Robinson (ed.) Cogni-
tion and Second Language Instruction (pp. 206–257). Cambridge: Cambridge University
Press.
Doughty, C. and Long, M. (2003) Optimal psycholinguistic environments for dis-
tance foreign language learning. Language Learning &Techn o l ogy 7 (3), 50–80. http://
llt.msu.edu/vol7num3/doughty/. Accessed 18.10.2003.
Doughty, C. and Williams, J. (1998) Pedagogical choices in focus on form. In C. Doughty
and J. Williams (eds) Focus on Form in Classroom Second Language Acquisition (pp. 197–
262). New York: Cambridge University Press.
Ehrman, M. (1998) Motivation and strategies questionnaire. In J. Reid (ed.) Understanding
Learning Styles in the Second Language Classroom (pp. 169–174). Upper Saddle River, NJ:
Prentice Hall.
Ehsani, F. and Knodt, E. (1998) Speech technology in computer-aided language
learning: Strengths and limitations of a new CALL paradigm. Language Learn-
ing &Techn o logy 2 (1), 45–60. http://llt.msu.edu/vol2num1/article3/. Accessed
18.10.2003.
Ellis, N. (2001) Memory for language. In P. Robinson (ed.) Cognition and Second Language
Instruction (pp. 33–68). Cambridge: Cambridge University Press.
Ellis, N. (2002) Unconscious and conscious sources of language acquisition. Paper
presented at the Sixth International Conference for Language Awareness, Ume˚
a,
Sweden.
Ellis, R. (1997) SLA Research and Language Teaching. Oxford: Oxford University Press.
Ellis, R. and Barkhuizen, G. (2005) Analysing Learner Language. Oxford: Oxford University
Press.
Fleming, N. (2001) Teaching and Learning Styles: VARK Strategies. Honolulu: Honolulu
Community College.
Freiermuth, M. (1997) L2 error correction: Criteria and techniques. The Lan-
guage Teacher Online 21 (9). http://langue.hyper.chubu.ac.jp/jalt/pub/tlt/97/sep/
freiermuth.html. Accessed 03.10.2003.
Gass, S. and Varonis, E. (1994) Input, interaction and second language production. Studies
in Second Language Acquisition 16, 283–302.
Goodfellow, R. (1999) Evaluating performance, approach, outcome. In K. Cameron (ed.)
CALL: Media, Design and Applications (pp. 109–140). Lisse: Swets & Zeitlinger.
Granger, S. (2003) Error-tagged learner corpora and CALL: A promising synergy. CALICO
Journal 20 (3), 465–480.
Gregg, K. (2001) Learnability and second language acquisition theory. In P. Robinson
(ed.) Cognition and Second Language Instruction (pp. 152–182). Cambridge: Cambridge
University Press.
Hamburger, H., Schoells, M. and Reeder, F. (1999) More intelligent CALL. In K. Cameron
(ed.) CALL: Media, Design and Applications (pp. 183–202). Lisse: Swets & Zeitlinger.
Heift, T. (2003) Multiple learner errors and meaningful feedback: A challenge for ICALL
systems. CALICO Journal 20 (3), 553–548.
Holland, V. (1995) Introduction: The case for intelligent CALL. In V. Holland, J. Ka-
plan and M. Sams (eds) Intelligent Language Tutors (pp. vii–xvi). Mahwah, NJ:
Erlbaum.
Holland, M., Maisano, R., Alderks, C. and Martin, J. (1993) Parsers in tutors: What are
they good for? CALICO Journal 11 (1), 28–46.
James, C. (1998) Errors in Language Learning and Use. Exploring Error Analysis.London:
Longman.
Krashen, S. (1987) Principles and Practice in Second Language Acquisition. London: Prentice-
Hall.
Larsen-Freeman, D. and Long, M. (1991) An Introduction to Second Language Acquisition
Research. New York: Longman.
L’Haire, S. and Faltin, A. (2003) Error diagnosis in the FreeText project. CALICO Journal
20 (3), 481–495.
Artificial Intelligence and Second Language Learning
113
Long, M. and Robinson, P. (1998) Focus on form: Theory, research and practice. In
C. Doughty and J. Williams (eds) Focus on Form in Classroom Second Language Ac-
quisition (pp. 15–41). New York: Cambridge University Press.
MacWhinney, B. (1995) Evaluating foreign language tutoring systems. In V. Holland,
J. Kaplan and M. Sams (eds) Intelligent Language Tutors (pp. 317–326). Mahwah, NJ:
Erlbaum.
Matthews, C. (1993) Grammar frameworks in intelligent CALL. CALICO Journal 11 (1),
5–27.
McDonough, J. and McDonough, S. (1997) Research Methods for English Language Teachers.
London: Arnold.
Mitchell, R. and Myles, F. (1998) Second Language Learning Theories. London: Arnold.
O’Grady, W., Dobrovolsky, M. and Arnoff, M. (1997) Contemporary Linguistics. Boston:
Bedford/St. Martin’s.
Oxford, R. (1995) Linking theories of learning with intelligent computer-assisted lan-
guage learning (ICALL). In V. Holland, J. Kaplan and M. Sams (eds) Intelligent Language
Tutor s (pp. 359–370). Mahwah, NJ: Erlbaum.
Salaberry, M. (1996) A theoretical foundation for the development of pedagogical tasks
in computer mediated communication. CALICO Journal 14 (1), 5–36.
Sawyer, M. and Ranta, L. (2001) Aptitude, individual differences, and instructional de-
sign. In P. Robinson (ed.) Cognition and Second Language Instruction (pp. 319–353).
Cambridge: Cambridge University Press.
Schmidt, R. (2001) Attention. In P. Robinson (ed.) Cognition and Second Language Instruc-
tion. Cambridge: Cambridge University Press.
Selinker, L. (1972) Interlanguage. International Review of Applied Linguistics 10 (3), 209–231.
Swain, M. (1998) Focus on form through conscious reflection. In C. Doughty and
J. Williams (eds) Focus on Form in Classroom Second Language Acquisition (pp. 64–82).
New York: Cambridge University Press.
Swartz, M. and Yazdani, M. (eds) (1992) Intelligent Tutoring Systems for Foreign Language
Learning. New York: Springer.
Tschichold, C. (1999) Intelligent grammar checking for CALL. ReCALL special publication,
Language Processing in CALL, 5–11.
Warschauer, M. (1999) Electronic Literacies. Mahwah, NJ: Erlbaum.
Willing, K. (1988) Learning Styles in Adult Migrant Education. Adelaide: NCRC.
Willing, K. (1989) Teaching How to Learn. Sydney: NCELTR.
Witkin, H., Moore, C., Goodenough, D. and Cox, P. (1977) Field dependent and field
independent cognitive styles and their educational implications. Review of Education
Research 47, 1–64.
Yip, V. (1995) Interlanguage and Learnability. Philadelphia: John Benjamins.
Yong, J. (2001) Malay/Indonesian speakers. In M. Swan and B. Smith (eds) Learner English
(2nd edn, pp. 279–295). Cambridge: Cambridge University Press.