ArticlePDF Available

Reexamining the Role of Vision in Second Language Motivation: A Preregistered Conceptual Replication of You, Dörnyei, and Csizér (2016)


Abstract and Figures

Vivid mental imagery, particularly of the self in future states, has been linked to a range of desirable motivational outcomes for language learning. In this study, we report a pre-registered conceptual replication and extension of You, Dörnyei, and Csizér (2016), who found a central motivational role for vision. We review essential considerations in structural equation modeling and discuss how these were addressed in the initial study. Applying these considerations, we then describe a conceptual replication with a South Korean sample (N = 1,297) of secondary school language learners of English. Our analysis of the scales used in the initial study, plus second language achievement, found support for an alternative model to that found by the initial study in that intended effort showed a better fit as a predictor of motivation rather than an outcome variable. Our findings suggest the need for greater precision and rigor in structural equation modeling research on second language learning motivation, and for greater numbers of language researchers to take up replication and other open science initiatives.
Content may be subject to copyright.
Language Learning ISSN 0023-8333
Reexamining the Role of Vision in Second
Language Motivation: A Preregistered
Conceptual Replication of You, D ¨
and Csiz ´
er (2016)
Phil Hiver aand Ali H. Al-Hoorie b
aFlorida State University and bRoyal Commission for Jubail and Yanbu
Researchers have linked vivid mental imagery, particularly of the self in future states, to
many desirable motivational outcomes for language learning. We report a preregistered
conceptual replication and extension of You, D¨
ornyei, and Csiz´
er (2016), who found
a central motivational role for vision. We review essential considerations in structural
equation modeling and discuss how the initial study addressed these, then describe
a conceptual replication with a South Korean sample of secondary school learners of
English (N=1,297). Our analysis of the scales from the initial study in addition to second
language achievement found support for an alternative model where the Intended Effort
scale showed a better fit as a predictor of motivation than as an outcome variable. Our
findings suggest the need for greater precision and rigor in structural equation modeling
research on second language learning motivation and for more language researchers to
take up replication and other open science initiatives.
We would like to thank the five reviewers for their thoughtful and constructive comments. We also
thank the editorial team of Pavel Trofimovich, Kara Morgan-Short, and Emma Marsden for their
assistance throughout the review process. Special thanks must also go to Chenjing (Julia) You for
her time reading and commenting on a previous version of this manuscript.
This article has been awarded Open Data, Open
Materials, and Preregistered Research Design
badges. All data and materials, along with preregis-
tration for research design and analyses, are publicly
accessible through the Open Science Framework at The study materials are
also publicly available via the IRIS database at Learn more about
the Open Practices badges from the Center for Open Science:
Correspondence concerning this article should be addressed to Phil Hiver, Florida State Univer-
sity, School of Teacher Education, College of Education, 1114 W. Call St., G128 Stone Building,
Tallahassee, FL 32306, United States. E-mail:
Language Learning 70:1, March 2020, pp. 48–102 48
C2019 Language Learning Research Club, University of Michigan
DOI: 10.1111/lang.12371
Hiver and Al-Hoorie Reexamining Vision in Second Language Motivation
Keywords motivation; vision; replication; preregistration; structural equation model-
ing; gender; second language
Replication of research findings has recently been recognized as an important
component of cumulatively refining empirical evidence in the social sciences,
including the field of language learning (Marsden, Morgan-Short, Thompson,
& Abugaber, 2018; Morgan-Short et al., 2018; Porte, 2012). Mackey and Gass
(2005) defined replication as “[c]onducting a research study again, in a way
that is either identical to the original procedure or with small changes . . .
to test the original findings” (p. 364), with the successful replication of a
finding lending further legitimacy to the initial study (Hiver & Al-Hoorie,
2016). A replication may be direct, when researchers hold constant design,
methods, and analysis, or partial, when researchers intentionally introduce one
significant change to a key variable to test replicability under the new condition
or procedure (Marsden, Morgan-Short, Thompson, & Abugaber, 2018). In
contrast, conceptual replications, like the current study, introduce more than
one significant change to an initial study and can provide information about the
potential value of different approaches to investigating a problem or different
conceptualization of a construct.
A subdomain of language learning research where replication research
is needed is language learning motivation. In recent literature in the second
language (L2) motivation field, vision and mental imagery1have received con-
siderable attention, with greater numbers of researchers showing interest in
the empirical assessment (see Al-Hoorie, 2018) and practical application (e.g.,
ornyei & Kubanyiova, 2014; Hadfield & D¨
ornyei, 2013) of vision in facilitat-
ing language learning. For example, You, D ¨
ornyei, and Csiz´
er (2016) conducted
a study that they described as “the first to offer a broad overview of the extent
to which the capacity of vision contributes to the overall motivational setup of
a whole language learning community” (p. 94). Their study with a large-scale
Chinese sample reported findings related to a number of motivation constructs,
including vision, sensory style, positive and negative change in L2 self im-
age, and gender. In evaluating their results, You et al. concluded that their
study offered unambiguous support for the role of vision in language learning
motivation (cf. pp. 113 and 120).
However, as our review of the literature shows, there is also evidence sug-
gesting that the effect of vision on effort and performance is open to question.
Thus, in order to refine our understanding of the extent to which vision can
49 Language Learning 70:1, March 2020, pp. 48–102
Hiver and Al-Hoorie Reexamining Vision in Second Language Motivation
be said to contribute to motivation for L2 learning, the first purpose of the
present study was to attempt to conceptually replicate the structural equation
model (SEM) used by You et al. (2016). Replications of studies using SEM,
of any kind, have tended to be rare. Perhaps because SEM typically requires
large samples, most of the SEM literature has consisted of “one-shot studies”
for which the researchers have not attempted replications (Kline, 2016, p. 121).
This situation has often been further complicated in instances where SEM has
been used to engage in post hoc, data-driven model modification, rendering
the results exploratory and potentially less likely to replicate. Because SEM is
prone to these and other misapplications, a second equally important purpose
of the present article was to provide an up-to-date methodological review of
considerations that SEM users must take into account when they analyze data
and report results.
Following best practices in replication research (Brandt et al., 2014), we first
closely examined the purpose, design, and methodology of You et al.’s (2016)
study to consider the rationale for our replication and then preregistered our
design and analysis plan. We identified a clear need for some kind of replication,
given the impact of the study on the field—with over 65 Google Scholar
citations in less than 3 years, a number that is higher than the average annual
citations of 17.65 that initial studies receive before being replicated (Marsden,
Morgan-Short, Thompson, & Abugaber, 2018, p. 347)—and given the broader
societal importance of better understanding motivation in language learning.
Furthermore, and in line with the main focus of the rationale that we have
described here, we identified a number of statistical concerns related to model
fit, assumption checking, measurement model validation, model justification,
measurement invariance, and hypothesis testing. To address these concerns,
we deviated from the methodology of You et al. (2016), thus making ours
a conceptual replication rather than a direct or partial replication. We also
included some exploratory elements that deviated from our own preregistration.
We emphasize that we are genuinely committed to constructing a better
understanding of L2 motivational phenomena. Our point of departure was thus
that “raising legitimate concerns about previous methodology, analysis, or con-
clusions, is regarded as a laudable line of inquiry and not to be misconstrued
as a covert assault on the original author’s integrity” (Porte, 2012, p. 3). The
fact that our results did not eventually reproduce the pattern of findings in You
et al.’s study must, in the same light, not be seen as implying any disrespect
to its authors or undermining their sizeable contribution to the field to date.
For example, we note that some of the SEM advances that we have reviewed
have only become widely available in recent years—with some emerging only
Language Learning 70:1, March 2020, pp. 48–102 50
Hiver and Al-Hoorie Reexamining Vision in Second Language Motivation
after the publication of You et al.’s study. Furthermore, You et al. used Amos
(Arbuckle, 2013), but we used Mplus (Muth´
en & Muth´
en, 1998–2012) and
R (R Core Team, 2014)—these latter both provided more flexibility and addi-
tional functionality. On these grounds, we must acknowledge that it should not
be too surprising that these methodological improvements, in combination with
some differences between our design and participants and those of the initial
study (described in detail below), have led to different results and conclusions.
It should also be noted that You et al. have made all their materials publicly
accessible in the IRIS Repository (Marsden, Mackey, & Plonsky, 2016), a de-
cision that made feasible the present conceptual replication, and it is one that
is also important for supporting open science practices more generally.
Background Literature
Vision and Motivation
Within a motivational science perspective (e.g., Higgins, 2014), human thought
and action is oriented toward the future, with the understanding that individuals
possess the capacity for change. The notion that vision is an essential element
for deliberately influencing the future has become popular in various social
disciplines (e.g., van der Helm, 2009). In this broad context, van der Helm
(2009) defined vision as “the more or less explicit claim or expression of a
future that is idealised in order to mobilise present potential to move into the
direction of this future” (p. 100). In a similar spirit, D¨
ornyei, Henry, and Muir
(2016) stated that vision in language learning motivation is “conceptualized as
a vivid mental image of the experience of successfully accomplishing a future
goal” (p. 22). Because vision “captures a core feature of modern theories of
L2 motivation” (D¨
ornyei & Kubanyiova, 2014, p. 9), it has recently come to
occupy a central place in language motivation research. Hadfield and D ¨
(2013) have additionally explained that “[w]hen we use the word ‘vision,’ we
use it literally . . . [as] more than mere long-term goals or future plans in that
they involve tangible images and senses” (p. 2, original emphasis).
The View From Mainstream Motivational Psychology
The role vision of the self in future states plays has a long tradition in motiva-
tional psychology. An early attempt to elaborate on this role was through the
discrepancy-reducing function of future self-guides (Higgins, 1987, 1998). Ac-
cording to self-discrepancy theory, self-guides are “self-state representations
[that] are self-directive standards or acquired guides for being” (Higgins, 1987,
p. 321). Self-discrepancy theory highlights two such self-guides: the ideal self,
representing one’s own hopes and wishes, and the ought self, representing
51 Language Learning 70:1, March 2020, pp. 48–102
Hiver and Al-Hoorie Reexamining Vision in Second Language Motivation
duties and obligations expected by significant others. A similar notion is found
in possible selves theory (Markus & Nurius, 1986). Possible selves “derive
from representations of the self in the past and they include representations of
the self in the future . . . represent[ing] specific, individually significant hopes,
fears, and fantasies” (Markus & Nurius, 1986, p. 954). These visions of the
self in a future state are assumed to serve as reference points for the actual self
by harnessing a person’s hopes, aspirations, expectations, obligations, or fears
within a given domain. And, because they approximate an affective and plau-
sibly real experience of the individual in that desired or undesired future state
(Markus & Ruvolo, 1989), they may serve to energize action. According to this
view, vision derives its sustaining function from its ability to direct a person’s
behavior to approach or avoid a certain target. These promotion and prevention
functions are based on the understanding that individuals are motivated to ini-
tiate and persist in a course of action that will reduce perceived discrepancies
between their actual self and their personally relevant future self-guides and
bring one in line with the other (Higgins, 1987, 1996).
This theorizing notwithstanding, findings from a number of lines of re-
search in mainstream psychology have unveiled a less impressive role for
vision in conceptualizing and modeling motivation. One example is work in
which researchers explored the motivating function of positive thoughts and
mental images about a desired future (e.g., Oettingen, 1996, 2012; Oettingen &
Mayer, 2002). These findings highlighted unexpected twists in the explanatory
power of thinking about and imagining the future, suggesting that imagining
is a far from sufficient precursor to goal-directed action. In summarizing the
results of this line of research, Oettingen (2012) argued that “counter to what
the popular self-help literature proposes, positive thinking can be detrimental
to effort and success if it comes in the form of fantasies (free thoughts and
images about the desired future)” (p. 1). This line of research has demonstrated
that, both in immediate and delayed (some up to 2 years later) measurements
of effort and performance across multiple domains of human functioning, such
mental images have emerged as negative predictors that can actually hinder
increased effort and successful performance.
Looking also to sports psychology, the field from which vision interventions
originated decades ago and which inspired research into vision in L2 motivation
(Adolphs et al., 2018; D¨
ornyei et al., 2016), similarly raises questions about the
centrality of vision2in motivation and performance. In a recent meta-analysis
of the effects of mental imagery and vision interventions on biopsychological
outcomes related to both performance restoration and performance optimiza-
tion, Zach, Dobersek, Filho, Inglis, and Tenenbaum (2018) reported that the
Language Learning 70:1, March 2020, pp. 48–102 52
Hiver and Al-Hoorie Reexamining Vision in Second Language Motivation
effects of imagery interventions were all statistically nonsignificant. These re-
searchers therefore argued that “much caution” (p. 85) is needed before making
claims about the effectiveness of mental imagery.
Educational psychology has similarly received with skepticism the applica-
tion of mental imagery as an instructional tool. For example, Dunlosky, Raw-
son, Marsh, Nathan, and Willingham (2013) classified it as a technique of low
utility. In their synthesis of the evidence available for 10 common techniques
under different learning conditions and in relation to different student charac-
teristics, different levels of material difficulty, and different outcome measures,
Dunlosky et al. described the impact of visualizing as “rather limited and not
robust” (p. 24) and concluded that the evidence generated from it remains a
“patchwork of inconsistent effects” (p. 25). This is perhaps also linked to Morin
and Latham’s (2000) results, which showed a moderating effect of visualization
ability on the relationship between mental imagery and outcomes. Those more
skillful in visualization were the ones who benefited the most from it. This adds
a further layer of caveats to any effective application of visualization techniques
to classroom instruction.
Finally, research by Paluck (2010) offered a cautionary tale about the im-
portance of scrutinizing hypothesized effects experimentally (see also Hiver &
Al-Hoorie, 2020). Paluck’s large (N=842) stratified experimental ethnographic
study aimed to promote intergroup tolerance among rival ethnic groups in war-
torn Democratic Republic of Congo. Paluck applied an imagine-self technique
wherein participants were encouraged to take the perspective of someone from
another ethnic group and image themselves in that person’s shoes in the hope of
enhancing empathy. Contrary to expectation, those applying the imagine-self
technique were actually less tolerant of other groups in their questionnaire re-
sponses, in their spontaneous comments, and in objective behavior of donating
to other groups. In one added surprise to these results, those who discussed
with others what they had imagined, a strategy intended to enhance the uptake
of the treatment, showed the lowest tolerance. In trying to explain these results,
Paluck suggested that the treatment might have primed intergroup grievances
and made the participants more aware of them, which led to an effect contrary
to what had been anticipated. In combination, these results from mainstream
motivational psychology suggest that the role of vision is at best debatable and
potentially problematic.
The View From L2 Motivation
The L2 motivational self system (D¨
ornyei, 2005, 2009) was proposed as a theo-
retical paradigm to account for motivation in language learning. Theoretically,
53 Language Learning 70:1, March 2020, pp. 48–102
Hiver and Al-Hoorie Reexamining Vision in Second Language Motivation
the L2 motivational self system draws from Markus and Nurius’s (1986) possi-
ble selves theory, Higgin’s (1987) self-discrepancy theory, and Gardner’s (1985,
2010) socioeducational model to reframe past research strands in language
learning motivation within a self framework. Key to the L2 motivational self
system is the premise that learners’ vision of themselves in the future (specifi-
cally the ideal L2 self and the ought-to L2 self) plays a key role in promoting
positive learning behaviors in the present (D¨
ornyei, 2014). The L2 motivational
self system additionally includes the L2 learning experience (sometimes called
attitudes to language learning), “which concerns situated, ‘executive’ motives
related to the immediate learning environment and experience” (D¨
ornyei, 2009,
p. 29).
The L2 motivational self system framework has led to a great deal of re-
search over the past decade (for reviews, see Al-Hoorie, 2018; Boo, D¨
& Ryan, 2015). Most of this research, however, has focused on examining the
correlations between these self-guides and self-reported intended effort (e.g.,
er & Kormos, 2009; Csiz´
er & Luk´
acs, 2010; Islam, Lamb, & Chambers,
2013; Kormos & Csiz´
er, 2008; Lamb, 2012; Magid, 2009; Papi & Abdol-
lazadeh, 2012; Ryan, 2009). Some other research has considered the correla-
tions between self-guides and perceptual styles (e.g., Al-Shehri, 2009; Kim,
2009; Kim & Kim, 2011, 2014; Yang & Kim, 2011). The results have generally
shown substantial correlations between self-guides and these self-report mea-
sures, leading some researchers to describe vision as one of the most important
variables in successful learning (e.g., D¨
ornyei & Kubanyiova, 2014, p. 2) and as
one of the most reliable predictors of long-term intended effort (e.g., D¨
et al., 2016, p. 23).
Taking this line of research further, D¨
ornyei and Chan (2013) examined the
relationship between different modalities (visual and auditory) and different
self-guides (ideal and ought) in two foreign languages (English and Mandarin)
in a sample of 172 Cantonese speakers in China. The researchers tested the
mental imagery function of future self-guides and reported that these were
associated with salient imagery and visualization components. In their sam-
ple, a visual perceptual style was significantly correlated with learners’ future
self-guides and, when combined with other sensory variables (the auditory
modality and imagery capacity), the correlation became stronger. In light of
these findings, they proposed that this vision is multisensory in nature, involv-
ing all modalities, not just visualization. A further important characteristic of
the imagery skills involved was their language-independent nature, pointing to
the conclusion that mental imagery is a more generic capacity rather than being
specifically L2 related. They emphasized the importance of imagery capacity
Language Learning 70:1, March 2020, pp. 48–102 54
Hiver and Al-Hoorie Reexamining Vision in Second Language Motivation
in developing future self-guides and concluded that motivation may depend on
learners’ abilities to create such mental imagery.
Although the reviewed studies mostly concerned themselves with self-
report outcome measures, other research (though still observational) has in-
vestigated the correlation between self-guides and language achievement and
success. This line of research has shown that, in contrast to self-report mea-
sures, correlations with more objectively measured language learning behavior
and achievement were less substantial. Such findings have been repeatedly
obtained in research conducted in South Korea (Kim & Kim, 2011), Indone-
sia (Lamb, 2012), Canada (MacIntyre & Serroul, 2015), Iran (Papi & Ab-
dollahzadeh, 2012), and Saudi Arabia (Moskovsky, Assulaimani, Racheva, &
Harkins, 2016). In a meta-analysis of these effects, Al-Hoorie (2018) found that
the correlation between the Ideal L2 Self scale and language achievement was
weak, r=.20, 95% CI [.08, .32]. This weak correlation exhibited substantial
heterogeneity and became even weaker after correction for potential publica-
tion bias, r=.10, 95% CI [–.01, .22]. These findings echoed Moskovsky et al.’s
(2016) conclusion that the results “at best indicate a tenuous link between the
self guides and achievement” (p. 650).
A minority of investigators have moved beyond correlations between self-
reported measures and observational designs to the search for novel avenues
of how to motivate language learners through generating a language learning
vision and enhancing imagery (e.g., Adolphs et al., 2018; Sato & Lara, 2019)
and for superordinate and durable motivational forces that underpin long-term
persistence for L2 learning (D¨
ornyei et al., 2016; Ibrahim & Al-Hoorie, 2019).
In some of this research, researchers have specifically examined whether and
how classroom teachers can use imagined future states and mental projections
of the self in those states to mobilize and sustain current language learning
behavior toward that visionary target (e.g., Chan, 2014; Mackay, 2014; Magid,
2014; Magid & Chan, 2012; Sampson, 2012). This line of intervention re-
search, which is still in its infancy, has been characterized as having a number
of methodological limitations, including a lack of adequate blinding and a
tendency to emphasize qualitative components over objective criteria. When
researchers used objective criteria, the results were mixed, with Mackay (2014)
concluding that “it is not entirely evident whether any improvement in moti-
vational factors was due specifically to the development of an Ideal L2 self
or simply to the novelty of the approach” (p. 398). Furthermore, even if some
qualitative analyses have suggested that vision-based activities might increase
learners’ motivation, the subsequent link to learning effort, actual engagement
55 Language Learning 70:1, March 2020, pp. 48–102
Hiver and Al-Hoorie Reexamining Vision in Second Language Motivation
and participation, and language development and achievement remains unclear
(see Sato & Lara, 2019).
These methodological choices might help shed light on the discrepancy
among different lines of research within L2 motivation and between the appar-
ently positive results obtained in the L2 motivation field and the less positive
results obtained from mainstream psychology. Most of the L2 motivation liter-
ature to date has been observational, relying primarily on operationalizing L2
motivation through the three questionnaire-based constructs of the L2 motiva-
tional self system: ideal L2 self, ought-to L2 self, and L2 learning experience.
Without the use of experimental designs, causality among variables is hard
to establish. Nevertheless, these methodological limitations have not stopped
scholars in the field from taking conclusions to the next level. By moving be-
yond theoretical implications, as Henry and Cliffordson (2017) have recently
cautioned, “through practitioner-oriented volumes . . . [vision] has also found
its way into language classrooms” (p. 732). This has led Henry and Cliffordson
to compare the situation to a Kuhnian normal science where “central con-
cepts are adopted uncritically and anomalies ignored” (p. 732) in reference
to the argument that an understanding of vision aids teachers in increasing
their students’ motivation (Kim & Kim, 2014; Muir & D¨
ornyei, 2013). As an
illustration, in describing to language teachers the pedagogical implications of
vision, D¨
ornyei and colleagues stated:
While the day-to-day reality of one’s L2 learning experience is
determined by a myriad of situation-specific forces pulling and pushing
learners in different directions, the vision people have of the L2
speaker/user [whom] they would like to become seems, in the long run, to
be one of the most reliable predictors of long-term commitment and
effort. (D¨
ornyei et al., 2016, p. 42)
You et al.’s (2016) Study
One flagship study that characterizes this line of research was conducted by
You et al. (2016). Building on existing work on the L2 motivational self system
model, You et al.’s (2016) observational study aimed to investigate the contribu-
tion of vision-related variables in the setup of a whole language community—
surveying a large sample (over 10,000) of foreign language learners in China.
Their sample included two learner age groups (secondary and tertiary, mean
ages of 16.5 and 19.6 years, respectively), with the overall female ratio being
slightly overrepresented at 54:46. The researchers used 10 questionnaire scales
using a six-point Likert response format. These scales included the three L2
Language Learning 70:1, March 2020, pp. 48–102 56
Hiver and Al-Hoorie Reexamining Vision in Second Language Motivation
motivational self system components, namely, the Ideal L2 Self, the Ought-to
L2 Self, and Attitudes to L2 Learning (adapted from Taguchi, Magid, & Papi,
2009). The authors also drew from three vision-related scales: Visual Style, Au-
ditory Style, and Vividness of Imagery (adapted from D¨
ornyei & Chan, 2013;
Kim, 2009; Kim & Kim, 2011). They further developed three other scales for
the purpose of their study: Ease of Using Imagery, Positive Changes in the
Future L2 Self-Image, and Negative Changes in the Future L2 Self-Image.
Finally, following the tradition in this line of research, they used an Intended
Effort scale as the outcome variable in their study. (The full list of items is
available in Appendix S1 of the initial study.)
In their initial analyses, You et al. (2016) used a chi-square test to show
that the majority of their participants reported that they had engaged in mental
imagery in their L2 learning, with the same pattern obtained for both male and
female learners (with ratios ranging from 2.4:1 up to 5.1:1). A multivariate
ANOVA further showed that those who had reported engaging in mental im-
agery additionally reported higher motivation on the Ideal L2 Self, the Ought-to
L2 Self, the Attitudes to L2 Learning, and the Intended Effort scales. This pat-
tern was consistent across their subgroups, namely, secondary school learners,
English majors, and non-English majors.
You et al. (2016) subsequently presented a SEM analysis that, they argued,
demonstrated that the three vision-related variables (visual style, auditory style,
and vividness of imagery) are empirical antecedents of the L2 motivational
self system components (ideal L2 self, ought-to L2 self, and attitudes to L2
learning), and the latter motivational measures in turn predict learners’ self-
reported intended effort to learn the L2. The authors further maintained that
their results showed that this model operates equivalently for male and female
learners. Summarizing their results, You et al. (2016) argued that “the findings
confirmed the significance of vision in general” (p. 120).
As we explained above, the primary outcome variable used in the initial
study was the Intended Effort scale. Admittedly, this Intended Effort scale—
which has been used extensively for a decade—contains items that are generic
in nature, inquiring, for example, about spending “a lot of effort” and “a
lot of time” and studying “very hard.” Generic intentions are less likely to
translate into behavior compared with more specific intentions and goals (Al-
Hoorie, 2016a, 2016b, 2018; Fishbein & Ajzen, 2010; Locke & Latham, 1990;
Teimouri, 2017). Alternatively, it has been noted that, “If we want to draw
more meaningful inferences about the impact of various motives, it is more
appropriate to use some sort of a behavioural measure as the criterion/dependent
variable” (D ¨
ornyei & Ushioda, 2011, p. 200, original emphasis). Consequently,
57 Language Learning 70:1, March 2020, pp. 48–102
Hiver and Al-Hoorie Reexamining Vision in Second Language Motivation
if You et al. had used an alternative outcome measure such as more specific
intentions, objective language performance, or actual success and achievement,
they might have obtained a different pattern of findings.
The dominant conceptualization in the L2 motivational self system litera-
ture posits intended effort as an outcome variable. However, it may be plausible
to hypothesize that (generic) intended effort is actually an antecedent of moti-
vation. For instance, Oga-Baldwin and Nakata (2017) showed that task engage-
ment is a predictor of subsequent motivation. Extending this line of reasoning,
it seems plausible to argue that initial, generic intended effort could be a vari-
able facilitating task engagement and subsequent motivation, both eventuating
in language development. From this perspective, a generic intention of the sort
“I am prepared to expend a lot of effort in learning English” could be viewed as
a potential initial trigger for engaging in learning. The outcome variable could
then be language success, performance, or a more specific intention such as
“I try to read two short stories every week.” Few researchers have seriously
entertained this alternative hypothesis to date (see also Al-Hoorie, 2018).
Overall, the contrasting results within the L2 motivation field and between
L2 motivation and mainstream psychology in relation to the role of vision and
mental imagery have suggested that vision remains an open empirical ques-
tion warranting further investigation. We therefore decided that a preregistered
conceptual replication of the SEM model in You et al. (2016) would constitute
a systematic first step. We, thus, carefully examined You et al.’s report and
methodological choices as we prepared our preregistration protocols. Our pre-
registration protocols involved deviation from certain methodological choices
in You et al. and included the addition of an academic achievement variable in
order to shed more light on the role of vision.
Structural Equation Modeling Considerations in the Context of
You et al. (2016)
In this section, we review a number of methodological aspects in You et al.
(2016; henceforth, the initial study) in order to contextualize the design of our
conceptual replication as well as to provide a state-of-the-art account of the use
of SEM in language learning research. The initial study did not fully report
several technical points—an observation made elsewhere about a great deal of
L2 research (e.g., Al-Hoorie & Vitta, in press; Larson-Hall & Plonsky, 2015;
Marsden, Morgan-Short, Thompson, & Abugaber, 2018; Marsden, Thompson,
& Plonsky, 2018)—making it impossible for us to offer a complete evaluation
of the validity of the results, which thus emphasized the need for this repli-
cation. Following Marsden, Morgan-Short, Thompson, and Abugaber’s (2018)
Language Learning 70:1, March 2020, pp. 48–102 58
Hiver and Al-Hoorie Reexamining Vision in Second Language Motivation
recommendation to spell out deviations from a replicated study as fully as
possible, we have surveyed these methodological issues from the initial study
in order to point out potential departures in our methodology from that of the
initial study.
General Design Issues
Stratified Sampling
To begin with, in the initial study, the authors described their method as stratified
sampling, proposing that “the large stratified sample lends credibility to the
results” (p. 119). They also explained:
In selecting our participants, a stratified sampling method was followed,
and while our limited recourses did not allow for fully random or
systematic sampling within each stratum of the sampling frame, it is
believed that the robust coverage ensures that no major motivational
trends have gone unnoticed. (You et al., 2016, p. 102)
The ultimate goal of a stratified sampling procedure is to obtain an accurate
estimation of population parameters. This requires researchers to weight the
sample size proportionally based on the size of each stratum. For example,
if Region A is considerably larger than Region B, then the sampling should
take that fact into account. Not doing so may lead to biased standard errors,
especially when the means are different (Levy & Lemeshow, 2008; Lumley,
2010). Although in the initial study, You et al. (2016) did not report consulting
census data to formally determine the size of each of stratum in their Chinese
population, we consulted official census data to guide our sampling decisions.
Model Fit
As measures of the robustness of findings that were central to the initial study,
the model fit indices showed some evidence of misfit. For example, the initial
study reported some comparative fit index (CFI), parsimonious CFI, normed fit
index, and nonnormed fit index—Tucker-Lewis index (TLI)—values that were
less than .90 (see Figures 1 and 2 in the initial study). The authors also reported
aX2/df ratio of over 25 (see Figure 1 in the initial study), much higher than
the recommended 3.0 or 5.0 threshold (Wheaton, Muth´
en, Alwin, & Summers,
1977). Although use of X2/df ratio as an index of fit has recently been criticized
(Goodboy & Kline, 2017; Kline, 2016), this extremely high value coupled
with the other goodness-of-fit indices below .90 raised concerns that the model
might have been misspecified and, thus, might be challenging to reproduce.
59 Language Learning 70:1, March 2020, pp. 48–102
Hiver and Al-Hoorie Reexamining Vision in Second Language Motivation
This was especially the case given that the X2/df ratio is affected by sample
size only when the model is misspecified (Marsh, Balla, & McDonald, 1988)
because it penalizes for excessive model complexity (West, Taylor, & Wu,
2012). Following recent practice (e.g., Kline, 2016), in our study, we therefore
have reported X2,df, and pand two incremental fit indices, CFI and TLI, and the
root mean square error of approximation, an absolute fit index. Furthermore,
because these model fit indices are global, their low values in the initial study
might have led the reader to wonder, additionally, about more local fit indices
(see below) that were not reported.
Measurement Model
Convergent and Discriminant Validity
The initial study drew from several scales, some of which had not been used
previously. The initial study did not report details about the psychometric prop-
erties of these scales apart from basic Cronbach’s alpha as an estimate of scale
reliability. A prerequisite step when researchers use SEM is investigating the
measurement model in order to satisfy certain psychometric conditions before
they conduct a structural model (e.g., Brown, 2015). The authors of the initial
study acknowledged that they had not empirically assessed the measurement
model, stating that “measurement models were drawn up based on the theoret-
ical considerations outlined in the review of the literature” (pp. 105–106).
The measurement model involves conducting a confirmatory factor analysis
with the aim of establishing the extent of construct validity of the latent vari-
ables in the model. This involves investigating both convergent and discriminant
validity (Fornell & Larcker, 1981; Hair, Black, Babin, & Anderson, 2010)—
both crucial for replication attempts. Convergent validity concerns whether the
indicators satisfactorily represent their latent constructs. One indicator of con-
vergent validity is factor loadings. A rule of thumb for satisfactory convergent
validity suggests that the standardized factor loading of each indicator variable
should be .70 or higher (or at least .50 or higher; Hair et al., 2010). The initial
study reported factor loadings as low as .31 and two at .40, with several others
missing (see Figure S3 in the appendix of the initial study) making it difficult
to assess the convergent validity of the latent variables in the model.
Another indicator of convergent validity is construct reliability (also called
composite reliability). In the context of latent variables, construct reliability is
best calculated using methods other than Cronbach’s alpha because it requires
the items to be τ-equivalent (i.e., with equal factor loadings; Raykov, 2004),
an assumption that is rarely satisfied in SEM data. Construct reliability is
Language Learning 70:1, March 2020, pp. 48–102 60
Hiver and Al-Hoorie Reexamining Vision in Second Language Motivation
computed using J¨
oreskog’s rho formula,3displayed as Equation 1 (Fornell &
Larcker, 1981), where the reliability of the latent variable ηis a function of
(varies with) the squared sum of the standardized factor loadings (λ) and the
sum of the error variances (ε):
i=1λyi 2
i=1λyi 2+p
i=1Va r (εi)
SEM software packages do not automatically compute construct reliability,
so researchers must calculate it separately from the confirmatory factor analysis.
The rule of thumb for construct reliability is that it should ideally be .70 or
higher (Hair et al., 2010). The initial study reported only the Cronbach’s alpha of
its scales, which by itself might have reflected incomplete information about the
reliability of the latent variables. (For more about the controversy surrounding
bias in Cronbach’s alpha in the context of SEM, see Peterson & Kim, 2013,
and Raykov, 1998; for details on the controversy surrounding Cronbach’s alpha
more generally, see McNeish, 2018, Raykov & Marcoulides, 2019, and Sijtsma,
A further indicator of convergent validity is the average variance extracted
(AVE). The AVE aims to establish whether the variance captured by the latent
variable is larger than the variance due to measurement error. The AVE can be
computed using Equation 2 (Fornell & Larcker, 1981), where the AVE (ρvc )of
the latent variable ηis a function of the sum of the squared standardized factor
loadings (λ) and the sum of the error variances (ε):
yi +p
i=1Va r (εi)(2)
Again, SEM software packages do not compute AVE. Researchers must cal-
culate it for each latent variable from the confirmatory factor analysis output.
As a rule of thumb, the AVE should be .50 or higher (Fornell & Larcker,
Discriminant validity refers to whether the constructs are sufficiently dis-
tinct from each other. The recommended measure is that the AVE values should
be greater than their respective interconstruct correlations squared (Hair et al.,
2010). The rationale behind this rule of thumb is that the construct should
explain more of the variance of its items than it shares with other constructs.
Because the initial study did not report AVE values for its latent variables,
discriminant validity could not be assessed. This information is helpful partic-
ularly when there is substantial overlap in the items used for different scales.
61 Language Learning 70:1, March 2020, pp. 48–102
Hiver and Al-Hoorie Reexamining Vision in Second Language Motivation
Tab l e 1 Overlap in item wording of different scales used in the initial study
Item Scale
Icanimagine myself in the future giving an
English speech successfully to the public in the
Ideal L2 Self
IfIwish,Icanimagine how I could successfully
use English in the future so vividly that the
images and/or sounds hold my attention as a
good movie or story does.
Vividness of Imagery
It is easy for me to imagine how I could
successfully use English in the future.
Ease of Using Imagery
In the past I couldn’t imagine of myself using
English in the future, but now I do imagine it.
Positive Changes of the Future
L2 Self-Image
In the case of the initial study, as Table 1 illustrates, items belonging to dif-
ferent scales had considerable wording overlap. Also in the SEM model of
the initial study, Vividness of Imagery predicted the Ideal L2 Self at .81 (see
Figure S3 in the initial study). This high coefficient raised further questions
about whether these were indeed two distinct latent variables. It was therefore
plausible that the items in Table 1 might turn out to be manifestations of the
same latent variable rather than to represent different latent variables. To ascer-
tain this, we investigated the convergent and discriminant validity of our model
using the methods described in this section before moving to the structural
Checking Assumptions
As with many statistical procedures, a number of assumptions need to be
satisfied before SEM results can be considered valid. Which particular as-
sumptions are required to be met depends on the details of the model. For
example, use of maximum likelihood estimation assumes that the data are
continuous and multivariate normal. In the initial study, the authors did not
make explicit which estimation method they had used, whether their ordinal
data were multivariate normal, or whether they had inspected outliers and
had dealt with them. When these assumptions are violated, alternative estima-
tion methods become more appropriate, including robust maximum likelihood
and diagonally weighted least squares (Li, 2016). In our study, we used the
diagonally weighted least squares estimation method because our data were
Language Learning 70:1, March 2020, pp. 48–102 62
Hiver and Al-Hoorie Reexamining Vision in Second Language Motivation
Another assumption is technically known as uncorrelated errors. Many
statistical procedures assume that errors are uncorrelated, and, if they are corre-
lated, special adjustments to the model need to be made. Errors can be correlated
when participants are sampled from discrete units such as classes, schools, and
regions. In such cases, some students will be more similar to each other (due
to their shared environment) than they would be if they are randomly sampled
from the population. The degree of dependence can be estimated empirically
using Equation 3 to calculate the intraclass correlation (ICC; Hox, 2010), where
the ICC is a function of the variance of the highest-level errors (σ2
u0) and the
lowest-level errors (σ2
The ICC can be obtained from software packages with multilevel functionality.
As an illustration of the seriousness of correlated errors, even a small ICC of
.01 can inflate Type I error rate from .05 to .17 in a sample of 100; an ICC
of .05 would elevate it to .43 (Barcikowski, 1981; Kreft & de Leeuw, 1998).
The degree of bias resulting from correlated errors can be estimated using the
design effect formula (Muth´
en & Satorra, 1995) of Equation 4, where cis the
average cluster size:
Deff =1+(c1)ρ(4)
Depending on the study design, the cluster might be the class, the school,
the neighborhood, or any other shared environment. Software packages do not
usually calculate design effect, so researchers need to calculate it themselves.
As a rule of thumb, a design effect of less than 2.0 is considered tolerable,
though some conditions require a more conservative threshold (Lai & Kwok,
2015). The authors of the initial study did not report the ICC or design effect
for their data, though the extremely large-scale nature of that study made it
likely that correlated errors existed. In our study, we adjusted standard errors
to correct for clustering within classrooms in our sample.
Local Fit
As we mentioned above, the model fit indices reported in the initial study were
measures of global fit, rather than local fit. As the name suggests, local fit can
point to specific problematic areas in the model. Although commonly used,
global measures have been criticized for not providing an adequate indication
of the size or exact location of misspecification or the lack of fit (Saris, Satorra,
& van der Veld, 2009), and this concern constituted part of the rationale for
63 Language Learning 70:1, March 2020, pp. 48–102
Hiver and Al-Hoorie Reexamining Vision in Second Language Motivation
our conceptual replication. A common approach to the evaluation of local fit
is the inspection of residuals. Residuals represent the discrepancy between the
hypothesized and observed covariance matrices. Residuals can be obtained for
every unique value in the model, thus allowing inspection of local misfit. As
Goodboy and Kline (2017) assert, “The real details about model fit can be found
by inspecting the residuals” (p. 74), whereas “testing for global fitness is often
of only minor use” (Pearl, 2009, p. 145). The larger the standardized residual
(usually ±2.0), the worse the fit. For some estimation methods (e.g., diagonally
weighted least squares) SEM packages provide the nor malized residuals instead,
which represent ratios of covariance residuals over the standard error of the
sample covariance and which are a more conservative measure of local fit
(Kline, 2016). In view of these concerns, we inspected local fit using normalized
residuals in our study.
Structural Model
Model Justification
The primary purpose of using SEM is to test and establish causal claims (Pearl,
2012). As with similar statistical analyses, SEM results are only as good as the
assumptions researchers hold about causality among the variables. In order for
SEM to be valid, these causal assumptions should be derived from experimental,
logical, and temporal considerations. Bollen and Pearl explain:
[D]evelopers and users of SEMs are under the mistaken impression that
SEMs can convert associations and partial associations among observed
and/or latent variables into causal relations. The mistaken suggestion is
that researchers developing or using SEMs believe that if a model is
estimated and it shows a significant coefficient, then that is sufficient to
conclude that a significant causal influence exists between the two
variables. Alternatively, a nonsignificant coefficient is sufficient to
establish the lack of a causal relation. Only the association of observed
variables is required to accomplish this miracle. (Bollen & Pearl, 2013,
p. 308)
In the initial study, the authors did not fully lay out the theoretical rationale
behind the model, which involved 10 structural paths, further justifying the
need for our conceptual replication, which allowed us to explore this and other
plausible models. For example, it was not made explicit why it was theorized
that the relationship between the auditory style and vividness of imagery should
be causal, why both the ideal L2 self and the ought-to L2 self had a causal effect
Language Learning 70:1, March 2020, pp. 48–102 64
Hiver and Al-Hoorie Reexamining Vision in Second Language Motivation
on attitudes to L2 learning but not the other way around, or why both auditory
and visual styles had a direct effect on attitudes to L2 learning but vividness of
imagery did not.
In the initial study, the authors did, however, discuss the rationale of some
paths, though they acknowledged the controversy surrounding the direction
of these paths: “Because the L2 Motivational Self System was originally pro-
posed as a framework with no directional links among the three components,
past empirical studies employing structural equation modeling (SEM) have not
been uniform in specifying these interrelationships” (p. 97). In fact, past SEM
research has at times even been contradictory. For example, in a chapter by
Taguchi et al. (2009, p. 86) the authors hypothesized a SEM path with the ideal
L2 self attitudes to learning English; however, in a following chapter in the
same anthology, Kormos and Csiz´
er (2009, p. 100) hypothesized the opposite
SEM path of the L2 learning experience the ideal L2 self. This is puzzling
given that attitudes to learning English and the L2 learning experience, despite
what these expressions might imply, have actually been used synonymously
in the language motivation literature. In fact, in the initial study, You et al.
acknowledged that the two scale names were “largely terminological variation
because the specific questionnaire items that were used to tap into this compo-
nent were broadly similar across the studies” (pp. 96–97). Both studies reported
support for their respective model, though the rationale for the path direction
each adopted was not clear. One could argue that the two variables reinforce
each other, and therefore there is reciprocal causality between them (or, in tech-
nical terms, nonrecursive SEM paths). Although this third hypothesis is also
plausible, to our knowledge it has not been tested empirically or even theorized
A further justification for our conceptual replication was that the initial study
did not set out to test equivalent or competing models to examine how well they
account for the data. This problem has been highlighted by SEM methodolo-
gists, who have considered it a form of confirmation bias. Researchers select
their preferred model to confirm it while they overlook alternative models
that could potentially account for the data equally well. Although researchers
have typically overlooked them, “equivalent models exist for most published
applications, often in large numbers” (MacCallum, Wegener, Uchino, & Fab-
rigar, 1993, p. 196). Therefore, replication research should explore alternative
models to minimize confirmation bias, which is seen as a serious threat to
published SEM research (Kline, 2016, p. 296; see also Robles, 1996, and Shah
& Goldstein, 2006, for similar arguments).
65 Language Learning 70:1, March 2020, pp. 48–102
Hiver and Al-Hoorie Reexamining Vision in Second Language Motivation
Figure 1 An illustration of the mindlessness of structural equation modeling. Two
different models resulting in identical structural coefficients and identical model fit.
Error terms have been removed for simplicity. CFI =comparative fit index; TLI =
Tucker-Lewis index; RMSEA =root mean square error of approximation; PCLOSE =
pof Close Fit.
Figure 1 illustrates more concretely the danger of overlooking alternative
conceptualizations of one’s hypothesized model. The figure presents two mod-
els, each with its structural coefficients and model fit (adapted from Albalawi,
2018). Panel A represents the hypothesis that experiences of disappointment in
daily life have a negative causal effect on learners’ ideal L2 self (note the head
of the arrow pointing from disappointment to ideal L2 self). Panel B, however,
represents the exact opposite hypothesis: Having a high (perhaps unrealistic)
ideal L2 self could lead learners to disappointment. Despite the contradic-
tory nature of the two models, they have identical structural coefficients and
identical model fit. Thus, regardless of whether the researcher advocates the
model in Panel A or Panel B of Figure 1, the SEM results would support the
researcher’s hypothesis equally. This example demonstrates that each and every
path in the SEM model needs to have a convincing (ideally experimental) ratio-
nale derived from prior research (Joe, Hiver, & Al-Hoorie, 2017; Yun, Hiver,
& Al-Hoorie, 2018). Otherwise, SEM software would mindlessly crunch the
numbers and return support for the model.
Finally, the initial study acknowledged that it had been a routine practice
in SEM studies to drop nonsignificant paths: “[W]hen certain links between
the ought-to self and other components did not reach significance, they were
Language Learning 70:1, March 2020, pp. 48–102 66
Hiver and Al-Hoorie Reexamining Vision in Second Language Motivation
deleted from the SEMs” (p. 98). However, SEM methodologists have cautioned
that “deleting nonsignificant paths from a structural equation model is a terrible
way to trim the model” (Goodboy & Kline, 2017, p. 72, original emphasis).
In addition to capitalizing on chance, this procedure wastes an important fea-
ture of SEM over conventional significance testing: estimating the magnitude
of relationships. Deleting a priori hypothesized paths due to nonsignificance
defeats this purpose. The initial study was not explicit as to whether the re-
searchers had dropped any nonsignificant paths or to whether they had made
any post hoc modifications (and if so, what these modifications were). When
researchers perform modifications, SEM results become exploratory. Neither
was the initial study clear as to its purpose. At first, the authors stated that
the aim of the study was to “explore the nature of this motivational role [of
vision]” (p. 107, emphasis added), but later stated in the conclusion that “the
findings confirmed the significance of vision” (p. 120, emphasis added). In our
conceptual replication, we adhered to our preregistration protocols and have
explicitly reported any deviation from these and from the initial study.
Measurement Invariance
Measurement invariance is concerned with whether two (or more) participant
groups interpret the items in a conceptually similar manner. For example,
when comparing males and females, an observed difference might be due to
one group indeed having a higher latent score, but it might also be due to
the two groups simply understanding the items differently. That the groups
understand the items in a similar way is considered “a logical prerequisite”
(Vandenberg & Lance, 2000, p. 9) to any meaningful interpretation of this
difference. Differential interpretation of items could occur in cross-cultural
and cross-age comparisons. Different levels of measurement invariance may
be established, most commonly configural (same number of factors across
groups), metric (or weak; equal factor loadings as a prerequisite for structural
coefficient comparisons), and scalar (or strong; equal item intercepts as a pre-
requisite for latent mean comparisons). When measurement invariance does not
hold, the measure is considered problematic and in need of further refinement
(e.g., see Davidov, Meuleman, Cieciuch, Schmidt, & Billiet, 2014). The same
is true in cross-time comparisons because the understanding of some abstract
notions may change over time. The initial study reported group differences
based on gender and use of visualization. However, because it did not report
measurement invariance results, it is not clear whether these differences were
genuine or an artifact of lack of measurement invariance. In our study, there-
fore, we ensured the satisfaction of measurement invariance before conducting
67 Language Learning 70:1, March 2020, pp. 48–102
Hiver and Al-Hoorie Reexamining Vision in Second Language Motivation
analyses of group differences. (For readers unfamiliar with the concept of mea-
surement invariance, detailed introductions such as Brown, 2015, chapter 7,
and Steinmetz, Schmidt, Tina-Booh, Wieczorek, & Schwartz, 2009, may be
Hypothesis Testing
To contextualize the design of our conceptual replication, a final analytical
concern relates to comparing coefficients for different groups without exam-
ining whether the differences between these coefficients are significant. Visual
comparisons lend inconclusive support if not backed by inferential testing. For
example, although the initial study stated, “we do find some discrepancies in the
scores of the two genders” (p. 111) and “the coefficient increased, and it peaked
at .31 in the most committed subsample” (p. 117), readers would also want
to know whether these differences were significant or not. In fact, even if one
coefficient is significant and the other is not, this may not necessarily imply that
the difference between them is itself significant (see Gelman & Stern, 2006).
Some procedures have been developed for this purpose in the context of SEM,
such as the Wald test (see Brown, 2015; Kline, 2016). Unfortunately, general-
izing without statistical hypothesis testing is a prevalent statistical problem in
the L2 field (Al-Hoorie, 2018; Al-Hoorie & Vitta, in press). In our study, we
used the Wald test to test hypotheses about group differences.
The Present Study
Having considered both open empirical questions and various methodological
issues relating to the initial study, we felt that a conceptual replication was
warranted to shed more light on the role of vision in language learning moti-
vation. In addition to these substantive and methodological rationales, another
criterion warranting replication was the weight a study has had on a field (e.g.,
Lindsay, 2015; Marsden, Morgan-Short, Thompson, & Abugaber, 2018). The
initial study has received numerous citations, as we described above, which
suggested that it has continued to have a major impact on the landscape of our
We have described our replication as conceptual (or constructive) due to
the deviation of our SEM model from that of the initial study. Conceptual
replications “introduce more than one significant change to the initial study
and can extend agendas in multifaceted ways but are in a weaker position
for ascribing different findings to the adaptations made to the initial study”
(Marsden, Morgan-Short, Thompson, & Abugaber, 2018, p. 366). Because our
replication was conceptual and because conceptual replications are in a weaker
Language Learning 70:1, March 2020, pp. 48–102 68
Hiver and Al-Hoorie Reexamining Vision in Second Language Motivation
position for explaining results that are not in line with those of an initial study,
we have made no claim that our results are more valid than those of the initial
Along the same lines, we reiterate that our intention was not to direct
criticism toward particular lines of research, methodological traditions, or in-
dividual scholars but rather to serve as a constructive step in pushing the L2
motivation field forward through ever more refined and precise empirical in-
sights (see Porte, 2012). The misconception that replicators may be harassing,
or even bullying, the authors of an initial study (see Bohannon, 2014) may be
attributed to the fact that the language field has not yet fostered a replication
culture. As an illustration, Marsden, Morgan-Short, Thompson, and Abugaber
(2018) estimated that there have been as few as one published replication study
for every 400 journal articles in the L2 field—an estimate they still described
as “generous” (p. 344). Within the language motivation field more specifically,
the situation is acute. Inspection of Marsden, Morgan-Short, Thompson, and
Abugaber’s (2018) list of replication studies across 26 L2 journals revealed
only one self-labeled replication study on motivation (Mantle-Bromley, 1995).
Although there were likely more replication studies that were not self-labeled as
such in their titles or abstracts, lack of explicit labeling can be counterproduc-
tive. For example, authors and reviewers may not feel as compelled to scrutinize
interstudy variation and to attempt to explain or evaluate it, and consequently
“heterogeneity from one study to the next can pass largely unchecked” (Mars-
den, Morgan-Short, Thompson, & Abugaber, 2018, p. 365).
As a methodological safeguard, we preregistered our study prior to data
collection (a time-stamped copy can be found at Prereg-
istration involves specifying in advance the research questions, the detailed
study design, as well as the analysis plan and statistical model. This aims to de-
marcate exploratory versus confirmatory research and to minimize researcher
degrees of freedom, which can bias results in favor of preferred or anticipated
As we stated in our preregistration protocols, the primary purpose of this
study was to replicate You et al.’s (2016) SEM model (see their Figure S3). We
adhered to the basic design and procedures described in You et al. (2016). At
the same time, our preregistration protocols explicitly stated that we planned
to deviate from You et al.’s design in several aspects:
1. Due to the considerable overlap in the wording of items belonging to differ-
ent scales, we explicitly predicted in our preregistration protocols that some
scales might not show sufficient discriminant validity and would therefore
69 Language Learning 70:1, March 2020, pp. 48–102
Hiver and Al-Hoorie Reexamining Vision in Second Language Motivation
have to be combined or completely excluded. (As we have shown below,
this prediction was confirmed.) Since we were unable to anticipate which
scales would survive the measurement model—because this was an empir-
ical question—we expected our final model to be different from that of You
et al., rendering our findings exploratory (again we had anticipated this in
our preregistration protocols).
2. We also stated in our preregistration protocols that we would compare the
fit of competing models. Once more, as we were unable to anticipate which
scales would survive the measurement model, we acknowledged that this
aspect would also be exploratory.
3. As the researchers had done in the initial study, we planned to conduct
multiple-group analyses to compare gender, vision capacity (vision-yes vs.
vision-no), and vision change (change-positive vs. change-negative). How-
ever, because the scales measuring the latter two conditions (vision capacity
and vision change) did not survive the measurement model, this part of the
analysis could not be conducted.
4. We planned to conduct our conceptual replication in a neighboring country,
South Korea, in which we had convenient access to a comparable sample
of learners. We reasoned, too, that the findings of the initial study would
be of greater utility to the field if they were shown to generalize first to a
neighboring country and then to other, more dissimilar, parts of the world
in future studies.
5. In an attempt to broaden the scope of our investigation, we sought to include
a second outcome measure in addition to intended effort. We obtained both
midterm and final course grades in English from our participants. Given the
importance of this outcome variable, we considered this point an extension
to the design of the initial study (rather than a deviation from it).
It is in light of these departures from the design of the initial study that we
have described our study as a conceptual rather than a direct or partial replication
(see Marsden, Morgan-Short, Thompson, & Abugaber, 2018). Despite these
design changes—including the additional outcome variable—our ultimate aim
in this study remained unchanged: to evaluate the claim advanced in the initial
study that vision is one of the single most important variables in language
motivation (see also D¨
ornyei & Kubanyiova, 2014, p. 2). Preregistering our
design and analyses was intended to give an additional assurance of the validity
of our conceptual replication.
Language Learning 70:1, March 2020, pp. 48–102 70
Hiver and Al-Hoorie Reexamining Vision in Second Language Motivation
Our sample included 1,297 secondary school L2 learners (female =789) re-
cruited from middle schools (12–14 years old). After consulting national cen-
sus data from the Korean Statistical Information Service, the government office
within the Ministry of Strategy and Finance that collects population data yearly
(Korean Statistical Information Service, 2016), we endeavored to stratify our
sample proportionately from the two main geographic/administrative regions
(northern and southern) to represent their socioeconomic characteristics. We
targeted public middle school students (1.1 million total), gender distribution
(52% female, 48% male), and number of public middle schools in the respective
administrative regions (2,567 total; weighted 42.8% and 57.2%). We sampled
from the more densely populated and urban northern provinces including the
capital (n=556), and from all the remaining southern provinces (n=741) to
correspond with the sociogeographic census weightings. The response rate for
female respondents was slightly higher in our sample, pushing the final gender
ratio of our sample closer to 59:41.
When compared to the initial study, our sample was younger because the
age range of the initial study was 16–20 years. The gender ratio of the initial
study also featured more females both at the university level (62:38) and at the
secondary school level (53:47). In contrast, several key structural contingencies
made the classroom learning of English of equal instrumental utility in both
contexts (i.e., South Korea and China) at both the secondary and tertiary levels.
Across both settings, there was relative parity in the social value ascribed to
achievement in L2 English, an occurrence influenced by the outsized emphasis it
is given on standardized assessments in compulsory education and its perceived
importance in serving as a broad metric of academic success. In practice, then,
L2 English learning in both settings (South Korea and China) serves as a
social stratification metric that is key to success at different stages of life and
in different areas of society, and poor scores in English language learning in
secondary or tertiary education can relegate individuals to lower-tier learning
institutions and to types of employment perceived as less desirable (Muslimin,
The initial study used a total of 10 scales relating to several aspects of motiva-
tion and vision (see Table 2). We adopted these scales using a 7-point Likert
response format4(see Appendix S1 in the Supporting Information online for
the complete list of items). Table 2 presents the Cronbach’s alpha reliability
71 Language Learning 70:1, March 2020, pp. 48–102
Hiver and Al-Hoorie Reexamining Vision in Second Language Motivation
Tab l e 2 Cronbach alpha reliability estimates for the 10 scales used in the present study
and in the initial study
Scale kInitial study Present study
Ideal L2 Self 5 .88 .91
Ought-to L2 Self 6 .74 .82
Attitudes to L2 Learning 5 .88 .92
Intended Effort 5 .81 .87
Vividness of Imagery 5 .91 .90
Visual Style 5 .66 .73
Auditory Style 5 .69 .76
Ease of Using Imagery 5 .85 .80
Positive Changes of the Future L2 Self-Image 3 .76 .82
Negative Changes of the Future L2 Self-Image 2 .80 .62
estimates from our Korean sample and shows how they compare to those of
the initial study’s Chinese sample. The reliabilities were broadly similar, with
the largest difference in Negative Changes of the Future L2 Self-Image, which
contained only two items. This was likely because Cronbach’s alpha, in addi-
tion to criticism that it has received in general terms as we reviewed above, is
particularly inappropriate with two-item scales (Eisinga, Grotenhuis, & Pelzer,
2013). We have reported Cronbach’s alpha for this two-item scale here only for
the sake of comparison with the initial study. Following the recommendation
of Eisinga et al. (2013), we computed the Spearman-Brown coefficient—which
turned out to be much lower (ρ=.45), indicating a critical problem with reli-
ability for this construct. The measurement model (see below) shed more light
on the psychometric properties of these scales.
We also sought to add a measure of L2 achievement to our analysis by
collecting the midterm and final scores (i.e., the two major assessments in each
grade, roughly 6 months apart) for each student. The original achievement
scores had a maximum of 100, which had to be rescaled to a maximum of 10
for model identification purposes. In this context, secondary schools cover L2
learning content from a national curriculum in a standard manner and sequence
to ensure fair opportunities for success on mandated assessments. Supplemen-
tary L2 material is often not permitted in regular secondary classrooms by
the South Korean Ministry of Education simply because it introduces unequal
opportunities and unpredictability in the classroom and does not correspond
Language Learning 70:1, March 2020, pp. 48–102 72
Hiver and Al-Hoorie Reexamining Vision in Second Language Motivation
with the tightly regulated school-leaving assessments. These end-of-semester
assessments are a more objective gauge of general language proficiency than
other, more formative, assessments that occur in this instructional context be-
cause these summative assessments require learners to demonstrate that they
know particular L2 content and are able to use it. Importantly, although not
all of the schools administer the exact same test, these assessments tend to be
relatively homogeneous across schools and regions. The content and format
(i.e., multiple-choice, cloze, and transformation question types) are roughly
identical, drawing heavily on reading and listening passages that require a high
degree of comprehension and responses that demonstrate lexical, grammatical,
and discourse competence. In these final exams, females (M=78.23, SD =
20.30) achieved significantly higher than males (M=75.23, SD =23.60),
t(965) =2.35, p=.019, d=0.14, 95% CI [0.03, 0.25].
We had the questionnaire items used in the initial study translated into the
students’ native language (Korean) by a nonaffiliated researcher familiar with
the principles of questionnaire construction and English and Korean, and then
we back-translated the items to avoid large deviations in meaning. All materials
were administered in Korean. After receiving ethics approval, we approached
school administration and teaching faculty in schools nationwide. We ultimately
obtained written institutional consent from 12 schools (44 classes in total) to
collect data in the final weeks of the 2016 school year. Students from the schools
that agreed to participate completed the survey outside of their regular class
hours. Both their English L2 teacher and a research assistant were present to
inform them about the purpose of the survey and to obtain consent and to ad-
minister these materials. Students were reminded that participation was entirely
voluntary and were assured of the confidentiality of their responses. Through-
out, all participants were treated in accordance with American Psychological
Association ethical guidelines.
Data Analysis
In the analyses, we followed our preregistration protocols. We started with the
measurement model to investigate the psychometric properties of the scales. To
determine the number of underlying factors, we submitted the data to Mokken
scaling analysis and then to confirmatory factor analysis. To further ascertain
the number of factors, we conducted additional analyses using exploratory
factor analysis, scree plot, optimal coordinates, and parallel analysis. We han-
dled missing data using the default function in Mplus 7 (Muth´
en & Muth´
73 Language Learning 70:1, March 2020, pp. 48–102
Hiver and Al-Hoorie Reexamining Vision in Second Language Motivation
1998–2012), which estimates the model under missing data theory using all
available data. We corrected standard errors and chi-square tests to account for
nonindependence of observations because participants came from 44 classes.
Due to violation of multivariate normality, we applied a robust weighted least
squares estimator using a diagonal weight matrix. We investigated both global
fit (e.g., CFI, TLI, and root mean square error of approximation) and local
fit (normalized residuals). We computed normalized residuals by dividing the
residual of the unrestricted model by the standard error of the corresponding H1
(sample covariance) value (Kline, 2016). We have also reported factor loadings,
construct reliability, and AVE values.
For the structural model, we tested two competing models to minimize
confirmation bias. We compared a model in which intended effort was an
antecedent with a model in which intended effort was an additional outcome
variable together with achievement, and we then controlled for baseline achieve-
ment using midterm scores in the model that had exhibited better fit. Finally,
we established measurement invariance before comparing the two genders.
The Measurement Model
Multivariate Normality
To test the multivariate normality of the variables, we conducted a Mardina’s
test using the MVN package (Korkmaz, Goksuluk, & Zararsiz, 2014) in R
(Version 3.1.2; R Core Team, 2014). The results showed that our data were not
multivariate normal, both in skewness (218.22), X2=47172.92, p<.001, and
kurtosis (3006.03), z=216.24, p<.001. Figure 2 presents an illustration of
the lack of multivariate normality in our data. In a normal distribution scenario,
the dots would be expected to align to the straight line in the plot.
Number of Factors
Following our preregistration protocols, we first submitted the 10 scales to a
Mokken scaling analysis using MSP5 (Molenaar & Sijtsma, 2000). This pro-
cedure is a nonparametric item response theory model aimed at determining
the number of unidimensional factors underlying the data (Meijer & Baneke,
2004; van der Eijk & Rose, 2015). The analysis returned only four factors
(out of the 10 scales). Table 3 presents the scales and their associated items
that were scalable (the remaining items were not scalable). The table shows
that items from the Ideal L2 Self, Vividness of Imagery, Ease of Using Im-
agery, and Negative Changes of the Future L2 Self-Image scales all loaded on
the same factor (Factor 2). This pattern was in line with our analysis above
Language Learning 70:1, March 2020, pp. 48–102 74
Hiver and Al-Hoorie Reexamining Vision in Second Language Motivation
Figure 2 Quartile-quartile plot showing violation of multivariate normality.
(cf. Table 1), suggesting that, based on their parallel wording, these items were
more likely to be manifestations of one latent variable than different latent
variables. Similarly, Attitudes to L2 Learning and Intended Effort loaded onto
one factor.
Although in our preregistration protocol we explicitly predicted that some
scales would not exhibit sufficient discriminant validity, the reduction from
10 to four factors was surprising (more than a 50% reduction). Because this
was a substantial reduction, we decided to conduct further tests that we had
not preregistered in order to ascertain the validity of the Mokken results. We
examined our data using scree plot, eigenvalue over 1.0 criterion, parallel
analysis, optimal coordinates, and acceleration factor using the SPSS R-Menu
2.0 (Courtney, 2013; see Figure 3) as well as exploratory factor analysis. Most
of these pointed to four factors only, thus raising our confidence in this number
of factors. The acceleration factor suggested one factor only, but this method
has been criticized for underestimating the number of factors (Ruscio & Roche,
2012). The eigenvalue over 1.0 criterion, which is considered very unreliable
(van der Eijk & Rose, 2015), returned seven factors, which was also fewer than
the 10 factors hypothesized in the initial study. Appendix S2 in the Supporting
Information online presents the results from exploratory factor analysis, which
showed—perhaps more clearly—that Attitudes to L2 Learning and Intended
Effort loaded on one factor, and therefore it would not have been appropriate to
treat them as two separate scale variables in our study. In our case, we selected
the Intended Effort over the Attitudes to L2 Learning scale, as the former was
the (only) outcome variable in the initial study.
75 Language Learning 70:1, March 2020, pp. 48–102
Hiver and Al-Hoorie Reexamining Vision in Second Language Motivation
Tab l e 3 Factors, homogeneities (H), and reliabilities (rho) resulting from Mokken
scaling analysis
Fac tor Ite m MItem H
Factor 1 ATLL1 4.11 .71
Fac tor H=.67, rho =.92 ATLL2 4.28 .72
ATLL3 4.52 .68
ATLL4 3.69 .66
ATLL5 4.14 .65
Intended1 4.94 .61
Intended2 4.32 .62
Factor 2 Ideal1 4.46 .64
Fac tor H=.66, rho =.96 Ideal2 4.02 .65
Ideal3 4.19 .65
Ideal4 4.42 .69
Ideal5 4.31 .68
Vivid1 4.27 .69
Vivid2 4.85 .67
Vivid4 4.35 .68
Vivid5 4.55 .69
Ease1 4.10 .63
Ease3 4.48 .68
Ease4 3.89 .62
Pos1 4.50 .62
Factor 3 Ought1 3.64 .69
Fac tor H=.69, rho =.81 Ought2 3.35 .69
Factor 4 Visual1 5.03 .69
Fac tor H=.69, rho =.81 Visual2 4.77 .69
Note. Homogeneity was set at a minimum of .60. ATLL =Attitudes to L2 Learn-
ing; Ease =Ease of Using Imagery; Ideal =Ideal L2 Self; Intended =Intended
Effort; Ought =Ought-to L2 Self; Pos =Positive Change in the Future L2 Self-Image;
Visu al =Vi sual Style; Vivid =Vividness of Imagery.
Confirmatory Factor Analysis
The results indicated that several scales analyzed in the initial study did not
seem to possess adequate psychometric properties to permit further analyses or
conclusions derived from them for the current study. The only scales that clearly
emerged from the analyses to this point were the Ideal L2 Self, the Ought-to L2
Self, Visual Style, and Intended Effort. This resulted in our design taking on a
more exploratory nature, as we had anticipated in our preregistration protocols.
Language Learning 70:1, March 2020, pp. 48–102 76
Hiver and Al-Hoorie Reexamining Vision in Second Language Motivation
Figure 3 Number of factors based on using scree plot, eigenvalues over 1.0 criterion,
parallel analysis, optimal coordinates, and acceleration factor.
Because our goal was to approximate the model in the initial study as closely
as possible, we conducted a confirmatory factor analysis on the four scales
emerging from our analysis. In all subsequent analyses, we further adjusted
standard errors to correct for clustering within the 44 classes in our sample
(ICC =.117, Deff =4.33). We also had to exclude one item from the Visual
Style scale (Visual2; see Appendix S1) to improve convergent validity. This
scale also had the lowest reliability in the initial study (see Table 2). It was
likely that we had to exclude this item because it was the only item specific
to using English fluently, whereas all other items in that scale were concerned
with using English successfully or skillfully more generally. In any case, we
had no reason to believe that dropping one item had a substantial impact in a
conceptual replication.
77 Language Learning 70:1, March 2020, pp. 48–102
Hiver and Al-Hoorie Reexamining Vision in Second Language Motivation
Tab l e 4 Reliability, validity, and interconstruct correlations for the scales of the mea-
surement model
Scale CR AVE Ought Ideal Intended Visual
Ought .851 .494 .703
Ideal .923 .706 .319 .840
Intended .889 .615 .395 .767 .784
Visual .781 .475 .329 .559 .664 .689
Note. Values in the diagonal are the square roots of their respective average variance
extracted (AVE). CR =construct reliability; Ought =Ought-to L2 Self; Ideal =Ideal
L2 Self; Intended =Intended Effort; Visual =Visual Style.
Table 4 shows that all variables exhibited acceptable construct reliability
(over the .70 threshold), though the AVEs of the Ought-to L2 Self and Visual
Style scales were just under the .50 threshold. Discriminant validity was satis-
fied because the square roots of AVEs (shown in the diagonal of Table 4) were
higher than their respective inter-construct correlations. Overall, this model
showed a reasonable global fit, X2(164) =1,132.545, p<.001, CFI =.954,
TLI =.947, though the root mean square error of approximation was bor-
derline, .067, 95% CI [.064, .071], p<.001. Inspection of the normalized
residuals showed that most residuals were within ±2.0, with the highest being
2.1 (between Ought3 and Intended5). All standardized factor loadings were
statistically significant, and most were above .70, with the lowest being .46 for
one Ought-to L2 Self scale item (see Table 5).
The Structural Model
Two Potential Models
As we discussed above, most of the recent SEM literature in the L2 motiva-
tion field has relied on testing a single model hypothesized by researchers,
though this risks confirmation bias. In the vast majority of SEM studies, re-
searchers have hypothesized the ideal L2 self, the ought-to L2 self, and the
L2 learning experience to have a causal effect on intended effort (see Al-
Hoorie, 2018). As we discussed above, intended effort may also be conceptu-
alized as a potential antecedent that could facilitate engagement, motivation,
and language development. In this study, therefore, we tested two competing
1. Following the model in the initial study, we hypothesized that Intended
Effort is the outcome of the Ideal L2 Self and the Ought-to L2 Self.
Language Learning 70:1, March 2020, pp. 48–102 78
Hiver and Al-Hoorie Reexamining Vision in Second Language Motivation
Tab l e 5 Standardized and unstandardized factor loadings, standard errors, and zratios
of scales in the measurement model
Path βBSE z
Ideal L2 Self Ideal1 .83 – 0.011 73.27
Ideal2 .84 1.01 0.011 75.45
Ideal3 .81 0.98 0.013 62.72
Ideal4 .87 1.05 0.009 93.33
Ideal5 .85 1.02 0.011 77.57
Ought-to L2 Self Ought1 .77 – 0.015 51.70
Ought2 .76 0.98 0.017 43.15
Ought3 .75 0.96 0.017 44.75
Ought4 .46 0.59 0.027 17.32
Ought5 .78 1.01 0.013 59.28
Ought6 .65 0.84 0.021 31.21
Visu al S tyle Visual1 .72 – 0.017 42.25
Visual3 .61 0.85 0.027 22.46
Visual4 .80 1.11 0.014 57.67
Visual5 .62 0.86 0.021 29.04
Intended Effort Intended1 .83 – 0.009 90.55
Intended2 .82 0.99 0.011 75.18
Intended3 .74 0.90 0.013 55.66
Intended4 .78 0.94 0.013 61.41
Intended5 .75 0.90 0.014 54.42
Note. All coefficients significant at the p.001 level.
2. Extending the model in Oga-Baldwin and Nakata (2017), we hypothesized
that Intended Effort is an antecedent of the Ideal L2 Self and the Ought-to
L2 Self.
In both models, we treated Visual Style as a predictor of the Ideal L2 Self and the
Ought-to L2 Self following D¨
ornyei and Chan’s (2013) argument that learners
with a visual sensory style preference are more likely to develop stronger self-
guides (see also D¨
ornyei, 2014). We also added L2 achievement as an outcome
variable in both models.
Figure 4 (see also Table 6) shows the results of these analyses. In both
models, the modification indices suggested one covariance term between two
Visual Style items, and all normalized residuals were under ±2.0. The model
hypothesizing Intended Effort as a predictor (Figure 4, Panel B) showed a better
79 Language Learning 70:1, March 2020, pp. 48–102
Hiver and Al-Hoorie Reexamining Vision in Second Language Motivation
Figure 4 Two competing models. All coefficients significant at the p.01 level un-
less otherwise indicated. Akaike information criterion (AIC) and Bayesian information
criterion (BIC) were obtained through robust maximum likelihood estimation. CFI =
comparative fit index; TLI =Tucker-Lewis index; RMSEA =root mean square error
of approximation.
Language Learning 70:1, March 2020, pp. 48–102 80
Hiver and Al-Hoorie Reexamining Vision in Second Language Motivation
Tab l e 6 Standardized and unstandardized structural coefficients, standard errors, and zratios with and without controlling for baseline
Without controlling for baseline
With controlling for baseline
Path βBSEz p βBSEz p
Ideal L2 Self Final achievement .44 1.24 0.13 8.88 <.001 .09 0.23 0.05 5.15 <.001
Ought-to L2 Self Final achievement .10 0.29 0.09 3.40 .001 .00 0.00 0.05 0.05 .957
Intended Effort Ideal L2 Self .62 0.62 0.05 12.31 <.001 .58 0.59 0.05 12.47 <.001
Ought-to L2 Self .24 0.23 0.06 3.55 <.001 .27 0.24 0.06 3.93 <.001
Visu al Style Ideal L2 Self .19 0.28 0.08 3.49 <.001 .19 0.31 0.09 3.45 .001
Ought-to L2 Self .20 0.28 0.09 3.06 .002 .23 0.33 0.09 3.64 <.001
Baseline Achievement
Ideal L2 Self .10 0.04 0.01 5.14 <.001
Ought-to L2 Self .14 0.05 0.01 5.22 <.001
Intended Effort .32 0.13 0.01 11.18 <.001
Visual Style .31 0.08 0.01 8.30 <.001
Final achievement .81 0.82 0.03 28.43 <.001
81 Language Learning 70:1, March 2020, pp. 48–102
Hiver and Al-Hoorie Reexamining Vision in Second Language Motivation
overall fit and smaller Akaike information criterion and Bayesian information
criterion values, suggesting that it was more likely to be replicable. Controlling
for baseline achievement in the better fitting model (Table 6) had a minor effect
on most coefficients, with the important exception being a notable drop in the
path from the Ideal L2 Self to achievement from .44 to .09, while the initial
coefficient of the Ought-to L2 Self (–.10) became 0.
Measurement Invariance
We investigated measurement invariance before conducting gender compar-
isons. We constrained both factor loadings and item thresholds to be equal
across the two genders. This model, however, failed to satisfy invariance,
X2(111) =294.277, p<.001. Subsequently, we freed the thresholds except
for one in each item in addition to another in the marker variable. This proce-
dure resulted in invariance being satisfied, X2(88) =104.234, p=.114. These
results indicated that weak invariance (i.e., factor loadings) was satisfied, but
strong invariance (i.e., item thresholds) was only partially satisfied.
Gender Differences
Because measurement invariance was partially satisfied across the two genders,
it became justifiable to conduct a multigroup SEM. We conducted this procedure
by constraining each path to be equal across the two genders and then examining
whether the model fit deteriorated significantly as a result of this equality
constraint. Deterioration of model fit would have indicated that the paths were
not in fact equal.
Table 7 presents the results of these additional models. The results showed
that one path was significantly different across the two genders and remained
significant after Bonferroni correction. This finding suggested a stronger re-
lationship between Intended Effort and the Ought-to L2 Self scales for males
but not for females. No other path was significantly different between the two
Summary of the Main Findings
This article has presented a preregistered conceptual replication of a recent
study by You et al. (2016). Our results do not point to any gender difference
in language motivation or achievement that is clearly attributable to the role of
vision. On the other hand, our results do point to the superiority of a model
positing that intended effort is an antecedent, rather than an outcome, of mo-
tivation. Overall, our final SEM model bears little resemblance to that of the
Language Learning 70:1, March 2020, pp. 48–102 82
Hiver and Al-Hoorie Reexamining Vision in Second Language Motivation
Tab l e 7 Standardized coefficients, standard errors, and Wald tests of parameter con-
initial study primarily due to the number of scales that we eventually used.
The initial study used 10 scales, but our analyses revealed only four scales
showing adequate psychometric properties. This points to the urgent need for
further psychometric research to refine scales that are frequently cited in the
L2 motivation literature. Several novel lines of research have begun to do this
(e.g., Papi, Bondarenko, Mansouri, Feng, & Jiang, 2019; Teimouri, 2017).
At the same time, we must acknowledge the possibility that the results
of the initial study might still be different from ours had You et al. followed
the same procedures as we used. This could be due to cultural differences
between the Chinese and South Korean populations, though, as we explained
above, we did not consider this a very likely scenario due to the usefulness of
and emphasis on English learning in the two neighboring countries. A more
likely explanation for a potentially genuine, rather than a methodologically
artifactual, difference between the two studies is the type of participants. There
was a difference in the gender ratio of the two samples, but it was rather minor.
A clearer difference, however, was in the age of the participants (12–14 years
old in our sample and 16–20 years in the initial study). This age variable
83 Language Learning 70:1, March 2020, pp. 48–102
Hiver and Al-Hoorie Reexamining Vision in Second Language Motivation
could have contributed to a genuine interstudy variation. However, the role
of age in language learning motivation is still not well understood, and this
is primarily due to the sampling methods that are (understandably) limited
by access and availability (see Boo et al., 2015). A more systematic research
program targeting learners with different ages, preferably employing multisite
registered replication reports (see below), should elucidate the contribution of
this variable to language learning motivation.
Vision and Gender
In our study, we were able to use only one of the three vision-related scales
from the initial study, namely, Visual Style. Therefore, we were unable to
examine the results obtained in the initial study in relation to either Vividness
of Imagery or Auditory Style. We leave it to future research, first, to develop
scales with adequate psychometric properties for these two constructs and,
second, to attempt to replicate You et al.’s findings from them. As for Visual
Style, our results show that this variable predicted the Ideal L2 Self rather
modestly (β=.19). This magnitude was equivalent for the two genders. We
did not, therefore, find support in our data for the idea that females are better
learners due to their superior vision-related capabilities. At the same time,
our findings support those found by Kim and Kim (2011), who showed that a
dominant visual preference is a weak variable in L2 learning. Other researchers
(e.g., Lamb, 2012; Moskovsky et al., 2016; Papi & Abdollahzadeh, 2012) have
advanced similar arguments about vision and self-guides in recent years.
Although our findings do not support a gender difference in relation to
visual style, our exploratory results do point toward one potential difference.
The relationship between Intended Effort and the Ought-to L2 Self scales was
markedly stronger for males. Assuming causality, this pattern implies that—for
males—the impact of the desire “to meet expectations and to avoid possi-
ble negative outcomes” (D ¨
ornyei, 2009, p. 29) that are “imposed” (p. 32) by
peers, parents, and authoritative figures becomes relevant only when sufficient
intended effort to learn the language exists in the first place. For females, how-
ever, this social element of the ought self-guide seems more persistent and not
necessarily connected to an initial intended effort. This interpretation seems
consistent with behavior genetics findings. According to Bouchard (2004; see
also Al-Hoorie, 2015), social attitudes are a domain where gender differences
clearly emerge—with heritabilities of .65 for males but only .45 for females,
suggesting that environmental influences may be more salient for females. This
implies that females tend to be more open to social persuasion than males.
This interpretation leads us to advance the ought gender-difference hypothe-
Language Learning 70:1, March 2020, pp. 48–102 84
Hiver and Al-Hoorie Reexamining Vision in Second Language Motivation
sis: The development of an ought-to L2 self is more likely to be facilitated
by the presence of a strong intended effort for males, but such a moderating
effect is less likely to exist for females. This hypothesis might help explain the
conflicting results (see Al-Hoorie, 2018) obtained for this construct to date.
We reiterate that this interpretation assumes that this exploratory finding will
replicate in future research and that this relationship is causal, which is yet to
be demonstrated experimentally in the language motivation field.
Intended Effort and Attitudes to L2 Learning
Another curious finding from our study was the lack of discriminant validity
between the Intended Effort and the Attitudes to L2 Learning (also called the
L2 Learning Experience) scales. Interestingly, Attitudes to L2 Learning has
frequently been described as the strongest predictor of Intended Effort (e.g.,
ornyei, 2019; Lamb, 2012). Our results suggest that this predictive validity
may be inflated because these two scales do not represent two clearly unidi-
mensional constructs. For instance, the items “I always look forward to English
classes” and “I would like to spend lots of time studying English” appeared in
the Attitudes to L2 Learning and the Intended Effort scales, respectively, in the
initial study. However, our results suggest that looking forward to something
and the desire to spend lots of time doing it do not psychometrically represent
two latent constructs. Indeed, it is rather mundane to find out that those who
self-report looking forward to something also self-report a desire to spend a lot
of time doing it (see also Gardner, 2010, p. 73, for a similar critique).
As an illustration of the prevalence of discriminant validity issues, Al-
Hoorie (2018), in a meta-analysis, compared the correlation between the In-
tended Effort and the L2 Learning Experience scales in studies that used a
factor-analytic procedure (to psychometrically ascertain the unidimensional-
ity of the two scales before examining their correlation) versus studies that
did not. The results showed a significant drop from .68 for studies without a
factor-analytic procedure to .41 for studies that implemented one. Apparently,
researchers who conducted an appropriate factor-analytic procedure were able
to detect and exclude problematic items (e.g., items loading on both constructs)
before computing correlation coefficients. An example cited in Al-Hoorie’s
(2018) meta-analysis demonstrated the severity of this problem: One study
used the item “Learning English is one of the most important aspects in my
life” in the L2 Learning Experience scale, and the item “It is extremely im-
portant for me to learn English” in the Intended Effort scale. Because these
two items are almost identical, it would be very hard to imagine that they un-
derlie two distinct constructs. Overall, these combined results suggest that the
85 Language Learning 70:1, March 2020, pp. 48–102
Hiver and Al-Hoorie Reexamining Vision in Second Language Motivation
L2 learning experience’s predictive validity of intended effort seems, in many
reports, exaggerated by some wording overlap in the items of these two scales.
Furthermore, in our conceptual replication, we added L2 achievement as
an outcome measure. As we explained above, we do not consider this decision
a deviation from the initial study but an extension of it. After all, with a
reduction from 10 to four scales, we were far beyond the point of reproducing
the initial study’s original model at any level of accuracy. Still, this additional
variable allowed us to engage in model comparison procedures that led to some
interesting insights. That is, we speculated that the Intended Effort variable,
commonly used as an outcome measure, might be argued to signal initial interest
in the language and in engaging in learning behaviors. Based on this, general
items like “I am prepared to expend a lot of effort in learning English” can be
said to mark the initial interest required to engage in the learning process and
its demands. In contrast, other specific (particularly increasingly challenging)
items then are more suitable for tapping into the outcome of this motivation
(Al-Hoorie, 2018).
As an illustration of this notion consider an example from sports, one
field from which the L2 vision tradition has borrowed insights and metaphors.
Initially, it makes sense to ask prospective trainees whether they are willing to
expend a lot of effort and to spend lots of time practicing a certain sport. Without
such initial willingness, serious commitment and dedication to that sport and
the effort required to attain competence are unlikely. After motivation takes
shape, however, it makes less sense to keep asking the same generic questions
about willingness to expend effort. Instead, more specific and increasingly
challenging questions are more appropriate as a criterion gauging trainees’
level of motivation, such as commitment to practice despite cold weather, on
holidays, for longer hours, and the like. Such specificity is most likely a better
test of trainees’ motivation and more clearly distinguishes it from the motivation
of other trainees.
Going back to the language learning domain, it seems that a general in-
tended effort should similarly be considered as signaling initial interest in learn-
ing the language and putting in the effort to engage with its demands. Once
engagement actually does occur, there will be a dynamic interaction between
motivation (e.g., the ideal L2 self, ought-to L2 self, integrative motivation, in-
trinsic motivations, etc.) and task demands, leading to continuous recalibration
of that motivational construct such as forming realistic aspirations and ability
expectations (Bandura, 1986, 1997). An appropriate test of the value of the
motivational construct in question, subsequently, should be its ability to pre-
dict more specific and challenging motivational intentions rather than generic
Language Learning 70:1, March 2020, pp. 48–102 86
Hiver and Al-Hoorie Reexamining Vision in Second Language Motivation
ones again. Examples include willingness to engage in optional work (e.g.,
additional homework tasks) versus only required work, and in agentic learning
(e.g., self-directed and autonomous activities) versus only prescribed activities.
More highly motivated learners are expected to exhibit more willingness to
engage in these challenging tasks (see Al-Hoorie, 2018, for a discussion) and,
as a result, attain higher competence. In our case, not only did our concep-
tualization of (generic) intended effort as an antecedent fit the data, but—to
our surprise—it also fit better than the more conventional model where it has
been conceptualized as an outcome variable. We repeat our caveat that these
findings are based on exploratory analyses, and so future replication research
should attempt to test them further, ideally longitudinally. We also encourage
researchers to investigate the possibility of nonrecursive paths, where causality
is reciprocal rather than unidirectional, as has always been assumed in our field.
Potential Confounds in Nonexperimental Research
When it comes to self-guides predicting achievement, our results also provide
some interesting insights. The results (see Panel B of Figure 4) show that results
from the Ideal L2 Self and the Ought-to Self scales initially predicted achieve-
ment at β=.44 and β=–.10, respectively. However, after controlling for
baseline achievement (see Table 6), the results dropped to β=.09 and β=.00,
respectively. The latter coefficients both fell within the 95% confidence inter-
vals of a recent meta-analysis (Al-Hoorie, 2018). The drop after controlling for
baseline achievement has been observed in previous research as well (e.g., Joe
et al., 2017; Yun et al., 2018), and suggests that many correlational results in the
literature could be inflated due to lack of baseline adjustment. This is especially
applicable in observational research (as opposed to experimental research) in
which researchers do not manipulate conditions, assign participants to groups
randomly, or use matched randomization. In observational research, various
confounds can creep in and seriously inflate (and occasionally suppress) the
relationship that one intends to observe (e.g., Beleche, Fairris, & Marks, 2012,
p. 709). Using the same logic, it is not implausible that the correlation between
a hypothesized motivational construct and achievement might be confounded
by similar variables. Despite the risk of statistical overcontrol (see Bandura,
1997, p. 69), ability level and previous achievement are strong candidates for
potential confounding variables, and so it is hard to rule out the possibility
that success breeds success and that the hypothesized motivational construct
is simply a byproduct of this process (e.g., Hiver et al., 2019). Correlation is
not causation. In short, although our results support a modest role for the Ideal
L2 Self scale (β=.09, or less than 1% of the variance), they show that the
87 Language Learning 70:1, March 2020, pp. 48–102
Hiver and Al-Hoorie Reexamining Vision in Second Language Motivation
Ought-to L2 Self scale results played virtually no role in predicting L2 achieve-
ment (β=.00).
Replication, Future Research, and Open Science Practices
To our knowledge, this article is the first preregistered replication study, and one
of the few self-labeled replication attempts, in the language motivation field. We
hope that greater numbers of language motivation researchers will take up this
initiative by engaging in direct, partial, and conceptual replications of critical
results in the literature (see Marsden, Morgan-Short, Thompson, & Abugaber,
2018, for a discussion of the kinds of findings that warrant replication) and by
encouraging their graduate students to do so. It is vital for a field to keep track of
the replicability of its findings. If results do not replicate, the field can engage in
self-correction efforts to reduce the chances of a perceived replication crisis—
as has been reported in a number of disciplines including biology, genetics,
medicine, and psychology (Schooler, 2014).
We emphasize, yet again, that we do not claim that You et al.’s results were
unfounded, were merely an artifact of their methodology, or were flawed. It
is by no means unusual for replication research to obtain divergent results,
particularly when the replication is conceptual, and so this should not be
viewed as dismissing the value of the initial study. Instead, the natural im-
plication in such cases is the need for further research and more robust designs
to better understand the phenomena in question, especially because SEM in
particular relies on a set of stringent but often unmet assumptions (Good-
boy & Kline, 2017, p. 69). Many language motivation researchers may be
reluctant to do this considering the Kuhnian normal science status that the
L2 motivational self-system has reached, which can give the perception that
it is above critique (see Henry & Cliffordson, 2017). To ensure meaningful
interpretation of existing results, a systematic replication program would al-
low our field to take stock of the best empirical evidence for a cumulative
Of course, all this points to the need to replicate our own model as well.
Nevertheless, our recommendation for future research would not in fact be to
suggest that scholars conduct yet more SEM studies, at least not initially and
not in isolation. A more fruitful direction would be to first establish causality
among variables experimentally (for similar calls, see Al-Hoorie, 2018; Lamb,
2017; Yun et al., 2018). We also recommend that motivation researchers delib-
erately advance hypotheses for future testing as we have done in this article. For
one reason or another, and despite a recent surge in publications (see Boo et al.,
2015), the language motivation field has not developed a culture of hypothesis
Language Learning 70:1, March 2020, pp. 48–102 88
Hiver and Al-Hoorie Reexamining Vision in Second Language Motivation
testing. In fact, unlike various sister L2 subdisciplines where hypothesis testing
is routine, one would be hard pressed to think of a hypothesis explicitly for-
mulated in such a way that it could be tested independently in L2 motivation
Whether choosing to engage in SEM, experiments, or other types of re-
search, we strongly encourage future researchers to preregister their designs
and analysis plans to maximize transparency and to minimize the possibil-
ity of (unintentional) questionable research practices, such as reporting only
significant results or reframing the research questions and hypotheses to fit
the results already obtained, which is often referred to as “hypothesizing after
the results are known” (see Kerr, 1998). One valuable initiative in this regard
is the Registered Reports housed at Language Learning (Marsden, Morgan-
Short, Trofimovich, & Ellis, 2018), in which authors are invited to submit
the full method and analysis protocol of their proposed study—along with its
conceptual justification—before the actual data collection. Submissions that
pass an initial peer review stage are conferred an in-principle acceptance sta-
tus, guaranteeing subsequent acceptance by the journal regardless of the re-
sults, provided that researchers adhere to the original method and analysis
protocols. This approach should encourage more researchers to publish their
reports irrespective of the findings that they obtain and, it is hoped, should
minimize publication bias. A particularly strong approach, achievable within
the Registered Report publication route, is a multisite replication in which re-
searchers at different sites attempt to replicate one finding independently while
adhering to preregistered protocols (for more details, see Morgan-Short et al.,
Finally, we would also encourage researchers to make their datasets publicly
available whenever possible. The uptake of these open science initiatives has
slowly started to grow, evidenced by the increasing number of leading journals
in the field that have begun to champion them, and we look forward to these
initiatives having a net positive effect on research practices and empirical
evidence in the language motivation field.
All in all, the involvement of vision, imagery, and sensory modalities has
been argued to represent a “key aspect” (D¨
ornyei, 2014, p. 10) of future self-
guides, particularly the ideal L2 self. However, the present study did not offer
unambiguous support for a vision scale that is distinct from the ideal L2 self,
itself shown to be a weak predictor of academic achievement in the L2. On
balance, then, if there is one thing to which the evidence available so far may
89 Language Learning 70:1, March 2020, pp. 48–102
Hiver and Al-Hoorie Reexamining Vision in Second Language Motivation
point, it is that the notion of centrality of vision, involving tangible images and
senses, to language learning motivation deserves further substantiating.
Final revised version accepted 3 June 2019
1 In current language learning motivation literature, no functional distinction is made
between mental imagery and vision. These terms are used interchangeably and
often alternatively for stylistic reasons. Throughout this study, we adopted the term
vision in line with the initial study.
2 The field of sports psychology has adopted the term motor imagery to refer to any
mental practice that involves imagination or vision.
3 We have presented several mathematical formulas in the text. We have chosen to
present these because standard SEM software does not calculate these indices and,
therefore, researchers must rely on these formulas to calculate these indices
themselves. Lowry and Gaskin (2014) have provided an accessible introduction to
these concepts, and Gaskin’s (2016) Stats Tools Package calculates most of the
formulas presented in this article within an Excel spreadsheet.
4 After we had concluded our data collection, we realized that the initial study had
used a 6-point Likert response format. This minor difference likely had minimal
impact on our results, especially because research by Felix (2011) showed hardly
any difference whether a 3-, 5-, 7-, or even 9-point scale is used, suggesting that
“scale width does not influence important indicators such as means, standard
deviations and skewness” (p. 143).
5 Readers interested in this and similar topics may consult the recently launched
journal Advances in Methods and Practices in Psychological Science.
Adolphs, S., Clark, L., D¨
ornyei, Z., Glover, T., Henry, A., Muir, C., . . . Valstar, M.
(2018). Digital innovations in L2 motivation: Harnessing the power of the Ideal L2
Self. System,78, 173–185.
Albalawi, F. H. E. (2018). L2 demotivation among Saudi learners of English: The role
of language learning mindsets (Unpublished doctoral dissertation). Nottingham,
UK: University of Nottingham.
Al-Hoorie, A. H. (2015). Human agency: Does the beach ball have free will? In Z.
ornyei, P. MacIntyre, & A. Henry (Eds.), Motivational dynamics in language
learning (pp. 55–72). Bristol, UK: Multilingual Matters.
Al-Hoorie, A. H. (2016a). Unconscious motivation. Part I: Implicit attitudes toward L2
speakers. Studies in Second Language Learning and Teaching,6, 423–454.
Language Learning 70:1, March 2020, pp. 48–102 90
Hiver and Al-Hoorie Reexamining Vision in Second Language Motivation
Al-Hoorie, A. H. (2016b). Unconscious motivation. Part II: Implicit attitudes and L2
achievement. Studies in Second Language Learning and Teaching,6, 619–649.
Al-Hoorie, A. H. (2018). The L2 motivational self system: A meta-analysis. Studies in
Second Language Learning and Teaching,8, 721–754.
Al-Hoorie, A. H., & Vitta, J. P. (in press). The seven sins of L2 research: A review of
30 journals’ statistical quality and their CiteScore, SJR, SNIP, JCR impact factors.
Language Teaching Research. Advance online publication
Al-Shehri, A. S. (2009). Motivation and vision: The relation between the ideal L2 self,
imagination and visual style. In Z. D¨
ornyei & E. Ushioda (Eds.), Motivation,
language identity and the L2 self (pp. 164–171). Bristol, UK: Multilingual Matters.
Arbuckle, J. L. (2013). IBM RSPSS RAmosTM 22 user’s guide. Meadville, PA: Amos
Development Corporation.
Bandura, A. (1986). Social foundations of thought and action: A social cognitive
theory. Englewood Cliffs, NJ: Prentice-Hall.
Bandura, A. (1997). Self-efficacy: The exercise of control. New York, NY: Freeman.
Barcikowski, R. S. (1981). Statistical power with group mean as the unit of analysis.
Journal of Educational Statistics,6, 267–285.
Beleche, T., Fairris, D., & Marks, M. (2012). Do course evaluations truly reflect
student learning? Evidence from an objectively graded post-test. Economics of
Education Review,31, 709–719.
Bohannon, J. (2014). Replication effort provokes praise—and ‘bullying’ charges.
Science,344, 788–789.
Bollen, K. A., & Pearl, J. (2013). Eight myths about causality and structural equation
models. In S. Morgan (Ed.), Handbook of causal analysis for social research
(pp. 301–328). New York, NY: Springer.
Boo, Z., D¨
ornyei, Z., & Ryan, S. (2015). L2 motivation research 2005–2014:
Understanding a publication surge and a changing landscape. System,55, 145–157.
Bouchard, T. J., Jr. (2004). Genetic influence on human psychological traits: A survey.
Current Directions in Psychological Science,13, 148–151.
Brandt, M., IJzerman, H., Dijksterhuis, A., Farach, F., Geller, J., Giner-Sorolla, R., . . .
van’t Veer, A. (2014). The replication recipe: What makes for a convincing
replication? Journal of Experimental Social Psychology,50, 217–224.
Brown, T. A. (2015). Confirmatory factor analysis for applied research (2nd ed.). New
York, NY: Guilford Press.
91 Language Learning 70:1, March 2020, pp. 48–102
Hiver and Al-Hoorie Reexamining Vision in Second Language Motivation
Chan, L. (2014). Effects of an imagery training strategy on Chinese university
students’ possible second language selves and learning experiences. In K. Csiz´
er &
M. Magid (Eds.), The impact of self-concept on language learning (pp. 357–376).
Bristol, UK: Multilingual Matters.
er, K., & Kormos, J. (2009). Learning experiences, selves and motivated learning
behavior: A comparative analysis of structural models for Hungarian secondary and
university learners of English. In Z. D¨
ornyei & E. Usioda (Eds.), Motivation,
language identity and the L2 self (pp. 98–119). Bristol, UK: Multilingual Matters.
er, K., & Luk´
acs, G. (2010). The comparative analysis of motivation, attitudes and
selves: The case of English and German in Hungary. System,38, 1–13.
Courtney, M. G. R. (2013). Determining the number of factors to retain in EFA: Using
the SPSS R-menu v2.0 to make more judicious estimations. Practical Assessment,
Research & Evaluation,18, 1–15. Retrieved from
Davidov, E., Meuleman, B., Cieciuch, J., Schmidt, P., & Billiet, J. (2014).
Measurement equivalence in cross-national research. Annual Review of Sociology,
40, 55–75.
ornyei, Z. (2005). The psychology of the language learner: Individual differences in
second language acquisition. London, UK: Erlbaum.
ornyei, Z. (2009). The L2 motivational self system. In Z. D¨
ornyei & E. Ushioda
(Eds.), Motivation, language identity and the L2 self (pp. 9–42). Bristol, UK:
Multilingual Matters.
ornyei, Z. (2014). Future self-guides and vision. In K. Csiz´
er & M. Magid (Eds.),
The impact of self-concept on language learning (pp. 7–18). Bristol, UK:
Multilingual Matters.
ornyei, Z. (2019). Towards a better understanding of the L2 learning experience, the
Cinderella of the L2 motivational self system. Studies in Second Language
Learning and Teaching,9, 21–32.
ornyei, Z., & Chan, L. (2013). Motivation and vision: An analysis of future L2 self
images, sensory styles, and imagery capacity across two target languages. Language
Learning,63, 437–462.
ornyei, Z., Henry, A., & Muir, C. (2016). Motivational currents in language
learning: Frameworks for focused interventions. New York, NY: Routledge.
ornyei, Z., & Kubanyiova, M. (2014). Motivating learners, motivating teachers:
Building vision in the language classroom. Cambridge, UK: Cambridge University
ornyei, Z., & Ushioda, U. (2011). Teaching and researching motivation (2nd ed.).
Harlow, UK: Pearson.
Dunlosky, J., Rawson, K. A., Marsh, E. J., Nathan, M. J., & Willingham, D. T. (2013).
Improving students’ learning with effective learning techniques: Promising
Language Learning 70:1, March 2020, pp. 48–102 92
Hiver and Al-Hoorie Reexamining Vision in Second Language Motivation
directions from cognitive and educational psychology. Psychological Science in the
Public Interest,14, 4–58.
Eisinga, R., Grotenhuis, M., & Pelzer, B. (2013). The reliability of a two-item scale:
Pearson, Cronbach, or Spearman-Brown? International Journal of Public Health,
58, 637–642.
Felix, R. (2011). The impact of scale width on responses for multi-item, self-report
measures. Journal of Targeting, Measurement and Analysis for Marketing,19,
Fishbein, M., & Ajzen, I. (2010). Predicting and changing behavior: The reasoned
action approach. New York, NY: Psychology Press.
Fornell, C., & Larcker, D. F. (1981). Evaluating structural equation models with
unobservable variables and measurement error. Journal of Marketing Research,18,
Gardner, R. C. (1985). Social psychology and second language learning: The role of
attitudes and motivation. London, UK: Edward Arnold.
Gardner, R. C. (2010). Motivation and second language acquisition: The
socio-educational model. New York, NY: Peter Lang.
Gaskin, J. (2016). Stats Tools Package [Computer software]. Retrieved from
Gelman, A., & Stern, H. S. (2006). The difference between “significant” and “not
significant” is not itself statistically significant. The American Statistician,60,
Goodboy, A. K., & Kline, R. B. (2017). Statistical and practical concerns with
published communication research featuring structural equation modeling.
Communication Research Reports,34, 68–77.
Hadfield, J., & D¨
ornyei, Z. (2013). Motivating learning. New York, NY: Routledge.
Hair, J. F., Jr., Black, W. C., Babin, B. J., & Anderson, R. E. (2010). Multivariate data
analysis (7th ed.). Upper Saddle River, NJ: Prentice Hall.
Henry, A., & Cliffordson, C. (2017). The impact of out-of-school factors on motivation
to learn English: Self-discrepancies, beliefs, and experiences of self-authenticity.
Applied Linguistics,38, 713–736.
Higgins, E. T. (1987). Self-discrepancy: A theory relating self and affect.
Psychological Review,94, 319–340.
Higgins, E. T. (1996). The “self digest”: Self-knowledge serving self-regulatory
functions. Journal of Personality and Social Psychology,71, 1062–1083.
Higgins, E. T. (1998). Promotion and prevention: Regulatory focus as a motivational
principle. Advances in Experimental Social Psychology,30, 1–46.
Higgins, E. T. (2014). Beyond pleasure and pain: How motivation works. Oxford, UK:
Oxford University Press.
93 Language Learning 70:1, March 2020, pp. 48–102
Hiver and Al-Hoorie Reexamining Vision in Second Language Motivation
Hiver, P., & Al-Hoorie, A. H. (2016). Putting complexity theory into practice: A
dynamic ensemble for second language research. The Modern Language Journal,
100, 741–756.
Hiver, P., & Al-Hoorie, A. H. (2020). Research methods for complexity theory in
applied linguistics. Bristol, UK: Multilingual Matters.
Hiver, P., Obando, G., Sang, Y., Tahmouresi, S., Zhou, A., & Zhou, Y. (2019).
Reframing the L2 learning experience as narrative reconstructions of classroom
learning. Studies in Second Language Learning and Teaching,9, 85–118.
Hox, J. J. (2010). Multilevel analysis: Techniques and applications (2nd ed.). New
York, NY: Routledge.
Ibrahim, Z., & Al-Hoorie, A. H. (2019). Shared, sustained flow: Triggering motivation
with collaborative projects. ELT Journal,73, 51–60.
Islam, M., Lamb, M., & Chambers, G. (2013). The L2 motivational self system and
national interest: A Pakistani perspective. System,41, 231–244.
Joe, H.-K., Hiver, P., & Al-Hoorie, A. H. (2017). Classroom social climate,
self-determined motivation, willingness to communicate, and achievement: A study
of structural relationships in instructed second language settings. Learning and
Individual Differences,53, 133–144.
Kerr, N. L. (1998). HARKing: Hypothesizing after the results are known. Personality
and Social Psychology Review,2, 196–217.
Kim, T.-Y. (2009). Korean elementary school students’ perceptual learning style, ideal
L2 self, and motivated behavior. Korean Journal of English Language and
Linguistics,9, 461–486.
Kim, T.-Y., & Kim, Y.-K. (2014). A structural model for perceptual learning styles, the
ideal L2 self, motivated behavior, and English proficiency. System,46, 14–27.
Kim, Y.-K., & Kim, T.-Y. (2011). The effect of Korean secondary school students’
perceptual learning styles and ideal L2 self on motivated L2 behavior and English
proficiency. Korean Journal of English Language and Linguistics,11, 21–42.
Kline, R. B. (2016). Principles and practice of structural equation modeling (4th ed.).
New York, NY: Guilford Press.
Korea Statistical Information Service. (2016). Statistical database: Education, culture,
and science. Retrieved from
Korkmaz, S., Goksuluk, D., & Zararsiz, G. (2014). MVN: An R package for assessing
multivariate normality. The R Journal,6, 151–162.
Kormos, J., & Csiz´
er, K. (2008). Age-related differences in the motivation of learning
English as a foreign language: Attitudes, selves, and motivated learning behavior.
Language Learning 70:1, March 2020, pp. 48–102 94
Hiver and Al-Hoorie Reexamining Vision in Second Language Motivation
Language Learning,58, 327–355.
Kormos, J., & Csiz´
er, K. (2009). Learning experiences, selves and motivated learning
behaviour: A comparative analysis of structural models for Hungarian secondary
and university learners of English. In Z. D¨
ornyei & E. Ushioda (Eds.), Motivation,
language identity and the L2 self (pp. 98–119). Bristol, UK: Multilingual Matters.
Kreft, I. G. G., & de Leeuw, J. (1998). Introducing multilevel modeling. Thousand
Oaks, CA: Sage.
Lai, M. H. C., & Kwok, O. (2015). Examining the rule of thumb of not using multilevel
modeling: The “design effect smaller than two” rule. The Journal of Experimental
Education,83, 423–438.
Lamb, M. (2012). A self system perspective on young adolescents’ motivation to learn
English in urban and rural settings. Language Learning,62, 997–1023.
Lamb, M. (2017). The motivational dimension of language teaching. Language
Teaching,50, 301–346.
Larson-Hall, J., & Plonsky, L. (2015). Reporting and interpreting quantitative research
findings: What gets reported and recommendations for the field. Language
Learning,65, 127–159.
Levy, P. S., & Lemeshow, S. (2008). Sampling of populations: Methods and
applications (4th ed.). Hoboken, NJ: Wiley.
Li, C.-H. (2016). Confirmatory factor analysis with ordinal data: Comparing robust
maximum likelihood and diagonally weighted least squares. Behavior Research
Methods,48, 936–949.
Lindsay, D. S. (2015). Replication in psychological science. Psychological Science,
26, 1827–1832.
Locke, E. A., & Latham, G. P. (1990). A theory of goal setting & task performance.
Englewood Cliffs, NJ: Prentice-Hall.
Lowry, P. B., & Gaskin, J. (2014). Partial least squares (PLS) structural equation
modeling (SEM) for building and testing behavioral causal theory: When to choose
it and how to use it. IEEE Transactions on Professional Communication,57,
Lumley, T. (2010). Complex surveys: A guide to analysis using R. Oxford, UK: Wiley.
MacCallum, R. C., Wegener, D. T., Uchino, B. N., & Fabrigar, L. R. (1993). The
problem of equivalent models in applications of covariance structure analysis.
Psychological Bulletin,114, 185–199.
MacIntyre, P. D., & Serroul, A. (2015). Motivation on a per-second timescale:
Examining approach-avoidance motivation during L2 task performance. In Z.
ornyei, P. MacIntyre, & A. Henry (Eds.), Motivational dynamics in language
learning (pp. 109–138). Bristol, UK: Multilingual Matters.
95 Language Learning 70:1, March 2020, pp. 48–102
Hiver and Al-Hoorie Reexamining Vision in Second Language Motivation
Mackay, J. (2014). Applications and implications of the L2 motivational self system in
a Catalan EFL context. In K. Csiz´
er & M. Magid (Eds.), The impact of self-concept
on language learning (pp. 377–400). Bristol, UK: Multilingual Matters.
Mackey, A., & Gass, S. M. (2005). Second language research: Methodology and
design. Mahwah, NJ: Erlbaum.
Magid, M. (2009). The L2 motivational self system from a Chinese perspective:
A mixed methods study. Journal of Applied Linguistics,6, 69–90.
Magid, M. (2014). A motivational programme for learners of English: An application
of the L2 motivational self system. In K. Csiz´
er & M. Magid (Eds.), The impact of
self-concept on language learning (pp. 333–356). Bristol, UK: Multilingual Matters.
Magid, M., & Chan, L. (2012). Motivating English learners by helping them visualize
their ideal L2 self: Lessons from two motivational programmes. Innovation in
Language Learning and Teaching,6, 113–125.
Mantle-Bromley, C. (1995). Positive attitudes and realistic beliefs: Links to
proficiency. The Modern Language Journal,79, 372–386.
Markus, H., & Nurius, P. (1986). Possible selves. American Psychologist,41, 954–969.
Markus, H., & Ruvolo, A. (1989). Possible selves: Personalized representations of
goals. In L. A. Pervin (Ed.), Goal concepts in personality and social psychology
(pp. 211–241). Hillsdale, NJ: Erlbaum.
Marsden, E., Mackey, A., & Plonsky, L. (2016). The IRIS Repository: Advancing
research practice and methodology. In A. Mackey & E. Marsden (Eds.), Advancing
methodology and practice: The IRIS Repository of Instruments for Research into
Second Languages (pp. 1–21). New York, NY: Routledge.
Marsden, E., Morgan-Short, K., Thompson, S., & Abugaber, D. (2018). Replication in
second language research: Narrative and systematic reviews and recommendations
for the field. Language Learning,68, 321–391.
Marsden, E., Morgan-Short, K., Trofimovich, P., & Ellis, N. C. (2018). Introducing
registered reports at Language Learning: Promoting transparency, replication, and a
synthetic ethic in the language sciences. Language Learning,68, 309–320.
Marsden, E., Thompson, S., & Plonsky, L. (2018). A methodological synthesis of
self-paced reading in second language research. Applied Psycholinguistics,39,
Marsh, H. W., Balla, J. R., & McDonald, R. P. (1988). Goodness-of-fit indexes in
confirmatory factor analysis: The effect of sample size. Psychological Bulletin,103,
McNeish, D. (2018). Thanks coefficient alpha, we’ll take it from here. Psychological
Methods,23, 412–433.
Language Learning 70:1, March 2020, pp. 48–102 96
Hiver and Al-Hoorie Reexamining Vision in Second Language Motivation
Meijer, R. R., & Baneke, J. J. (2004). Analyzing psychopathology items: A case for
nonparametric item response theory modeling. Psychological Methods,9, 354–368.
Molenaar, I. W., & Sijtsma, K. (2000). MSP5 for Windows: A program for Mokken
scale analysis for polytomous items (Version 5.0) [Computer software]. Groningen,
Netherlands: ProGAMMA.
Morgan-Short, K., Marsden, E., Heil, J., Issa, B. I. II, Leow, R. P., Mikhaylova, A., . . .
Szudarski, P. (2018). Multisite replication in second language acquisition research:
Attention to form during listening and reading comprehension. Language Learning,
68, 392–437.
Morin, L., & Latham, G. P. (2000). The effect of mental practice and goal setting as a
transfer of training intervention on supervisors’ self-efficacy and communication
skills: An exploratory study. Applied Psychology,49, 566–578.
Moskovsky, C., Assulaimani, T., Racheva, S., & Harkins, J. (2016). The L2
motivational self system and L2 achievement: A study of Saudi EFL learners. The
Modern Language Journal,100, 641–654.
Muir, C., & D¨
ornyei, Z. (2013). Directed motivational currents: Using vision to create
effective motivational pathways. Studies in Second Language Learning and
Teaching,3, 357–375.
Muslimin, A. S. M. (2017, November 30). Why Asian countries are investing so
heavily in the English language. For b es , Retrieved from
en, B. O., & Satorra, A. (1995). Complex sample data in structural equation
modeling. Sociological Methodology,25, 267–316.
en, L. K., & Muth´
en, B. O. (19982012). Mplus user’s guide (7th ed.). Los
Angeles, CA: Muth´
en & Muth´
Oga-Baldwin, W. L. Q., & Nakata, Y. (2017). Engagement, gender, and motivation: A
predictive model for Japanese young language learners. System,65, 151–163.
Oettingen, G. (1996). Positive fantasy and motivation. In P. M. Gollwitzer & J. A.
Bargh (Eds.), The psychology of action: Linking cognition and motivation to
behavior (pp. 236–259). New York, NY: Guilford Press.
Oettingen, G. (2012). Future thought and behaviour change. European Review of
Social Psychology,23, 1–63.
Oettingen, G., & Mayer, D. (2002). The motivating function of thinking about the
future: Expectations versus fantasies. Journal of Personality and Social Psychology,
83, 1198–1212.
Paluck, E. L. (2010). Is it better not to talk? Group polarization, extended contact, and
perspective taking in eastern Democratic Republic of Congo. Personality and Social
Psychology Bulletin,36, 1170–1185.
Papi, M., & Abdollahzadeh, E. (2012). Teacher motivational practice, student
motivation, and possible L2 selves: An examination in the Iranian EFL context.
97 Language Learning 70:1, March 2020, pp. 48–102
Hiver and Al-Hoorie Reexamining Vision in Second Language Motivation
Language Learning,62, 571–594.
Papi, M., Bondarenko, A., Mansouri, S., Feng, L., & Jiang, C. (2019). Rethinking L2
motivation research: The 2 ×2 model of L2 self-guides. Studies in Second
Language Acquisition,41, 337–361.
Pearl, J. (2009). Causality: Models, reasoning, and inference (2nd ed.). Cambridge,
UK: Cambridge University Press.
Pearl, J. (2012). The causal foundations of structural equation modeling. In R. H.
Hoyle (Ed.), Handbook of structural equation modeling (pp. 68–91). New York,
NY: Guilford Press.
Peterson, R. A., & Kim, Y. (2013). On the relationship between coefficient alpha and
composite reliability. Journal of Applied Psychology,98, 194–198.
Porte, G. K. (2012). Introduction. In G. K. Porte (Ed.), Replication research in applied
linguistics (pp. 1–17). Cambridge, UK: Cambridge University Press.
R Core Team. (2014). R: A language and environment for statistical computing
(Version 3.1.2) [Computer software]. Vienna, Austria: R Foundation for Statistical
Computing. Retrieved from
Raykov, T. (1998). Coefficient alpha and composite reliability with interrelated
nonhomogeneous items. Applied Psychological Measurement,22, 375–385.
Raykov, T. (2004). Point and interval estimation of reliability for multiple-component
measuring instruments via linear constraint covariance structure modeling.
Structural Equation Modeling: A Multidisciplinary Journal,11, 342–356.
Raykov, T., & Marcoulides, G. A. (2019). Thanks coefficient alpha, we still need you!
Educational and Psychological Measurement,79, 200–210.
Robles, J. (1996). Confirmation bias in structural equation modeling. Structural
Equation Modeling: A Multidisciplinary Journal,3, 73–83.
Ruscio, J., & Roche, B. (2012). Determining the number of factors to retain in an
exploratory factor analysis using comparison data of known factorial structure.
Psychological Assessment,24, 282–292.
Ryan, S. (2009). Self and identity in L2 motivation in Japan: The ideal L2 self and
Japanese learners of English. In Z. D¨
ornyei & E. Ushioda (Eds.), Motivation,
language identity and the L2 self (pp. 120–143). Bristol, UK: Multilingual Matters.
Sampson, R. (2012). The language-learning self, self-enhancement activities, and self
perceptual change. Language Teaching Research,16, 317–335.
Saris, W. E., Satorra, A., & van der Veld, W. M. (2009). Testing structural equation
models or detection of misspecifications? Structural Equation Modeling: A
Language Learning 70:1, March 2020, pp. 48–102 98
Hiver and Al-Hoorie Reexamining Vision in Second Language Motivation
Multidisciplinary Journal,16, 561–582.
Sato, M., & Lara, P. (2019). Interaction vision intervention to increase second
language motivation: A classroom study. In M. Sato & S. Loewen (Eds.),
Evidence-based second language pedagogy: A collection of instructed second
language acquisition studies (pp. 287–313). New York, NY: Routledge.
Schooler, J. W. (2014). Metascience could rescue the ‘replication crisis.’ Nature,515,
Shah, R., & Goldstein, S. M. (2006). Use of structural equation modeling in operations
management research: Looking back and forward. Journal of Operations
Management,24, 148–169.
Sijtsma, K. (2009). On the use, the misuse, and the very limited usefulness of
Cronbach’s alpha. Psychometrika,74, 107–120.
Steinmetz, H., Schmidt, P., Tina-Booh, A., Wieczorek, S., & Schwartz, S. H. (2009).
Testing measurement invariance using multigroup CFA: Differences between
educational groups in human values measurement. Quality & Quantity,43,
Taguchi, T., Magid, M., & Papi, M. (2009). The L2 motivational self system among
Japanese, Chinese and Iranian learners of English: A comparative study.
In Z. D¨
ornyei & E. Ushioda (Eds.), Motivation, language identity and the L2 self
(pp. 66–97). Bristol, UK: Multilingual Matters.
Teimouri, Y. (2017). L2 selves, emotions, and motivated behaviors. Studies in Second
Language Acquisition,39, 681–709.
van der Eijk, C., & Rose, J. (2015). Risky business: Factor analysis of survey
data—Assessing the probability of incorrect dimensionalisation. PLoS One,10,
van der Helm, R. (2009). The vision phenomenon: Towards a theoretical underpinning
of visions of the future and the process of envisioning. Futures,41, 96–104.
Vandenberg, R. J., & Lance, C. E. (2000). A review and synthesis of the measurement
invariance literature: Suggestions, practices, and recommendations for
organizational research. Organizational Research Methods,3, 4–70.
West, S. G., Taylor, A. B., & Wu, W. (2012). Model fit and model selection in
structural equation modeling. In R. H. Hoyle (Ed.), Handbook of structural
equation modeling (pp. 209–231). New York, NY: Guilford Press.
Wheaton, B., Muth´
en, B., Alwin, D. F., & Summers, G. F. (1977). Assessing reliability
and stability in panel models. Sociological Methodology,8, 84–136.
Yang, J. S., & Kim, T.-Y. (2011). The L2 motivational self system and perceptual
learning styles of Chinese, Japanese, Korean, and Swedish students. English
Teaching,66, 141–162.
99 Language Learning 70:1, March 2020, pp. 48–102
Hiver and Al-Hoorie Reexamining Vision in Second Language Motivation
You, C., D¨
ornyei, Z., & Csiz´
er, K. (2016). Motivation, vision, and gender: A survey of
learners of English in China. Language Learning,66, 94–123.
Yun, S., Hiver, P., & Al-Hoorie, A. H. (2018). Academic buoyancy: Exploring
learners’ everyday resilience in the language classroom. Studies in Second
Language Acquisition,40, 805–830.
Zach, S., Dobersek, U., Filho, E., Inglis, V., & Tenenbaum, G. (2018). A meta-analysis
of mental imagery effects on post-injury functional mobility, perceived pain, and
self-efficacy. Psychology of Sport and Exercise,34, 79–87.
Supporting Information
Additional Supporting Information may be found in the online version of this
article at the publisher’s website:
Appendix S1. Questionnaire Items.
Appendix S2. Exploratory Factor Analysis.
Appendix: Accessible Summary (also publicly available at
Is Learners’ Mental Imagery About Their Future Linked to Their
Motivation for Language Learning?
What This Research Was About and Why It Is Important
Many teachers are interested in maximizing learners’ motivation to learn lan-
guages. Recently, researchers have suggested that learners’ vision of themselves
in the future (their mental imagery) can energize action and help learners de-
velop positive learning behaviors in the present. These ideas have become pop-
ular with teachers, but the evidence to back up these claims has been mixed.
In this study, the researchers looked at this question by replicating (reproduc-
ing) and adapting a previous large-scale study authored by You and colleagues.
Their study reported vision and mental imagery to be reliable ways to increase
Chinese learners’ motivation and their effort for learning English as a second
language. This follow-up study focused on Korean learners of English and
used a visual statistical procedure—called a structural equation model—to test
potential links between learners’ vision, their motivation and effort to learn
English, and their actual achievement. The researchers found evidence that
learners’ intentions to apply effort strengthened their visions of themselves in
the future, not that mental imagery itself directly increased their learning effort.
Language Learning 70:1, March 2020, pp. 48–102 100
Hiver and Al-Hoorie Reexamining Vision in Second Language Motivation
What the Researchers Did
rThe researchers recruited 1,297 English language learners from middle
schools in selected areas of South Korea. They were selected to closely
reflect the most recent sociogeographic census in that country.
rThe learners responded to 10 questionnaire scales, each scale with multiple
items, as used in (or adapted from) the initial study in China, targeting aspects
of their personal motivation and vision (e.g., “I study English because close
friends of mine think it is important”).
rMeasures of language learning achievement (midterm and final exams) were
rThe researchers checked whether the questionnaire scales reliably measured
the constructs being investigated.
rThen, the researchers tested links between learners’ vision, motivation and
learning effort, and actual achievement.
rTwo models were tested to compare which one better represented the rela-
tionship between vision, motivation, and learning effort: One model where
“vision and motivation” led to “learning effort,” and the other where “learn-
ing effort” led to “vision and motivation.”
What the Researchers Found
rThe researchers found that only four from the 10 scales used in the initial
study were useful for drawing conclusions about the learners’ vision and
rOverall there was weak support for the idea that learners’ vision can posi-
tively influence their desire to expend effort in language learning and impact
their actual achievement in the language classroom.
rThe desire to meet expectations and outcomes imposed by peers, parents,
and authoritative figures tended to be more prevalent in female than in
male learners. For male learners, these tended to only have an impact once
these learners expressed a certain level of intention to spend time and effort
learning the language.
rStronger support was found for one of the two models, namely, the idea that
learners’ intention to spend time and effort on learning the language can lead
to vision and motivation.
rThe strongest and most reliable predictor of achievement was previous
achievement in language learning.
101 Language Learning 70:1, March 2020, pp. 48–102
Hiver and Al-Hoorie Reexamining Vision in Second Language Motivation
Things to Consider
rThe amount of learners’ general intention to spend time and effort on learning
a language may predict the motivation required to engage in the learning
process and its demands.
rThe notion that vision and mental imagery, involving tangible images and
senses, are central to language learning motivation is far from settled and
requires further evidence.
How to cite this summary: Hiver, P., & Al-Hoorie, A. H. (2019). Is mental
imagery linked to motivation for language learning? OASIS Summary of Hiver
& Al-Hoorie in Language Learning.
This summary has a CC BY-NC-SA license.
Language Learning 70:1, March 2020, pp. 48–102 102
... With the growing recognition that CDST approximates the reality of language development (Hiver & Al-Hoorie, 2020a), more SLA researchers are adopting this framework. However, there are many methodological considerations for conducting empirical research within a CDST paradigm. ...
... Some of these include how to operationalize the system, how to assess the influence of contextual factors on the system, as well as macro-and micro-structure considerations . Given the inherent complexities of analyzing dynamic cause-effect relationships between systems and their components, there has been much discussion about suitable methodologies and suggestions of how to enhance our CDST toolbox (de Bot, 2011;, 2020aHiver et al., 2022). Hilpert and Marchand (2018) distinguish between three conceptual perspectives to studying complex systems and their accompanying research designs: timeintensive, relation-intensive, and time-relation intensive approaches. ...
... More specifically, we concentrate on psychological networks, as opposed to social networks. SLA researchers have already explored social network analysis as a suitable research methodology for CDST, for example to model relationships between learners in a classroom and teacher networks as complex systems Hiver & Al-Hoorie, 2020a;Mercer, 2014). SLA researchers have not yet explored the potential of psychological networks to model psychological constructs that influence language learning as complex systems. ...
Full-text available
Network analysis is a method used to explore the structural relationships between people or organizations, and more recently between psychological constructs. Network analysis is a novel technique that can be used to model psychological constructs that influence language learning as complex systems, with longitudinal data, or cross-sectional data. The majority of complex dynamic systems theory (CDST) research in the field of second language acquisition (SLA) to date has been time-intensive, with a focus on analyzing intraindividual variation with dense longitudinal data collection. The question of how to model systems from a structural perspective using relation-intensive methods is an underexplored dimension of CDST research in applied linguistics. To expand our research agenda, we highlight the potential that psychological networks have for studying individual differences in language learning. We provide two empirical examples of network models using cross-sectional datasets that are publicly available online. We believe that this methodology can complement time-intensive approaches and that it has the potential to contribute to the development of new dimensions of CDST research in applied linguistics.
... Generally, motivation is considered a stimulation force that shapes individuals' behavior (Brophy, 1983). In the realm of education, student motivation, typically known as AM, is associated with their involvement in the process of learning (Hiver & Al-Hoorie, 2020). AM inspire learners to "make certain academic decisions, participate in classroom activities, and persist in pursuing the demanding process of learning" (Dörnyei & Ushioda, 2009, p. 2). ...
... Based on Trad et al. (2014), trait motivation is static, whereas state motivation is dynamic and may change. Different factors may influence static motivation such as learning atmosphere and course content (Hiver & Al-Hoorie, 2020) as well as teachers' personalities and relationships with the learners (Dörnyei, 2020;Kolganov et al, 2022). ...
Full-text available
The types of assessment tasks affect the learners’ psychological well-being and the process of learning. For years, educationalists were in search of finding and implementing accurate and convenient approaches to assess learners efficiently. Despite the significant role of performance-based assessment (PBA) in affecting second/foreign language (L2) learning processes, few empirical studies have tried to explore how PBA affects reading comprehension achievement (RCA), academic motivation (AM), foreign language anxiety (FLA), and students’ self-efficacy (SS-E). To fill this lacuna of research, the current study intended to gauge the impact of PBA on the improvement of RCA, AM, FLA, and SS-E in English as a foreign language (EFL) context. In so doing, a sample of 88 intermediate EFL learners were randomly divided into experimental group (EG) and CG (control group). During this research (16 sessions), the learners in the CG (N = 43) received the tradition assessment. In contrast, the learners in the EG (N = 45) were exposed to some modification based on the underpinning theories of PBA. Data inspection applying the one-way multivariate analysis of variance (i.e., the one-way MANOVA) indicated that the learners in the EG outperformed their counterparts in the CG. The results highlighted the significant contributions of PBA in fostering RCA, AM, FLA, and S-E beliefs. The implications of this study may redound to the benefits of language learners, teachers, curriculum designers, and policy makers in providing opportunities for further practice of PBA.
... In the first model, the unique role of grit in L2 achievement was examined. To avoid the confirmation bias that is common in SEM, we examined five competing models (Hiver & Al-Hoorie, 2020). We examined the fit of the model when each of the assumed predictor variables (e.g., conscientiousness, agreeableness, extraversion, neuroticism, and openness to experiencing) was used as the mediator instead of grit to examine the way each model accounts for the data and then to indicate the best fit model. ...
... To examine learners' L2 motivational characteristics, we adopted the L2 Motivational Self System [18,19] for two reasons. It is widely considered to be a synthesis of several existing constructs in L2 motivation research [20], and it has been successfully applied to multiple quantitative studies in various learning settings over the past decade [21][22][23][24]. L2 pragmatic production in the present study was operationalized as the ability to convey intentions effectively and appropriately in speech acts of complaints in high, equal, and low social power and social distance situations. ...
Full-text available
While motivation plays an important role in language learning, few attempts have been made to explore its significance in second language (L2) pragmatics learning. The current study investigated whether and how language learning motivation affects L2 pragmatics production. A total of 60 adult Chinese learners of English participated in this study. Data were elicited from a motivation questionnaire and a discourse completion task (DCT). The results revealed that L2 learners with high motivation performed better in making complaints in the target language than learners with low motivation. Moreover, learners’ levels of pragmatic production correlated positively with their overall L2 motivation, as well as with four motivational subscales, namely, attitudes towards learning English, ideal L2 self, intended learning efforts, and attitudes towards the L2 community. Regression analysis showed that learners’ attitude towards learning English best predicted their production of the speech act of complaints. The findings of this study support the role motivational dispositions play in learners’ L2 pragmatic production. The study provides insight into the interaction of L2 motivation and pragmatics learning.
... Due to this and many other benefits, and thanks to the availability of more user-friendly software options ( Byrne, 2013 ), SEM has been embraced in recent decades in applied linguistics, especially in the realm of language testing and individual differences (see, for example, Hiver & Al-Hoorie, 2020 ;In'nami & Koizumi, 2013 ;Li, 2013 ;Ockey & Choi, 2015 ;Papi & Khajavy, 2021 ;Vafaee & Suzuki, 2020 ). According to Winke (2014) , recent growth in the utilization of SEM is likely a result of the field's theoretical and methodological maturity, which warrants greater sophistication in data analysis and research designs (see similar sentiments in Plonsky, 2014 ). ...
Structural equation modeling (SEM) and meta-analysis (MA) are both powerful techniques employed frequently throughout the social and behavioral sciences, including applied linguistics. Although meta-analytic data are typically analyzed by calculating weighted means or correlation coefficients, other statistical models such as SEM can also be applied (Schoemann, 2016). SEM models gauge conceptualized models vis-à-vis empirical data across a given domain. Despite a considerable expansion of the analytical repertoire in applied linguistics in recent years (Gass, Loewen, & Plonsky, 2021), this particular technique has yet to be formally introduced or applied. The present methods tutorial, therefore, aims to introduce MASEM to applied linguistics. In doing so, we provide a conceptual rationale for MASEM, an outline of major stages involved, and a worked example of how MASEM might be utilized in the field, along with the data and code necessary for re-running all analyses.
Complex Dynamic Systems Theory (CDST), an instantiation in applied linguistics of complexity epistemology that transcends disciplinary boundaries, has gained much traction and momentum over the last decade, finding expressions in a fast-growing number of empirical second language developmental studies. However, the literature, while rapidly expanding, has displayed much confusion, notably oscillating between invoking CDST as a metatheory and as an object theory. Then, too, the metaphorical genesis of CDST—the metaphorical adoption of complexity epistemology from physical sciences—has seemed to invite miscellaneous interpretations, rendering CDST an ostensibly all-in-one conceptual prism. This article explores the epistemology of CDST, tracing its ontology and examining its role in second language developmental research. This enables a more nuanced understanding of CDST, while at once surfacing critical issues and directions for future research, as it moves toward a pluralistic approach to investigating CDST as a potentially unique lens on second language development.
Full-text available
Based on the theoretical framework of the L2 Motivational Self System (L2MSS), the present study aims to make a methodological contribution to L2 motivation research. With the application of a novel growth mixture modeling (GMM) technique, the study depicted developmental trajectories of three motivational variables (ideal L2 self, ought-to L2 self, and L2 learning experience) of 176 Chinese tertiary-level students over a period of two semesters. Results showed two to three salient classes with typical developmental patterns for the three motivational variables respectively, with which the study gained fresh insights into the developmental processes of motivation beyond the individual level. Our study further established three main multivariate profiles of motivation characterized by a distinct combination of different motivational variables. The findings extend our understanding of motivational dynamics, providing a nuanced picture of emergent motivational trajectories systemically. Additionally, GMM has shown to be an effective and applicable method for the identification of salient patterns in motivation development, which leads to practical implications.
Full-text available
The present study adopted a novel parallel-process GMM technique to research the adaptive interaction between foreign language learners’ learning motivation and emotions, with a view to advancing our understanding of how language learning motivation and emotions (enjoyment and anxiety) adaptively interact with each other over time. The present study, situated in the Chinese EFL learning context, collected learning motivation and emotion data from 176 Chinese EFL learners over a period of two semesters (12 months). The GMM technique adopted in the study identified three developmental profiles of motivation and two of emotions respectively. The study further distilled salient patterns of motivation-emotion interaction over time, patterns significant for designing and implementing pedagogical interventions for motivation enhancement. The parallel-process GMM technique was also proven to be a useful approach to parsing learner variety and learning heterogeneity, efficiently summarizing the complex, dynamic processes of motivation and emotion development.
Full-text available
Evidence-Based Second Language Pedagogy is a cutting-edge collection of empirical research conducted by top scholars focusing on instructed second language acquisition (ISLA) and offering a direct contribution to second language pedagogy by closing the gap between research and practice. Building on the conceptual, state-of-the-art chapters in The Routledge Handbook of Instructed Second Language Acquisition (2017), studies in this volume are organized according to the key components of ISLA: types of instruction, learning processes, learning outcomes, and learner and teacher psychology. The volume responds to pedagogical needs in different L2 teaching and learning settings by including a variety of theoretical frameworks (sociological, psychological, sociocultural, and cognitive), methodologies (qualitative and quantitative), target languages (English, Spanish, and Mandarin), modes of instruction (face-to-face and computer-mediated), targets of instruction (speaking, writing, listening, motivation, and professional development), and instructional settings (second language, foreign language, and heritage language). A novel synthesis of research in the rapidly growing field of ISLA that also covers effective research-based teaching strategies, Evidence-Based Second Language Pedagogy is the ideal resource for researchers, practitioners, and graduate students in SLA, applied linguistics, and TESOL.
Full-text available
• This study examined a classroom intervention designed to increase L2 learners’ motivation within the framework of Dörnyei’s (2005) L2 Motivational Self System. • The intervention aimed to form and strengthen EFL learners’ future vision related to their profession (i.e., business). The intervention was incorporated into classroom communicative tasks. • The results showed that the experimental group increased their Ideal L2 Self and Learning Experience but not their Ought-to L2 Self. • We conclude that further instructional support is needed for L2 learners to transfer their positive self-guides into their actual motivated behaviors.
Full-text available
This book provides practical guidance on research methods and designs that can be applied to Complex Dynamic Systems Theory (CDST) research. It discusses the contribution of CDST to the field of applied linguistics, examines what this perspective entails for research and introduces practical methods and templates, both qualitative and quantitative, for how applied linguistics researchers can design and conduct research using the CDST framework. Introduced in the book are methods ranging from those in widespread use in social complexity, to more familiar methods in use throughout applied linguistics. All are inherently suited to studying both dynamic change in context and interconnectedness. This accessible introduction to CDST research will equip readers with the knowledge to ensure compatibility between empirical research designs and the theoretical tenets of complexity. It will be of value to researchers working in the areas of applied linguistics, language pedagogy and educational linguistics and to scholars and professionals with an interest in second/foreign language acquisition and complexity theory.
Full-text available
The theoretical emphasis within the L2 Motivational Self System has typically been on the two future self-guides representing possible (ideal and ought-to) selves, leaving the third main dimension of the construct, the L2 Learning Experience, somewhat undertheorized. Yet, this third component is not secondary in importance, as evidenced by empirical studies that consistently indicate that the L2 Learning Experience is not only a strong predictor of various criterion measures but is often the most powerful predictor of motivated behavior. This paper begins with an analysis of possible reasons for this neglect and then draws on the notion of student engagement in educational psychology to offer a theoretical framework for the concept. It is proposed that the L2 Learning Experience may be defined as the perceived quality of the learners’ engagement with various aspects of the language learning process.
Full-text available
Doing Replication Research in Applied Linguistics is the only book available to specifically discuss the applied aspects of how to carry out replication studies in Second Language Acquisition. This text takes the reader from seeking out a suitable study for replication, through deciding on the most valuable form of replication approach to its execution, discussion, and writing up for publication. A step-by-step decision-making approach to the activities guides the reader/student through the replication research process from the initial search for a target study to replicate, through the setting up, execution, analysis, and dissemination of the finished work.
Full-text available
In this study we investigate the situated and dynamic nature of the L2 Learning Experience through a newly-purposed instrument called the Language Learning Story Interview—adapted from McAdams’ Life Story Interview (2007). Using critical case sampling, data were collected from an equal number of learners of various L2s (e.g., Arabic, English, Mandarin, Spanish) and analyzed using Qualitative Comparative Analysis (Rihoux & Ragin, 2009). Through our data analysis, we demonstrate how language learners construct overarching narratives of the L2 learning experience and what the characteristic features and components that make up these narratives are. Our results provide evidence for prototypical nuclear scenes (McAdams et al., 2004) as well as core specifications and parameters of learners’ narrative accounts of the L2 learning experience. We discuss how these shape motivation and language learning behavior.
Full-text available
Sustained motivation is crucial to learning a second language (L2), and one way to support this can be through the mental visualisation of ideal L2 selves (Dörnyei & Kubanyiova, 2014). This paper reports on an exploratory study which investigated the possibility of using technology to create representations of language learners’ ideal L2 selves digitally. Nine Chinese learners of L2 English were invited to three semi-structured interviews to discuss their ideal L2 selves and their future language goals, as well as their opinions on several different technological approaches to representing their ideal L2 selves. Three approaches were shown to participants: (a) 2D and 3D animations, (b) Facial Overlay, and (c) Facial Mask. Within these, several iterations were also included (e.g. with/without background or context). Results indicate that 3D animation currently offers the best approach in terms of realism and animation of facial features, and improvements to Facial Overlay could lead to beneficial results in the future. Approaches using the 2D animations and the Facial Mask approach appeared to have little future potential. The descriptive details of learners’ ideal L2 selves also provide preliminary directions for the development of content that might be included in future technology-based interventions.
Full-text available
This article reports the first meta-analysis of the L2 Motivational Self System (Dörnyei, 2005, 2009). A total of 32 research reports, involving 39 unique samples and 32,078 language learners, were meta-analyzed. The results showed that the three components of the L2 Motivational Self System (the ideal L2 self, the ought-to L2 self, and the L2 learning experience) were significant predictors of subjective intended effort (rs = .61, .38, and .41, respectively), though weaker predictors of objective measures of achievement (rs = .20, –.05, and .17). Substantial heterogeneity was also observed in most of these correlations. The results also suggest that the strong correlation between the L2 learning experience and intended effort reported in the literature is, due to substantial wording overlap, partly an artifact of lack of discriminant validity between these two scales. Implications of these results and directions for future research are discussed.
This volume presents a new approach to motivation that focuses on the concept of 'vision'. Drawing on visualisation research in sports, psychology and education, the authors describe powerful ways by which imagining future scenarios in one's mind's eye can promote motivation to learn a foreign language. The book offers a rich selection of motivational strategies that can help students to 'see' themselves as potentially competent language users, to experience the value of knowing a foreign language in their own lives and, ultimately, to invest effort into learning it. Transformational leaders' vision for change is one of the prerequisites of turning language classrooms into motivating learning environments, and the second part of the book therefore focuses on how to ignite language teacher enthusiasm, how to re-kindle it when it may be waning and how to guard it when it is under threat.
The major focus of the thesis is to investigate the complex dynamism of L2 demotivation. It is an attempt to reform previous thinking of demotivation and move the L2 demotivation mainstream research into a new phase that focuses on the complexity of its process and its development. The demotivational, motivational, and remotivational trajectories of language learners were examined through the lens of various key psychological and theoretical constructs including mindset, personality hardiness, learnerd helplessness, and the L2 Motivational Self System. The thesis consists of two studies that investigated the demotivation of female Saudi university students by using a variety of research methodologies, including qualitative in-depth interviews, quantitative surveys, and structural equation modelling. A primary explorative qualitative study was conducted aiming at examining the Saudi learners’ different explanations of their language learning experiences and their various perceptions of different demotivating factors. Semi-structured interviewes were conducted with 13 female learners of English in King Abdulaziz University, Jeddah, SAUDI ARABIA . Analysis of the qualitative data showed that the language learning mindset played an important role in the language learner’s motivation, demotivation, remotivation, and resilience/vulnerability. However, the relationship between the variables that emerged in the qualitative data needed further investigation in order to be confirmed and generalised to larger populations. A secondary confirmatory quantitative study was carried out aiming at investigating the impact of having a particular language learning mindset on L2 demotivation. Using the key variables that emerged in the qualitative data, a questionnaire was desgined and administered to 2044 foundation-year university students. A number of tests were conducted to investigate (a) the relationships between the variables; (b) the differences between the growth mindset language learners and the fixed mindset language learners; and (c) the differences between the resilient and vulnerable language learners. The quantitative results confirmed all the hypothesised relationships assumed and established an empirical link between the language learning mindset and both L2 demotivation and L2 resilience. Finally, a model that assumed that L2 demotivation can be predicted by the fixed language learning mindset was hypothesised. A structural equation modelling (SEM) to empirically test and examine the hypothesised model was conducted. A set of causal relationships were examined simultaneously. The SEM analysis confirmed all the hypothesised causal relationships and showed that L2 demotivation can be predicted positively and directly by the fixed language learning mindset. It also showed that the fixed language learning mindset can lead to L2 demotivation indirectly via decreasing the ability to create a positive ideal L2 self and increasing L2 disappointment. Although all the studies were conducted in the Saudi context and with female learners, it is hoped that the wealth of data can serve as an empirical point of departure in the realm of investigation of L2 demotivation. Conceptualising L2 demotivation by focusing on the role of the language learning mindset and its contribution to the learners’ perceptions and responses to demotivating factors, seems to provide language educators with a new tool to minimise language learners’ demotivation and help them to rebuild their motivation. It also seems to provide future researchers with a new theoritcal model to investigate when researching L2 demotivation in different contexts.