ArticlePDF Available

Insights Into Student Listening From Paused Transcription

Authors:

Abstract and Figures

Listening comprehension is an essential and challenging skill for language learners, and listening instruction can also be a challenge for language instructors, since they have little access to the listening process inside students' minds. Greater knowledge about what learners perceive when they listen could help language teachers better tailor their instruction to student needs. In this mixed-methods study, students at 2 proficiency levels participated in a listening test based on Field's paused transcription method (2008a, 2008c, 2011). Results were analyzed quantitatively on the basis of student and text level, word class, and artic-ulation rate. Transcription errors were analyzed qualitatively to identify patterns of mishearing. Paused transcription is recommended as a classroom activity to identify and raise awareness of student listening challenges. S econd language (L2) listening presents major challenges to learners , since the speed and lexical/syntactical choices of spoken discourse are out of the control of the listener. At the same time, listening is an essential skill for learners, since listening can provide many opportunities for continued language learning. For international university students in the US, listening also represents a primary way of accessing necessary information. It is important, therefore, to help incoming international students develop their listening skills as much as possible before they begin their university studies. What Makes Listening Difficult To help students develop listening skills in a second language, it is helpful know what makes listening difficult for them. Some studies have approached this question by asking learners why a text feels difficult. In response to these questions, learners have reported that sec
Content may be subject to copyright.
e CATESOL Journal 29.2 • 2017 • 81
Insights Into Student Listening
From Paused Transcription
Listening comprehension is an essential and challenging
skill for language learners, and listening instruction can
also be a challenge for language instructors, since they
have little access to the listening process inside students’
minds. Greater knowledge about what learners perceive
when they listen could help language teachers better tailor
their instruction to student needs. In this mixed-methods
study, students at 2 prociency levels participated in a lis-
tening test based on Field’s paused transcription method
(2008a, 2008c, 2011). Results were analyzed quantitatively
on the basis of student and text level, word class, and artic-
ulation rate. Transcription errors were analyzed qualita-
tively to identify patterns of mishearing. Paused transcrip-
tion is recommended as a classroom activity to identify
and raise awareness of student listening challenges.
Second language (L2) listening presents major challenges to learn-
ers, since the speed and lexical/syntactical choices of spoken dis-
course are out of the control of the listener. At the same time,
listening is an essential skill for learners, since listening can provide
many opportunities for continued language learning. For internation-
al university students in the US, listening also represents a primary
way of accessing necessary information. It is important, therefore, to
help incoming international students develop their listening skills as
much as possible before they begin their university studies.
What Makes Listening Dicult
To help students develop listening skills in a second language, it
is helpful know what makes listening dicult for them. Some studies
have approached this question by asking learners why a text feels dif-
cult. In response to these questions, learners have reported that sec-
BETH SHEPPARD
BRIAN BUTLER
University of Oregon
82e CATESOL Journal 29.2 • 2017
ond language listening is hard for the following reasons (Goh, 2000;
Liu 2002; Renandya & Farrell, 2011):
• e speaker is too fast.
• ey do not know all the words.
• ey cannot recognize known words in context.
• ey cannot focus on the whole message.
• ey feel anxious.
Other studies have approached this question by comparing language
learner results on listening tests with specic dierences in the audio
texts. e following text factors have been found to increase the di-
culty of L2 listening comprehension (Bloomeld et al., 2011; Brunfaut
& Revesz, 2015; Revesz & Brunfaut, 2012):
• Greater lexical range and density;
• More formal, literate discourse structure (reduced redun-
dancy, greater referential cohesion, greater information den-
sity);
• Indirectness (requiring listeners to infer implied meaning);
• Unfamiliar accent;
• Faster articulation rate and reduced pauses.
ese are the challenges learners need to overcome as they develop
into procient L2 listeners.
Bottom-Up and Top-Down Listening Processes
Most discussions of second language listening development re-
fer to top-down and bottom-up processes, both of which are essential
for listening comprehension. Top-down (knowledge-based, concept-
driven) processes involve using knowledge of the world, speech con-
text, and recent co-text to predict or limit possible interpretations of
the speaker’s message. Bottom-up (text-based, stimulus-driven) pro-
cesses involve recognizing phonemes, syllables, words, and relation-
ships between words to decipher the speaker’s message. Top-down
and bottom-up processes are used simultaneously by all listeners, but
skilled and novice listeners may use them in dierent ways. In par-
ticular, Field (2008d) emphasizes that skilled listeners use top-down
processes to amplify and extend the speaker’s message on the basis of
automatic and very eective bottom-up processing, while novice lis-
teners use top-down processes to compensate for incomplete bottom-
up processing by making reasonable guesses about missed words and
phrases.
e CATESOL Journal 29.2 • 2017 • 83
In this study, we focus on the subset of bottom-up processes by
which listeners identify words from the stream of sound. ese in-
clude phoneme recognition, locating word boundaries, and lexical
matching. We will refer to these processes as aural decoding.
Listening Instruction
A good deal of recent discourse (e.g., Field, 2008d; Siegel, 2014;
Vandergri, 2004) has suggested that ESL listening instruction must
place a greater focus on the process of listening, rather than just the
product of listening in the form of correct answers to comprehension
questions. is attention to process can emphasize top-down skills,
such as explicit instruction in metacognitive listening strategies (Van-
dergri & Goh, 2012), or bottom-up skills, such as diagnosis of specif-
ic aural decoding problems followed by practice in those areas (Field,
2008d). A balance of these two approaches seems most likely to meet
students’ needs, but the literature indicates an imbalance in current
teaching practices, with more attention needed to bottom-up skills
(Field, 2008d; Siegel & Siegel, 2015; Vandergri, 2004).
e ability to quickly and automatically decode the speech stream
into known words is a key skill for successful listening. Tsui & Ful-
lilove (1998) found that strong bottom-up skills distinguish stronger
from weaker performers on a listening test. To help students improve
these skills, Field (2008d) proposed a diagnostic approach in which
the teacher ascertains which bottom-up processes are causing chal-
lenges and designs short instructional activities to practice precisely
these processes. In order to apply a diagnostic approach to listening
instruction, however, it is necessary to nd out what learners hear
when they listen.
e Present Study
We are instructors in a moderately large Intensive English Pro-
gram (IEP) at a moderately large public university. As at many other
universities, our students can begin their university studies when they
reach an intermediate to high-intermediate language level. e ability
of students at this level to decode connected speech has been found
to be remarkably low, with around 60% of words decoded on average,
as compared to around 95% for native speakers (Estes, 2014; Field,
2008a, 2008c, 2011).
We were interested in learning more about the decoding ability
of our own intermediate-level learners. Past studies have found that
learners decode content words more accurately than function words,
in spite of the greater frequency of function words. We were inter-
ested in this result, and we also wondered how articulation rate would
84e CATESOL Journal 29.2 • 2017
aect decoding, since students oen state a belief that they cannot
understand when the text is fast. We also hypothesized that students’
specic errors in paused transcription would oer clues to diagnose
which subskills of listening were challenging for them, and therefore
this method could be a useful tool in the classroom.
us our research questions are:
1. How completely do our students decode listening texts at
various levels?
2. Will students decode more content words than function
words?
3. Will students decode more words with a slower articulation
rate?
4. Can students’ transcriptions provide insight into their listen-
ing processes?
Method
Since aural decoding and comprehension occur inside the mind,
they cannot be directly observed. Researchers have approached this
problem using think-aloud protocols and retrospective interviews
(e.g., Goh, 2000; Zielenski, 2008), paused transcription (e.g., Estes,
2014; Field, 2008c), and priming studies (e.g., Cutler, 2012), among
others. Paused transcription has the advantage that it focuses spe-
cically on aural decoding, but without divorcing the target phrases
from a natural context in connected speech and discourse or prevent-
ing learners from also applying top-down processes as they would in
natural listening. In paused transcription, subjects are asked to listen
to an extended text into which pauses have been inserted at irregular
intervals. During each pause, subjects write down the last phrase (4-5
words) that they heard. e written phrases can then be compared to
the original text and coded for accuracy.
e rationale for this method is that it taps into a listening process
that replicates a real-world one. Subjects listen to the recording
with a view to following its meaning, and it is only when a pause
occurs that they switch attention to word level. Memory eects
are limited by the fact that subjects are asked to transcribe around
four or ve words – well within the range of Miller’s (1956) sev-
en plus or minus two. Furthermore … listeners retain verbatim
word forms until major clause boundaries and only then “wrap
them up” by replacing them with representations in propositional
form. (Field, 2008b, pp. 16-17)
e CATESOL Journal 29.2 • 2017 • 85
Participants
Study participants were students in intact listening and speaking
classes at a university-based Intensive English Program. Participants
(N=77) included 48 upper-level students and 29 midlevel students
who spoke Chinese (65.4%), Japanese (10.2%), or Arabic (24.4%) as
their rst language. ey had already studied in the US for an average
of about 11 months, and a t-test showed that the length of residence
was not signicantly dierent between students in the two levels.
Materials
ree listening texts were used for the paused transcription study.
e rst two texts were from listening textbooks and graded for easy
comprehension at the two prociency levels. A third text was taken
from an authentic university lecture available online. In addition, a
very short text was prepared for use as a sample/warm-up activity to
clarify the paused transcription procedure.
All three audio texts were similar in length (see Table 1). Each
was structured as an academic talk or lecture, with a relatively infor-
mal tone and some features of oral language (the textbook record-
ings were scripted and performed by actors, but some of these features
were written into the script). All speakers had standard North Ameri-
can accents.
Table 1
Origin, Topic, and Length of Listening Texts
Warm-up Text 1 Text 2 Text 3
Origin Pathways 2 Pathways 2 Learn to
Listen, Listen
to Learn
Open Yale
Courses
Topi c Comparing
people
Changes in our
world
Women and
work
Our
relationship to
food
Length 0:44 2:58 3:32 3:21
Word s 104 (142
wpm*)
387 (130 wpm) 498 (141
wpm)
561 (167
wpm)
Note. *words per minute.
For each audio text, Cobb’s (n.d.) VocabProler was used to se-
lect four-word phrases for transcription. Twelve phrases were selected
from each audio text, for a total of 144 words (see Appendix A). Of
86e CATESOL Journal 29.2 • 2017
these, 141 were found among the 1,000 most commonly used words
in English based on the General Service List (West, 1953), and three
were among the second thousand words of the General Service List
(“dance,” “repeat,” and “probably”). ese words were estimated to be
familiar to students at both levels. us study participants could be
expected to be familiar with most or all of the words selected for tran-
scription.
Procedure
e study was conducted as a listening exercise during class time.
e rst author conducted all sessions of the study. Aer reading in-
structions and giving consent in their L1, participants completed a
brief questionnaire about their language background and then the
warm-up paused transcription activity. ey were then instructed to
explain the activity to each other in their L1. Once all participants
understood the instructions, the three texts were played, always in the
same order (Text 1, Text 2, Text 3). Participants wrote their transcrip-
tions on a paper packet. At the end of each audio text, participants
rated their comprehension of the text from 1 to 5 and then turned the
page for the next audio text.
ree class instructors chose to participate in the study, transcrib-
ing in the pauses as their students did. All three had 100% correct
transcriptions.
Data Analysis
Each transcribed target word was coded as correct or incorrect.
Only the target words (last four words spoken before the beep/pause)
were coded and any extra words were ignored. Missed words were
coded the same as incorrect words. When words were present but
transcribed out of order, they were still coded as correct. Words with
morphological errors (generally in endings for tense and number)
were coded as correct. Misspelled words were also coded as correct, if
they could clearly be identied as the intended word. e rst author
coded all words and the second author coded a subset of 10%. Inter-
rater agreement was found to be 98.1%. Examples of coding can be
found in Table 2.
During the process of coding for quantitative analysis, interesting
transcriptions were highlighted for qualitative analysis. In addition,
an overall diculty score was calculated for each phrase (an average of
the percent correct for the four words), and the most dicult phrases
were agged for further qualitative error analysis. For selected phras-
es, transcription errors were tallied and categorized. e researchers
listened again to the target phrases, made notes about the speaker’s
e CATESOL Journal 29.2 • 2017 • 87
Table 2
Sample Coding for Target Word Transcriptions
Target word Transcription Coded
raised Raise correct
raised Rave incorrect
woman Women correct
dress Drees correct
dress Drac incorrect
have had, has correct
their e incorrect
delivery, and speculated about the origin of specic errors. In this pro-
cess, several broad types of errors emerged as common and signicant
in the data. All transcriptions of the dicult phrases were then reana-
lyzed with reference to these error types.
Results and Discussion
Research Question 1
How completely do our students decode listening texts at various levels?
With 144 target words and 77 participants, there were 11,088
target tokens. Of these, 7,414 target tokens were coded as correctly
transcribed, a correct transcription rate of 67%. Upper-level students
(intermediate prociency) transcribed 73% correctly, while midlevel
(preintermediate prociency) students were successful with 54% of
the target tokens. e percent of correctly transcribed tokens by text
and student level can be seen in Figure 1.
Figure 1. Percent of tokens correctly transcribed.
100%
80%
60%
40%
20%
0%
Text 1 Text 2 Text 3
Upper-level students
Midlevel students
All students
Percent Tokens Correctly Transcribed
88e CATESOL Journal 29.2 • 2017
An ANOVA conrmed that dierences in overall transcription
accuracy were signicant by student group, F(1, 282) = 48.80, p < .001,
and by text, F(2, 282) = 24.76, p < .001. Full statistics can be found in
Appendix B, Tables 1 and 2.
Both groups of students experienced signicant gaps in their
aural decoding, with less than three quarters of the words decoded
in every group except the upper-level students listening to the easi-
est text. e upper-level students were a few weeks away from exiting
the IEP and beginning university classes, yet they could decode only
about 60% of the words in the rst four minutes of the rst lecture of
an undergraduate class (Text 3). A lexical coverage of 90-95% has been
found to be sucient for adequate listening comprehension (Van Zee-
land & Schmidt, 2012). We can therefore see that when international
university students enter with minimally acceptable English language
prociency, decoding perhaps 60-70% of the words in a typical lecture,
they will be at a signicant disadvantage in lecture comprehension.
Research Question 2
Will students decode more content words than function words?
Overall, study participants were able to correctly transcribe 76%
of content words and 54% of function words. A t-test conrms that
transcriptions of content words (n=80, M=0.75, SD=0.19) were sig-
nicantly more accurate than those of function words (n=64, M=0.54,
SD=0.24), t(142) = 6.06, p < .001. e results are presented in Figure 2.
is nding aligns with results of previous studies that have found
that language learners can decode more content words than function
Figure 2. Average transcription accuracy by word type.
100%
80%
60%
40%
20%
0%
Percent correct
content words function words
e CATESOL Journal 29.2 • 2017 • 89
words. ESL students at these levels are likely familiar with all func-
tion words and encounter them frequently, but these words are oen
reduced in speech and also are usually less essential to understand-
ing the overall message of an utterance. In fact, even L1 listeners have
been found to rely on context to fully decode function words (Herron
& Bates, 1997, as cited in Field, 2008c).
With limited available attention, a focus on decoding content
words is probably an eective choice for L2 listeners. At times, howev-
er, function words can have a signicant eect on meaning. Consider,
for example, the eect of misunderstanding a preposition or pronoun
in the sentence “I bought it for you.” Also, if students can hear and
understand function words, then listening becomes an avenue for
them to improve their productive language skill through exposure to
correct grammar in context. Field (2008c, 2008d) suggests activities
to help language students pay attention to function words in listen-
ing. For example, teachers can train learners to infer function words
aer perceiving content words by pausing an audio text (or dictation)
before a function word and asking students to predict what word will
come next, or teachers can have their students explicitly practice per-
ceiving unstressed function words and suxes through a variety of
targeted dictation exercises.
Research Question 3
Will students decode more known words with a slower articulation
rate?
Language students oen state a belief that diculties in listen-
ing comprehension arise from faster audio delivery (e.g., Goh, 2000),
but studies on speed and listening comprehension have found mixed
results. It appears that pauses are helpful to L2 listeners, and increased
speed can negatively aect comprehension, but slower rates do not
always improve comprehension and students oen misattribute other
causes of diculty to speed (Bloomeld et al., 2011).
In the current study, a simple measure of articulation rate (phrase
time divided by pronounced syllables) was calculated for each four-
word target phrase (n=36, M=4.704, SD=0.899). A basic measure
of phrase diculty was calculated by averaging the percent of par-
ticipants who correctly transcribed each of the four words (n=36,
M=0.658, SD=0.161). No signicant correlation was found between
these two measures, r = -0.253, n = 36, p = .137, indicating a lack of
strong relationship between within-phrase articulation rate and suc-
cess in decoding the words of the phrase. Figure 3 shows the relation-
ship between transcription accuracy and articulation rate for the 36
phrases.
90e CATESOL Journal 29.2 • 2017
Figure 3. Phrasal articulation rate and average transcription accuracy.
is result is not surprising against the background of research
mentioned above, but still it might come as a revelation to some teach-
ers and many students. Simply informing students of these ndings
could have an impact on students’ emotions about listening compre-
hension. Since listener anxiety has been found to have a powerful ef-
fect on comprehension scores (Bloomeld et al., 2011), aective issues
are one key to helping students listen more successfully. Finally, when
teachers select recorded authentic texts for classroom use, they may
oen base decisions on “speed” of delivery. ese results add to data
suggesting that teachers should consider the speaker’s use of pauses
rather than overall words per minute or articulation rate.
Research Question 4
Can students’ transcriptions provide insight into their listening
processes?
Qualitative examination of transcription errors led to a variety of
insights about participant misunderstandings and gave hints about the
listening processes they struggled with. We focused our error analysis
on the phrases that proved most dicult for participants, based on
average words transcribed correctly. Both researchers examined these
phrases, considering the frequency and possible origin of each error.
Several categories of errors emerged that we will discuss individ-
ually, giving example participant transcriptions for each. We will also
suggest some simple classroom activities that could be used to draw
students’ attention to these issues and practice skills (both bottom-up
and top-down) that may underlie or support them. e categories are
word segmentation, phonemes, unknown words and phrases, and top-
down fabrications.
0 1 2 3 4 5 6 7 8
Target Phrase Articulation Rate in Syllables/Second
100%
80%
60%
40%
20%
0%
Percent transcription accuracy of
all words in target phrase
Speed and Decoding
e CATESOL Journal 29.2 • 2017 • 91
Word Segmentation
One challenge of L2 listening is to locate the beginnings and
ends of words, since there are usually no silent spaces between them.
Listeners employ several strategies to meet this challenge, including
vocabulary knowledge (recognizing one word will also locate the
beginning of the next word), knowledge of language-specic rules
about which phonemes and combinations of phonemes can appear
in word-initial and word-nal positions (phonotactics), and strategies
involving stress and rhythm. e most eective strategy for listeners
of English is to initially assume that each stressed (unreduced) syllable
begins a new content word and adjust as needed based on other strate-
gies (Cutler, 2012). For the most part, the word-segmentation errors
in our study resulted in transcriptions that also followed this primary
strategy. In other words, participants did not incorrectly place stressed
syllables in the middle of transcribed words. ree example phrases
are analyzed below.
Text 2 phrase 6—“Some of the factors a woman might want to take
into account
Incorrect transcription NError analysis
… taking to account 17 /tek/ is a stressed syllable, which begins a
content word. In this common error, /tek/
is still correctly placed at the beginning of
a word. /ɪntu/ is a function word of two
unstressed syllables, and students have
mistakenly assigned the rst unstressed
syllable of /ɪntu/ as an unstressed sux
of the preceding content word. is is
reasonable from the standpoint of word-
segmentation strategy, but syntax and
subtle clues in delivery could have helped
disambiguate the phrase.
… a count
… count
… a corn, a comet
… a(n)- [no following
word transcribed]
10
4
2
9
/cɑʊnt/ is a stressed syllable, so it is
reasonable to guess that it will begin a
content word and therefore to assume
that the preceding /ə/ is a separate
function word. Here knowledge
of English collocations could help
disambiguate the phrase.
92e CATESOL Journal 29.2 • 2017
Text 1 phrase 5—“Native American music used to be played
For this phrase it is noteworthy that study participants did not
command the grammar in “used to be played”—70% of all partici-
pants were able to transcribe some form of both content words (“use
and “play”), but only 22% were able to transcribe the whole phrase
with correct function words and morphemes. Many omitted one or
more of the content words (e.g., “used to play” n=13).
Incorrect transcription NError analysis
Usually play
Usually like to play
Usually to played
2
1
1
is phrase included four syllables, with a
stress on the rst and fourth syllables. Like
the previous example, the rule of assuming
that stressed syllables begin content
words resulted in more than one possible
interpretation, and these four participants
selected an incorrect interpretation
that had the same rhythm and vowels,
but meant that they transcribed two
consonants incorrectly. In addition
to the consonants, syntax could have
disambiguated this phrase.
Text 1 phrase 1—“Changes take place over time, so we don’t always
notice them
Incorrect transcription NError analysis
We don’t always know this sound
We don’t always know the sound
We don’t always know this song
1
1
1
e frequent word-
segmentation error represented
here is a perception of the
second (unstressed) syllable
of “notice” as a separate
(unstressed) function word.
As above, this interpretation
follows the basic word-
segmentation assumption.
Various phonemic changes
are associated with this shi
in word boundaries, and the
results vary in their syntactic
and semantic plausibility.
We don’t always know the change 1
We don’t always know understand
We do not always understand
So we don’t understand
1
1
1
Don’t always don’t the sound 1
e CATESOL Journal 29.2 • 2017 • 93
We don’t always know this
We don’t always know that
We don’t always know them
Always no them
We don’t know
1
3
9
1
1
ese are similar to the above,
except that one syllable is
missing—either the unstressed
syllable of “notice” or the
last function word. It is thus
unclear whether they represent
word-segmentation errors or a
missed word.
We don’t know all with them 2 Here, “always” has been split
into two words (and there is
a reversal of words/sounds as
well).
We always listen 1 Here we see a dierent
segmentation, with the
unstressed second syllable
of “notice” misperceived as
a stressed initial syllable of
a dierence content word
(“listen”), along with some
phoneme errors.
In most of the clear examples of incorrect word segmentation,
participants were found to have maintained the pattern of stressed
(unreduced) syllables’ beginning content words. Participants applied
a nativelike strategy to segment words, successfully segmenting a
great majority of the words they heard. e examples presented here
are the clearest incidences of word-segmentation error precisely be-
cause they maintain some of the rhythm and phonemes of the origi-
nal. Less-transparent segmentation errors may underlie other incor-
rect transcriptions as well.
When listeners misperceive word boundaries, it can cause lasting
confusion. For language learners, aural misperception of word bound-
aries is a more common and longer-lasting phenomenon than for
more expert listeners. e learner’s smaller number of known words
and uncertainty in phonemic matches can lead to more frequent er-
rors, and a lack of condence in general comprehension can impede
learners’ recognition and correction of previous mistakes in decoding
(Field, 2008b).
Instructional Suggestions for Word Segmentation
Dictation: Brief dictation exercises can be an excellent tar-
geted-listening task, as long as the target sentences are spo-
ken with a natural speech rate and style. While maintaining
94e CATESOL Journal 29.2 • 2017
this natural delivery, length, lexical choices, and grammati-
cal complexity can be adjusted to student prociency levels.
Students will practice word segmentation as they listen and
transcribe sentences and phrases.
Elicited imitation: is technique is similar to dictation,
except that comprehension is displayed via speaking rather
than writing. Students listen to phrases spoken naturally and
repeat back what they hear. Extremely short phrases may be
repeated back phonetically, but with more than a few sylla-
bles repetition requires comprehension (see Yan, Maeda, Lv,
& Ginther, 2016, for a meta-analysis of elicited imitation as a
measure of L2 prociency).
Paused transcription detectives: With teacher guidance,
students can nd segmentation errors in their own paused
transcription practice and examine the pronunciation dif-
ferences between the spoken phrase and their transcription,
pronouncing and practicing the phrases. ey should also
examine co-text for semantic or syntactic clues to correct
word segmentation.
Phonemes
Research has indicated that word codas are less salient than on-
sets, and that students have more trouble correctly identifying vowels
than consonants (Cross, 2009; Field, 2004; Rost, 2016). e partici-
pants in our study did have a tendency to transcribe wrong words be-
ginning with the right sounds, and to transcribe syllables with correct
consonants and incorrect vowels. However, we also found opposite
examples, in which participants transcribed wrong words ending with
the right sounds, and examples in which the vowel was correct but the
consonants were inaccurate. Two example phrases are analyzed below.
In the example Text 2 phrase 10, we can see that the /st/ onset of
“study” was quite salient, and the nal /i/ of the word was also main-
tained in several of these erroneous transcriptions. e middle of the
word was not maintained in any erroneous transcriptions.
For the function word “was,” the rst phoneme was maintained
in erroneous transcriptions. Participants never mistook this word for
a content word, instead substituting other function words beginning
with /w/. Both function words in this phrase were oen omitted.
Five percent of all participants wrote “down” for “done.” In this
case, initial and nal consonants were both maintained, but the vowel
was not decoded correctly. e erroneous transcription “stone” for
done may have had some relationship with the /st/ of “study,” but since
the full transcription in this case was “stay with stone,” we know that
e CATESOL Journal 29.2 • 2017 • 95
“stone” was an attempt at “done.” e nal consonant is correctly de-
coded, and the middle vowel is similar to the target but still incorrect.
Text 2 phrase 10—“I’d like to tell you about a study that was done
Target word Study at Was Done
Incorrect
transcriptions
Error NError NError NError N
Stay 4It 1With 4down 4
Stiy 1And 2Will 1Stone 1
Staied 1We 1
Stains 1What 1
Still 1e 1
Stand 1Language* 1L anguage* 1
Story 2 Almost* 1 Almost* 1
State 1
Outside 1
Research 1
Omissions 16 56 40 30
Correct
transcriptions 47 13 30 42
Note. *ese two-syllable words seemed to replace both function words.
In the example Text 3 phrase 11, the second word of this phrase,
“wouldn’t,” was the only word with a 0% correct transcription rate in
this study. Forty-two erroneous transcriptions are presented in the
chart. e other 35 participants did not transcribe this word. e great
majority of erroneous transcriptions (39/42) maintain the correct ini-
tial phoneme. Participants who wrote “would” were correct about the
entire rst syllable (although the meaning of the sentence will still
be misunderstood), while others were able to transcribe some of the
word-nal consonants, for example, “want.
For “seem,” the most common error was a failure to perceive the
nal /m/ sound, resulting in transcriptions of “see,” which indicates
correct perception of the word-initial consonant and the vowel (vari-
ous morphological endings added to “see” may have been related to
the application of top-down skills). However, other participants main-
tained the word-nal consonant but not the vowel (“same”), while
others maintained only the /i/ vowel sound (“think,” “technique”).
More than half of the erroneous transcriptions for the nal word
of this phrase, “like,” maintained the correct vowel sound. None main-
tained the correct consonants in word-initial or word-nal position.
96e CATESOL Journal 29.2 • 2017
Text 3 phrase 11—“Burning more calories creating a paper than you
guys have too. at wouldn’t seem like
Target word at Wou l d n’ t Seem Like
Incorrect
transcriptions
Error NError NError NError N
(Now) I 14 One 13 See* 28 My 6
en 3Was 8Same 3Why 3
e 3Will 5 ink 3 A lot 2
e 2Wou l d 5Say 2Have 2
Him 1Want 4 Might 2
It 1We 3Wise 1
ere 1Can 2How 1
May 1As 1
When 1
Omissions 18 35 14 26
Correct
transcriptions 34 0 27 33
Note. *Some form of “see” (see, seen, sees, seeing).
When students perceive a phoneme incorrectly or ambiguously, it
can lead to identication of the wrong word, as we see in these exam-
ples. Even when it does not lead to incorrect word identication, it can
slow down and complicate aural decoding by introducing additional
competition from “phantom words” (Broersma & Cutler, 2008) into
the process of word recognition. erefore, teachers should help their
students practice identifying phonemes, focusing as much as possible
on the specic areas where students struggle.
Instructional Suggestions for Phonemes
Vowel/consonant homework: Individual students can work
with phonemes that are dicult for them to distinguish, be-
ing sure to practice with the sounds in a variety of phonetic
contexts. For example, teachers can assign work with http://
www.englishaccentcoach.com/.
Partial dictation: Phrases or sentences are printed with a
blank, and students ll in the missing part. e blanks can be
word codas (e.g., “at woul_____ seem like”), pre-/suxes
(e.g., “In from larg____ distances”), or word middles (e.g.,
“at wouldn’t s____m like”). It is preferable to concentrate
on one position for the blanks in each short exercise.
e CATESOL Journal 29.2 • 2017 • 97
Gating and prediction: e teacher can stop the audio text
aer the rst sound or syllable of a word and have students
predict what the rest might be (e.g., the teacher says, “Food
was raised lo-” and students talk to a partner about what
word might follow, and then they discuss with class). is
activity helps students practice applying top-down skills to
make up for gaps or ambiguities in phoneme perception.
Unknown Words and Phrases
In designing the paused transcription materials, we tried to tar-
get only words that were known to participants to see if they would
decode them in context. However, some unrecognized words and
combinations of words may have been treated as unknown words by
participants. We could infer that this had occurred when participants
wrote letter combinations that did not correspond to any English
word. Here are some examples of single words that appeared to be
unrecognized.
Target word(s) Transcriptions
Locally (Text 3 phrase
1—“food was raised
locally”)
Recoaly, Ridlly, Grobally, Recloliy, Quackly,
Workly, Ulgerly, Bigulgle, Locanary, Revly
Distances (Text 3 phrase
6—“in from larger
distances”)
Siystances, Digness, Indecnit, Adegescence,
Destious, Margien
Field (2004) discusses three strategies that learners might select
when they encounter an unknown word in listening. ey might take a
phonological approach (attempt to transcribe the sounds they heard),
a lexical approach (attempt to match approximately to a known word),
or a zero approach (no transcription). Each of these approaches has
advantages and disadvantages for learner comprehension. If learn-
ers take a strictly phonological approach, they recognize that a word
has been missed and begin to learn the sounds of the new word, but
they do not take the opportunity to apply schema and make an edu-
cated guess that will support their overall understanding of the text.
If they choose a lexical approach, learners engage actively in trying to
make meaning of the text, but they may forget the provisional nature
of the lexical match and fail to revise their hypothesis when needed.
Field (2004) found that his subjects selected a lexical approach more
frequently than expected, and that lexical matches oen were not
semantically appropriate. Finally, a zero approach to new words can
be seen as an instance in which the learner either did not recognize
98e CATESOL Journal 29.2 • 2017
that another word was spoken or could not remember anything about
that word. ese instances may occur when the listener “couldn’t keep
up” with the input, oen resulting in a perception that the input was
fast, regardless of its actual speed (see Bloomeld et al., 2015; Goh,
1999). Certainly, increased vocabulary knowledge can help improve
students’ listening comprehension, especially if the vocabulary is well
known in its spoken form (Staehr, 2009; Van Zeeland, 2013; Van Zee-
land & Schmitt, 2012). In fact, aural word recognition in context has
been shown to correlate strongly with general listening comprehen-
sion scores (Matthews & Cheng, 2015).
One of the most dicult phrases for our participants to tran-
scribe completely was “over an open re.” It was transcribed with 40%
accuracy, compared to 66-90% accuracy for all other phrases in Text
1. Most participants wrote some words correctly, but very few tran-
scribed both “over” and “open.” e phrase is a common collocation,
a formulaic expression that may be unfamiliar to many English lan-
guage learners.
Text 1 phrase 7—“Instead of cooking over an open re
Incorrect transcriptions N Analysis
Open re 20 42 students transcribed “open” but
not “over.
Cooking (in/with/ on) (an/
the/0) open re
10
Cooking (and/or) open
(an/the/0) re
7
Open cooking re 1
Open (the/on/a) re 4
Cooking over re 8 10 students transcribed “over” but
not “open.”
Stopping over the re 1
Over and over re 1
Cooking over an open re
Cooking over and open re
Cooking over open in re
1
1
1
Only one student transcribed all
four words correctly. Two additional
students transcribed both “over” and
“open,” but missed the word “an.
e remaining 22 students omitted both “over” and “open” from
their transcriptions.
e CATESOL Journal 29.2 • 2017 • 99
Instructional Suggestions for Unknown Words and Phrases
Look up unknown words from listening: Teachers can
dictate sentences that include an unknown word. Students
approximate the spelling to look up the word and compare
meanings to the co-text (Sheppard, 2013). Field (2008b) sug-
gests using proper nouns and even nonwords that conform
to target language phonology in dictation and matching ex-
ercises.
Learn aural forms: Teachers can easily incorporate aural
forms into vocabulary study by having students listen to and
repeat the words, identify syllables and stress, and hear the
target words in the context of phrases and sentences.
Notice new expressions: To encourage students to develop
the habit of noticing and investigating word combinations,
the teacher can pause aer speaking or hearing a common
idiom or collocation and asking students to discuss it. Dicta-
tion of common phrases or formulaic expression can also be
a good method to raise student awareness.
Top-Down Fabrications
In some instances, participant transcriptions had little similarity
to some or all of the four target words, either semantically or phoneti-
cally. Oen these phrases were related to previous content from the
audio text. In other cases, learners used the “lexical strategy” for un-
recognized words as described above, selecting a familiar word with
some similar characteristics. In these cases, the resulting phrase oen
made sense but did not t semantically with the co-text. Finally, there
were instances in which participants wrote words or phrases that did
not match the target phonetically but had a similar meaning. ese
last instances can be seen as examples of successful application of top-
down skills to repair small gaps in bottom-up processing. Two ex-
ample phrases are analyzed below.
Text 3 phrase 6—“Food is shipped in from larger distances
Incorrect transcriptions Error analysis
Food get dierent
relationship
e topic of the text is “our relationship
with food” and this phrase is also part of
recent co-text.
Food relationship
100e CATESOL Journal 29.2 • 2017
Close the relationship e phrase “a distant rather than a close
relationship” is part of less-recent co-text
(about 1 minute ago).
Ship to logically places Some sounds from “distances” are
maintained or nearly maintained in
“places,” and “logically” has the same
initial phoneme as the target word. e
preposition is completely changed. e
phrase does not make sense.
In from long distances Long is a reasonable word for this context.
e meaning is not changed, even though
the participant did not write “larger.
is can be seen as a successful semantic
interpretation.
Text 3 phrase 1—“ey were physically close to it and
psychologically close to it. Food was raised locally
Incorrect
transcriptions
Error analysis
Food will increase
normally
Some of the sounds are maintained and some
nearly maintained (e.g., /i/ for /e/ is a common
mishearing), but dierent fairly sensible word
choices are substituted. e phrase makes sense
by itself but does not t the co-text.
Food was grown rekoly
e food was grown
locally
Transcription of the third word substitutes a
semantically sensible alternative for “raised”—in
that sense it can be seen as successful. In one
of the two instances, the last word was not
recognized (although a number of phonemes are
maintained).
e food was reason
locally
Food is look locally
Less-successful substitutions for the third word
are seen here. In the rst instance we see some
matching phonemes, and in the second perhaps
some eect of the following phonemes.
e good was great
lovely
e food was lovely
A dierent word with several similar sounds is
substituted for the fourth word of the phrase.
In the rst example, a phonetically similar
word is also substituted for “raised.” In the
second example, “raised” is omitted, leading to
a phrase that makes sense by itself and could
stretch to make sense with the co-text so far,
but this interpretation will still add challenge to
interpretation of following co-text.
e CATESOL Journal 29.2 • 2017 • 101
Food was very (n=3) is is a plausible beginning for a sentence in
this context, and “very” does incorporate some
phonemes from both of the words it replaces.
e missed concept of “locally” will, however,
add to the challenges of listening in the next
sentences.
Applying top-down skills to guess in the face of inadequate de-
coding is a valuable strategy, but learners need to remember that
guesses may need to be revised in light of further input. Mispercep-
tion of words in a key sentence can lead some learners to maintain
incorrect beliefs about the topic of a text even when further co-text
makes it clear that something is wrong. Field (2008b) suggests that
this may occur when learners do not trust their comprehension of
later co-text enough to discard their investment in what they heard
before, especially since they cannot go back and listen again. us
teachers should encourage students to use top-down skills to make
guesses but also remind students to revise those guesses as needed.
Instructional Suggestions for Top-Down Fabrications
Monitor comprehension: Students must learn to check their
understanding of the text-so-far for consistency with what
they think they are understanding in the moment. Teachers
can tell stories of their own misunderstandings or give think-
aloud demonstrations to raise awareness of this point. Teach-
ers can make a habit of asking, “How sure are you?,” along
with other comprehension questions, to develop in students
the habit of assessing their own level of certainty.
Making and checking predictions: A teacher can play the
rst part of an audio text, then ask students to make predic-
tions about the topic and main ideas together with a partner
or group, and then play some more of the text and ask stu-
dents to discuss whether and in what ways their predictions
were right or wrong. ey can also discuss possible reasons
for misunderstandings.
Metacognitive strategy instruction: Teachers can follow
Vandergri and Goh’s metacognitive pedagogical sequence
(2012), in which learners are taught to (a) plan for listening,
(b) monitor comprehension, (c) solve problems with com-
prehension, and (d) evaluate the outcome.
102e CATESOL Journal 29.2 • 2017
Using Paused Transcription in the Classroom
e process of examining student errors in paused transcriptions
was enlightening to us as teachers, highlighting common errors and
also giving insights into the misperceptions of individuals. It would
likely be similarly enlightening for other classroom teachers to exam-
ine the patterns of error in paused transcriptions from their students.
Using a short text, teachers could deliberately locate pauses to check
students’ perceptions of certain language features as a diagnostic tool.
It may be even more useful (and more practical) for teachers to have
students examine their own results from a paused transcription exer-
cise. Aer the listening activity, teachers could post the full text and
ask students to correct their own answers, with instructions to ignore
spelling errors if the correct word was intended. ey could then ask
students to count specic kinds of errors, or simply instruct students
to write and share a reection on a few errors they found interesting,
speculating about why they made those mistakes.
We believe that classroom activities involving analysis of paused
transcription exercises can help teachers and students better under-
stand the challenges of L2 listening and provide guidance for class-
room instruction to improve listening skills. We also believe that such
exercises can help develop an attitude of curiosity about errors that
can facilitate student engagement and reduce listener anxiety, result-
ing in a more eective listening classroom.
Conclusions
is study suggests that even known words (or words presumed
to be known—see the discussion of limitations below) oen are not
successfully decoded by intermediate-level language learners. ese
learners are more likely to decode known words when they are part of
a less challenging text. When words drawn from the same list are part
of a more challenging aural text, they are less successfully decoded.
Content words are decoded more successfully than function words, a
nding that conrms results of previous studies. Finally, faster phrases
are not necessarily harder to decode, in spite of students’ perceptions
about speed and listening challenges (Bloomeld et al., 2011; Goh,
1999; Renandya & Farrell, 2011).
e paused transcription methodology used in this study can
provide useful information about what individual students perceive
when they listen. We recommend that teachers and students employ
brief paused transcription exercises in the classroom to analyze lis-
tening perception for strengths and weaknesses, raise awareness,
and possibly guide instruction. Teachers can choose a short, level-
appropriate audio recording and insert 15-second pauses at the end of
e CATESOL Journal 29.2 • 2017 • 103
several phrases. ere is no need to space the pauses equally—varied
intervals are preferred. If inserting pauses in the recording is a chal-
lenge, the teacher can simply plan locations to pause playback at the
ends of phrases. Students listen to the recording, and in each pause
write the last phrase (4-5 words) that was heard. Finally, the resulting
written phrases are compared to a complete transcript of the audio
recording. Teachers can conduct a simple analysis of student results
to decide what kinds of activities would be helpful—for example, by
checking for a few common categories of errors. Students can analyze
their own results to build awareness of their strengths and weaknesses
and to report their analysis to the teacher and receive advice.
is study had several limitations. First, we presumed that all re-
search participants were familiar with the 1,000 most common words
of English. While this probably is mostly true, word knowledge does
vary, even among the most common words. For future paused tran-
scription studies that target known words, this knowledge should be
explicitly tested in a session aer the paused transcription session. e
vocabulary test should target auditory knowledge, not just familiar-
ity with words in their written form. Second, we do not know how
well participants understood the overall message of the three audio
texts used in this study. It would be valuable for future studies on this
topic to include an assessment of overall test comprehension, perhaps
with a control group who did not do paused transcription, so we can
get a better idea of how the paused transcription methodology might
interact with listening processes. Finally, it would have been interest-
ing to include a measure of participant condence for each phrase
transcribed. In this study, we cannot distinguish between errors that
are guesses and errors that are strongly believed by the participant.
Suggestions for interventions could be dierent in these two cases.
In our discussion, we have proposed a variety of activities to help
students improve specic listening skills. Some of these activities are
drawn from the literature, while others are our ideas. More research
is needed on eectiveness of these specic interventions to improve
listening subskills. In the meantime, we suggest only that teachers try
them out and watch carefully for improvements in student listening.
Authors
Beth Sheppard teaches and develops curriculum for ESL listening and
speaking at the University of Oregon and is involved in teacher training.
Brian Butler teaches academic reading and writing for international stu-
dents at the University of Oregon and uses experimental research meth-
ods to explore and explain the functions of the English article system.
104e CATESOL Journal 29.2 • 2017
References
Bloomeld, A., Wayland, S., Rhoades, E., Blodgett, A., Linck, J., &
Ross, S. (2011). What makes listening dicult? Factors aecting
second language listening comprehension. College Park: University
of Maryland Center for Advanced Study of Language (CASL).
Broersma, M., & Cutler, A. (2008). Phantom word activation in L2.
System, 36(1), 22-34.
Brunfaut, T., & Revesz, A. (2015). e role of task and listener char-
acteristics in second language listening. TESOL Quarterly, 49(1),
141-168.
Cobb, T. (n.d.) Web vocabprole [an adaptation of Heatley, Nation, &
Coxhead. (2002). Range]. Retrieved from http://www.lextutor.ca/
vp/
Cross, J. (2009). Diagnosing the process, text, and intrusion problems
responsible for L2 listeners’ decoding errors. Asian EFL Journal,
11(2), 31-53.
Cutler, A. (2012). Native listening: Language experience and the recog-
nition of spoken words. Cambridge, MA: MIT Press.
Estes, R. (2014). Lexical segmentation in L2 Spanish listening (Doctoral
dissertation). University of California, Davis.
Field, J. (2004). An insight into listeners’ problems: Too much bottom-
up or too much top-down? System, 32(3), 363-377.
Field, J. (2008a). e L2 listener: Type or individual. In Working papers
in English and applied linguistics in honour of Gillian Brown (pp.
11-32). Cambridge, England: RCEAL.
Field, J. (2008b). Revising segmentation hypotheses in rst and sec-
ond language listening. System, 36(1), 35-51.
Field, J. (2008c). Bricks or mortar: Which part of the input does a sec-
ond language listener rely on? TESOL Quarterly, 42(3), 411-432.
Field, J. (2008d). Listening in the language classroom. Cambridge, Eng-
land: Cambridge University Press.
Field, J. (2011). Into the mind of the academic listener. Journal of Eng-
lish for Academic Purposes, 10(2), 102-112.
Goh, C. (1999). How much do learners know about the factors that
inuence their listening comprehension? Hong Kong Journal of
Applied Linguistics, 4(1), 17-40.
Goh, C. (2000). A cognitive perspective on language learners’ listen-
ing comprehension problems. System, 28(1), 55-75.
Herron, D., & Bates, E. (1997). Sentential and acoustic factors in the
recognition of open- and closed-class words. Journal of Memory
and Language, 32(2), 217-239.
Liu, N-f. (2002). Processing problems in L2 listening comprehension of
university students in Hong Kong (Unpublished doctoral disserta-
e CATESOL Journal 29.2 • 2017 • 105
tion). Hong Kong Polytechnic University.
Matthews, J., & Cheng, J. (2015). Recognition of high frequency words
from speech as a predictor of L2 listening comprehension. Sys-
tem, 52, 1-13.
Renandya, W., & Farrell, T. (2011). Teacher, the tape is too fast! Exten-
sive listening in ELT. ELT Journal, 65(1), 52-59.
Revesz, A., & Brunfaut, T. (2012). Text characteristics of task input
and diculty in second language listening comprehension. Stud-
ies in Second Language Acquisition (SSLA), 35(1), 31-65.
Rost, M. (2016). Teaching and researching listening (3rd ed.). Abing-
don, England: Routledge.
Siegel, J. (2014). Exploring L2 listening instruction: Examinations of
practice. ELT Journal, 68(1), 22-30.
Siegel, J., & Siegel, A. (2015). Getting to the bottom of L2 listening
instruction: Making a case for bottom-up activities. SSLLT, 5(4).
Retrieved from http://pressto.amu.edu.pl/index.php/ssllt/article/
view/4322/4386
Sheppard, B. (2013). Dening unknown words from listening. ORTE-
SOL Journal, 30, 33-34.
Staehr, L. (2009). Vocabulary knowledge and advanced listening com-
prehension in English as a foreign language. Studies in Second
Language Acquisition (SSLA), 31(4), 577-607.
Tsui, A. B. M., & Fullilove, J. (1998). Bottom-up or top-down as a
discriminator of L2 listening performance. Applied Linguistics,
19(4), 432-451.
Van Zeeland, H. (2013). L2 vocabulary knowledge in and out of con-
text: Is it the same for reading and listening? Australian Review of
Applied Linguistics, 36(1), 52-70.
Van Zeeland, H., & Schmitt, N. (2012). Lexical coverage in L1 and
L2 listening comprehension: e same or dierent from reading
comprehension? Applied Linguistics,34(4), 457-479.
Vandergri, L. (2004). Listening to learn or learning to listen? Annual
Review of Applied Linguistics, 24, 3-25.
Vandergri, L., & Goh, C. C. M. (2012). Teaching and learning second
language listening: Metacognition in action. New York, NY: Rout-
ledge.
West, M. (1953). A general service list of English words. London, Eng-
land: Longman, Green.
Yan, X., Maeda, Y., Lv, J., & Ginther, A. (2016). Elicited imitation as a
measure of second language prociency: A narrative review and
meta-analysis. Language Testing, 33(4), 497-528.
Zielinski, B. (2008). e listener: No longer the silent partner in re-
duced intelligibility. System, 36, 69-84.
106e CATESOL Journal 29.2 • 2017
Appendix A
Target Phrases
Target phrase # content
words
# function
words
Syllables/
second
Text 1
don’t always notice them 3 1 3.628
and make new friends 3 1 4.540
most of the dances 2 2 3.918
never done for money 3 1 7.075
used to be played 1 3 4.449
might see a woman 2 2 4.122
over an open re 2 2 5.076
still a special time 3 1 3.381
women wore long dresses 4 0 3.363
part of our lives 2 2 3.902
think is beautiful today 3 1 4.079
like in the future 1 3 4.575
Text 2
to have an opinion 2 2 6.263
direction of their lives 2 2 4.323
women must now decide 3 1 4.199
to stay at home 2 2 4.188
it is no longer 2 2 4.878
to take into account 2 2 4.598
We knew that men 2 2 4.624
outside of the home 2 2 4.255
to be about equal 2 2 5.391
study that was done 2 2 4.621
women in both groups 3 1 4.990
let me repeat that 2 2 5.519
Text 3
food was raised locally 3 1 3.417
person or one step 3 1 3.455
True in earlier days 3 1 4.648
than a close relationship 2 2 4.593
you can see that 1 3 5.900
in from larger distances 2 2 4.645
where it came from 1 3 4.154
that story is something 2 2 5.045
you probably know this 2 2 5.618
later in the class 2 2 5.464
that wouldn’t seem like 2 2 6.410
go across the room 2 2 6.024
Tot a l 80 64
e CATESOL Journal 29.2 • 2017 • 107
Appendix B
Tables of Statistics
Table 1
Descriptive Statistics for Transcription Accuracy
by Student Level and Text Level
Midlevel students
(n =144)
Upper-level students
(n = 144)
Total
(n = 288)
Variable MSD MSD MSD
Text level 1 0.66 0.22 0.84 0.19 0.75 0.22
Text level 2 0.53 0.23 0.75 0.20 0.64 0.24
Text level 3 0.42 0.28 0.61 0.27 0.51 0.29
Total 0.54 0.26 0.73 0.24 0.63 0.27
Table 2
Student Level by Text-Level Analysis of Variance Summary Table
Source df SS MS F
Student level 1 2.68 2.68 48.80*
Text level 2 2.72 1.36 24.76*
Student level * Text level 2 0.02 0.01 0.18
Error 30 1668.00 55.60
Total 35 5793.00
Note. *p < .05.
... A growing body of literature has recognized the significance of diagnosing listening errors in building listening skills (e.g. Cho, 2021;Light & Tephens, 2011;Masako, 1984;Sheppard & Butler, 2017;Wong et al., 2021;Yang & Kang, 2020). These studies demonstrate that identifying errors made by learners is a valuable aid to both learners and teachers to have a better understanding of learners' listening problems, particularly the intricate nature of the listening process. ...
... Given the significance of error diagnosis in listening development, a plethora of research from different contexts has been carried out to gain a better understanding of the listening process based on the analysis of these errors (e.g. Cho, 2021;Light & Tephens, 2011;Lu, 2020;Masako, 1984;Sheppard & Butler, 2017;Wong et al., 2021;Yang & Kang, 2020;Zhang, 2014). One of the earliest investigations was conducted by Masako (1984) in Japan with 55 sophomores. ...
... Later on, most of the researchers in this domain utilized dictation as a tool to collect data for error analysis but with diverse listening materials. Some used passages (Cho, 2021;Lu, 2021;Light & Tephens, 2011); some used sentences or chunks (Yang & Kang, 2020;Wong et al., 2021), and some used conversations (Lu, 2021;Sheppard & Butler, 2017). At chunk level, Yang and Kang (2020) allowed 22 Korean university students to complete self-annotated transcription and note down their listening difficulties. ...
Article
Full-text available
Research has shown that error analysis (EA) can be a valuable tool for linguistic scholars to collect useful information on second language (L2) acquisition. In the domain of L2 listening development, identifying patterns of learners' erroneous output allows both teachers and students to have an overview of learners' listening problems. On this premise, remedial actions can be taken for the achievement of effective listening comprehension. Having said that, the goal of this study is to investigate the common types of listening errors made by 12 EFL university students at a private university in Vietnam. Specifically, the study attempts to seek an understanding of how listeners process speech at chunk level and how their listening transcriptions reflect their listening processes. Sixty chunks extracted as a separate clip from 15 dialogues with basic features of the oral language were embedded in listening tasks on a self-access online platform. Error analysis of a total of 720 transcriptions reveals that chunks containing errors occupied 44 percent, suggesting that at the chunk level, students still struggled to construct the meanings of the aural input. Major listening errors identified are related to sound misperception, including confusion, omission, addition, and misformation. The findings of this study stress the significance of respecting learners' meaning-making mechanisms in the listening process by giving listeners more control in accessing listening materials. On top of that, it highlights the priority of listening at the chunk level without contextual clues at the earlier stages of listening, which can be a head start for their listening development. More implications for language teachers and researchers in listening are also discussed.
... Investigations of L2 learners' capacities to segment and extract meaning from samples of connected speech suggest that phonological modification is strongly associated with listening ability (Field, 2008a;Lange, 2018). For example, Sheppard and Butler (2017) used paused transcription tasks to investigate the capacity of 77 L2 learners to segment strings of four or five words in connected speech. Results indicated that only 67% of the words were correctly transcribed. ...
... In order to ensure that the target phrases were representative of authentic language in connected speech, each phrase was designed to contain one of three types of phonological modification: reduced function words, transitions between words (i.e., assimilation and elision) or linking (i.e., liaison). These categories of co-articulation are known to be problematic for L2 learners (Sheppard & Butler, 2017;Wong et al., 2017). The target item length was set at three words to reduce the difficulty of the transcription task while adequately representing phonological modification occurring between words. 2 ...
Article
Full-text available
The capacity to perceive and meaningfully process foreign or second language (L2) words from the aural modality is a fundamentally important aspect of successful L2 listening. Despite this, the relationships between L2 listening and learners' capacity to process aural input at the lexical level has received relatively little research focus. This study explores the relationships between measures of aural vocabulary , lexical segmentation and two measures of L2 listening comprehension (i.e., TOEIC & Eiken Pre-2) among a cohort of 130 tertiary level English as a foreign language (EFL) Japanese learners. Multiple regression modelling indicated that in combination, aural knowledge of vocabulary at the first 1,000-word level and lexical segmentation ability could predict 34% and 38% of total variance observed in TOEIC listening and Eiken Pre-2 listening scores respectively. The findings are used to provide some preliminary recommendations for building the capacity of EFL learners to process aural input at the lexical level.
... This practice could have been inspired in the idea of deep listening, which consisted in involving students in activities that require a high degree of concentration while listening and working through exercises at the same time, mainly dictation activities (Clark, 1993). This activity would be classified as bottom-up, focusing on the recognition of words and the message conveyed rather than on interpretations of the message -Topdown - (Sheppard & Butler, 2017). In words of Hoskins, Sasaki and Johnson (2004:3), the deep listening dictation exercise involves "transcribing while listening to an academic lecture followed by carefully correcting the transcription and then listening reflectively while reading the self-corrected transcription". ...
... In fact, it has been previously used by other scholars, who invited their students to listen to music and transcribe songs in the target language (Clark, 1993;Hoskins, Sasaki & Johnson, 2004;Rowe, 2012). This involved deep listening, which require a high degree of concentration in tasks that could be classified as bottom-up (Clark, 1993;Sheppard & Butler, 2017). As it has been illustrated, the transcribing approach mainly helps learners enhance their listening skills and acquire new language forms (Lynch, 2001;Stillwell et al., 2010); although it could also have some benefits on the other language skills. ...
Book
Full-text available
Como en los volúmenes anteriores, este se divide en dos partes, el análisis del discurso y la enseñanza y aprendizaje de lenguas, principales ejes vertebradores de la lingüística aplicada. Recoge nueve capítulos que atañen a diferentes ámbitos como el discurso de los comics, la minería de datos, la variación lingüística, las metodologías activas, la integración de las tecnologías digitales, la traducción, el aprendizaje asistido por ordenador, entre otros.
... However, in practice this is problematic, as it is often difficult to pinpoint the origin of a comprehension breakdown. Sheppard and Butler (2017) do this for bottom-up problems, and it is a fascinating study; but how feasible is it in a real-life classroom? How do we know that we are not making wrong assumptions about why learners did not understand? ...
Article
Full-text available
The authors conduct a duoethnographic exploration of listening pedagogy relating to authentic listening courses they taught in Italy and Japan respectively. Themes explored include how authenticity is operationalised and how it relates to the politics of text selection. Whether the Comprehension Approach (CA) (Field, 2008) is actually rejected by teachers is examined and discussed in relation to the difficulties and feasibility of teaching listening with a process approach. Learner motivation and how to manage and mitigate demotivation is discussed, while attribution theory (Weiner, 1985) is used to illustrate ways that learners may be taught to approach difficulty in texts. Additionally, feelings of 'impostor syndrome' and the generalizability of listening research to classroom instruction are considered. Implications relate to the accessibility of research to teachers, and whether partially implemented research recommendations are pedagogically viable. The duoethnography concludes by noting the potential of learner autonomy in mitigating instruction time constraints, the conflicts between skill instruction and listening for language acquisition and the possibilities of attribution theory for improved toleration of listening difficulties. The viability or otherwise of a process approach to listening instruction is discussed but left unsolved.
... For L2 listeners however, identifying words in connected speech is particularly difficult (Field, 2008b) as their ability to process aural input is not automatized to the same level as it is for L1 listeners (see Segalowitz, 2010, for details). Aural decoding, as is defined by Sheppard and Butler (2017), includes recognizing phonemes, locating word boundaries, and segmenting speech streams into words. According to Field (2008a), decoding means to translate "the speech signal into speech sounds, words and clauses, and finally into literal meaning" (p. ...
Article
Research on second language (L2) learners’ aural decoding, a bottom-up process in listening comprehension, has not been given enough attention. To help researchers and teachers understand the aural decoding processing, this study investigates the relationship between aural decoding and L2 listening comprehension involving 42 s-year students majoring in English in a Chinese university. Findings indicate that: 1) there was a strong correlation between aural decoding and L2 listening comprehension (r = 0.69, p < .01); 2) a threshold of aural decoding of around 80% of decoding scores may lead to good L2 listening comprehension; 3) aural decoding scores may predict 46.9% of the variance in L2 listening comprehension. Additionally, results show that the most frequent decoding errors tend to be those which have no similarities to the input and those which are phonetically similar to the input. The most common reason leading to such errors seems to be that learners encountered words they had never heard before. Implications for pedagogical practice in L2 classrooms are discussed.
Article
This study employed cognitive diagnostic modeling to examine whether learners' performance on the common subskills of listening and reading varied across modalities and performance levels, aiming to provide a better understanding of the similarities and differences between listening and reading in the Chinese EFL context. Specifically, we retrofitted a large-scale EFL test taken by 797 non-English-major undergraduates. We utilized the G-DINA package in R to obtain test takers’ mastery patterns of global and local subskills in the listening and reading tests and further compared them through a mixed-design ANOVA. The results showed that the comprehension subskills were manifested similarly in listening and reading, but a modality effect did exist. Learners generally performed worse in listening and their mastery status of local and global skills was significantly different across modalities in that learners fulfilled global tasks better in listening and local tasks better in reading. The high-performing group mastered global skills better in listening and local skills better in reading while the low-performing group mastered global skills better in both listening and reading. The findings of the study provide backing for a modality effect in L2 comprehension, encouraging comprehension theorists and language teachers to reconsider the value of the modality-specific characteristics.
Article
Current second language (L2) listening research has lacked detailed accounts of L2 listeners’ difficulties comprehending texts comprisingorthographically known lexis. In the current study, 15 first language (L1) Japanese English language learners of three English proficiency levels listened to sentences and a narrative text. A two‐task diagnostic procedure using L1 recalls and L2 repetitions was employed to understand how orthographically known lexis was often misinterpreted over the course of multiple listening opportunities. Evidence from transcripts showed that the factors likely causing listening comprehension difficulty were L1 phonological influence, English connected speech modifications, and misinterpretation of top‐down contextual information. The study results show that even texts comprising high‐frequency vocabulary or other orthographically known lexis can be persistently difficult for L2 listeners to comprehend. The results thus challenge some current assumptions in L2 listening literature about the comprehensibility of texts with high‐frequency vocabulary or orthographically known lexis.
Article
In L2 listening research, there is debate over the roles of bottom-up and top-down processing in comprehension. However, little previous research has examined the relationship between listeners' ability to aurally decode individual words (an aspect of bottom-up processing) and their overall comprehension of listening texts (believed to be the result of both bottom-up and top-down processes). Additionally, the extent to which L2 listeners use context to compensate for aural decoding difficulties is unclear. In this study, 25 intermediate to advanced learners of Spanish (L1 English) listened to excerpts of three stories in Spanish. They completed a recall protocol in their L1 and a transcription task in the L2 for each story. Results indicate that good comprehension (defined as a score of ≥70% on the recall protocol) was sometimes achieved when participants correctly decoded as few as 73.5% of the words in the story, but only scores of ≥93.0% on the transcription task were consistently associated with good comprehension, which provides preliminary evidence suggesting a strong relationship between accurate aural decoding and good comprehension. Additionally, only 16.6% of aural decoding errors were semantically appropriate to the context, which suggests a limited role for context in participants’ decoding of the input.
Article
Full-text available
This paper argues for the incorporation of bottom-up activities for English as a foreign language (EFL) listening. It discusses theoretical concepts and pedagogic options for addressing bottom-up aural processing in the EFL classroom as well as how and why teachers may wish to include such activities in lessons. This discussion is augmented by a small-scale classroom-based research project that investigated six activities targeting learners? bottom-up listening abilities. Learners studying at the lower-intermediate level of a compulsory EFL university course were divided into a treatment group (n = 21) and a contrast group (n = 32). Each group listened to the same audio material and completed listening activities from an assigned textbook. The treatment group also engaged in a set of six bottom-up listening activities using the same material. This quasi-experimental study used dictation and listening proficiency tests before and after the course. Between-group comparisons of t-test results of dictation and listening proficiency tests indicated that improvements for the treatment group were probably due to the BU intervention. In addition, results from a posttreatment survey suggested that learners value explicit bottom-up listening instruction.
Article
Full-text available
Article
Full-text available
Elicited imitation (EI) has been widely used to examine second language (L2) proficiency and development and was an especially popular method in the 1970s and early 1980s. However, as the field embraced more communicative approaches to both instruction and assessment, the use of EI diminished, and the construct-related validity of EI scores as a representation of language proficiency was called into question. Current uses of EI, while not discounting the importance of communicative activities and assessments, tend to focus on the importance of processing and automaticity. This study presents a systematic review of EI in an effort to clarify the construct and usefulness of EI tasks in L2 research. The review underwent two phases: a narrative review and a meta-analysis. We surveyed 76 theoretical and empirical studies from 1970 to 2014, to investigate the use of EI in particular with respect to the research/assessment context and task features. The results of the narrative review provided a theoretical basis for the meta-analysis. The meta-analysis utilized 24 independent effect sizes based on 1089 participants obtained from 21 studies. To investigate evidence of construct-related validity for EI, we examined the following: (1) the ability of EI scores to distinguish speakers across proficiency levels; (2) correlations between scores on EI and other measures of language proficiency; and (3) key task features that moderate the sensitivity of EI. Results of the review demonstrate that EI tasks vary greatly in terms of task features; however, EI tasks in general have a strong ability to discriminate between speakers across proficiency levels (Hedges’ g = 1.34). Additionally, construct, sentence length, and scoring method were identified as moderators for the sensitivity of EI. Findings of this study provide supportive construct-related validity evidence for EI as a measure of L2 proficiency and inform appropriate EI task development and administration in L2 research and assessment.
Article
Full-text available
Factors affecting second language listening comprehension understand more of what they hear when they are listening to their non-native language. 1 Further, listeners who effectively use metacognitive strategies—that is, those who are aware of and use effective strategies, such as avoiding mental translation—demonstrate better L2 listening comprehension. 2 In addition to these general cogni-tive abilities, a number of factors pertaining to experience with the L2 influence listening skill. These factors include the amount of prior exposure to the language, familiarity with and an ability to understand the non-native language's phonology, vocabulary size, and background knowledge about the topic, text, structure, schema, and culture. Familiarity with the L2 changes the extent to which the L2 listener uses top-down or bottom-up strategies in listening. For example, expert listeners PurPose—To establish what is currently known about factors that affect foreign language listening comprehension, with a focus on characteristics of the listener, passage, and testing conditions. ConClusions—Research on second language (L2) listening comprehension strongly supports the importance of a number of factors; for example, a listener's working memory capacity or the density of information in a passage. Much of the research, however, reports weak or inconclusive results, leaving the importance of many factors and interactions among factors unresolved and in need of further investigation. relevanCe—Identifying the factors that affect L2 listening comprehension will help Defense Language Institute Proficiency Test (DLPT) designers anticipate how qualities of created materials and selected authentic materials will impact listening comprehension.
Book
This book challenges the orthodox approach to the teaching of second language listening, which is based upon the asking and answering of comprehension questions. It critically examines the practices and assumptions associated with this approach and suggests ways of revising them. The book's central argument is that a preoccupation with the notion of 'comprehension' has led teachers to focus on the product of listening in the form of answers to questions, ignoring the listening process itself. The author provides an informed account of the psychological processes whcih make up the skill of listening, and analyss the characteristics of the speech signal from which listeners have to construct a message. Drawing upon this information, the book propsoes a radical alternative to the comprehension approach and provides for intensive small-scale practice in aspects of listening that are perceptually or cognitively demanding for the listener.
Article
This reader-friendly text, firmly grounded in listening theories and supported by recent research findings, offers a comprehensive treatment of concepts and knowledge related to teaching second language (L2) listening, with a particular emphasis on metacognition.
Article
The vast majority of second language (L2) vocabulary research focuses on learners' knowledge of isolated word forms. However, it is unclear to what extent this knowledge can be used as an indicator of knowledge in context (i.e. reading and listening). This study aims to shed light on this issue by comparing ESL learners' knowledge of the meaning of isolated words ('decontextual knowledge') with their knowledge of the same words in both reading and listening ('contextual knowledge'). Decontextual knowledge was measured in a free recall interview. Contextual knowledge was measured through a task in which participants paraphrased sentences containing the target items from both a written and spoken narrative. Results showed that learners' decontextual and contextual knowledge agreed in 65% of the cases. This indicates a considerable gap between the two, and emphasises that scores on decontextualised vocabulary test should not be used as predictors of learners' vocabulary knowledge in context. In addition, learners demonstrated significantly better knowledge of word meaning in the reading than listening mode, which may be due to processing difficulties in listening as well as better inferencing opportunities in reading. Two additional factors found to affect both decontextual and contextual knowledge are word frequency and learners' vocabulary size.
Article
An argument that the way we listen to speech is shaped by our experience with our native language. © 2012 Massachusetts Institute of Technology. All rights reserved.
Article
This paper contributes to L2 listening pedagogy by exploring listening instruction and examining teachers’ authentic listening lessons. Listening instruction has yet to be investigated systematically, and the literature has typically relied on anecdotal and intuitive accounts of what takes place in listening lessons. Therefore, this paper reports on a practical investigation into listening pedagogy through a review of 30 listening lessons taught and recorded by ten EFL instructors in Japan. Lesson content was transcribed and coded according to a priori categories informed by the literature. These categories included, among others, comprehension questions, bottom-up listening activities, and metacognitive listening strategies. Results revealed some teachers using a range of techniques while others limited their teaching to product-based approaches. The paper provides empirical descriptions of L2 listening instruction in practice and discusses pedagogic implications stemming from the results, including suggestions for how language teachers can expand their repertoires for the teaching of listening.