Article

Efficiency of spoken word recognition slows across the adult lifespan

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Spoken word recognition is a critical hub during language processing, linking hearing and perception to meaning and syntax. Words must be recognized quickly and efficiently as speech unfolds to be successfully integrated into conversation. This makes word recognition a computationally challenging process even for young, normal hearing adults. Older adults often experience declines in hearing and cognition, which could be linked by age-related declines in the cognitive processes specific to word recognition. However, it is unclear whether changes in word recognition across the lifespan can be accounted for by hearing or domain-general cognition. Participants (N = 107) responded to spoken words in a Visual World Paradigm task while their eyes were tracked to assess the real-time dynamics of word recognition. We examined several indices of word recognition from early adolescence through older adulthood (ages 11-78). The timing and proportion of eye fixations to target and competitor images reveals that spoken word recognition became more efficient through age 25 and began to slow in middle age, accompanied by declines in the ability to resolve competition (e.g., suppressing sandwich to recognize sandal). There was a unique effect of age even after accounting for differences in inhibitory control, processing speed, and hearing thresholds. This suggests a limited age range where listeners are peak performers.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... One prominent account suggests that social engagement plays a crucial role in maintaining cognitive function [21][22][23][24][25][26][27][28][29] . However, difficulties in language processing, which declines with age 30 , could compound the impact of hearing loss on social disengagement; alternatively, strong language skills could buffer the functional consequences of hearing loss. ...
... While the VWP can characterize many aspects of word recognition 62,63 , this variant-which focuses on competition among candidates-has been influential because of its ability to capture the most important mechanism that undergird most theories of word recognition: competition 15,64,65 . It has been in wide use across populations including children 66,67 , older adults 30 , people with developmental language disorder 61 , and multilinguals 68 , as well as NH listeners in challenging conditions 17,18 . Thus, it offers a consensus diagnostic of how the competition process that underlies word recognition varies across listeners. ...
... However, their small samples precluded any analysis of individual differences that could link these profiles to outcomes (e.g., to determine if it is beneficial to wait-and-see) or identify factors that lead listeners to sustain activation or wait and see. The present study thus incorporated a larger and thoroughly characterized sample of CI users (N = 101), alongside new analyses of a previously reported lifespan sample of listeners without major hearing loss (N = 107) 30 , to address three questions. ...
Article
Full-text available
Word recognition is a gateway to language, linking sound to meaning. Prior work has characterized its cognitive mechanisms as a form of competition between similar-sounding words. However, it has not identified dimensions along which this competition varies across people. We sought to identify these dimensions in a population of cochlear implant users with heterogenous backgrounds and audiological profiles, and in a lifespan sample of people without hearing loss. Our study characterizes the process of lexical competition using the Visual World Paradigm. A principal component analysis reveals that people’s ability to resolve lexical competition varies along three dimensions that mirror prior small-scale studies. These dimensions capture the degree to which lexical access is delayed (“Wait-and-See”), the degree to which competition fully resolves (“Sustained-Activation”), and the overall rate of activation. Each dimension is predicted by a different auditory skills and demographic factors (onset of deafness, age, cochlear implant experience). Moreover, each dimension predicts outcomes (speech perception in quiet and noise, subjective listening success) over and above auditory fidelity. Higher degrees of Wait-and-See and Sustained-Activation predict poorer outcomes. These results suggest the mechanisms of word recognition vary along a few underlying dimensions which help explain variable performance among listeners encountering auditory challenge.
... Rigler et al. (2015) also found that both 9-and 16-yearolds eventually achieved the same level of asymptotic fixations, suggesting no development in competition resolution during this window. These results also push the developmental window up through age 16 (see also McMurray et al., 2018), and later work suggests as late as 25 (Colby & McMurray, 2023). Apfelbaum et al. (2023) replicated the Rigler et al. (2015) study but with a finer division of ages (three age groups, from 7 to 15 years of age). ...
... We used this approach to address four research questions. 1) What is the time-course and nature of the development of wordform recognition? Although prior studies suggest that resolving phonological competition continues to develop until age 25 (Apfelbaum et al., 2023;Colby & McMurray, 2023;Rigler et al., 2015), these experiments used a small age range and short stimuli. Therefore, we built on this work with a wider age sample and with longer, bisyllabic stimuli which include more phonetic overlap. ...
Article
Prior research suggests that real-time phonological competition processes are stabilized in early childhood (Fernald et al., 2006). However, recent work suggests that development of these processes continues throughout adolescence (Huang & Snedeker, 2011; Rigler et al., 2015). This study aimed to investigate whether these developmental changes are based solely within the lexical system or are due to domain general changes. This study also aimed to investigate the development of real-time lexical-semantic activation. We captured semantic activation phonological competition and non-linguistic domain general processing skills using two Visual World Paradigm experiments in 43 7-9-year-olds, 42 10-13-year-olds, and 30 16-17-year-olds. Older children were quicker to fixate the target word and exhibited earlier onset and offset of fixations to both semantic and phonological competitors. Visual/cognitive skills explained significant, but not all, variance in the development of these effects. Developmental changes in semantic activation were largely attributable to changes in phonological processing. These results suggest that the concurrent development of linguistic processes and broader visual/cognitive skills lead to developmental changes in real-time phonological competition, while semantic activation is stable across these ages.
... This pattern has been colloquially termed sustained activation [93,96,100]. Elderly listeners-even with NH-also show evidence of this [101], and priming studies show the same in young adult NH listeners exposed to diverse dialects [102]. ...
Preprint
Speech processing requires listeners to map temporally unfolding input to words. There has been consensus around the principles governing this process: lexical items are activated immediately and incrementally as speech arrives, perceptual and lexical representations rapidly decay to make room for new information, and lexical entries are temporally structured. In this framework, speech processing is tightly coupled to the temporally unfolding input. However, recent work challenges this: low-level auditory and higher-level lexical representations do not decay but are retained over long durations; speech perception may require encapsulated memory buffers; lexical representations are not strictly temporally structured; and listeners can delay lexical access substantially in some circumstances. These findings argue for a deep revision to models of word recognition.
... In fact, half of our sample scored above the maximum LexTALE score reported in the Sarrett study. Inability to efficiently resolve lexical competition has been associated with language disorders (McMurray et al., 2010) and aging (Colby & McMurray, 2023). Therefore, lasting coactivation at This document is copyrighted by the American Psychological Association or one of its allied publishers. ...
Article
Full-text available
During spoken word recognition, words that are related phonologically (e.g., dog and dot) and words that are related semantically (e.g., dog and bear) are known to become active within the first second of word recognition. The time course of activation and resolution of these competing words changes as a function of linguistic knowledge. This preregistered study aimed to examine how a less commonly used linguistic predictor, percent lifetime language exposure, affects the time course of target and competitor activation in an eye-tracking visual world paradigm. Lifetime exposure was expected to capture variability in the representations and processes that contribute to individual differences in spoken word recognition. Results show that when putting lifetime exposure to French on a scale, more lifetime exposure was related to target fixations and slightly related to early phonological coactivation, but not related to semantic coactivation. These analyses demonstrate how generalized additive mixed models might help examine time course data with more continuous linguistic variables. Exploratory analyses looked at the amount of variance captured by three linguistic experience predictors (lifetime French exposure, recent French exposure, French vocabulary) on indices of target, phonological, and semantic fixations and identified vocabulary size as most frequently explaining significant variance, but the pattern of results did not differ from those of lifetime language exposure. These findings suggest that lifetime language exposure may not fully capture subtle differences in linguistic experience that affect lexical coactivation such as those brought upon by differences in exposure trajectories across the lifetime or differences in the setting of language exposure.
... The diagnosis of listening problems may be influenced by these skills as they may decline with age or disrupted by developmental disorders even in normal hearing individuals. 16,17 Therefore, single word tasks may serve a valuable role in controlling some of this nonperceptual variability and contributing to the research and clinical resources. The Iowa Test of Consonant Perception (ITCP) was recently developed to overcome these concerns. ...
Article
The Iowa Test of Consonant Perception is a single-word closed-set speech-in-noise test with well-balanced phonetic features. The current study aimed to establish a U.K. version of the test (ITCP-B) based on the Southern Standard British English. We conducted a validity test in two sessions with 46 participants. The ITCP-B demonstrated excellent test-retest reliability, cross-talker validity, and good convergent validity. These findings suggest that ITCP-B is a reliable measure of speech-in-noise perception. The test can be used to facilitate comparative or combined studies in the U.S. and U.K. All materials (application and scripts) to run the ITCP-B/ITCP are freely available online.
... This pattern has been colloquially termed sustained activation ( Figure 5C, D) [79; 84; 86]. Elderly listeners also show evidence of this sustained activation even with normal hearing [88]. Clopper Figure 5. Profiles of word recognition. ...
Preprint
Speech understanding requires listeners to map temporally unfolding speech to lexical items. For decades there has been consensus around the principles governing this process: lexical items are activated immediately as the input arrives, acoustic information flows to word recognition in a continuous cascade, perceptual and lexical representations rapidly decay to make room for new information, and lexical entries are temporally ordered. In this view speech processing is tightly coupled to the input (following “left-to-right” processing). However, recent work challenges this view: listeners often revise earlier decisions, they maintain low-level information for some time; lexical access may be delayed in some circumstances; and lexical representations are not fully ordered. These findings argue for a deep revision of models of language processing.
Article
Word recognition is generally thought to be supported by an automatic process of lexical competition, at least in normal hearing young adults. When listening becomes challenging, either due to properties of the environment (noise) or the individual (hearing loss), the dynamics of lexical competition change and word recognition can feel effortful and fatiguing. In cochlear implant users, several dimensions of lexical competition have been identified that capture the timing of the onset of lexical competition (Wait-and-See), the degree to which competition is fully resolved (Sustained Activation), and how quickly lexical candidates are activated (Activation Rate). It is unclear, however, how these dimensions relate to listening effort. To address this question, a group of cochlear implant users (N=79) completed a pupillometry task to index effort and a visual world paradigm task to index the dynamics of lexical competition as part of a larger battery of clinical and experimental tasks. Listeners who engaged more effort, as indexed by peak pupil size difference score, fell lower along the Wait-and-See dimension, suggesting that these listeners are engaging effort to be less Wait-and-See (or to begin the process of lexical competition earlier). Listeners who engaged effort earlier had better word and sentence recognition outcomes. The timing of effort was predicted by age and spectral fidelity, but no audiological or demographic factors predicted peak pupil size difference. The dissociation between the magnitude of engaged effort and the timing of effort suggests they perform different goals for spoken word recognition.
Chapter
Full text available at: https://authors.elsevier.com/a/1jbdyI8Pe%7EOHl
Article
In typical adults, recognizing both spoken and written words is thought to be served by a process of competition between candidates in the lexicon. In recent years, work has used eye-tracking in the visual world paradigm to characterize this competition process over development. It has shown that both spoken and written word recognition continue to develop through adolescence (Rigleret al., 2015). It isstillunclear what drives these changes in real-time word recognitionover the school years, as there are dramatic changes in language, the onset of reading instruction, and gains in domain general function during this time.This study began to addressthese issuesby asking whether changes in real-time word recognitionderive from changes in overall languageand reading ability or reflect more general age-related development. This cross-sectional study examined 278school-age children (Grades 1-3) using the Visual World Paradigm to assess both spoken and written word recognition, along withmultiple measures of language, reading and phonology. A structural equation modelapplied to these ability measuresfound three factors representing language, reading,and phonology. Multiple regression analyses wereused to understand how thesethreefactors relate to real-time spoken and written word recognitionas well as a non-linguistic variant of the VWPintendedto capture decision speed, eye-movement factors, and other non-language/reading differences. We found that for both spokenand written word recognition, the speed of activating target words in both domains was more closely tied to the relevant ability (e.g., reading for written word recognition) than was age. We also examined competitionresolution (how fully competitorswere suppressed late in processing). Here, spoken word recognition showedonlysmall,developmental effects that were only related to phonological processing, suggesting links to developmental language disorder. However, in written word recognition, competitor resolution showed large impacts of development which were strongly linked to reading.This suggests the dimensionality of real-time lexical processing may differ across domains. Importantly, neither spoken nor written word recognitionis fully described by changes in non-linguistic skills assessed with non-linguisticVWP, and the non-linguistic VWP was linked to differences in language and reading. These findings suggest that spoken and written word recognition continuepastthe first year of life and aremostly driven by ability and not only by overall maturation
Article
Full-text available
As a spoken word unfolds over time, similar sounding words (cap and cat) compete until one word "wins". Lexical competition becomes more efficient from infancy through adolescence. We examined one potential mechanism underlying this development: lexical inhibition, by which activated candidates suppress competitors. In Experiment 1, younger (7-8 years) and older (12-13 years) children heard words (cap) in which the onset was manipulated to briefly boost competition from a cohort competitor (cat). This was compared to a condition with a nonword (cack) onset that would not inhibit the target. Words were presented in a visual world task during which eye movements were recorded. Both groups showed less looking to the target when perceiving the competitor-splice relative to the nonword-splice, showing engagement of lexical inhibition. Exploratory analyses of linguistic adaptation across the experiment revealed that older children demonstrated consistent lexical inhibition across the experiment and younger children did not, initially showing no effect in the first half of trials and then a robust effect in the latter half. In Experiment 2, adults also displayed consistent lexical inhibition in the same task. These findings suggest that younger children do not consistently engage lexical inhibition in typical listening but can quickly bring it online in response to certain linguistic experiences. Computational modeling showed that age-related differences are best explained by increased engagement of inhibition rather than growth in activation. These findings suggest that continued development of lexical inhibition in later childhood may underlie increases in efficiency of spoken word recognition. (PsycInfo Database Record (c) 2022 APA, all rights reserved).
Article
Full-text available
Theories of adult cognitive development classically distinguish between fluid abilities, which require effortful processing at the time of assessment, and crystallized abilities, which require the retrieval and application of knowledge. On average, fluid abilities decline throughout adulthood, whereas crystallized abilities show gains into old age. These diverging age trends, along with marked individual differences in rates of change, have led to the proposition that individuals might compensate for fluid declines with crystallized gains. Here, using data from two large longitudinal studies, we show that rates of change are strongly correlated across fluid and crystallized abilities. Hence, individuals showing greater losses in fluid abilities tend to show smaller gains, or even losses, in crystallized abilities. This observed commonality between fluid and crystallized changes places constraints on theories of compensation and directs attention toward domain-general drivers of adult cognitive decline and maintenance.
Article
Full-text available
Listeners generally categorize speech sounds in a gradient manner. However, recent work, using a visual analogue scaling (VAS) task, suggests that some listeners show more categorical performance, leading to less flexible cue integration and poorer recovery from misperceptions (Kapnoula et al., 2017, 2021). We asked how individual differences in speech gradiency can be reconciled with the well-established gradiency in the modal listener, showing how VAS performance relates to both Visual World Paradigm and EEG measures of gradiency. We also investigated three potential sources of these individual differences: inhibitory control; lexical inhibition; and early cue encoding. We used the N1 ERP component to track pre-categorical encoding of Voice Onset Time (VOT). The N1 linearly tracked VOT, reflecting a fundamentally gradient speech perception; however, for less gradient listeners, this linearity was disrupted near the boundary. Thus, while all listeners are gradient, they may show idiosyncratic encoding of specific cues, affecting downstream processing.
Article
Full-text available
This study assessed the effects of age, word frequency, and background noise on the time course of lexical activation during spoken word recognition. Participants (41 young adults and 39 older adults) performed a visual world word recognition task while we monitored their gaze position. On each trial, four phonologically unrelated pictures appeared on the screen. A target word was presented auditorily following a carrier phrase (“Click on ________”), at which point participants were instructed to use the mouse to click on the picture that corresponded to the target word. High- and low-frequency words were presented in quiet to half of the participants. The other half heard the words in a low level of noise in which the words were still readily identifiable. Results showed that, even in the absence of phonological competitors in the visual array, high-frequency words were fixated more quickly than low-frequency words by both listener groups. Young adults were generally faster to fixate on targets compared to older adults, but the pattern of interactions among noise, word frequency, and listener age showed that older adults’ lexical activation largely matches that of young adults in a modest amount of noise.
Article
Full-text available
Understanding speech when background noise is present is a critical everyday task that varies widely among people. A key challenge is to understand why some people struggle with speech-in-noise perception, despite having clinically normal hearing. Here, we developed new figure-ground tests that require participants to extract a coherent tone pattern from a stochastic background of tones. These tests dissociated variability in speech-in-noise perception related to mechanisms for detecting static (same-frequency) patterns and those for tracking patterns that change frequency over time. In addition, elevated hearing thresholds that are widely considered to be ‘normal’ explained significant variance in speech-in-noise perception, independent of figure-ground perception. Overall, our results demonstrate that successful speech-in-noise perception is related to audiometric thresholds, fundamental grouping of static acoustic patterns, and tracking of acoustic sources that change in frequency. Crucially, speech-in-noise deficits are better assessed by measuring central (grouping) processes alongside audiometric thresholds.
Article
Full-text available
The field of cognitive aging has seen considerable advances in describing the linguistic and semantic changes that happen during the adult life span to uncover the structure of the mental lexicon (i.e., the mental repository of lexical and conceptual representations). Nevertheless, there is still debate concerning the sources of these changes, including the role of environmental exposure and several cognitive mechanisms associated with learning, representation, and retrieval of information. We review the current status of research in this field and outline a framework that promises to assess the contribution of both ecological and psychological aspects to the aging lexicon.
Article
Full-text available
Individual differences in working memory capacity have been gaining recognition as playing an important role in speech comprehension, especially in noisy environments. Using the visual world eye-tracking paradigm, a recent study by Hadar and coworkers found that online spoken word recognition was slowed when listeners were required to retain in memory a list of four spoken digits (high load) compared with only one (low load). In the current study, we recognized that the influence of a digit preload might be greater for individuals who have a more limited memory span. We compared participants with higher and lower memory spans on the time course for spoken word recognition by testing eye-fixations on a named object, relative to fixations on an object whose name shared phonology with the named object. Results show that when a low load was imposed, differences in memory span had no effect on the time course of preferential fixations. However, with a high load, listeners with lower span were delayed by ∼550 ms in discriminating target from sound-sharing competitors, relative to higher span listeners. This follows an assumption that the interference effect of a memory preload is not a fixed value, but rather, its effect is greater for individuals with a smaller memory span. Interestingly, span differences affected the timeline for spoken word recognition in noise, but not offline accuracy. This highlights the significance of using eye-tracking as a measure for online speech processing. Results further emphasize the importance of considering differences in cognitive capacity, even when testing normal hearing young adults.
Article
Full-text available
Purpose This study examined whether older adults remain perceptually flexible when presented with ambiguities in speech in the absence of lexically disambiguating information. We expected older adults to show less perceptual learning when top-down information was not available. We also investigated whether individual differences in executive function predicted perceptual learning in older and younger adults. Method Younger (n = 31) and older adults (n = 27) completed 2 perceptual learning tasks composed of a pretest, exposure, and posttest phase. Both learning tasks exposed participants to clear and ambiguous speech tokens, but crucially, the lexically guided learning task provided disambiguating lexical information whereas the distributional learning task did not. Participants also performed several cognitive tasks to investigate individual differences in working memory, vocabulary, and attention-switching control. Results We found that perceptual learning is maintained in older adults, but that learning may be stronger in contexts where top-down information is available. Receptive vocabulary scores predicted learning across both age groups and in both learning tasks. Conclusions Implicit learning is maintained with age across different learning conditions but remains stronger when lexically biasing information is available. We find that receptive vocabulary is relevant for learning in both types of learning tasks, suggesting the importance of vocabulary knowledge for adapting to ambiguities in speech.
Article
Full-text available
We have previously shown that older adults hyper-bind, or form more extraneous associations than younger adults. In this study, we aimed to both replicate the original implicit transfer effect and to test whether younger adults show evidence of hyper-binding when informed about the relevance of past information. Our results suggest that regardless of the test conditions, younger adults do not hyper-bind. In contrast, older adults showed hyper-binding under (standard) implicit instructions, but not when made aware of a connection between tasks. These results replicate the original hyper-binding effect and reiterate its implicit nature.
Article
Full-text available
The presence of noise and interfering information can pose major difficulties during speech perception, particularly for older adults. Analogously, interference from similar representations during retrieval is a major cause of age-related memory failures. To demonstrate a suppression mechanism that underlies such speech and memory difficulties, we tested the hypothesis that interference between targets and competitors is resolved by suppressing competitors, thereby rendering them less intelligible in noise. In a series of experiments using a paradigm adapted from Healey, Hasher, and Campbell (2013), we presented a list of words that included target/competitor pairs of orthographically similar words (e.g., ALLERGY and ANALOGY). After a delay, participants solved fragments (e.g., A_L__GY), some of which resembled both members of the target/competitor pair, but could only be completed by the target. We then assessed the consequence of having successfully resolved this interference by asking participants to identify words in noise, some of which included the rejected competitor words from the previous phase. Consistent with a suppression account of interference resolution, younger adults reliably demonstrated reduced identification accuracy for competitors, indicating that they had effectively rejected, and therefore suppressed, competitors. In contrast, older adults showed a relative increase in accuracy for competitors relative to young adults. Such results suggest that older adults' reduced ability to suppress these representations resulted in sustained access to lexical traces, subsequently increasing perceptual identification of such items. We discuss these findings within the framework of inhibitory control theory in cognitive aging and its implications for age-related changes in speech perception.
Article
Full-text available
In spite of the rapidity of everyday speech, older adults tend to keep up relatively well in day-to-day listening. In laboratory settings older adults do not respond as quickly as younger adults in off-line tests of sentence comprehension, but the question is whether comprehension itself is actually slower. Two unique features of the human eye were used to address this question. First, we tracked eye-movements as 20 young adults and 20 healthy older adults listened to sentences that referred to one of four objects pictured on a computer screen. Although the older adults took longer to indicate the referenced object with a cursor-pointing response, their gaze moved to the correct object as rapidly as that of the younger adults. Second, we concurrently measured dilation of the pupil of the eye as a physiological index of effort. This measure revealed that although poorer hearing acuity did not slow processing, success came at the cost of greater processing effort.
Article
Full-text available
The Fifth Eriksholm Workshop on "Hearing Impairment and Cognitive Energy" was convened to develop a consensus among interdisciplinary experts about what is known on the topic, gaps in knowledge, the use of terminology, priorities for future research, and implications for practice. The general term cognitive energy was chosen to facilitate the broadest possible discussion of the topic. It goes back to who described the effects of attention on perception; he used the term psychic energy for the notion that limited mental resources can be flexibly allocated among perceptual and mental activities. The workshop focused on three main areas: (1) theories, models, concepts, definitions, and frameworks; (2) methods and measures; and (3) knowledge translation. We defined effort as the deliberate allocation of mental resources to overcome obstacles in goal pursuit when carrying out a task, with listening effort applying more specifically when tasks involve listening. We adapted Kahneman's seminal (1973) Capacity Model of Attention to listening and proposed a heuristically useful Framework for Understanding Effortful Listening (FUEL). Our FUEL incorporates the well-known relationship between cognitive demand and the supply of cognitive capacity that is the foundation of cognitive theories of attention. Our FUEL also incorporates a motivation dimension based on complementary theories of motivational intensity, adaptive gain control, and optimal performance, fatigue, and pleasure. Using a three-dimensional illustration, we highlight how listening effort depends not only on hearing difficulties and task demands but also on the listener's motivation to expend mental effort in the challenging situations of everyday life.
Article
Full-text available
Language learning is generally described as a problem of acquiring new information (e.g., new words). However, equally important are changes in how the system processes known information. For example, a wealth of studies has suggested dramatic changes over development in how efficiently children recognize familiar words, but it is unknown what kind of experience-dependent mechanisms of plasticity give rise to such changes in real-time processing. We examined the plasticity of the language processing system by testing whether a fundamental aspect of spoken word recognition, lexical interference, can be altered by experience. Adult participants were trained on a set of familiar words over a series of 4 tasks. In the high-competition (HC) condition, tasks were designed to encourage coactivation of similar words (e.g., net and neck) and to require listeners to resolve this competition. Tasks were similar in the low-competition (LC) condition, but did not enhance this competition. Immediately after training, interlexical interference was tested using a visual world paradigm task. Participants in the HC group resolved interference to a fuller degree than those in the LC group, demonstrating that experience can shape the way competition between words is resolved. TRACE simulations showed that the observed late differences in the pattern of interference resolution can be attributed to differences in the strength of lexical inhibition. These findings inform cognitive models in many domains that involve competition/interference processes, and suggest an experience-dependent mechanism of plasticity that may underlie longer term changes in processing efficiency associated with both typical and atypical development. (PsycINFO Database Record
Article
Full-text available
This study investigated the developmental time course of spoken word recognition in older children using eye tracking to assess how the real-time processing dynamics of word recognition change over development. We found that 9-year-olds were slower to activate the target words and showed more early competition from competitor words than 16-year-olds; however, both age groups ultimately fixated targets to the same degree. This contrasts with a prior study of adolescents with language impairment (McMurray, Samelson, Lee, & Tomblin, 2010) that showed a different pattern of real-time processes. These findings suggest that the dynamics of word recognition are still developing even at these late ages, and developmental changes may derive from different sources than individual differences in relative language ability. (PsycINFO Database Record
Article
Full-text available
This study examined the temporal dynamics of spoken word recognition in noise and background speech. In two visual-world experiments, English participants listened to target words while looking at four pictures on the screen: a target (e.g. candle), an onset competitor (e.g. candy), a rhyme competitor (e.g. sandal), and an unrelated distractor (e.g. lemon). Target words were presented in quiet, mixed with broadband noise, or mixed with background speech. Results showed that lexical competition changes throughout the observation window as a function of what is presented in the background. These findings suggest that, rather than being strictly sequential, stream segregation and lexical competition interact during spoken word recognition.
Article
Full-text available
Audiovisual (AV) speech perception is the process by which auditory and visual sensory signals are integrated and used to understand what a talker is saying during face-to-face communication. This form of communication is markedly superior to speech perception in either sensory modality alone. However, there are additional lexical factors that are affected by age-related cognitive changes that may contribute to differences in AV perception. In the current study, we extended an existing model of spoken word identification to the AV domain, and examined the cognitive factors that contribute to age-related and individual differences in AV perception of words varying in lexical difficulty (i.e., on the basis of competing items). Young (n = 49) and older adults (n = 50) completed a series of cognitive inhibition tasks and a spoken word identification task. The words were presented in auditory-only, visual-only, and AV conditions, and were equally divided into lexically hard (words with many competitors) and lexically easy (words with few competitors). Overall, young adults demonstrated better inhibitory abilities and higher identification performance than older adults. However, whereas no relationship was observed between inhibitory abilities and AV word identification performance in young adults, there was a significant relationship between Stroop interference and AV identification of lexically hard words in older adults. These results are interpreted within the framework of existing models of spoken-word recognition with implications for how cognitive deficits in older adults contribute to speech perception. (PsycINFO Database Record (c) 2015 APA, all rights reserved).
Article
Full-text available
Clinical audiometry has long focused on determining the detection thresholds for pure tones, which depend on intact cochlear mechanics and hair cell function. Yet many listeners with normal hearing thresholds complain of communication difficulties, and the causes for such problems are not well understood. Here, we explore whether normal-hearing listeners exhibit such suprathreshold deficits, affecting the fidelity with which subcortical areas encode the temporal structure of clearly audible sound. Using an array of measures, we evaluated a cohort of young adults with thresholds in the normal range to assess both cochlear mechanical function and temporal coding of suprathreshold sounds. Listeners differed widely in both electrophysiological and behavioral measures of temporal coding fidelity. These measures correlated significantly with each other. Conversely, these differences were unrelated to the modest variation in otoacoustic emissions, cochlear tuning, or the residual differences in hearing threshold present in our cohort. Electroencephalography revealed that listeners with poor subcortical encoding had poor cortical sensitivity to changes in interaural time differences, which are critical for localizing sound sources and analyzing complex scenes. These listeners also performed poorly when asked to direct selective attention to one of two competing speech streams, a task that mimics the challenges of many everyday listening environments. Together with previous animal and computational models, our results suggest that hidden hearing deficits, likely originating at the level of the cochlear nerve, are part of “normal hearing”.
Article
Full-text available
This study examined age-related differences in the ability to judge one's vocabulary. Young, middle-age, and older adults completed a multiple-choice test of vocabulary, judged their confidence in each answer, and estimated their overall performance. Older adults performed better and were more confident in their knowledge than were the other 2 groups. Importantly, relative to young adults, older adults demonstrated better calibration both on item-by-item confidence judgments and on global estimates. Resolution, as defined by correlations between item-by-item performance and confidence judgments, was age-invariant. We suggest that age-related accumulation of vocabulary is accompanied by enhanced perception of mastery in one's knowledge. (PsycINFO Database Record (c) 2015 APA, all rights reserved).
Article
Full-text available
All words of the languages we know are stored in the mental lexicon. Psycholinguistic models describe in which format lexical knowledge is stored and how it is accessed when needed for language use. The present article summarizes key findings in spoken-word recognition by humans and describes how models of spoken-word recognition account for them. Although current models of spoken-word recognition differ considerably in the details of implementation, there is general consensus among them on at least three aspects: multiple word candidates are activated in parallel as a word is being heard, activation of word candidates varies with the degree of match between the speech signal and stored lexical representations, and activated candidate words compete for recognition. No consensus has been reached on other aspects such as the flow of information between different processing levels, and the format of stored prelexical and lexical representations. WIREs Cogn Sci 2012, 3:387-401. doi: 10.1002/wcs.1178 For further resources related to this article, please visit the WIREs website. Copyright © 2012 John Wiley & Sons, Ltd.
Article
Full-text available
As otherwise healthy adults age, their performance on cognitive tests tends to decline. This change is traditionally taken as evidence that cognitive processing is subject to significant declines in healthy aging. We examine this claim, showing current theories over-estimate the evidence in support of it, and demonstrating that when properly evaluated, the empirical record often indicates that the opposite is true. To explain the disparity between the evidence and current theories, we show how the models of learning assumed in aging research are incapable of capturing even the most basic of empirical facts of “associative” learning, and lend themselves to spurious discoveries of “cognitive decline.” Once a more accurate model of learning is introduced, we demonstrate that far from declining, the accuracy of older adults lexical processing appears to improve continuously across the lifespan. We further identify other measures on which performance does not decline with age, and show how these different patterns of performance fit within an overall framework of learning. Finally, we consider the implications of our demonstrations of continuous and consistent learning performance throughout adulthood for our understanding of the changes in underlying brain morphology that occur during the course of cognitive development across the lifespan.
Article
Full-text available
This study investigates the extent to which age-related language processing difficulties are due to a decline in sensory processes or to a deterioration of cognitive factors, specifically, attentional control. Two facets of attentional control were examined: inhibition of irrelevant information and divided attention. Younger and older adults were asked to categorize the initial phoneme of spoken syllables ("Was it m or n?"), trying to ignore the lexical status of the syllables. The phonemes were manipulated to range in eight steps from m to n. Participants also did a discrimination task on syllable pairs ("Were the initial sounds the same or different?"). Categorization and discrimination were performed under either divided attention (concurrent visual-search task) or focused attention (no visual task). The results showed that even when the younger and older adults were matched on their discrimination scores: (1) the older adults had more difficulty inhibiting lexical knowledge than did younger adults, (2) divided attention weakened lexical inhibition in both younger and older adults, and (3) divided attention impaired sound discrimination more in older than younger listeners. The results confirm the independent and combined contribution of sensory decline and deficit in attentional control to language processing difficulties associated with aging. The relative weight of these variables and their mechanisms of action are discussed in the context of theories of aging and language. (PsycINFO Database Record (c) 2014 APA, all rights reserved).
Article
Full-text available
How do we map the rapid input of spoken language onto phonological and lexical representations over time? Attempts at psychologically-tractable computational models of spoken word recognition tend either to ignore time or to transform the temporal input into a spatial representation. TRACE, a connectionist model with broad and deep coverage of speech perception and spoken word recognition phenomena, takes the latter approach, using exclusively time-specific units at every level of representation. TRACE reduplicates featural, phonemic, and lexical inputs at every time step in a large memory trace, with rich interconnections (excitatory forward and backward connections between levels and inhibitory links within levels). As the length of the memory trace is increased, or as the phoneme and lexical inventory of the model is increased to a realistic size, this reduplication of time- (temporal position) specific units leads to a dramatic proliferation of units and connections, begging the question of whether a more efficient approach is possible. Our starting point is the observation that models of visual object recognition-including visual word recognition-have grappled with the problem of spatial invariance, and arrived at solutions other than a fully-reduplicative strategy like that of TRACE. This inspires a new model of spoken word recognition that combines time-specific phoneme representations similar to those in TRACE with higher-level representations based on string kernels: temporally independent (time invariant) diphone and lexical units. This reduces the number of necessary units and connections by several orders of magnitude relative to TRACE. Critically, we compare the new model to TRACE on a set of key phenomena, demonstrating that the new model inherits much of the behavior of TRACE and that the drastic computational savings do not come at the cost of explanatory power.
Article
Full-text available
Though much is known about how words are recognized, little research has focused on how a degraded signal affects the fine-grained temporal aspects of real-time word recognition. The perception of degraded speech was examined in two populations with the goal of describing the time course of word recognition and lexical competition. Thirty-three postlingually deafened cochlear implant (CI) users and 57 normal hearing (NH) adults (16 in a CI-simulation condition) participated in a visual world paradigm eye-tracking task in which their fixations to a set of phonologically related items were monitored as they heard one item being named. Each degraded-speech group was compared with a set of age-matched NH participants listening to unfiltered speech. CI users and the simulation group showed a delay in activation relative to the NH listeners, and there is weak evidence that the CI users showed differences in the degree of peak and late competitor activation. In general, though, the degraded-speech groups behaved statistically similarly with respect to activation levels. (PsycINFO Database Record (c) 2013 APA, all rights reserved).
Article
Full-text available
Purpose Researchers have begun to use eye tracking in the visual world paradigm (VWP) to study clinical differences in language processing, but the reliability of such laboratory tests has rarely been assessed. In this article, the authors assess test–retest reliability of the VWP for spoken word recognition. Methods Participants performed an auditory VWP task in repeated sessions and a visual-only VWP task in a third session. The authors performed correlation and regression analyses on several parameters to determine which reflect reliable behavior and which are predictive of behavior in later sessions. Results Results showed that the fixation parameters most closely related to timing and degree of fixations were moderately-to-strongly correlated across days, whereas the parameters related to rate of increase or decrease of fixations to particular items were less strongly correlated. Moreover, when including factors derived from the visual-only task, the performance of the regression model was at least moderately correlated with Day 2 performance on all parameters (R > .30). Conclusion The VWP is stable enough (with some caveats) to serve as an individual measure. These findings suggest guidelines for future use of the paradigm and for areas of improvement in both methodology and analysis.
Article
Full-text available
Purpose In this study, the authors investigated how acoustic distortion affected younger and older adults' use of context in a lexical decision task. Method The authors measured lexical decision reaction times (RTs) when intact target words followed acoustically distorted sentence contexts. Contexts were semantically congruent, neutral, or incongruent. Younger adults (n = 216) were tested on three distortion types: low-pass filtering, time compression, and masking by multitalker babble, using two amounts of distortion selected to control for word recognition accuracy. Older adults (n = 108) were tested on two amounts of time compression and one low-pass filtering condition. Results For both age groups, there was robust facilitation by congruent contexts but minimal inhibition by incongruent contexts. Facilitation decreased as distortion increased. Older listeners had slower RTs than younger listeners, but this difference was smaller in congruent than in neutral or incongruent conditions. After controlling for word recognition accuracy, older listeners' RTs were slower in time-compressed than in low-pass filtering conditions, but younger listeners performed similarly in both conditions. Conclusions These RT results highlight the interdependence between bottom-up sensory and top-down semantic processing. Consistent with previous findings based on accuracy measures, compared with younger adults, older adults were disproportionately slowed when speech was time compressed but more facilitated by congruent contexts.
Article
Full-text available
Participants' eye movements were monitored as they followed spoken instructions to click on a pictured object with a computer mouse (e.g., ''click on the net''). Participants were slower to éxate the target picture when the onset of the target word came from a competitor word (e.g., ne(ck)t) than from a nonword (e.g., ne(p)t), as predicted by models of spoken-word recognition that incorporate lexical competition. This was found whether the picture of the competitor word (e.g., the picture of a neck) was present on the display or not. Simulations with the TRACE model captured the major trends of éxations to the target and its competitor over time. We argue that eye movements provide a éne-grained measure of lexical activation over time, and thus reveal effects of lexical competition that are masked by response measures such as lexical decisions.
Article
Full-text available
Ambiguity resolution is a central problem in language comprehension. Lexical and syntactic ambiguities are standardly assumed to involve different types of knowledge representations and be resolved by different mechanisms. An alternative account is provided in which both types of ambiguity derive from aspects of lexical representation and are resolved by the same processing mechanisms. Reinterpreting syntactic ambiguity resolution as a form of lexical ambiguity resolution obviates the need for special parsing principles to account for syntactic interpretation preferences, reconciles a number of apparently conflicting results concerning the roles of lexical and contextual information in sentence processing, explains differences among ambiguities in terms of ease of resolution, and provides a more unified account of language comprehension than was previously available.
Article
Full-text available
Unlabelled: BACKGROUND/STUDY CONTEXT: Older adults, especially those with reduced hearing acuity, can make good use of linguistic context in word recognition. Less is known about the effects of the weighted distribution of probable target and nontarget words that fit the sentence context (response entropy). The present study examined the effects of age, hearing acuity, linguistic context, and response entropy on spoken word recognition. Methods: Participants were 18 older adults with good hearing acuity (M age = 74.3 years), 18 older adults with mild-to-moderate hearing loss (M age = 76.1 years), and 18 young adults with age-normal hearing (M age = 19.6 years). Participants heard sentence-final words using a word-onset gating paradigm, in which words were heard with increasing amounts of onset information until they could be correctly identified. Degrees of context varied from a neutral context to a high context condition. Results: Older adults with poor hearing acuity required a greater amount of word onset information for recognition of words when heard in a neutral context compared with older adults with good hearing acuity and young adults. This difference progressively decreased with an increase in words' contextual probability. Unlike the young adults, both older adult groups' word recognition thresholds were sensitive to response entropy. Response entropy was not affected by hearing acuity. Conclusion: Increasing linguistic context mitigates the negative effect of age and hearing loss on word recognition. The effect of response entropy on older adults' word recognition is discussed in terms of an age-related inhibition deficit.
Article
Full-text available
Models of spoken word recognition assume that words are represented as sequences of phonemes. We evaluated this assumption by examining phonemic anadromes, words that share the same phonemes but differ in their order (e.g., sub and bus). Using the visual-world paradigm, we found that listeners show more fixations to anadromes (e.g., sub when bus is the target) than to unrelated words (well) and to words that share the same vowel but not the same set of phonemes (sun). This contrasts with the predictions of existing models and suggests that words are not defined as strict sequences of phonemes.
Article
Full-text available
Using the cross-modal priming paradigm, we attempted to determine whether semantic representations for word-final morphemes embedded in multisyllabic words (e.g., /lak/ in /hεmlak/) are independently activated in memory. That is, we attempted to determine whether the auditory prime, /hεmlak/, would facilitate lexical decision times to the visual target,key, even when the recognition point for/hεmlak / occurred prior to the end of the word, which should ensure deactivation of all lexical candidates. In the first experiment, a gating task was used in order to ensure that the multisyllabic words could be identified prior to their offsets. In the second experiment, lexical decision times for visually presented targets following spoken monosyllabic primes (e.g., /lak/-key) were compared with reaction times for the same visual targets following multisyllabic pairs (/hεmlak/-KEY). Significant priming was found for both the monosyllabic and the multisyllabic conditions. The results support a recognition strategy that initiates lexical access at strong syllables (Cutler & Norris, 1988) and operates according to a principle of delayed commitment (Marr, 1982).
Article
Full-text available
When identifying spoken words, older listeners may have difficulty resolving lexical competition or may place a greater weight on factors like lexical frequency. To obtain information about age differences in the time course of spoken word recognition, young and older adults' eye movements were monitored as they followed spoken instructions to click on objects displayed on a computer screen. Older listeners were more likely than younger listeners to fixate high-frequency displayed phonological competitors. However, degradation of auditory quality in younger listeners does not reproduce this result. These data are most consistent with an increased role for lexical frequency with age.
Article
Full-text available
To determine whether hearing loss is associated with incident all-cause dementia and Alzheimer disease (AD). Prospective study of 639 individuals who underwent audiometric testing and were dementia free in 1990 to 1994. Hearing loss was defined by a pure-tone average of hearing thresholds at 0.5, 1, 2, and 4 kHz in the better-hearing ear (normal, <25 dB [n = 455]; mild loss, 25-40 dB [n = 125]; moderate loss, 41-70 dB [n = 53]; and severe loss, >70 dB [n = 6]). Diagnosis of incident dementia was made by consensus diagnostic conference. Cox proportional hazards models were used to model time to incident dementia according to severity of hearing loss and were adjusted for age, sex, race, education, diabetes mellitus, smoking, and hypertension. Baltimore Longitudinal Study of Aging. Six hundred thirty-nine individuals aged 36 to 90 years. Incident cases of all-cause dementia and AD until May 31, 2008. During a median follow-up of 11.9 years, 58 cases of incident all-cause dementia were diagnosed, of which 37 cases were AD. The risk of incident all-cause dementia increased log linearly with the severity of baseline hearing loss (1.27 per 10-dB loss; 95% confidence interval, 1.06-1.50). Compared with normal hearing, the hazard ratio (95% confidence interval) for incident all-cause dementia was 1.89 (1.00-3.58) for mild hearing loss, 3.00 (1.43-6.30) for moderate hearing loss, and 4.94 (1.09-22.40) for severe hearing loss. The risk of incident AD also increased with baseline hearing loss (1.20 per 10 dB of hearing loss) but with a wider confidence interval (0.94-1.53). Hearing loss is independently associated with incident all-cause dementia. Whether hearing loss is a marker for early-stage dementia or is actually a modifiable risk factor for dementia deserves further study.
Article
Classic psycholinguistics seeks universal language mechanisms for all people, emphasizing the “modal” listener: hearing, neurotypical, monolingual, and young adults. Applied psycholinguistics then characterizes differences in terms of their deviation from the modal. This mirrors naturalist philosophies of health which presume a normal function, with illness as a deviation. In contrast, normative positions argue that illness is partially culturally derived. It occurs when a person cannot meet socio-culturally defined goals, separating differences in biology (disease) from socio-cultural function (illness). We synthesize this with mechanistic functionalist views in which language emerges from diverse lower-level mechanisms with no one-to-one mapping to function (termed the functional mechanistic normative approach). This challenges primarily psychometric approaches—which are culturally defined—suggesting a process-based approach may yield more insight. We illustrate this with work on word recognition across multiple domains: cochlear implant users, children, language disorders, L2 learners, and aging. This work investigates each group’s solutions to the problem of word recognition as interesting in its own right. Variation in the process is value-neutral, and psychometric measures complement this, reflecting fit with cultural expectations (disease vs. illness). By examining variation in processing across people with a variety of skills and goals, we arrive at deeper insight into fundamental principles.
Article
Words are fundamental to language, linking sound, articulation, and spelling to meaning and syntax; and lexical deficits are core to communicative disorders. Work in language acquisition commonly focuses on how lexical knowledge—knowledge of words’ sound patterns and meanings—is acquired. But lexical knowledge is insufficient to account for skilled language use. Sophisticated real-time processes must decode the sound pattern of words and interpret them appropriately. We review work that bridges this gap by using sensitive real-time measures (eye tracking in the visual world paradigm) of school-age children’s processing of highly familiar words. This work reveals that the development of word recognition skills can be characterized by changes in the rate at which decisions unfold in the lexical system (the activation rate). Moreover, contrary to the standard view that these real-time skills largely develop during infancy and toddlerhood, they develop slowly, at least through adolescence. In contrast, language disorders can be linked to differences in the ultimate degree to which competing interpretations are suppressed (competition resolution), and these differences can be mechanistically linked to deficits in inhibition. These findings have implications for real-world problems such as reading difficulties and second-language acquisition. They suggest that developing accurate, flexible, and efficient processing is just as important a developmental goal as is acquiring language knowledge.
Article
Efficient word recognition depends on the ability to overcome competition from overlapping words. The nature of the overlap depends on the input modality: spoken words have temporal overlap from other words that share phonemes in the same positions, whereas written words have spatial overlap from other words with letters in the same places. It is unclear how these differences in input format affect the ability to recognize a word and the types of competitors that become active while doing so. This study investigates word recognition in both modalities in children between 7 and 15. Children complete a visual-world paradigm eye-tracking task that measures competition from words with several types of overlap, using identical word lists between modalities. Results showed correlated developmental changes in the speed of target recognition in both modalities. Additionally, developmental changes were seen in the efficiency of competitor suppression for some competitor types in the spoken modality. These data reveal some developmental continuity in the process of word recognition independent of modality, but also some instances of independence in how competitors are activated. Stimuli, data and analyses from this project are available at: https://osf.io/eav72
Article
A common critique of the Visual World Paradigm (VWP) in psycholinguistic studies is that what is designed as a measure of language processes is meaningfully altered by the visual context of the task. This is crucial, particularly in studies of spoken word recognition, where the displayed images are usually seen as just a part of the measure and are not of fundamental interest. Many variants of the VWP allow participants to sample the visual scene before a trial begins. However, this could bias their interpretations of the later speech or even lead to abnormal processing strategies (e.g., comparing the input to only preactivated working memory representations). Prior work has focused only on whether preview duration changes fixation patterns. However, preview could affect a number of processes, such as visual search, that would not challenge the interpretation of the VWP. The present study uses a series of targeted manipulations of the preview period to ask if preview alters looking behavior during a trial, and why. Results show that evidence of incremental processing and phonological competition seen in the VWP are not dependent on preview, and are not enhanced by manipulations that directly encourage phonological prenaming. Moreover, some forms of preview can eliminate nuisance variance deriving from object recognition and visual search demands in order to produce a more sensitive measure of linguistic processing. These results deepen our understanding of how the visual scene interacts with language processing to drive fixations patterns in the VWP, and reinforce the value of the VWP as a tool for measuring real-time language processing. Stimuli, data and analysis scripts are available at https://osf.io/b7q65/.
Chapter
This chapter considers the consequences of aging on communication and the ability to age well in terms of participation in everyday life. Interventions designed to reduce the negative effects of age-related hearing impairment on communication and participation are also described. These interventions span technological, behavioral, and environmental approaches. Based on correlations between measured speech-recognition or self-report surveys of communication and pure-tone thresholds, it is estimated that the inaudibility of speech may fully explain the speech-communication difficulties experienced by about half of older adults. For these individuals, hearing aids compensating for this inaudibility may be sufficient to remediate the speech-communication problems they experience. For the remainder of older adults, however, their problems are more complex and may be attributable to auditory neural, central-auditory, or cognitive deficits, with or without accompanying pure-tone hearing loss. Furthermore, psychological and social adjustment to hearing loss may require nontechnological solutions. Such adjustment may be complicated by the interface between hearing loss and other age-related comorbidities that affect optimal participation and aging well. Devices such as hearing aids in and of themselves are likely to be insufficient to remediate the difficulties experienced by these older adults. Complementary or supplementary interventions are needed to fully address their functional deficits and to reduce the participation restrictions or activity limitations experienced by older adults with hearing loss and speech-communication difficulties.
Article
Although lexical competition has been ubiquitously observed in spoken word recognition, less has been known about whether the lexical competitors interfere with the recognition of the target and how lexical interference is resolved. The present study examined whether lexical competitors overlapping in output with the target would interfere with its recognition, and tested an underestimated hypothesis that the domain-general inhibitory control contributes to the resolution of lexical interference. Specifically, in this study, a Visual World Paradigm was used to access the temporal dynamics of lexical activations when participants were moving the mouse cursor to the written word form of the spoken word they heard. By using Chinese characters, the orthographic similarity between the lexical competitor and target was manipulated independently of their phonological overlap. The results demonstrated that behavioral performance in the similar condition was poorer compared to that in the control condition, and that individuals with better inhibitory control (having smaller Stroop interference effect) exhibited weaker activation of orthographic competitors (mouse trajectories less attracted by the orthographic competitors). The implications of these findings for our understanding of lexical interference and its resolution in spoken word recognition were discussed.
Article
In two eye-tracking experiments using the Visual World Paradigm, we examined how listeners recognize words when faced with speech at lower intensities (40, 50, and 65 dBA). After hearing the target word, participants (n = 32) clicked the corresponding picture from a display of four images - a target (e.g., money), a cohort competitor (e.g., mother), a rhyme competitor (e.g., honey) and an unrelated item (e.g., whistle) - while their eye-movements were tracked. For slightly soft speech (50 dBA), listeners demonstrated an increase in cohort activation, whereas for rhyme competitors, activation started later and was sustained longer in processing. For very soft speech (40 dBA), listeners waited until later in processing to activate potential words, as illustrated by a decrease in activation for cohorts, and an increase in activation for rhymes. Further, the extent to which words were considered depended on word length (mono- vs. bi-syllabic words), and speech-extrinsic factors such as the surrounding listening environment. These results advance current theories of spoken word recognition by considering a range of speech levels more typical of everyday listening environments. From an applied perspective, these results motivate models of how individuals who are hard of hearing approach the task of recognizing spoken words.
Article
In three experiments, we examined priming effects where primes were formed by transposing the first and last phoneme of tri‐phonemic target words (e.g., /byt/ as a prime for /tyb/). Auditory lexical decisions were found not to be sensitive to this transposed‐phoneme priming manipulation in long‐term priming (Experiment 1), with primes and targets presented in two separated blocks of stimuli and with unrelated primes used as control condition (/mul/‐/tyb/), while a long‐term repetition priming effect was observed (/tyb/‐/tyb/). However, a clear transposed‐phoneme priming effect was found in two short‐term priming experiments (Experiments 2 and 3), with primes and targets presented in close temporal succession. The transposed‐phoneme priming effect was found when unrelated prime‐target pairs (/mul/‐/tyb/) were used as control and more important when prime‐target pairs sharing the medial vowel (/pys/‐/tyb/) served as control condition, thus indicating that the effect is not due to vocalic overlap. Finally, in Experiment 3, a transposed‐phoneme priming effect was found when primes sharing the medial vowel plus one consonant in an incorrect position with the targets (/byl/‐/tyb/) served as control condition, and this condition did not differ significantly from the vowel‐only condition. Altogether, these results provide further evidence for a role for position‐independent phonemes in spoken word recognition, such that a phoneme at a given position in a word also provides evidence for the presence of words that contain that phoneme at a different position.
Article
Spoken language unfolds over time. Consequently, there are brief periods of ambiguity, when incomplete input can match many possible words. Typical listeners solve this problem by immediately activating multiple candidates which compete for recognition. In two experiments using the visual world paradigm, we examined real-time lexical competition in prelingually deaf cochlear implant (CI) users, and normal hearing (NH) adults listening to severely degraded speech. In Experiment 1, adolescent CI users and NH controls matched spoken words to arrays of pictures including pictures of the target word and phonological competitors. Eye-movements to each referent were monitored asa measure of how strongly that candidate was considered over time. Relative to NH controls, CI users showed a large delay in fixating any object, less competition from onset competitors (e.g., sandwich after hearing sandal), and increased competition from rhyme competitors (e.g., candle after hearing sandal). Experiment 2 observed the same pattern with NH listeners hearing highly degraded speech. These studies suggests that in contrast to all prior studies of word recognition in typical listeners, listeners recognizing words in severely degraded conditions can exhibit a substantively different pattern of dynamics, waiting to begin lexical access until substantial information has accumulated.
Article
This review article considers some of the age-related changes in cognition that are likely to interact with hearing, listening effort, and cognitive energy. The focus of the review is on normative age-related changes in cognition; however, consideration is also given to older adults who experience clinically significant deficits in cognition, such as persons with Alzheimer's disease or who may be in a preclinical stage of dementia (mild cognitive impairment). The article distinguishes between the assessment of cognitive function for clinical versus research purposes. It reviews the goal of cognitive testing in older adults and discusses the challenges of validly assessing cognition in persons with sensory impairments. The article then discusses the goals of assessing specific cognitive functions (processing speed and attentional processes) for the purpose of understanding their relationships with listening effort. Finally, the article highlights certain concepts that are likely to be relevant to listening effort and cognitive energy, including some issues that have not yet received much attention in this context (e.g., conation, cognitive reserve, and second language speech processing).
Article
Listening to speech in noise can be exhausting, especially for older adults with impaired hearing. Pupil dilation is thought to track the difficulty associated with listening to speech at various intelligibility levels for young and middle-aged adults. This study examined changes in the pupil response with acoustic and lexical manipulations of difficulty in older adults with hearing loss. Participants identified words at two signal-to-noise ratios (SNRs) among options that could include a similar-sounding lexical competitor. Growth Curve Analyses revealed that the pupil response was affected by an SNR × Lexical competition interaction, such that it was larger and more delayed and sustained in the harder SNR condition, particularly in the presence of lexical competition. Pupillometry detected these effects for correct trials and across reaction times, suggesting it provides additional evidence of task difficulty than behavioral measures alone.
Article
Audition is often treated as a ‘secondary’ sensory system behind vision in the study of cognitive science. In this review, we focus on three seemingly simple perceptual tasks to demonstrate the complexity of perceptual–cognitive processing involved in everyday audition. After providing a short overview of the characteristics of sound and their neural encoding, we present a description of the perceptual task of segregating multiple sound events that are mixed together in the signal reaching the ears. Then, we discuss the ability to localize the sound source in the environment. Finally, we provide some data and theory on how listeners categorize complex sounds, such as speech. In particular, we present research on how listeners weigh multiple acoustic cues in making a categorization decision. One conclusion of this review is that it is time for auditory cognitive science to be developed to match what has been done in vision in order for us to better understand how humans communicate with speech and music. WIREs Cogni Sci 2011 2 479–489 DOI: 10.1002/wcs.123 For further resources related to this article, please visit the WIREs website
Article
Previous work has shown a back-propagation network with recurrent connections can successfully model many aspects of human spoken word recognition (Norris, 1988, 1990, 1992, 1993). However, such networks are unable to revise their decisions in the light of subsequent context. TRACE (McClelland & Elman, 1986), on the other hand, manages to deal appropriately with following context, but only by using a highly implausible architecture that fails to account for some important experimental results. A new model is presented which displays the more desirable properties of each of these models. In contrast to TRACE the new model is entirely bottom-up and can readily perform simulations with vocabularies of tens of thousands of words.
Article
We describe a model called the TRACE model of speech perception. The model is based on the principles of interactive activation. Information processing takes place through the excitatory and inhibitory interactions of a large number of simple processing units, each working continuously to update its own activation on the basis of the activations of other units to which it is connected. The model is called the TRACE model because the network of units forms a dynamic processing structure called “the Trace,” which serves at once as the perceptual processing mechanism and as the system's working memory. The model is instantiated in two simulation programs. TRACE I, described in detail elsewhere, deals with short segments of real speech, and suggests a mechanism for coping with the fact that the cues to the identity of phonemes vary as a function of context. TRACE II, the focus of this article, simulates a large number of empirical findings on the perception of phonemes and words and on the interactions of phoneme and word perception. At the phoneme level, TRACE II simulates the influence of lexical information on the identification of phonemes and accounts for the fact that lexical effects are found under certain conditions but not others. The model also shows how knowledge of phonological constraints can be embodied in particular lexical items but can still be used to influence processing of novel, nonword utterances. The model also exhibits categorical perception and the ability to trade cues off against each other in phoneme identification. At the word level, the model captures the major positive feature of Marslen-Wilson's COHORT model of speech perception, in that it shows immediate sensitivity to information favoring one word or set of words over others. At the same time, it overcomes a difficulty with the COHORT model: it can recover from underspecification or mispronunciation of a word's beginning. TRACE II also uses lexical information to segment a stream of speech into a sequence of words and to find word beginnings and endings, and it simulates a number of recent findings related to these points. The TRACE model has some limitations, but we believe it is a step toward a psychologically and computationally adequate model of the process of speech perception.
Article
A fundamental problem in the study of human spoken word recognition concerns the structural relations among the sound patterns of words in memory and the effects these relations have on spoken word recognition. In the present investigation, computational and experimental methods were employed to address a number of fundamental issues related to the representation and structural organization of spoken words in the mental lexicon and to lay the groundwork for a model of spoken word recognition. Using a computerized lexicon consisting of transcriptions of 20,000 words, similarity neighborhoods for each of the transcriptions were computed. Among the variables of interest in the computation of the similarity neighborhoods were: 1) the number of words occurring in a neighborhood, 2) the degree of phonetic similarity among the words, and 3) the frequencies of occurrence of the words in the language. The effects of these variables on auditory word recognition were examined in a series of behavioral experiments employing three experimental paradigms: perceptual identification of words in noise, auditory lexical decision, and auditory word naming. The results of each of these experiments demonstrated that the number and nature of words in a similarity neighborhood affect the speed and accuracy of word recognition. A neighborhood probability rule was developed that adequately predicted identification performance. This rule, based on Luce's (1959) choice rule, combines stimulus word intelligibility, neighborhood confusability, and frequency into a single expression. Based on this rule, a model of auditory word recognition, the neighborhood activation model, was proposed. This model describes the effects of similarity neighborhood structure on the process of discriminating among the acoustic-phonetic representations of words in memory. The results of these experiments have important implications for current conceptions of auditory word recognition in normal and hearing impaired populations of children and adults.