Robert E. Remez’s scientific contributions

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (21)


Santa Claus, the Tooth Fairy, and Auditory‐Visual Integration
  • Article

April 2021

·

14 Reads

·

2 Citations

·

Jennifer S. Pardo

·

Lynne C. Nygaard

·

[...]

·

Speech perception and comprehension are critical for human communication. This chapter aims to evaluate the claim that auditory‐visual (AV) integration is an ability or skill that is predictive of AV benefit and that can explain both age and individual differences in the auditory‐visual benefit. One clear prediction that emerges from considering auditory‐visual integration an ability is that measures of enhancement should be correlated across different types of speech materials. One promising direction for improving the ability to quantify integration is neuroimaging of unimodal and bimodal speech perception. There has been little systematic research examining the developmental time course of lip‐reading across the adult life span. The suggestion that access to unimodal speech information is the principal determinant of the AV benefit has a number of important clinical implications. Neuroimaging studies are providing new insights into the mechanisms and structures mediating the AV benefit.


Phonotactics in Spoken‐Word Recognition

April 2021

·

107 Reads

·

9 Citations

Phonotactics describes the phonological segments and sequences of phonological segments that are legal as well as illegal in a given language. Once the word‐learning process has been effectively triggered by a novel word with low phonotactic probability, neighborhood density acts on the fragile and nascent representation of the word form to be learned to try to integrate the representation of the newly learned word into the extant lexicon. Several previous models of spoken‐word recognition suggested that spoken‐word recognition occurred by acoustic‐phonetic input being mapped directly onto lexical word forms. A more recent and novel approach to spoken‐word recognition using the computational tools of network science offers an alternative way to account for the influences of phonotactic probability and neighborhood density on spoken‐word recognition. The chapter describes a few of the diagnostic advances that have been made by considering phonotactic information, and provides an example of how phonotactic information could inform treatment.


Speaker Normalization in Speech Perception

April 2021

·

58 Reads

·

55 Citations

People have different acoustic voice signatures because of the unique physiology of the jaw, tongue, lips, and throat. The largest acoustic difference between talkers is the difference between men and women and children. Much of the research done on talker normalization has focused on understanding how listeners must map the acoustic properties of speech produced by men and women to talker‐independent linguistic representations. This chapter reviews some practical vowel‐normalization methods and discusses the perceptual processes that listeners may use to accomplish speech recognition in the face of talker variation. It focuses on segment‐internal talker cues (intrinsic normalization), and provides a general overview of cortical organization for speech perception. An important difference between intrinsic normalization and extrinsic normalization is that the latter requires the system to achieve and maintain a stable representation of the talker and their acoustic voice properties so as to provide a frame of reference for further interpretation.


Speech Perception and Reading Ability
  • Article
  • Full-text available

April 2021

·

170 Reads

This chapter focuses on research that has investigated whether speech perception, the initial encoding of linguistic input, may be a plausible candidate: several of the deficits accompanying poor reading achievement could conceivably stem from deficits in the underlying quality of phoneme percepts. Research conducted to investigate whether speech‐perception deficits are common for struggling readers has primarily used three measures of speech perception: categorical perception, nonword repetition, and speech in noise. Speech repetition has frequently been used to investigate phonological processes related to reading ability. Research aimed at investigating listening difficulties in noisy backgrounds highlights the impact of the spectral characteristics of the masker on the perception of a simultaneous speech signal. A further area of research using the categorical perception procedure has investigated the influence of developmental weighting strategies on identification responses. The majority of studies have looked at the relationship between speech perception in noise and reading abilities by using cross‐sectional designs.

Download

Speech Perception by Children

April 2021

·

70 Reads

·

1 Citation

This chapter reviews research on children's speech perception that contributes to that alternative view. The reports described thus far adhered to the view of the phoneme as the principal unit of language processing for adults and infants alike. Children may be predisposed to acquire new vocabulary items at early ages that are as acoustically distinct as possible from words already in their lexicons. One factor defining the experiences that help infants and young children acquire the attentional and organizational skills needed to recognize phonemes is that those experiences are fundamentally social in nature. Evidence that young children rely on less detailed acoustic patterns than adults for lexical representation and retrieval can be gathered from several experiments. Some people manage to perform language functions, including speech perception, without phonemic segments. It is proposed that the model leading to that dual proficiency can be termed the structural refinement and differentiation model.


Perceptual Control of Speech

April 2021

·

37 Reads

·

2 Citations

Like singing, speech is governed by a control system that requires sensory information about the effects of its actions, and the major source of this sensory feedback is the auditory system. This chapter addresses a number of issues related to the perceptual control of speech production. The study of postlingually deafened individuals represents the best window onto the role played by auditory feedback in a well‐developed human control system. The chapter reviews what is known about the neural processing of self‐produced sound. This includes work on corollary discharge or efference copy, as well as studies showing cortical suppression during vocalizing. The chapter addresses the topic of vocal learning and the general question about the relationship between speech perception and speech production. One of the key requirements for successful reinforcement learning is exploration. Sampling the control space allows the organism to learn the value of a range of different actions.


Perceptual Learning of Accented Speech

April 2021

·

127 Reads

·

18 Citations

Individuals who are speaking in a second language tend to use the language in ways that differ from native speakers. As listeners build representations of nonnative‐accented speech, the need for explicit processing should decrease and fewer attentional resources should be necessary for listeners to access the lexical items intended by the nonnative speakers. A growing body of work suggests that listeners can adapt to nonnative speech after both long‐ and short‐term exposure to these speech varieties. The influence of both accent strength and listener experience on accuracy and processing speed were gradient and nonlinear. An important issue that has drawn more attention is how accent adaptation may change across the life span. Many divergences from native norms in nonnative speech involve shifts in category boundaries rather than category mismatches. Although nonnative speech introduces variability into the speech signal, it is well established that native speech also contains substantial variability.


A Comprehensive Approach to Specificity Effects in Spoken‐Word Recognition

April 2021

·

55 Reads

·

2 Citations

This chapter argues that a comprehensive approach to research on specificity effects in spoken‐word recognition. It focuses on findings demonstrating the roles that the talker, the speech signal, the listener, and the context play in indexical specificity effects in spoken‐word recognition. The chapter discusses theoretical frameworks and new research questions that emerge as researchers embrace a comprehensive approach to investigating spoken‐word recognition. It describes specificity effects associated with environmental background sounds. Spoken‐word recognition is influenced by regional variations in speech. The chapter also focuses on empirical tests of the time‐course and attention‐based hypotheses. It discusses whether specificity effects emanate from the mental lexicon or from a more general memory system. A comprehensive approach to specificity effects includes investigating how the talker, speech signal, listener, and context influence spoken‐word recognition. Interdisciplinary collaborations are crucial for the development of new ideas; consequently, it is important to encourage such efforts.


Perceptual Integration of Linguistic and Non‐Linguistic Properties of Speech

April 2021

·

128 Reads

·

24 Citations

Speech is a complex auditory signal that contains multiple layers of linguistic and non‐linguistic structures. This chapter discusses empirical and theoretical work examining the extent to which linguistic and non‐linguistic properties are independently processed and represented. It considers research examining the impact of socially conditioned and linguistically relevant variation on the perception of speech and reviews how familiarity with this lawful variation impacts listeners’ perception of both linguistic and non‐linguistic forms. The chapter argues that variation due to talker and other factors is highly informative and has perceptual consequences for linguistic processing, this variation is integral to the representation and processing of spoken language, and that models of speech perception must necessarily include mechanisms for tracking and representing informative variation in linguistic form. A seminal demonstration of the dependence of talker recognition on phonetic instantiation comes from the investigation of talker identification from sinewave replicas of speech.


Word Stress in Speech Perception

April 2021

·

104 Reads

·

23 Citations

Languages where stress placement in words can vary are said to have “lexical stress.” In lexical‐stress languages, the stress pattern of every polysyllabic word is lexically determined, that is, is part of the phonological representation of how speakers ought to produce the word. This chapter considers issues of vocabulary structure that could influence the perceptual relevance of lexical stress in a similar way. Fixed‐stress languages clearly vary in the extent to which their phonology and vocabulary encourage any perceptual role for word stress. The phonemic and syllabic properties of a language's vocabulary thus have direct implications for speech perception. The vocabulary statistics are based on measures of overlap, on the assumption that interword competition is the primary testbed for whether a particular factor will play a useful role in the identification of spoken words. Listeners can afford to let suprasegmental information regarding word identity be outweighed by segmental information.


Citations (15)


... In addition, the important role of the articulatory movements of the lips, tongue, and jaw are well known to play a crucial role in speech perception (e.g., Bicevskis et al. 2016;Chandrasekaran et al. 2009;Hartcher-O'Brien et al. 2017;Macaluso et al. 2016;Peelle and Sommers 2015;Rosenblum and Dorsi 2021;Schwartz et al. 2012;Turk 2014). According to Liberman's Motor Theory of Speech Perception, listeners discern speech by linking acoustic signals to articulatory motor commands, which aid in organizing speech sounds into phonetic categories despite variations among speakers (A. ...

Reference:

Segmenting Speech: The Role of Resyllabification in Spanish Phonology
Primacy of Multimodal Speech Perception for the Brain and Science
  • Citing Article
  • April 2021

... One facet of this preparedness is a perceptual system that is sensitive to the sensory features of speech sounds at birth, including multimodal features that co-occur when making those sounds. There is an intrinsic link between the sounds of speech and the articulatory movements that produce those sounds [3][4][5], and this link between perceptual and motor control processes is relevant from early development to adulthood [6]. Correspondingly, neural models of adult speech perception [7-9] and production [10,11] specify a central link between auditory and motor transformations. ...

Perceptual Control of Speech
  • Citing Article
  • April 2021

... Given the difficulty in defining a connection between signs/nodes to form a singlelayer network of ASL similar to the single-layer phonological network of English [3], researchers interested in using the powerful quantitative tools of cognitive network science to examine ASL might consider alternative network architectures. For example, it was examined in [69] how several different network architectures might account for the wellstudied effects on language processing of phonotactic probability or the frequency with which phonological segments and sequences of phonological segments appear in words (for a review, see [70]). One type of architecture they examined was a bipartite network, a type of multilayer network with two different types of nodes with connections between nodes in one layer to nodes in the other layer, but no connections within the same layer [71]. ...

Phonotactics in Spoken‐Word Recognition
  • Citing Article
  • April 2021

... While valuable and informative as global measures of outcomes and benefit, the conventional, descriptive speech recognition tests routinely used in the audiology clinic were never designed to study individual differences and variability in outcomes and do not have the sensitivity and specificity to provide new insights into the underlying information processing operations and mechanisms of action employed in speech recognition and spoken language understanding (Pisoni et al., 2008). Moreover, these conventional clinical tests of speech recognition lack strong theoretical motivation and rationale (Pisoni, 2021). ...

Cognitive Audiology
  • Citing Article
  • April 2021

... Instead, they are motivated by morphophonology and associated with specific syllables with their position determined by the stressed syllable. Finally, lexical stress, which is of particular interest to the present study, is a structural property of a word that specifies which syllable in the word is more prominent than any of the others (Cutler & Jesse, 2021;Zora, 2016;Cutler, 2005Cutler, , 2015, and stressed syllables typically have a longer duration, greater intensity, and/or higher f0 than unstressed syllables. Different placement of stress occasionally creates lexically distinct minimal pairs (Cutler & Jesse, 2021;Cutler, 2015) as in English stress-alternating homophones úpset (noun) and upsét (verb), carrying strong-weak (trochaic) and weak-strong (iambic) stress patterns, respectively. ...

Word Stress in Speech Perception
  • Citing Article
  • April 2021

... Phonological features grouped into different categories (major class features, laryngeal features, manner features, and place features) may affect accent rating, in that sensitivity to the accent could derive from 'feature boundary violation' . Researchers have proposed a 'phonological feature geometry' (Figure 7), in which a higher order in the feature hierarchy would correspond to a heavier feature weight (LaCharité and Prévost, 1999;Mah and Archibald, 2003;De Jong and Hao, 2018;Jones and Schnupp, 2021), and an 'L1-L2 feature boundary violation' regarding articulator node would be more difficult to acquire than a terminal node. This indicates that the greater weight on the 'L1-L2 feature boundary violation' , the more likely that a non-native substitution is dissimilar from the target segment from the native listeners' perspective, and will therefore be judged as heavier accented. ...

How Does the Brain Represent Speech?

... Previous research has found that VOT, closure duration, and CF0 are among the perceptible acoustic cues for voicing distinction and that each cue's relative influence on the listener's percept remains unclear (Lisker, 1957;Port and Dalby, 1982;Whalen et al., 1993). There is still debate over the weighting of cues, especially the presence of a main acoustic cue over another cue (Raphael, 2021). Listeners may employ any and all cues that are available in the speech stream; however, there are some cues that are more effective in perception (Diehl and Kluender, 1989). ...

Acoustic Cues to the Perception of Segmental Phonemes
  • Citing Article
  • April 2021

... The role of family dialect as an influencing mechanism for the development of English pronunciation will arise from the principle that "the various dialects, or main forms of speech, being the first developed and the most frequently used in early language acquisition, lay down a phonetic pattern which is not easily modified in subsequent life" [39]. This can be especially noticeable in respondents who claim not to have much Mandarin at home. ...

Perception of Dialect Variation
  • Citing Article
  • April 2021

... Mirroring the findings observed for intersubject variability, the speakers' auditory/ ba/(or visual/ga/) that were frequently misperceived as "da" were more potent to evoke McGurk illusion on conflicting audiovisual trials (A ba + V ga ). Ample evidence showed that speech categorization depends heavily on the speakers' acoustic (i.e., voice onset time, F2, and F2 transitions, etc.) and facial articulatory features (Bent & Holt, 2017;Jiang & Bernstein, 2011;Nusbaum & Magnuson, 1997;Olmstead et al., 2020), which vary between speakers due to biological and social factors (Johnson & Sjerps, 2021;Kleinschmidt, 2019;Pisoni, 1993). These findings suggest observers weight auditory and visual information by taking into account several sources of variability: sensory observation noise and within category variability (e.g., variations Fig. 4 Main results of Experiment 3. a Box plot showing the distribution of McGurk illusion susceptibility across speakers (raw data and rank transformed data). ...

Speaker Normalization in Speech Perception
  • Citing Article
  • April 2021