
István WinklerResearch Centre for Natural Sciences · Institute of Cognitive Neuroscience and Psychology
István Winkler
PhD, DSc
About
313
Publications
42,184
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
17,125
Citations
Citations since 2017
Introduction
Additional affiliations
September 2008 - August 2016
June 2005 - present
Research Centre for Natural Sciences, Hungarian Academy of Sciences
Position
- Consultant
Description
- Previously Institute of Psychology, Hungarian Academy of Sciences
Publications
Publications (313)
In most cultures infant directed speech (IDS) is used to communicate with young children. The main role IDS plays in parent-child interactions appears to change over time from conveying emotion to facilitating language acquisition. There is EEG evidence for the discrimination of IDS form adult directed speech (ADS) at birth, however, less is known...
A crucial skill in infant language acquisition is learning of the native language phonemes. This requires the ability to group complex sounds into distinct auditory categories based on their shared features. Problems in phonetic learning have been suggested to underlie language learning difficulties in dyslexia, a developmental reading-skill defici...
Newborn infants have been shown to extract temporal regularities from sound sequences, both in the form of learning regular sequential properties, and extracting periodicity in the input, commonly referred to as a beat. However, these two types of regularities are often indistinguishable in isochronous sequences, as both statistical learning and be...
Infants are able to extract words from speech early in life. Here we show that the quality of forming longer-term representations for word forms at birth predicts expressive language ability at the age of two years. Seventy-five neonates were familiarized with two spoken disyllabic pseudowords. We then tested whether the neonate brain predicts the...
Hearing is one of the earliest senses to develop and is quite mature by birth. Contemporary theories assume that regularities in sound are exploited by the brain to create internal models of the environment. Through statistical learning, internal models extrapolate from patterns to predictions about subsequent experience. In adults, altered brain r...
Many aspects of cognitive ability and brain function that change as we age look like deficits on account of measurable differences in comparison to younger adult groups. One such difference occurs in auditory sensory responses that index perceptual learning. Meta-analytic findings show reliable age-related differences in auditory responses to repet...
The effects of lexical meaning and lexical familiarity on auditory deviance detection were investigated by presenting oddball sequences of words, while participants ignored the stimuli. Stimulus sequences were composed of words that were varied in word class (nouns vs. functions words) and frequency of language use (high vs. low frequency) in a fac...
Infants are able to extract words from speech early in life. Here we show that the quality of word-form learning at birth predicts language development at the age of two years. Seventy-five neonates were familiarized with two spoken disyllabic pseudowords. We then tested whether the neonate brain predicts the second syllable from the first one by p...
People with normal hearing can usually follow one of the several concurrent speakers.Speech tempo affects both the separation of concurrent speech streams and information extraction from them. The current study varied the tempo of two concurrent speech streams to investigate these processes in a multi-talker situation. Listeners performed a target-...
Introduction
The global COVID-19 pandemic has affected the economy, daily life, and mental/physical health. The latter includes the use of electroencephalography (EEG) in clinical practice and research. We report a survey of the impact of COVID-19 on the use of clinical EEG in practice and research in several countries, and the recommendations of a...
Speech unfolds at different time scales. Therefore, neuronal mechanisms involved in speech processing should likewise operate at different (corresponding) time scales. The present study aimed to identify speech units relevant for selecting speech streams in a multi-talker situation. Functional connectivity was extracted from the continuous EEG whil...
A study by Tóth, Kocsis, Háden, Szerafin, Shinn-Cunningham, and Winkler [Neuroimage 141, 108 − 119 (2016)] reported that spatial cues (such as interaural differences or ITDs) that differentiate the perceived sound source directions of a target tone sequence (figure) from simultaneous distracting tones (background) did not improve the ability of par...
Acoustic predictability has been shown to affect auditory stream segregation, while linguistic predictability is known to be an important factor in speech comprehension. We tested the effects of linguistic predictability on auditory stream segregation and target detection by assessing the event-related potentials elicited by targets and distractors...
Fatigue is a core symptom in many psychological disorders and it can strongly influence everyday productivity. As fatigue effects have been typically demonstrated after long hours of time on task, it was surprising that in a previous study, we accidentally found a decline of temporal order judgment (TOJ) performance within 5–8 min. After replicatin...
The task of making sense of the world around us is supported by brain processes that simplify the environment. For example, repetitive patterns of sensory input help us to predict future events. This study builds on work, suggesting that sensory predictions are heavily influenced by first impressions. We presented healthy adults with a sequence com...
Human listeners can follow the voice of one speaker while several others are talking at the same time. This process requires segregating the speech streams from each other and continuously directing attention to the target stream. We investigated the functional brain networks underlying this ability. Two speech streams were presented simultaneously...
The formation of auditory events requires integration between successive sounds. There is a temporal limit below which a single sound event is perceived while above which a second perceptual event is formed. Behavioral studies applying the Temporal Order Judgment paradigm showed that this boundary is between 20 and 70 ms. Here we provide event-rela...
Human listeners can focus on one speech stream out of several concurrent ones. The present study aimed to assess the whole-brain functional networks underlying a) the process of focusing attention on a single speech stream vs. dividing attention between two streams and 2) speech processing on different time-scales and depth. Two spoken narratives w...
NIRS FC networks significantly affected by TASK TYPE.
Stronger for the tracking than for the detection task (Tracking Task Specific Networks. The left column of panels A) and B) separately shows the regional distribution of the functional connections (color scale right from each panel). 100% refers to the sum of the connections comprising the signi...
Grand average spectral power.
Spectral density is shown for all 64 channel channels separately (colored lines). The scalp distribution of the power for 1 Hz, 10 Hz and 20 Hz are plotted above the diagram.
(TIF)
Supplementary information on EEG FC and behavioral data correlation analysis.
(DOCX)
Summary of the number of connections within the subnetworks that showed significant ATTENTION or TASK TYPE effects, separately for the three EEG bands (columns).
Each line represents a ROI (identified by its abbreviation as defined in S1 Table). ROIs are grouped by brain lobes (Frontal, Cingular, Temporal, and Parietal). The sum of connections with...
Results of the assessment of source localization accuracy.
ROI pair distances above the thresholds of 15 (top panel) and 20 mm (bottom panel) are plotted as lines connecting the corresponding ROI centers. The threshold was defined as 50% indicating that more than half of the ROI’s voxel’s source activity could be unreliably attributed to another RO...
Assessment of EEG source localization accuracy.
(DOCX)
Effect size measure of FC statistics.
(DOCX)
Source regions and their abbreviation.
Source regions and their abbreviation (third column) for EEG (second column) and NIRS sources (fourth column) grouped according to large-scale anatomical areas (first column).
(DOCX)
Results of the post hoc pairwise dependent sample t-tests performed on the average subnetwork connectivity strength values (Student’s t, degree of freedom, p, and Cohen’s d effect size values).
(DOCX)
Summary of the number of connections within the subnetworks showing a significant TASK TYPE effect for the NIRS deoxygenated hemoglobin concentration.
Each line represents a ROI (identified by its abbreviation as defined in S1 Table). ROIs are grouped by brain lobes (Frontal, Cingular, Temporal, and Parietal). The sum of connections within each lob...
NIRS functional connectivity analysis.
(DOCX)
Supplementary results: The effect of the LOCATION.
(DOCX)
EEG source localization accuracy results.
All ROIs pairs (listed region as A-B) above the threshold degree of overlap reported separately for 15 and 20 mm localization error distance values.
(DOCX)
Stimuli: Syntactic violation.
(DOCX)
Supplementary results: NIRS deoxygenated hemoglobin concentration.
(DOCX)
Statistical analysis: Extended description of the NBS statistics.
(DOCX)
NIRS recording and preprocessing.
(DOCX)
Supplementary results: Behavioral responses–testing the effects of location.
(DOCX)
Supplementary results: The effect of the DETECTION TASK TYPE on EEG FC.
(DOCX)
Large-scale functional brain network correlates of speech predictability effects on speaker separation
The predictability of speech influences the quality of comprehension, especially in noisy environments. We explored the large-scale functional brain networks underlying speech perception in the presence of an interfering stream while varying sema...
Grouping distinct, temporally separated sounds is assumed to follow Gestalt principles, such as similarity or proximity. In the auditory streaming paradigm, the probability of perceiving all sounds as part of the same repeating pattern (the integrated percept) increases when the interstimulus interval (ISI) is increased from medium to long interval...
Auditory perceptual inference engages learning of complex statistical information about the environment. Inferences assist us to simplify perception highlighting what can be predicted on the basis of prior learning (through the formation of internal "prediction" models) and what might be new, potentially necessitating an investment of resources to...
The notion of automatic syntactic analysis received support from some event-related potential (ERP) studies. However, none of these studies tested syntax processing in the presence of a concurrent speech stream. Here we present two concurrent continuous speech streams, manipulating two variables potentially affecting speech processing in a fully cr...
The dynamics of perceptual bistability, the phenomenon in which perception switches between different interpretations of an unchanging stimulus, are characterised by very similar properties across a wide range of qualitatively different paradigms. This suggests that perceptual switching may be triggered by some common source. However, it is also po...
Predictive coding is arguably the currently dominant theoretical framework for the study of perception. It has been employed to explain important auditory perceptual phenomena, and it has inspired theoretical, experimental, and computational modelling efforts aimed at describing how the auditory system parses the complex sound input into meaningful...
In perceptual multi-stability, perception stochastically switches between alternative interpretations of the stimulus allowing examination of perceptual experience independent of stimulus parameters. Previous studies found that listeners show temporally stable idiosyncratic switching patterns when listening to a multi-stable auditory stimulus, such...
This paper reports on a preliminary study carried out to examine whether the phoneme classes defined by linguists elicit distinguishable electroencephalographic (EEG) responses from the brain. To this end event-related potentials (ERP) were recorded in response to three-syllabic nonsense words (non-words) with three consonant-vowel transitions (CVC...
The organization of functional brain networks changes across human lifespan. The present study analyzed functional brain networks in healthy full-term infants (N = 139) within 1-6 days from birth by measuring neural synchrony in EEG recordings during quiet sleep. Large-scale phase synchronization was measured in six frequency bands with the Phase L...
Supplementary material includes a supplementary text, three figures, and nine tables
https://figshare.com/s/0531ebc55884a85dd38d
Multistability in perception is a powerful tool for investigating sensory–perceptual transformations, because it produces dissociations between sensory inputs and subjective experience. Spontaneous switching between different perceptual objects occurs during prolonged listening to a sound sequence of tone triplets or repeated words (termed auditory...
Auditory scene analysis (ASA) refers to the process (es) of parsing the complex acoustic input into auditory perceptual objects representing either physical sources or temporal sound patterns, such as melodies, which contributed to the sound waves reaching the ears. A number of new computational models accounting for some of the perceptual phenomen...
Significance statement:
Our research presents the first definite evidence for the auditory system prioritizing transitional probabilities over probabilities of individual sensory events. Forming representations for transitional probabilities paves the way for predictions of upcoming sounds. Several recent theories suggest that predictive processin...
Background and aims: Syntax needs to be processed in order to understand a sentence. Here we
studied the event-related brain potentials (ERPs) elicited by syntactic violations: the left anterior
negativity (LAN) and the P600. LAN signals the morphosyntactic violations. P600 is assumed to reflect
structural reanalysis when encountering grammatically...
The auditory environment typically comprises several simultaneously active sound sources. In
contrast to the perceptual segregation of two concurrent sounds, the perception of three
simultaneous sound objects has not yet been studied systematically. We conducted two
experiments in which participants were presented with complex sounds containing sou...
Multi-stability refers to the phenomenon of perception stochastically switching between possible interpretations of an unchanging stimulus. Despite considerable variability, individuals show stable idiosyncratic patterns of switching between alternative perceptions in the auditory streaming paradigm. We explored correlates of the individual switchi...
Spearman Rank Order correlation coefficients between the perceptual and the executive functions, personality traits, and creativity.
MDS X = the first dimension of the MDS, MDS Y = the second dimension of the MDS, MDS Z = the third dimension of the MDS, Duration of integrated = average phase duration of the integrated percept in seconds, Duration o...
Descriptive statistics of the measured variables.
Mean (SD) = the mean and the standard deviation of the variable, Min = the minimum of the variable, Max = the maximum of the variable, α = Cronbach’s alpha in the case of the personality questionnaires and inter-rater reliability in case of Creativity tasks, MDS X = the first dimension of the MDS, M...
The ability to isolate a single sound source among concurrent sources is crucial for veridical auditory perception. The present study investigated the event-related oscillations evoked by complex tones, which could be perceived as a single sound and tonal complexes with cues promoting the perception of two concurrent sounds by inharmonicity, onset...
While subjective reports provide a direct measure of perception, their validity is not self-evident. Here, the authors tested three possible biasing effects on perceptual reports in the auditory streaming paradigm: errors due to imperfect understanding of the instructions, voluntary perceptual biasing, and susceptibility to implicit expectations. (...
Although first-impressions are known to impact decision-making and to have prolonged effects on reasoning, it is less well known that the same type of rapidly formed assumptions can explain biases in automatic relevance filtering outside of deliberate behaviour. This paper features two studies in which participants have been asked to ignore sequenc...
In the adult auditory system, deviant detection and updating the representation of the environment is reflected by the event-related potential (ERP) component termed the mismatch negativity (MMN). MMN is elicited when a rare-pitch deviant stimulus is presented amongst frequent standard pitch stimuli. The same stimuli also elicit a similar discrimin...
Complex first, simple later: Higher-order auditory capabilities in preverbal infants
The perceptual resolution of basic auditory features is much lower in young infants than in adults. However, regarding higher-order auditory capabilities, infants perform qualitatively similarly to adults. Infants are competent perceivers of sound. They form audit...
In this work, we compare two skewness-based salient event detector algorithms, which can detect transients in human speech signals. Speech transients are characterized by rapid changes in signal energy. The purpose of this study was to compare the identification of transients by two different methods based on skewness calculation in order to develo...
In collaboration with This research is supported by: Methods Fig 2a. Striped blocks contain a 30ms standard and 60ms deviant sound. These roles were reversed in greyed blocks. Conditions were distinct due to a difference of stability in the relative tone probabilities (2.4mins versus 0.8mins) b. Halves analysis was conducted to compare MMN size imm...
Communication by sounds requires that the communication channels (i.e. speech/speakers and other sound sources) had been established. This allows to separate concurrently active sound sources, to track their identity, to assess the type of message arriving from them, and to decide whether and when to react (e.g., reply to the message). We propose t...
Most people are able to recognise familiar tunes even when played in a different key. It is assumed that this depends on a general capacity for relative pitch perception; the ability to recognise the pattern of inter-note intervals that characterises the tune. However, when healthy adults are required to detect rare deviant melodic patterns in a se...
By measuring event-related brain potentials (ERPs), the authors tested the sensitivity of the newborn auditory cortex to sound lateralization and to the most common cues of horizontal sound localization.
Sixty-eight healthy full-term newborn infants were presented with auditory oddball sequences composed of frequent and rare noise segments in four...
The notion of predictive sound processing suggests that the auditory system prepares for upcoming sounds once it has detected regular features within a sequence. Here we investigated whether predictive processes are operating at birth in the human auditory system. Event-related potentials (ERP) were recorded from healthy newborns to occasional asce...
Separating concurrent sounds is fundamental for a veridical perception of one's auditory surroundings. Sound components that are harmonically related and start at the same time are usually grouped into a common perceptual object, whereas components that are not in harmonic relation or have different onset times are more likely to be perceived in te...