Article

Pitch-class distribution modulates the statistical learning of atonal chord sequences

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Particularly, the Competitive Chunker [25], PARSER [26], Information Dynamics of Music (IDyOM) [27], and n-gram models [28] underlie the hypothesis that music is acquired by concatenating chunks. Computational studies calculate statistical distributions in music and devise corresponding models, then evaluate the validities of these models through neurological and behavioural experiments [27,29,30]. Particularly, SL in Markov models, which correspond to n-gram models based on conditional probability [31], overlaps with SL in many other fields of study, such as neuroscience, behavioural science, and computational science. ...
... In other words, distinct statistical knowledge of high-and low-pitch sequences can be acquired simultaneously. Another neurophysiological study suggested that SL is also possible for harmony sequences in which the highest and lowest pitches are randomly distributed without regularity [29]. Together, neural studies support the hypothesis that SL of the melody and SL of the bass line interact with and are partly independent of each other in the framework of the Gestalt principle in music [60]. ...
... In summary, this study may suggest that the SL of the melody and bass line correlate with and are partly independent of each other in terms of TP distribution. These findings may also be in agreement with the hypothesis in neural studies that the SL of the melody and bass line interact with and are partly independent of each other [29,61,65]. In the present studies, it was expected that this would occur based on some very specific findings in the neuroscience literature, but a previous neural study also suggested that SL could be modulated by music-specific features such as tonal mode and key [29]. ...
Article
Statistical learning is the ability to learn based on transitional probability (TP) in sequential information, which has been considered to contribute to creativity in music. The interdisci-plinary theory of statistical learning examines statistical learning as a mechanism of human learning. This study investigated how TP distribution and conditional entropy in TP of the melody and bass line in music interact with each other, using the highest and lowest pitches in Beethoven's piano sonatas and Johann Sebastian Bach's Well-Tempered Clavier. Results for the two composers were similar. First, the results detected specific statistical characteristics that are unique to each melody and bass line as well as general statistical characteristics that are shared between the melody and bass line. Additionally, a correlation of the conditional entropies sampled from the TP distribution could be detected between the melody and bass line. This suggests that the variability of entropies interacts between the melody and bass line. In summary, this study suggested that TP distributions and the entro-pies of the melody and bass line interact with but are partly independent of each other.
... Using a Markov chain 11, 13,14,16,17 or a word segmentation task 10,12,15,17 , previous studies demonstrated that auditory statistical learning could be reflected in auditory event-related responses, such as P1/P1 m 12, 16 , N1/N1 m or mismatch negativity/mismatch field [10][11][12][13][14][15]17 , and P2/P2 m 11 , peaking in a latency range from 50 to 200 ms after stimulus onset. Some studies suggest that the learning effect relationship with P1 m involves music expertise and specialised training experience 12, 16 . ...
... With this prediction for upcoming tones, tones with higher transitional probability (i.e., more predictable tones) lead to a decrease in amplitude and shortening of latencies in neural responses. In contrast, tones with lower transitional probability (i.e., less predictable tones) lead to an increased neural response amplitude [10][11][12][13][14][15][16][17] . In the present study, participants were presented with two simultaneous tone sequences that had tones with higher and lower transitional probabilities (i.e., frequent and rare tones, respectively). ...
... Paraskevopoulos et al. demonstrated that, in the initial phase of statistical learning, learning effects on P1, but not N1, were larger in musicians compared with non-musicians 12 . In our previous study, the statistical learning of chord sequences was reflected in P1 16 . Another study reported that, in learning the chord progression with conditional probability, the learning effects on the later responses such as early anterior negativity (EAN: 150-250 ms) 35,36 were facilitated by musical training 37 . ...
Article
Full-text available
When we are exposed to a novel stimulus sequence, we can learn the sequence by extracting a statistical structure that is potentially embedded in the sequence. This mechanism is called statistical learning, and is considered a fundamental and domain-general process that is innate in humans. In the real-world environment, humans are inevitably exposed to auditory sequences that often overlap with one another, such as speech sound streams from multiple speakers or entangled melody lines generated by multiple instruments. The present study investigated how single and dual attention modulates brain activity, reflecting statistical learning when two auditory sequences were presented simultaneously. The results demonstrated that the effect of statistical learning had more pronounced neural activity when listeners paid attention to only one sequence and ignored the other, rather than paying attention to both sequences. Biased attention may thus be an essential strategy when learners are exposed to multiple information streams.
... Finally, the SL effects manifest as a difference in amplitudes between stimuli with lower and higher TPs. A body of studies detected SL effects on ERP/ERF such as P50 (Paraskevopoulos et al., 2012;Daikoku et al., 2016Daikoku et al., , 2017cDaikoku and Yumoto, 2017), N100 (Sanders et al., 2002;Furl et al., 2011;Daikoku et al., 2014Daikoku et al., , 2015Daikoku et al., , 2017c, mismatch negativity (MMN; Koelsch et al., 2016;François et al., 2017;Moldwin et al., 2017), P200 (Cunillera et al., 2006;De Diego Balaguer et al., 2007;François and Schön, 2011;Furl et al., 2011), P300 (Batterink et al., 2015), and N400 components (Sanders et al., 2002;Cunillera et al., 2006Cunillera et al., , 2009François and Schön, 2011;François et al., 2013François et al., , 2014. Compared with later auditory responses, the earlier auditory responses that peak at 20-80 ms (e.g., P50) have been attributed to parallel cortico-cortical or thalamo-cortical connections between the primary auditory cortex and the superior temporal gyrus (Adler et al., 1982). ...
... The results were consistent with a body of previous studies on SL: the brain learned TPs of the sequences, predicted a stimulus with a high TP (i.e., frequent stimuli), and inhibited the neural response to the stimuli with a high TP. The SL effects finally represent as a difference amplitudes between the stimuli with high and low TPs François et al., 2013François et al., , 2017Paraskevopoulos et al., 2012;Daikoku et al., 2014Daikoku et al., , 2015Daikoku et al., , 2016Koelsch et al., 2016). These SL effects (i.e., difference amplitudes between the stimuli with high and low TPs), however, could not be detected after the Markov chains of the two sequences were exchanged in the last portion. ...
... SL is reflected in the early component of P1 (Paraskevopoulos et al., 2012;Daikoku et al., 2016Daikoku et al., , 2017c as well as in the FIGURE 4 | Mean peak amplitudes (upper) and the latencies (lower) of P1m. Green bars: responses to chords consisting of two rare tones in attended and ignored sequences; pink bars: responses to chords consisting of a rare tone in attended sequence and a frequent tone in ignored sequence; blue bars: responses to chords consisting of a frequent tone in attended sequence and a rare tone in ignored sequence; black bars: responses to chords consisting of two frequent tones in attended and ignored sequences. Asterisks indicate significant differences (p < 0.05, Bonferroni-corrected). ...
Article
Full-text available
In an auditory environment, humans are frequently exposed to overlapping sound sequences such as those made by human voices and musical instruments, and we can acquire information embedded in these sequences via attentional and nonattentional accesses. Whether the knowledge acquired by attentional accesses interacts with that acquired by nonattentional accesses is unknown, however. The present study examined how the statistical learning (SL) of two overlapping sound sequences is reflected in neurophysiological and behavioral responses, and how the learning effects are modulated by attention to each sequence. SL in this experimental paradigm was reflected in a neuromagnetic response predominantly in the right hemisphere, and the learning effects were not retained when attention to the tone streams was switched during the learning session. These results suggest that attentional and nonattentional learning scarcely interact with each other and that there may be a specific system for nonattentional learning, which is independent of attentional learning.
... As a result of the implicit nature of SL, however, humans cannot verbalize exactly what they statistically learn. Nonetheless, a body of evidence indicates that neurophysiological and behavioural responses can unveil musical and linguistic SL effects [14,32,[34][35][36][37][38][39][40][41][42][43][44] in the framework of predictive coding [20]. Furthermore, recent studies have detected the effects of musical training on linguistic SL of words [41,43,[45][46][47] and the interactions between musical and linguistic SL [10] and between auditory and visual SL [44,[48][49][50]. ...
... Although many studies of word segmentation detected SL effects on the N400 component [43,46,88,89,93,94,105], which is generally considered to reflect a semantic meaning in language and music [106][107][108], auditory brainstem response (ABR) [96], P50 [41], N100 [94], mismatch negativity (MMN) [40,44,98], P200 [46,89,105], N200-250 [44,47], and P300 [83] have also been reported to reflect SL effects (Table 1). In addition, other studies using Markov models also reported that SL is reflected in the P50 [14,36,37], N100 [10,14,32,35], and P200 components [35]. Compared with later auditory responses such as N400, the auditory responses that peak earlier than 10 ms after stimulus presentation (e.g., ABR) and at 20-80 ms, which is around P50 latency, have been attributed to parallel thalamo-cortical connections or cortico-cortical connections between the primary auditory cortex and the superior temporal gyrus [109]. ...
... Neurophysiological effects of word segmentation, such as N400, reflecting a semantic meaning in language [106][107][108] may be associated with the neural basis underlying linguistic functions, as well as statistical computation itself. On the other hand, our previous study using the first-order Markov model [36] struggled to detect N400 in terms of a stimulus onset asynchrony of sequences (i.e., 500 ms). A future study will be needed to verify SL effects of N400 using the Markov model. ...
Article
Full-text available
Statistical learning (SL) is a method of learning based on the transitional probabilities embedded in sequential phenomena such as music and language. It has been considered an implicit and domain-general mechanism that is innate in the human brain and that functions independently of intention to learn and awareness of what has been learned. SL is an interdisciplinary notion that incorporates information technology, artificial intelligence, musicology, and linguistics, as well as psychology and neuroscience. A body of recent study has suggested that SL can be reflected in neurophysiological responses based on the framework of information theory. This paper reviews a range of work on SL in adults and children that suggests overlapping and independent neural correlations in music and language, and that indicates disability of SL. Furthermore, this article discusses the relationships between the order of transitional probabilities (TPs) (i.e., hierarchy of local statistics) and entropy (i.e., global statistics) regarding SL strategies in human’s brains; claims importance of information-theoretical approaches to understand domain-general, higher-order, and global SL covering both real-world music and language; and proposes promising approaches for the application of therapy and pedagogy from various perspectives of psychology, neuroscience, computational studies, musicology, and linguistics.
... In addition, these statistical learning effects reflected in neural responses could be observed in both the ERPs and the magnetic counterparts of ERPs. This neural effect could be observed in the responses at approximately 50 ms (i.e., P1/P1m) (Paraskevopoulos et al., 2012;Daikoku et al., 2016), 100 ms (i.e., N1/N1m) (Abla et al., 2008;Furl et al., 2011;Daikoku et al., 2014Daikoku et al., , 2015Koelsch et al., 2016), and 200 ms (i.e., P2/P2m) (Furl et al., 2011) after stimulus onset. ...
... As a result, the differences in the amplitude and latency of the responses to tones between higher and lower transitional probabilities could occur during statistical learning. In previous studies, such differences could be detected in P1m (Paraskevopoulos et al., 2012;Daikoku et al., 2016) and N1 responses (Abla et al., 2008) when participants listened to a continuous concatenation of tone words. Because tone transitions between words were more variable than within words, the amplitudes for the final tone within words were significantly lower than those for the initial tone, after learning word borders (i.e., word segmentation). ...
Article
Previous neural studies have supported the hypothesis that statistical learning mechanisms are used broadly across different domains such as language and music. However, these studies have only investigated a single aspect of statistical learning at a time, such as recognizing word boundaries or learning word order patterns. In this study, we neutrally investigated how the two levels of statistical learning for recognizing word boundaries and word ordering could be reflected in neuromagnetic responses and how acquired statistical knowledge is reorganised when the syntactic rules are revised. Neuromagnetic responses to the Japanese-vowel sequence (a, e, i, o, and u), presented every 0.45 seconds, were recorded from 14 right-handed Japanese participants. The vowel order was constrained by a Markov stochastic model such that five nonsense words (aue, eao, iea, oiu, and uoi) were chained with an either-or rule: the probability of the forthcoming word was statistically defined (80% for one word; 20% for the other word) by the most recent two words. All of the word transition probabilities (80% and 20%) were switched in the middle of the sequence. In the first and second quarters of the sequence, the neuromagnetic responses to the words that appeared with higher transitional probability were significantly reduced compared with those that appeared with a lower transitional probability. After switching the word transition probabilities, the response reduction was replicated in the last quarter of the sequence. The responses to the final vowels in the words were significantly reduced compared with those to the initial vowels in the last quarter of the sequence. The results suggest that both within-word and between-word statistical learning are reflected in neural responses. The present study supports the hypothesis that listeners learn larger structures such as phrases first, and they subsequently extract smaller structures, such as words, from the learned phrases. The present study provides the first neurophysiological evidence that the correction of statistical knowledge requires more time than the acquisition of new statistical knowledge.
... The Markov chain, which was first reported by Markov [19], is a mathematical system in which the probability of the forthcoming state is statistically defined only by the latest state. The use of the Markov chains embedded in tone sequences allows us to verify the statistical structure of music [12][13][14][15][16][17][18] and statistical learning and knowledge in humans [20][21][22][23]. ...
... A tone with a higher transitional probability of sequences in a musical score may be one that a composer is more likely to choose compared to other tones with lower transitional probability [20][21][22][23][58][59][60]. Thus, the transitional probability matrix calculated from music may represent the characteristics of a composer's statistical knowledge by which a forthcoming tone is implicitly defined. ...
Article
Learning and knowledge of transitional probability in sequences like music, called statistical learning and knowledge, are considered implicit processes that occur without intention to learn and awareness of what one knows. This implicit statistical knowledge can be alternatively expressed via abstract medium such as musical melody, which suggests this knowledge is reflected in melodies written by a composer. This study investigates how statistics in music vary over a composer’s lifetime. Transitional probabilities of highest-pitch sequences in Ludwig van Beethoven’s Piano Sonata were calculated based on different hierarchical Markov models. Each interval pattern was ordered based on the sonata opus number. The transitional probabilities of sequential patterns that are musical universal in music gradually decreased, suggesting that time-course variations of statistics in music reflect time-course variations of a composer’s statistical knowledge. This study sheds new light on novel methodologies that may be able to evaluate the time-course variation of composer’s implicit knowledge using musical scores.
... A number of studies explored the neural correlates of statistical learning assessing event-related potential (ERP) or event-related magnetic field (ERF) components such as the N400 9-13 or the N100/N100m 9,13-15 . Earlier components have also been reported such as auditory brainstem responses (ABR) 16 or the P50m 17,18 . ...
... A number of studies explored the neural correlates of statistical learning assessing event-related potential (ERP) or event-related magnetic field (ERF) components such as the N400 9-13 or the N100/N100m 9,[13][14][15] . Earlier components have also been reported such as auditory brainstem responses (ABR) 16 or the P50m 17,18 . ...
Article
Full-text available
How do listeners respond to prediction errors within patterned sequence of sounds? To answer this question we carried out a statistical learning study using electroencephalography (EEG). In a continuous auditory stream of sound triplets the deviations were either (a) statistical, in terms of transitional probability, (b) physical, due to a change in sound location (left or right speaker) or (c) a double deviants, i.e. a combination of the two. Statistical and physical deviants elicited a statistical mismatch negativity and a physical MMN respectively. Most importantly, we found that effects of statistical and physical deviants interacted (the statistical MMN was smaller when co-occurring with a physical deviant). Results show, for the first time, that processing of prediction errors due to statistical learning is affected by prediction errors due to physical deviance. Our findings thus show that the statistical MMN interacts with the physical MMN, implying that prediction error processing due to physical sound attributes suppresses processing of learned statistical properties of sounds.
... The time interval does not depend on a beat, but rather, likely to depend on preceding phrases and the meaning. The SL effect on neural responses has been mainly detected using spectral sequences with a consistent inter-onset interval (IOI) of pitch (Furl et al., 2011), formant (Daikoku et al., 2015), timbre (Koelsch et al., 2016;Tsogli et al., 2019), musical chord (Daikoku et al., 2016), and speech (Daikoku et al., 2017a,b). To the best of our knowledge, however, few studies addressed how SL of "temporal" patterns such as rhythm and meter is reflected in neurophysiological responses. ...
... According to the previous studies, learning effects on P1 responses were not correlated with the other neural responses such as N1 (Boutros et al., 1995;Boutros and Belger, 1999;Kisley et al., 2004). SL effects on P1 responses were more detectable than N1 responses in musical chord sequences (Daikoku et al., 2016). These findings suggest that P1 responses are modulated by music training (Kizkin et al., 2006;Wang et al., 2009). ...
... Statistical learning, which occurs even in sleeping neonates and infants [4,5], is considered an implicit and innate system in humans. Evidence has suggested that, even if humans are unaware of what they learn, neurophysiological responses disclose statistical learning effects [6][7][8][9][10][11][12][13]. This learned knowledge can be in part expressed via an abstract medium such as a musical melody [14][15][16]. ...
... Sixteen transitional patterns with two tones that appear in all pieces of music were detected ((0,0), (0,1), (0,−1), (0,2), (0,−2), (0,3), (0,−3), (0,4), (0,−4), (0,5), (0,−5), (0,6), (0,7), (0,−7), (0, 9), and (0,−9)). Using these transitional patterns, a multiple linear regression was carried out to predict the chronological order in which music was played by each musician, based on the TPs of the 16 transitional patterns. ...
Article
Full-text available
Statistical learning is an innate function in the brain and considered to be essential for producing and comprehending structured information such as music. Within the framework of statistical learning the brain has an ability to calculate the transitional probabilities of sequences such as speech and music, and to predict a future state using learned statistics. This paper computationally examines whether and how statistical learning and knowledge partially contributes to musical representation in jazz improvisation. The results represent the time-course variations in a musician’s statistical knowledge. Furthermore, the findings show that improvisational musical representation might be susceptible to higher- but not lower-order statistical knowledge (i.e., knowledge of higher-order transitional probability). The evidence also demonstrates the individuality of improvisation for each improviser, which in part depends on statistical knowledge. Thus, this study suggests that statistical properties in jazz improvisation underline individuality of musical representation.
... Furthermore, based on the internalized statistical model, it can predict a future state and optimize action for achieving a given goal ( Monroy et al., 2017a,c) to resolve the uncertainty of information (Friston, 2010). The SL has also be thought to contribute to the encoding of the complexity in the information (Hasson, 2017), and to acquisition of musical and linguistic knowledge including tonality ( Daikoku et al., 2016) and syntax ( Daikoku et al., 2017a). ...
Article
Full-text available
Statistical learning is a learning mechanism based on transition probability in sequences such as music and language. Recent computational and neurophysiological studies suggest that the statistical learning contributes to production, action, and musical creativity as well as prediction and perception. The present study investigated how statistical structure interacts with tonalities in music based on various-order statistical models. To verify this in all 24 major and minor keys, the transition probabilities of the sequences containing the highest pitches in Bach's Well-Tempered Clavier, which is a collection of two series (No. 1 and No. 2) of preludes and fugues in all of the 24 major and minor keys, were calculated based on nth-order Markov models. The transition probabilities of each sequence were compared among tonalities (major and minor), two series (No. 1 and No. 2), and music types (prelude and fugue). The differences in statistical characteristics between major and minor keys were detected in lower- but not higher-order models. The results also showed that statistical knowledge in music might be modulated by tonalities and composition periods. Furthermore, the principal component analysis detected the shared components of related keys, suggesting that the tonalities modulate statistical characteristics in music. The present study may suggest that there are at least two types of statistical knowledge in music that are interdependent on and independent of tonality, respectively.
... PARSER (Perruchet and Vinter, 1998), Competitive Chunker (Servan-Schreiber and Anderson, 1990), Information Dynamics of Music (IDyOM) (Pearce and Wiggins, 2012), Information Dynamics of Thinking (IDyOT) (Wiggins, 2018), and other Markovian models including the n-gram and nth-order Markov models (Daikoku, 2018b), can implement chunking hypotheses that learning is based on extracting, storing, and combining small chunks. Particularly, information-theoretical models including Markovian processes have been applied to neurophysiological studies of SL in human brain as well as computational simulation (Pearce et al., 2010a;Pearce and Wiggins, 2012;Daikoku et al., 2014Daikoku et al., , 2016Daikoku et al., , 2017aDaikoku et al., , 2018Daikoku, 2016, 2018;Daikoku and Yumoto, 2017;Daikoku, 2018c). These neurophysiological experiments showed consistent evidence: neural activities for stimuli with high information content (i.e., low probability) are larger than those with low information content (i.e., high probability). ...
Article
Full-text available
Brain models music as a hierarchy of dynamical systems that encode probability distributions and complexity (i.e., entropy and uncertainty). Through musical experience over lifetime, a human is intrinsically motivated in optimizing the internalized probabilistic model for efficient information processing and the uncertainty resolution, which has been regarded as rewords. Human's behavior, however, appears to be not necessarily directing to efficiency but sometimes act inefficiently in order to explore a maximum rewards of uncertainty resolution. Previous studies suggest that the drive for novelty seeking behavior (high uncertain phenomenon) reflects human's curiosity, and that the curiosity rewards encourage humans to create and learn new regularities. That is to say, although brain generally minimizes uncertainty of music structure, we sometimes derive pleasure from music with uncertain structure due to curiosity for novelty seeking behavior by which we anticipate the resolution of uncertainty. Few studies, however, investigated how curiosity for uncertain and novelty seeking behavior modulates musical creativity. The present study investigated how the probabilistic model and the uncertainty in music fluctuate over a composer's lifetime (all of the 32 piano sonatas by Ludwig van Beethoven). In the late periods of the composer's lifetime, the transitional probabilities (TPs) of sequential patterns that ubiquitously appear in all of his music (familiar phrase) were decreased, whereas the uncertainties of the whole structure were increased. Furthermore, these findings were prominent in higher-, rather than lower-, order models of TP distribution. This may suggest that the higher-order probabilistic model is susceptible to experience and psychological phenomena over the composer's lifetime. The present study first suggested the fluctuation of uncertainty of musical structure over a composer's lifetime. It is suggested that human's curiosity for uncertain and novelty seeking behavior may modulate optimization and creativity in human's brain.
... We hypothesize that this deep statistical learning has a potential link to statistical creativity. This hypothesis has been investigated in neural (Daikoku et al., 2016(Daikoku et al., , 2017 and computational studies (Daikoku, 2018b(Daikoku, , 2019b. One useful model of creativity comes from musical improvisation, in which musicians spontaneously create novel melodies and rhythms. ...
... We hypothesize that this deep statistical learning has a potential link to statistical creativity. This hypothesis has been investigated in neural (Daikoku et al., 2016(Daikoku et al., , 2017 and computational studies (Daikoku, 2018b(Daikoku, , 2019b. One useful model of creativity comes from musical improvisation, in which musicians spontaneously create novel melodies and rhythms. ...
Article
Full-text available
Creativity is part of human nature and is commonly understood as a phenomenon whereby something original and worthwhile is formed. Owing to this ability, humans can produce innovative information that often facilitates growth in our society. Creativity also contributes to esthetic and artistic productions, such as music and art. However, the mechanism by which creativity emerges in the brain remains debatable. Recently, a growing body of evidence has suggested that statistical learning contributes to creativity. Statistical learning is an innate and implicit function of the human brain and is considered essential for brain development. Through statistical learning, humans can produce and comprehend structured information, such as music. It is thought that creativity is linked to acquired knowledge, but so-called “eureka” moments often occur unexpectedly under subconscious conditions, without the intention to use the acquired knowledge. Given that a creative moment is intrinsically implicit, we postulate that some types of creativity can be linked to implicit statistical knowledge in the brain. This article reviews neural and computational studies on how creativity emerges within the framework of statistical learning in the brain (i.e., statistical creativity). Here, we propose a hierarchical model of statistical learning: statistically chunking into a unit (hereafter and shallow statistical learning) and combining several units (hereafter and deep statistical learning). We suggest that deep statistical learning contributes dominantly to statistical creativity in music. Furthermore, the temporal dynamics of perceptual uncertainty can be another potential causal factor in statistical creativity. Considering that statistical learning is fundamental to brain development, we also discuss how typical versus atypical brain development modulates hierarchical statistical learning and statistical creativity. We believe that this review will shed light on the key roles of statistical learning in musical creativity and facilitate further investigation of how creativity emerges in the brain.
... The notion of statistical learning (SL) ( Saffran et al., 1996), which includes both informatics and neurophysiology (Harrison et al., 2006;Pearce and Wiggins, 2012), involves the hypothesis that our brain automatically codes the nth-order transitional probabilities (TPs) embedded in sequential phenomena such as music and language (i.e., local statistics in nth-order levels) ( Daikoku et al., 2016Daikoku et al., , 2017bDaikoku and Yumoto, 2017), grasps the entropy/uncertainty of the TP distribution (i.e., global statistics) (Hasson, 2017), predicts the future state based on the internalized nth-order statistical model ( Daikoku et al., 2014;Yumoto and Daikoku, 2016), and continually updates the model to adapt to the variable external environment ( Daikoku et al., 2012Daikoku et al., , 2017d). The concept of brain nth-order SL is underpinned by information theory (Shannon, 1951) involving n-gram or Markov models. ...
Article
Full-text available
Recent neurophysiological and computational studies have proposed the hypothesis that our brain automatically codes the nth-order transitional probabilities (TPs) embedded in sequential phenomena such as music and language (i.e., local statistics in nth-order level), grasps the entropy of the TP distribution (i.e., global statistics), and predicts the future state based on the internalized nth-order statistical model. This mechanism is called statistical learning (SL). SL is also believed to contribute to the creativity involved in musical improvisation. The present study examines the interactions among local statistics, global statistics, and different levels of orders (mutual information) in musical improvisation interact. Interactions among local statistics, global statistics, and hierarchy were detected in higher-order SL models of pitches, but not lower-order SL models of pitches or SL models of rhythms. These results suggest that the information-theoretical phenomena of local and global statistics in each order may be reflected in improvisational music. The present study proposes novel methodology to evaluate musical creativity associated with SL based on information theory.
... Because of the implicitness of statistical learning and knowledge, humans are unaware of exactly what they learn (Daikoku et al., 2014). Nonetheless, neurophysiological and behavioral responses disclose implicit learning effects (Francois and Schön, 2011;François et al., 2013;Daikoku et al., 2015Daikoku et al., , 2016Daikoku et al., , 2017aKoelsch et al., 2016;Daikoku, 2016, 2018;. When the brain implicitly encodes TP distributions that are inherent in dynamical phenomena, several things are automatically expected, including a probable future state with a higher TP, facilitating optimisation of performance based on the encoded statistics despite being unable to describe the knowledge (Broadbent, 1977;Berry and Broadbent, 1984;Green and Hecht, 1992;Williams, 2005;Rebuschat and Williams, 2012), and inhibit neurophysiological response to predictable external stimuli for the efficiency and low entropy of neural processing based on predictive coding (Daikoku, 2018b). ...
Article
Full-text available
It has been suggests musical creativity is mainly formed by implicit knowledge. However, the types of spectro-temporal features and depth of the implicit knowledge forming individualities of improvisation are unknown. This study, using various-order Markov models on implicit statistical learning, investigated spectro-temporal statistics among musicians. The results suggested that lower-order models on implicit knowledge represented general characteristics shared among musicians, whereas higher-order models detected specific characteristics unique to each musician. Second, individuality may essentially be formed by pitch but not rhythm, whereas the rhythms may allow the individuality of pitches to strengthen. Third, time-course variation of musical creativity formed by implicit knowledge and uncertainty (i.e., entropy) may occur in a musician’s lifetime. Individuality of improvisational creativity may be formed by deeper but not superficial implicit knowledge of pitches, and that the rhythms may allow the individuality of pitches to strengthen. Individualities of the creativity may shift over a musician’s lifetime via experience and training.
... There is considerable evidence that the reproduction of a time interval depends significantly on several factors, including stimuli, age, gender, learning abilities, and cognitive function (Szelag, 1997;Szelag et al., 1998Szelag et al., , 2002Szelag et al., , 2004aKanabus et al., 2004). The cerebral cortex is a learning machine that can work regardless of attention and domain of learning (Daikoku et al., 2012(Daikoku et al., , 2014(Daikoku et al., , 2015(Daikoku et al., , 2016(Daikoku et al., , 2017aKoelsch et al., 2016;Daikoku and Yumoto, 2017;Daikoku, 2018), and that can be developed by cognitive and motor learning throughout life (Merzenich et al., 1996). Szelag et al. (1998) suggested that the prefrontal cortex, which is responsible for the developmental effect, also plays an important role in temporal processing, and is involved in working memory up to around 3 s. ...
Article
Full-text available
How the human brain perceives time intervals is a fascinating topic that has been explored in many fields of study. This study examined how time intervals are replicated in three conditions: with no internalized cue (PT), with an internalized cue without a beat (AS), and with an internalized cue with a beat (RS). In PT, participants accurately reproduced the time intervals up to approximately 3 s. Over 3 s, however, the reproduction errors became increasingly negative. In RS, longer presentations of over 5.6 s and 13 beats induced accurate time intervals in reproductions. This suggests longer exposure to beat presentation leads to stable internalization and efficiency in the sensorimotor processing of perception and reproduction. In AS, up to approximately 3 s, the results were similar to those of RS whereas over 3 s, the results shifted and became similar to those of PT. The time intervals between the first two stimuli indicate that the strategies of time-interval reproduction in AS may shift from RS to PT. Neural basis underlying the reproduction of time intervals without a beat may depend on length of time interval between adjacent stimuli in sequences.
... Adding probabilistic approach to that would certainly increase the accuracy of the estimation. Here simple probabilistic approaches were used which found to be successful in this field and other fields of work as well [10][11][12][13][14]. By combining these two methods in this paper we have proposed a novel approach in detecting musical chords. ...
Article
Full-text available
This paper represents a method to extract guitar chords from a given audio file using a probabilistic approach called Maximum likelihood estimation. The audio file is split into smaller clips and then it is transformed from time domain into frequency domain using Fourier Transformation. There are multiple known frequencies of musical notes we denote them as reference frequencies. A chord basically is a combination of multiple frequencies. Fourier transformation allows us to identify the frequencies that have precedence over other frequencies in that clip. So, we identify the frequencies having precedence over other frequencies and match them with the reference frequencies to find out which note they belong to. Thus, we get a number of notes in each clip yielding us a specific chord. If we fail to obtain a chord for any sample clip, we follow a probabilistic approach which is termed as ‘Maximum Likelihood Estimation’ and we use it to approximate the musical chord for the first time with high level of accuracy.
... Moreover, recent studies indicate that corticofugal projections, which, in line with the predictive coding framework modulate the responsiveness of sub-cortical auditory regions 57 , may play an important role in statistical learning 30 . The STG and the primary auditory cortex are related to such top-down connections 58 and the fact that they show enhanced contribution in statistical learning of tones, as indicated by the connectivity of STG as well as the P50 response presented in our previous study 11 and in a recent study by Doikoku et al. 59 , may support this interpretation. The augmented response of musicians in P50 and the increased connectivity of STG with other cortical regions in the group of musicians may further indicate that long-term musical training is related to enhanced top-down shaping of low level auditory regions. ...
Article
Full-text available
Statistical learning is a cognitive process of great importance for the detection and representation of environmental regularities. Complex cognitive processes such as statistical learning usually emerge as a result of the activation of widespread cortical areas functioning in dynamic networks. The present study investigated the cortical large-scale network supporting statistical learning of tone sequences in humans. The reorganization of this network related to musical expertise was assessed via a cross-sectional comparison of a group of musicians to a group of non-musicians. The cortical responses to a statistical learning paradigm incorporating an oddball approach were measured via Magnetoencephalographic (MEG) recordings. Large-scale connectivity of the cortical activity was calculated via a statistical comparison of the estimated transfer entropy in the sources’ activity. Results revealed the functional architecture of the network supporting the processing of statistical learning, highlighting the prominent role of informational processing pathways that bilaterally connect superior temporal and intraparietal sources with the left IFG. Musical expertise is related to extensive reorganization of this network, as the group of musicians showed a network comprising of more widespread and distributed cortical areas as well as enhanced global efficiency and increased contribution of additional temporal and frontal sources in the information processing pathway.
... Apart from behavioral indicators, researchers have also used neurophysiological markers to show the presence of long-term knowledge about musical syntax 17-20 and common melodies 21,22 . Some have also used such markers to show the presence of knowledge about musical regularities underlying the stimuli presented within an experiment [23][24][25][26][27][28][29] . However, it is unknown what happens when the latter type of knowledge, i.e., knowledge developed within an experiment, becomes the former type of knowledge, i.e., knowledge that can be activated OPEN ...
Article
Full-text available
Most listeners possess sophisticated knowledge about the music around them without being aware of it or its intricacies. Previous research shows that we develop such knowledge through exposure. This knowledge can then be assessed using behavioral and neurophysiological measures. It remains unknown however, which neurophysiological measures accompany the development of musical long-term knowledge. In this series of experiments, we first identified a potential ERP marker of musical long-term knowledge by comparing EEG activity following musically unexpected and expected tones within the context of known music ( n = 30). We then validated the marker by showing that it does not differentiate between such tones within the context of unknown music ( n = 34). In a third experiment, we exposed participants to unknown music ( n = 40) and compared EEG data before and after exposure to explore effects of time. Although listeners’ behavior indicated musical long-term knowledge, we did not find any effects of time on the ERP marker. Instead, the relationship between behavioral and EEG data suggests musical long-term knowledge may have formed before we could confirm its presence through behavioral measures. Listeners are thus not only knowledgeable about music but seem to also be incredibly fast music learners.
... In other words, improvement in physical fitness could increase neural resources necessary to perform statistical learning as well as physical exercise. Incidental statistical learning is considered to be a domain-general and implicit learning process innate to humans, regardless of learner's ages [1,2], suggesting that we might constantly and unconsciously perform statistical learning of sequential stimuli such as language and music [34,35]. Some researchers reported that dyslexia is difficult to perform statistical learning compared to healthy learners [36,37]. ...
Article
In real-world auditory environments, humans are exposed to overlapping auditory information such as those made by human voices and musical instruments even during routine physical activities such as walking and cycling. The present study investigated how concurrent physical exercise affects performance of incidental and intentional learning of overlapping auditory streams, and whether physical fitness modulates the performances of learning. Participants were grouped with 11 participants with lower and higher fitness each, based on their Vo2max value. They were presented simultaneous auditory sequences with a distinct statistical regularity each other (i.e. statistical learning), while they were pedaling on the bike and seating on a bike at rest. In experiment 1, they were instructed to attend to one of the two sequences and ignore to the other sequence. In experiment 2, they were instructed to attend to both of the two sequences. After exposure to the sequences, learning effects were evaluated by familiarity test. In the experiment 1, performance of statistical learning of ignored sequences during concurrent pedaling could be higher in the participants with high than low physical fitness, whereas in attended sequence, there was no significant difference in performance of statistical learning between high than low physical fitness. Furthermore, there was no significant effect of physical fitness on learning while resting. In the experiment 2, the both participants with high and low physical fitness could perform intentional statistical learning of two simultaneous sequences in the both exercise and rest sessions. The improvement in physical fitness might facilitate incidental but not intentional statistical learning of simultaneous auditory sequences during concurrent physical exercise.
Article
Statistical learning allows comprehension of structured information, such as that in language and music. The brain computes a sequence's transition probability and predicts future states to minimise sensory reaction and derive entropy (uncertainty) from sequential information. Neurophysiological studies have revealed that early event-related neural responses (P1 and N1) reflect statistical learning − when the brain encodes transition probability in stimulus sequences, it predicts an upcoming stimulus with a high transition probability and suppresses the early event-related responses to a stimulus with a high transition probability. This amplitude difference between high and low transition probabilities reflects statistical learning effects. However, how a sequence's transition probability ratio affects neural responses contributing to statistical learning effects remains unknown. This study investigated how transition-probability ratios or conditional entropy (uncertainty) in auditory sequences modulate the early event-related neuromagnetic responses of P1m and N1m. Sequence uncertainties were manipulated using three different transition-probability ratios: 90:10%, 80:20%, and 67:33% (conditional entropy: 0.47, 0.72, and 0.92 bits, respectively). Neuromagnetic responses were recorded when participants listened to sequential sounds with these three transition probabilities. Amplitude differences between lower and higher probabilities were larger in sequences with transition-probability ratios of 90:10% and smaller in sequences with those of 67:33%, compared to sequences with those of 80:20%. This suggests that the transition-probability ratio finely tunes P1m and N1m. Our study also showed larger amplitude differences between frequent- and rare-transition stimuli in P1m than in N1m. This indicates that information about transition-probability differences may be calculated in earlier cognitive processes.
Article
Statistical learning, the process of tracking distributional information and discovering embedded patterns, is traditionally regarded as a form of implicit learning. However, recent studies proposed that both implicit (attention-independent) and explicit (attention-dependent) learning systems are involved in statistical learning. To understand the role of attention in statistical learning, the current study investigates the cortical processing of distributional patterns in speech across local and global contexts. We then ask how these cortical responses relate to statistical learning behavior in a word segmentation task. We found Event-Related Potential (ERP) evidence of pre-attentive processing of both the local (mismatching negativity) and global distributional information (late discriminative negativity). However, as speech elements became less frequent and more surprising, some participants showed an involuntary attentional shift, reflected in a P3a response. Individuals who displayed attentive neural tracking of distributional information showed faster learning in a speech statistical learning task. These results suggest that an involuntary attentional shift might play a facilitatory, but not essential, role in statistical learning.
Article
Full-text available
Within the framework of statistical learning, many behavioural studies investigated the processing of unpredicted events. However, surprisingly few neurophysiological studies are available on this topic, and no statistical learning experiment has investigated electroencephalographic (EEG) correlates of processing events with different transition probabilities. We carried out an EEG study with a novel variant of the established statistical learning paradigm. Timbres were presented in isochronous sequences of triplets. The first two sounds of all triplets were equiprobable, while the third sound occurred with either low (10%), intermediate (30%), or high (60%) probability. Thus, the occurrence probability of the third item of each triplet (given the first two items) was varied. Compared to high-probability triplet endings, endings with low and intermediate probability elicited an early anterior negativity that had an onset around 100 ms and was maximal at around 180 ms. This effect was larger for events with low than for events with intermediate probability. Our results reveal that, when predictions are based on statistical learning, events that do not match a prediction evoke an early anterior negativity, with the amplitude of this mismatch response being inversely related to the probability of such events. Thus, we report a statistical mismatch negativity (sMMN) that reflects statistical learning of transitional probability distributions that go beyond auditory sensory memory capabilities.
Article
Full-text available
We explore the capacity for music in terms of five questions: (1) What cognitive structures are invoked by music? (2) What are the principles that create these structures? (3) How do listeners acquire these principles? (4) What pre-existing resources make such acquisition possible? (5) Which aspects of these resources are specific to music, and which are more general? We examine these issues by looking at the major components of musical organization: rhythm (an interaction of grouping and meter), tonal organization (the structure of melody and harmony), and affect (the interaction of music with emotion). Each domain reveals a combination of cognitively general phenomena, such as gestalt grouping principles, harmonic roughness, and stream segregation, with phenomena that appear special to music and language, such as metrical organization. These are subtly interwoven with a residue of components that are devoted specifically to music, such as the structure of tonal systems and the contours of melodic tension and relaxation that depend on tonality. In the domain of affect, these components are especially tangled, involving the interaction of such varied factors as general-purpose aesthetic framing, communication of affect by tone of voice, and the musically specific way that tonal pitch contours evoke patterns of posture and gesture.
Article
Full-text available
The aim of the present study was to identify a specific neuronal correlate underlying the pre-attentive auditory stream segregation of subsequent sound patterns alternating in spectral or temporal cues. Fifteen participants with normal hearing were presented with series' of two consecutive ABA auditory tone-triplet sequences, the initial triplets being the Adaptation sequence and the subsequent triplets being the Test sequence. In the first experiment, the frequency separation (delta-f) between A and B tones in the sequences was varied by 2, 4 and 10 semitones. In the second experiment, a constant delta-f of 6 semitones was maintained but the Inter-Stimulus Intervals (ISIs) between A and B tones were varied. Auditory evoked magnetic fields (AEFs) were recorded using magnetoencephalography (MEG). Participants watched a muted video of their choice and ignored the auditory stimuli. In a subsequent behavioral study both MEG experiments were replicated to provide information about the participants' perceptual state. MEG measurements showed a significant increase in the amplitude of the B-tone related P1 component of the AEFs as delta-f increased. This effect was seen predominantly in the left hemisphere. A significant increase in the amplitude of the N1 component was only obtained for a Test sequence delta-f of 10 semitones with a prior Adaptation sequence of 2 semitones. This effect was more pronounced in the right hemisphere. The additional behavioral data indicated an increased probability of two-stream perception for delta-f = 4 and delta-f = 10 semitones with a preceding Adaptation sequence of 2 semitones. However, neither the neural activity nor the perception of the successive streaming sequences were modulated when the ISIs were alternated. Our MEG experiment demonstrated differences in the behavior of P1 and N1 components during the automatic segregation of sounds when induced by an initial Adaptation sequence. The P1 component appeared enhanced in all Test-conditions and thus demonstrates the preceding context effect, whereas N1 was specifically modulated only by large delta-f Test sequences induced by a preceding small delta-f Adaptation sequence. These results suggest that P1 and N1 components represent at least partially-different systems that underlie the neural representation of auditory streaming.
Article
Full-text available
Adults and infants can use the statistical properties of syllable sequences to extract words from continuous speech. Here we present a review of a series of electrophysiological studies investigating (1) Speech segmentation resulting from exposure to spoken and sung sequences (2) The extraction of linguistic versus musical information from a sung sequence (3) Differences between musicians and non-musicians in both linguistic and musical dimensions. The results show that segmentation is better after exposure to sung compared to spoken material and moreover, that linguistic structure is better learned than the musical structure when using sung material. In addition, musical expertise facilitates the learning of both linguistic and musical structures. Finally, an electrophysiological approach, which directly measures brain activity, appears to be more sensitive than a behavioral one.
Article
Full-text available
Recent electrophysiological and neuroimaging studies have explored how and where musical syntax in Western music is processed in the human brain. An inappropriate chord progression elicits an event-related potential (ERP) component called an early right anterior negativity (ERAN) or simply an early anterior negativity (EAN) in an early stage of processing the musical syntax. Though the possible underlying mechanism of the EAN is assumed to be probabilistic learning, the effect of the probability of chord progressions on the EAN response has not been previously explored explicitly. In the present study, the empirical conditional probabilities in a Western music corpus were employed as an approximation of the frequencies in previous exposure of participants. Three types of chord progression were presented to musicians and non-musicians in order to examine the correlation between the probability of chord progression and the neuromagnetic response using magnetoencephalography (MEG). Chord progressions were found to elicit early responses in a negatively correlating fashion with the conditional probability. Observed EANm (as a magnetic counterpart of the EAN component) responses were consistent with the previously reported EAN responses in terms of latency and location. The effect of conditional probability interacted with the effect of musical training. In addition, the neural response also correlated with the behavioral measures in the non-musicians. Our study is the first to reveal the correlation between the probability of chord progression and the corresponding neuromagnetic response. The current results suggest that the physiological response is a reflection of the probabilistic representations of the musical syntax. Moreover, the results indicate that the probabilistic representation is related to the musical training as well as the sensitivity of an individual.
Article
Full-text available
Used the probe tone method to quantify the perceived hierarchy of tones of North Indian music. Indian music is tonal and has many features in common with Western music. However, the primary means of expressing tonality in Indian music is through melody, whereas in Western music it is through harmony (the use of chords). Probe tone ratings were given by Indian and Western listeners in the context of 10 North Indian rags (a standard set of melodic forms). These ratings confirmed the predicted hierarchical ordering. Both groups of listeners gave the highest ratings to the tonic and the 5th degree of the scale. The ratings of both groups of listeners generally reflected the pattern of tone durations in the musical contexts. This result suggests that the distribution of tones in music is a psychologically effective means of conveying the tonal hierarchy to listeners whether they are familiar with the musical tradition. Only the Indian listeners were sensitive to the scales (thats) underlying the rags. There was little evidence that Western listeners assimilated the pitch materials to the major and minor diatonic system of Western music. (34 ref) (PsycINFO Database Record (c) 2006 APA, all rights reserved).
Article
Full-text available
The present study investigated different aspects of auditory language comprehension. The sentences which were presented as connected speech were either correct or incorrect including a semantic error (selectional restriction), a morphological error (verb inflection), or a syntactic error (phrase structure). After each sentence, a probe word was presented auditorily, and subjects had to decide whether this word was part of the preceding sentence or not. Event-related brain potentials (ERPs) were recorded from 7 scalp electrodes. The ERPs evoked by incorrect sentences differed significantly from the correct ones as a function of error type. Semantic anomalies evoked a 'classical' N400 pattern. Morphological errors elicited a pronounced negativity between 300 and 600 ms followed by a late positivity. Syntactic errors, in contrast, evoked an early negativity peaking around 180 ms followed by a negativity around 400 ms. The early negativity was only significant over the left anterior electrode. The present data demonstrate that linguistic errors of different categories evoke different ERP patterns. They indicate that with using connected speech as input, different aspects of language comprehension processes cannot only be described with respect to their temporal structure, but eventually also with respect to possible brain systems subserving these processes.
Book
Geoffrey Chaucer is the best-known and most widely read of all medieval British writers, famous for his scurrilous humour and biting satire against the vices and absurdities of his age. Yet he was also a poet of passionate love, sensitive to issues of gender and sexual difference, fascinated by the ideological differences between the pagan past and the Christian present, and a man of science, knowledgeable in astronomy, astrology and alchemy. This concise book is an ideal starting point for study of all his major poems, particularly The Canterbury Tales, to which two chapters are devoted. It offers close readings of individual texts, presenting various possibilities for interpretation, and includes discussion of Chaucer's life, career, historical context and literary influences. An account of the various ways in which he has been understood over the centuries leads into an up-to-date, annotated guide to further reading.
Article
We explore the capacity for music in terms of five questions: (1) What cognitive structures are invoked by music? (2) What are the principles that create these structures? (3) How do listeners acquire these principles? (4) What pre-existing resources make such acquisition possible? (5) Which aspects of these resources are specific to music, and which are more general? We examine these issues by looking at the major components of musical organization: rhythm (an interaction of grouping and meter), tonal organization (the structure of melody and harmony), and affect (the interaction of music with emotion). Each domain reveals a combination of cognitively general phenomena, such as gestalt grouping principles, harmonic roughness, and stream segregation, with phenomena that appear special to music and language, such as metrical organization. These are subtly interwoven with a residue of components that are devoted specifically to music, such as the structure of tonal systems and the contours of melodic tension and relaxation that depend on tonality. In the domain of affect, these components are especially tangled, involving the interaction of such varied factors as general-purpose aesthetic framing, communication of affect by tone of voice, and the musically specific way that tonal pitch contours evoke patterns of posture and gesture.
Article
The majority of studies on music processing in children used simple musical stimuli. Here, primaryschoolchildren judged the appropriateness of musical closure in expressive polyphone music, whilehigh-density electroencephalography was recorded. Stimuli ended either regularly or contained refined in-keyharmonic transgressions at closure. The children discriminated the transgressions well above chance. Regularand transgressed endings evoked opposite scalp voltage configurations peaking around 400 ms after stimulusonset with bilateral frontal negativity for regular and centro-posterior negativity (CPN) for transgressed endings.A positive correlation could be established between strength of the CPN response and rater sensitivity (d-prime).We also investigated whether the capacity to discriminate the transgressions was supported by auditory domainspecific or general cognitive mechanisms, and found that working memory capacity predicted transgression dis-crimination. Latency and distribution of the CPN are reminiscent of the N400, typically observed in response tosemantic incongruities in language. Therefore our observation is intriguing, as the CPN occurred here within anintra-musical context,without any symbolsreferring tothe external world. Moreover, the harmonicin-keytrans-gressions that we implemented may be considered syntactical as they transgress structural rules. Such structuralincongruities in music are typically followed by an early right anterior negativity (ERAN) and an N5, but not sohere. Putative contributive sources of the CPN were localized in left pre-motor, mid-posterior cingulate and su-perior parietal regions of the brain that can be linked to integration processing. These results suggest that, atleast in children, processing of syntax and meaning may coincide in complex intra-musical contexts.
Article
The condition of congenital amusia, commonly known as tone-deafness, has been described for more than a century, but has received little empirical attention. In the present study, a research effort has been made to document in detail the behavioural manifestations of congenital amusia. A group of 11 adults, fitting stringent criteria of musical disabilities, were examined in a series of tests originally designed to assess the presence and specificity of musical disorders in brain-damaged patients. The results show that congenital amusia is related to severe deficiencies in processing pitch variations. The deficit extends to impairments in music memory and recognition as well as in singing and the ability to tap in time to music. Interestingly, the disorder appears specific to the musical domain. Congenital amusical individuals process and recognize speech, including speech prosody, common environmental sounds and human voices, as well as control subjects. Thus, the present study convincingly demonstrates the existence of congenital amusia as a new class of learning disabilities that affect musical abilities.
Article
Humans process music even without conscious effort according to implicit knowledge about syntactic regularities. Whether such automatic and implicit processing is modulated by veridical knowledge has remained unknown in previous neurophysiological studies. This study investigates this issue by testing whether the acquisition of veridical knowledge of a music-syntactic irregularity (acquired through supervised learning) modulates early, partly automatic, music-syntactic processes (as reflected in the early right anterior negativity, ERAN), and/or late controlled processes (as reflected in the late positive component, LPC). Excerpts of piano sonatas with syntactically regular and less regular chords were presented repeatedly (ten times) to non-musicians and amateur musicians. Participants were informed by a cue as to whether the following excerpt contained a regular or less regular chord. Results showed that the repeated exposure to several presentations of regular and less regular excerpts did not influence the ERAN elicited by less regular chords. By contrast, amplitudes of the LPC (as well as of the P3a evoked by less regular chords) decreased systematically across learning trials. These results reveal that late controlled, but not early (partly automatic), neural mechanisms of music-syntactic processing are modulated by repeated exposure to a musical piece. Copyright © 2015. Published by Elsevier B.V.
Article
In our previous study (Daikoku, Yatomi, & Yumoto, 2014), we demonstrated that the N1m response could be a marker for the statistical learning process of pitch sequence, in which each tone was ordered by a Markov stochastic model. The aim of the present study was to investigate how the statistical learning of music- and language-like auditory sequences is reflected in the N1m responses based on the assumption that both language and music share domain generality. By using vowel sounds generated by a formant synthesizer, we devised music- and language-like auditory sequences in which higher-ordered transitional rules were embedded according to a Markov stochastic model by controlling fundamental (F0) and/or formant frequencies (F1-F2). In each sequence, F0 and/or F1-F2 were spectrally shifted in the last one-third of the tone sequence. Neuromagnetic responses to the tone sequences were recorded from 14 right-handed normal volunteers. In the music- and language-like sequences with pitch change, the N1m responses to the tones that appeared with higher transitional probability were significantly decreased compared with the responses to the tones that appeared with lower transitional probability within the first two-thirds of each sequence. Moreover, the amplitude difference was even retained within the last one-third of the sequence after the spectral shifts. However, in the language-like sequence without pitch change, no significant difference could be detected. The pitch change may facilitate the statistical learning in language and music. Statistically acquired knowledge may be appropriated to process altered auditory sequences with spectral shifts. The relative processing of spectral sequences may be a domain-general auditory mechanism that is innate to humans.
Article
The majority of studies on music processing in children used simple musical stimuli. Here, primary schoolchildren judged the appropriateness of musical closure in expressive polyphone music, while high-density electroencephalography was recorded. Stimuli ended either regularly or contained refined in-key harmonic transgressions at closure. The children discriminated the transgressions well above chance. Regular and transgressed endings evoked opposite scalp voltage configurations peaking around 400 ms after stimulus onset with bilateral frontal negativity for regular and centro-posterior negativity (CPN) for transgressed endings. A positive correlation could be established between strength of the CPN response and rater sensitivity (d-prime). We also investigated whether the capacity to discriminate the transgressions was supported by auditory domain specific or general cognitive mechanisms, and found that working memory capacity predicted transgression dis-crimination. Latency and distribution of the CPN are reminiscent of the N400, typically observed in response to semantic incongruities in language. Therefore our observation is intriguing, as the CPN occurred here within an intra-musical context, without any symbols referring to the external world. Moreover, the harmonic in-key trans-gressions that we implemented may be considered syntactical as they transgress structural rules. Such structural incongruities in music are typically followed by an early right anterior negativity (ERAN) and an N5, but not so here. Putative contributive sources of the CPN were localized in left pre-motor, mid-posterior cingulate and su-perior parietal regions of the brain that can be linked to integration processing. These results suggest that, at least in children, processing of syntax and meaning may coincide in complex intra-musical contexts.
Article
The P600 component in Event Related Potential research has been hypothesised to be associated with syntactic reanalysis processes. We, however, propose that the P600 is not restricted to reanalysis processes, but reflects difficulty with syntactic integration processes in general. First we discuss this integration hypothesis in terms of a sentence processing model proposed elsewhere. Next, in Experiment 1, we show that the P600 is elicited in grammatical, non-garden path sentences in which integration is more difficult (i.e., ''who'' questions) relative to a control sentence (''whether'' questions). This effect is replicated in Experiment 2. Furthermore, we directly compare the effect of difficult integration in grammatical sentences to the effect of agreement violations. The results suggest that the positivity elicited in ''who'' questions and the P600-effect elicited by agreement violations have partly overlapping neural generators. This supports the hypothesis that similar cognitive processes, i.e., integration, are involved in both first pass analysis of ''who'' questions and dealing with ungrammaticalities (reanalysis).
Article
To understand the world around us, continuous streams of information including speech must be segmented into units that can be mapped onto stored representations. Recent evidence has shown that event-related potentials (ERPs) can index the online segmentation of sound streams. In the current study, listeners were trained to recognize sequences of three nonsense sounds that could not easily be rehearsed. Beginning 40 ms after onset, sequence-initial sounds elicited a larger amplitude negativity after compared to before training. This difference was not evident for medial or final sounds in the sequences. Across studies, ERP segmentation effects are remarkably similar regardless of the available segmentation cues and nature of the continuous streams. These results indicate the preferential processing of sequence-initial information is not domain specific and instead implicate a more general cognitive mechanism such as temporally selective attention.
Article
The magnetic field pattern over the temporal area of the scalp 100 ms following the onset of a tone burst stimulus provides evidence for neuronal activity in auditory primary and association cortices that overlap in time. Habituation studies indicate that onset and offset features of a tone produce activation traces in primary cortex that are at least partially common, but only the onset produces an appreciable trace in association cortex. The characteristic time constant for the decay of the latter's activation trace is several seconds longer than for the former.
Conference Paper
The goal of information extraction is to extract database records from text or semi-structured sources. Traditionally, information extraction proceeds by first segmenting each can- didate record separately, and then merging records that refer to the same entities. While computationally efficient, this ap- proach is suboptimal, because it ignores the fact that segment- ing one candidate record can help to segment similar ones. For example, resolving a well-segmented field with a less- clear one can disambiguate the latter's boundaries. In this paper we propose a joint approach to information extraction, where segmentation of all records and entity resolution are performed together in a single integrated inference process. While a number of previous authors have taken steps in this direction (e.g., Pasula et al. (2003), Wellner et al. (2004)), to our knowledge this is the first fully joint approach. In exper- iments on the CiteSeer and Cora citation matching datasets, joint inference improved accuracy, and our approach outper- formed previous ones. Further, by using Markov logic and the existing algorithms for it, our solution consisted mainly of writing the appropriate logical formulas, and required much less engineering than previous ones.
Conference Paper
Machine learning approaches to coreference resolution are typically supervised, and re- quire expensive labeled data. Some unsuper- vised approaches have been proposed (e.g., Haghighi and Klein (2007)), but they are less accurate. In this paper, we present the first un- supervised approach that is competitive with supervised ones. This is made possible by performing joint inference across mentions, in contrast to the pairwise classification typ- ically used in supervised methods, and by us- ing Markov logic as a representation language, which enables us to easily express relations like apposition and predicate nominals. On MUC and ACE datasets, our model outper- forms Haghigi and Klein's one using only a fraction of the training data, and often matches or exceeds the accuracy of state-of-the-art su- pervised models.
Article
Entity resolution is the problem of determining which records in a database refer to the same entities, and is a crucial and expensive step in the data mining process. In- terest in it has grown rapidly in recent years, and many ap- proaches have been proposed. However, they tend to ad- dress only isolated aspects of the problem, and are often ad hoc. This paper proposes a well-founded, integrated solution to the entity resolution problem based on Markov logic. Markov logic combines first-order logic and proba- bilistic graphical models by attaching weights to first-order formulas, and viewing them as templates for features of Markov networks. We show how a number of previous ap- proaches can be formulated and seamlessly combined in Markov logic, and how the resulting learning and inference problems can be solved efficiently. Experiments on two ci- tation databases show the utility of this approach, and eval- uate the contribution of the different components.
Article
We propose a simple approach to combining first-order logic and probabilistic graphical models in a single representation. A Markov logic network (MLN) is a first-order knowledge base with a weight attached to each formula (or clause). Together with a set of constants representing objects in the domain, it specifies a ground Markov network containing one feature for each possible grounding of a first-order formula in the KB, with the corresponding weight. Inference in MLNs is performed by MCMC over the minimal subset of the ground network required for answering the query. Weights are efficiently learned from relational databases by iteratively optimizing a pseudo-likelihood measure. Optionally, additional clauses are learned using inductive logic programming techniques. Experiments with a real-world database and knowledge base in a university domain illustrate the promise of this approach.
Book
The curiosity of this book is first manifested by the pleonasm in its title. All theories are generative; no matter what the subject is, a theory necessarily propounds that from certain structures postulated to be basic other, usually more complex, structures are generable by identified procedures. A theory of tonality could start, and historically more than one has started, with an overtone series as the basic structure. From this, scales have been generated by applying specified procedures. From scales, so–called fundamental basses have been obtained by other procedures. From fundamental basses, grammatically permissible chord structures have been derived, from which theorists have claimed that by application of various contrapuntal procedures whole movements of actual compositions would result.
Article
This study aimed to assess the effect of musical training in statistical learning of tone sequences using Magnetoencephalography (MEG). Specifically, MEG recordings were used to investigate the neural and functional correlates of the pre-attentive ability for detection of deviance, from a statistically learned tone sequence. The effect of long-term musical training in this ability is investigated by means of comparison of MMN in musicians to non-musicians. Both groups (musicians and non-musicians) showed a mismatch negativity (MMN) response to the deviants and this response did not differ amongst them neither in amplitude nor in latency. Another interesting finding of this study is that both groups revealed a significant difference between the standards and the deviants in the response of P50 and this difference was significantly larger in the group of musicians. The increase of this difference in the group of musicians underlies that intensive, specialized and long term exercise can enhance the ability of the auditory cortex to discriminate new auditory events from previously learned ones according to transitional probabilities. A behavioral discrimination task between the standard and the deviant sequences followed the MEG measurement. The behavioral results indicated that the detection of deviance was not explicitly learned by either group, probably due to the lack of attentional resources. These findings provide valuable insights on the functional architecture of statistical learning.
Article
Magnetic interference signals often hamper analysis of magnetoencephalographic (MEG) measurements. Artifact sources in the proximity of the sensors cause strong and spatially complex signals that are particularly challenging for the existing interference-suppression methods. Here we demonstrate the performance of the temporally extended signal space separation method (tSSS) in removing strong interference caused by external and nearby sources on auditory-evoked magnetic fields-the sources of which are well established. The MEG signals were contaminated by normal environmental interference, by artificially produced additional external interference, and by nearby artifacts produced by a piece of magnetized wire in the subject's lip. After tSSS processing, even the single-trial auditory responses had a good-enough signal-to-noise ratio for detailed waveform and source analysis. Waveforms and source locations of the tSSS-reconstructed data were in good agreement with the responses from the control condition without extra interference. Our results demonstrate that tSSS is a robust and efficient method for removing a wide range of different types of interference signals in neuromagnetic multichannel measurements.
Article
How do listeners learn about the statistical regularities underlying musical harmony? In traditional Western music, certain chords predict the occurrence of other chords: Given a particular chord, not all chords are equally likely to follow. In Experiments 1 and 2, we investigated whether adults make use of statistical information when learning new musical structures. Listeners were exposed to a novel musical system containing phrases generated using an artificial grammar. This new system contained statistical structure quite different from Western tonal music. Our results suggest that learners take advantage of the statistical patterning of chords to acquire new musical structures, similar to learning processes previously observed for language learning.
Article
During auditory perception, we are required to abstract information from complex temporal sequences such as those in music and speech. Here, we investigated how higher-order statistics modulate the neural responses to sound sequences, hypothesizing that these modulations are associated with higher levels of the peri-Sylvian auditory hierarchy. We devised second-order Markov sequences of pure tones with uniform first-order transition probabilities. Participants learned to discriminate these sequences from random ones. Magnetoencephalography was used to identify evoked fields in which second-order transition probabilities were encoded. We show that improbable tones evoked heightened neural responses after 200 ms post-tone onset during exposure at the learning stage or around 150 ms during the subsequent test stage, originating near the right temporoparietal junction. These signal changes reflected higher-order statistical learning, which can contribute to the perception of natural sounds with hierarchical structures. We propose that our results reflect hierarchical predictive representations, which can contribute to the experiences of speech and music.
Article
To learn a new language, it is necessary for the learner to succeed in segmenting the continuous stream of sounds into significant units. Previous behavioral studies have shown that it is possible to segment a language or musical stream based only on probabilities of occurrence between adjacent syllables/tones. Here we used a sung language and tested participants' learning of both linguistic and musical structures while recording electroencephalography. Although behavioral results showed learning of the linguistic structure only, event-related potential results for both dimensions showed a negative component sensitive to the degree of familiarity of items. We discuss this component as an index of lexical search, also pointing to the greater sensitivity of the event-related potentials compared to the behavioral responses.
Article
The event-related potential (ERP) component mismatch negativity (MMN) is a neural marker of human echoic memory. MMN is elicited by deviant sounds embedded in a stream of frequent standards, reflecting the deviation from an inferred memory trace of the standard stimulus. The strength of this memory trace is thought to be proportional to the number of repetitions of the standard tone, visible as the progressive enhancement of MMN with number of repetitions (MMN memory-trace effect). However, no direct ERP correlates of the formation of echoic memory traces are currently known. This study set out to investigate changes in ERPs to different numbers of repetitions of standards, delivered in a roving-stimulus paradigm in which the frequency of the standard stimulus changed randomly between stimulus trains. Normal healthy volunteers (n = 40) were engaged in two experimental conditions: during passive listening and while actively discriminating changes in tone frequency. As predicted, MMN increased with increasing number of standards. However, this MMN memory-trace effect was caused mainly by enhancement with stimulus repetition of a slow positive wave from 50 to 250 ms poststimulus in the standard ERP, which is termed here "repetition positivity" (RP). This RP was recorded from frontocentral electrodes when participants were passively listening to or actively discriminating changes in tone frequency. RP may represent a human ERP correlate of rapid and stimulus-specific adaptation, a candidate neuronal mechanism underlying sensory memory formation in the auditory cortex.
Article
Recognizing melody in music involves detection of both the pitch intervals and the silence between sequentially presented sounds. This study tested the hypothesis that active musical training in adolescents facilitates the ability to passively detect sequential sound patterns compared to musically non-trained age-matched peers. Twenty adolescents, aged 15-18 years, were divided into groups according to their musical training and current experience. A fixed order tone pattern was presented at various stimulus rates while electroencephalogram was recorded. The influence of musical training on passive auditory processing of the sound patterns was assessed using components of event-related brain potentials (ERPs). The mismatch negativity (MMN) ERP component was elicited in different stimulus onset asynchrony (SOA) conditions in non-musicians than musicians, indicating that musically active adolescents were able to detect sound patterns across longer time intervals than age-matched peers. Musical training facilitates detection of auditory patterns, allowing the ability to automatically recognize sequential sound patterns over longer time periods than non-musical counterparts.
Article
Experiencing repeatedly presented auditory stimuli during magnetoencephalographic (MEG) recording may affect how the sound is processed in the listener's brain and may modify auditory evoked responses over the time course of the experiment. Amplitudes of N1 and P2 responses have been proposed as indicators for the outcome of training and learning studies. In this context the effect of merely sound experience on N1 and P2 responses was studied during two experimental sessions on different days with young, middle-aged, and older participants passively listening to speech stimuli and a noise sound. N1 and P2 were characterized as functionally distinct responses with P2 sources located more anterior than N1 in auditory cortices. N1 amplitudes decreased continuously during each recording session, but completely recovered between sessions. In contrast, P2 amplitudes were fairly constant within a session but increased from the first to the second day of MEG recording. Whereas N1 decrease was independent of age, the amount of P2 amplitude increase diminished with age. Temporal dynamics of N1 and P2 amplitudes were interpreted as reflecting neuroplastic changes along different time scales. The long lasting increase in P2 amplitude indicates that the auditory P2 response is potentially an important physiological correlate of perceptual learning, memory, and training.
Article
The early right anterior negativity (ERAN) is an event-related potential (ERP) reflecting processing of music-syntactic information, that is, of acoustic information structured according to abstract and complex regularities. The ERAN is usually maximal between 150 and 250 ms, has anterior scalp distribution (and often right-hemispheric weighting), can be modified by short- and long-term musical experience, can be elicited under ignore conditions, and emerges in early childhood. Main generators of the ERAN appear to be located in inferior fronto-lateral cortex. The ERAN resembles both the physical MMN and the abstract feature MMN in a number of properties, but the cognitive mechanisms underlying ERAN and MMN partly differ: Whereas the generation of the MMN is based on representations of regularities of intersound relationships that are extracted online from the acoustic environment, the generation of the ERAN relies on representations of music-syntactic regularities that already exist in a long-term memory format. Other processes, such as predicting subsequent acoustic events and comparing new acoustic information with the predicted sound, presumably overlap strongly for MMN and ERAN.
Article
In the dominant aesthetic theory, composers are said to use unpredictable events to tease the listener, and make music optimally challenging and therefore aesthetically pleasing. We tested this claim that events optimally discrepant from a schema will be most pleasing. Experts and novices evaluated harmonic progressions at seven levels of syntactic prototypicality. Four results emerged: (1) even novices were extremely sensitive to syntactic atypicality; (2) all subjects found atypical progressions more interesting and complex; (3) novices and undergraduate music students preferred harmonic prototypes, contrary to most aesthetic theories; (4) only music graduate students preferred atypical progressions. We discuss the striking sensitivity of novices to harmonic syntax. We describe differences between an aesthetic theory based on information and uncertainty, and one based on schemas and schema divergence. We also consider the tonal conservatism of most subjects. This conservatism constrains aesthetic theories, and may have implications for music's stylistic evolution.
Middle latency responses (MLRs), in the 10-100 msec latency range evoked by click stimuli, were examined in two sets of 7 adult subjects utilizing 5 randomly ordered rates of stimulus presentation: 0.5/sec, 1/sec, 5/sec, 8/sec and 10/sec. Evoked potentials were collected in 250 trial averages for each rate, and a replication across rates yielded 500 trial averages. Peak-to-peak measurements for Pa-Nb and P1-Nb components revealed that the P1 component was reduced in amplitude or absent at the faster rates, while the amplitude of the Pa component remained unchanged across rates. In addition, the latency of Pa was significantly longer for the faster rates of stimulation. These findings were similar across both mastoid and sternovertebral references. Taken together with previous work, these data suggest that the human Pa and P1 potentials reflect different generator systems. Moreover, the physiological similarities between the human P1 potential and the cat wave A suggest that in the human, as in the cat, this potential may be generated within the ascending reticular activating system, whereas the physiological similarities between the human Pa and the cat wave 7, as well as previous clinical data, suggest an auditory cortex origin of this component.
Article
The need for a simply applied quantitative assessment of handedness is discussed and some previous forms reviewed. An inventory of 20 items with a set of instructions and response- and computational-conventions is proposed and the results obtained from a young adult population numbering some 1100 individuals are reported. The separate items are examined from the point of view of sex, cultural and socio-economic factors which might appertain to them and also of their inter-relationship to each other and to the measure computed from them all. Criteria derived from these considerations are then applied to eliminate 10 of the original 20 items and the results recomputed to provide frequency-distribution and cumulative frequency functions and a revised item-analysis. The difference of incidence of handedness between the sexes is discussed.
Article
The action of CNS inhibitory neuronal mechanisms was tested in acutely psychotic unmedicated schizophrenic patients and in normal controls. An early positive component of the auditory average evoked response recorded at the vertex 50 msec after a click stimulus was studied. Stimuli were delivered at 10-sec intervals to establish a base-line response. Inhibitory mechanisms were then tested using a conditioning-testing paradigm by assessing the change in response to a second stimulus following the first at either 0.5, 1.0, or 2.0-sec intervals. At the 0.5-sec interval, normal controls had over a 90% mean decrement in response, whereas schizophrenics showed less than a 15% mean decrement. At 2-sec intervals, responses from normals were still 30 to 50% diminished, but those from schizophrenics showed an increased response to the stimulus compared to base line. The data suggest that normally present inhibitory mechanisms are markedly reduced in schizophrenics. Failure of these inhibitory mechanisms may be responsible for the defects in sensory gating which are thought to be an important part of the pathophysiology of schizophrenia.
Article
In a sentence reading task, words that occurred out of context were associated with specific types of event-related brain potentials. Words that were physically aberrant (larger than normal) elecited a late positive series of potentials, whereas semantically inappropriate words elicited a late negative wave (N400). The N400 wave may be an electrophysiological sign of the "reprocessing" of semantically anomalous information.
The goal of this study is to determine and localize the generators of different components of middle latency auditory evoked potentials (MLAEPs) through intracerebral recording in auditory cortex in man (Heschl's gyrus and planum temporale). The present results show that the generators of components at 30, 50, 60 and 75 msec latency are distributed medio-laterally along Heschl's gyrus. The 30 msec component is generated in the dorso-postero-medial part of Heschl's gyrus (primary area) and the 50 msec component is generated laterally in the primary area. The generators of the later components (60-75 msec) are localized in the lateral part of Heschl's gyrus that forms the secondary areas. The localization of N100 generators is discussed.
Article
Sensory gating is a complex, multistage, multifaceted physiological function believed to be protecting higher cortical centers from being flooded with incoming irrelevant sensory stimuli. Failure of such mechanisms is hypothesized as one of the mechanisms underlying the development of psychotic states. Attenuation of the amplitude of the P50 evoked potential component with stimulus repetition is widely used to study sensory gating. In the current study, we investigated the responsiveness of the P50 component to changes in the physical characteristics of ongoing trains of auditory stimuli. Forty normal volunteers were studied in a modified oddball paradigm. At all cerebral locations studied, P50 amplitudes were higher in response to infrequent stimuli. We postulate that the increase in P50 amplitude reflects the system's recognition of novel stimuli or "gating in" of sensory input. The ratio of the amplitude of the responses to the infrequent stimuli to those of the frequent stimuli was significantly higher for the posterior temporal regions. This finding provides further evidence that the temporal lobes may be significantly involved in sensory gating processes. Although this study only included normal subjects, the data generated contribute to the understanding of sensory gating mechanisms that may be relevant to psychotic states.
Article
Evoked potential (EP) changes accompanying dementing processes have been documented in a number of studies. However, EPs have not been studied in subjects who are at heightened risk for the development of Alzheimer's Disease (AD). Nineteen volunteers with no immediate family members with a history of AD and 33 healthy subjects with at least one first-degree relative with AD were studied. Of the 33 subjects with a positive family history of AD, the illness of the sick relative was classified as possible AD in 10 subjects, probable AD in 17 subjects, and definite (autopsy-proven) AD in 6 subjects. Mid-latency evoked potentials (P50, N100, and P200) and P300 event-related potentials were recorded in an oddball paradigm. The amplitudes of the P50 responses to the frequent stimuli and of the P300 responses were significantly higher in the subjects whose relatives had definite AD as compared with the other three groups. The amplitude of the N100 component was also larger in the same group, but the difference was only statistically significant from the group of healthy volunteers without a family history of AD. A process of increased sensitivity to incoming stimuli may be reflected in the increased P50, N100, and P300 amplitudes in the subjects at increased risk for developing AD.
Article
This paper presents a model describing the temporal and neurotopological structure of syntactic processes during comprehension. It postulates three distinct phases of language comprehension, two of which are primarily syntactic in nature. During the first phase the parser assigns the initial syntactic structure on the basis of word category information. These early structural processes are assumed to be subserved by the anterior parts of the left hemisphere, as event-related brain potentials show this area to be maximally activated when phrase structure violations are processed and as circumscribed lesions in this area lead to an impairment of the on-line structural assignment. During the second phase lexical-semantic and verb-argument structure information is processed. This phase is neurophysiologically manifest in a negative component in the event-related brain potential around 400 ms after stimulus onset which is distributed over the left and right temporo-parietal areas when lexical-semantic information is processed and over left anterior areas when verb-argument structure information is processed. During the third phase the parser tries to map the initial syntactic structure onto the available lexical-semantic and verb-argument structure information. In case of an unsuccessful match between the two types of information reanalyses may become necessary. These processes of structural reanalysis are correlated with a centroparietally distributed late positive component in the event-related brain potential.(ABSTRACT TRUNCATED AT 250 WORDS)
Article
Learners rely on a combination of experience-independent and experience-dependent mechanisms to extract information from the environment. Language acquisition involves both types of mechanisms, but most theorists emphasize the relative importance of experience-independent mechanisms. The present study shows that a fundamental task of language acquisition, segmentation of words from fluent speech, can be accomplished by 8-month-old infants based solely on the statistical relationships between neighboring speech sounds. Moreover, this word segmentation was based on statistical learning from only 2 minutes of exposure, suggesting that infants have access to a powerful mechanism for the computation of statistical properties of the language input.