Yang Xu

Yang Xu
  • University of Toronto

About

98
Publications
15,828
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,136
Citations
Current institution
University of Toronto

Publications

Publications (98)
Article
A key function of the lexicon is to express novel concepts as they emerge over time through a process known as lexicalization. The most common lexicalization strategies are the reuse and combination of existing words, but they have typically been studied separately in the areas of word meaning extension and word formation. Here, we offer an informa...
Article
Full-text available
Morality is central to social well-being and cognition, and moral lexicon is a key device for human communication of moral concepts and experiences. How was the moral lexicon formed? We explore this open question and hypothesize that words evolved to take on abstract moral meanings from concrete and grounded experiences. We test this hypothesis by...
Article
A defining property of human language is the creative use of words to express multiple meanings through word meaning extension. Such lexical creativity is manifested at different timescales, ranging from language development in children to the evolution of word meanings over history. We explored whether different manifestations of lexical creativit...
Conference Paper
Full-text available
Moral norms vary across cultures. A recent line of work suggests that English large language models contain human-like moral biases, but these studies typically do not examine moral variation in a diverse cultural setting. We investigate the extent to which monolingual English language models contain knowledge about moral norms in different countri...
Article
Full-text available
Slang is a common type of informal language, but its flexible nature and paucity of data resources present challenges for existing natural language systems. We take an initial step toward machine generation of slang by developing a framework that models the speaker’s word choice in slang context. Our framework encodes novel slang meaning by relatin...
Preprint
Full-text available
Humans can make moral inferences from multiple sources of input. In contrast, automated moral inference in artificial intelligence typically relies on language models with textual input. However, morality is conveyed through modalities beyond language. We present a computational framework that supports moral inference from natural images, demonstra...
Article
The lexicon is an evolving symbolic system that expresses an unbounded set of emerging meanings with a limited vocabulary. As a result, words often extend to new meanings. Decades of research have suggested that word meaning extension is non-arbitrary, and recent work formalizes this process as cognitive models of semantic chaining whereby emerging...
Article
Automated moral inference is an emerging topic of critical importance in artificial intelligence. The contemporary approach typically relies on language models to infer moral relevance or moral properties of a concept. This approach demands complex parameterization and costly computation, and it tends to disconnect with existing psychological accou...
Preprint
Full-text available
A key function of the lexicon is to express novel concepts as they emerge over time through a process known as lexicalization. The most common lexicalization strategies are the reuse and combination of existing words, but they have typically been studied separately in the areas of word meaning extension and word formation. Here we offer an informat...
Article
Morality is central to social well-being and cognition, and moral lexicon is a key device for human communication of moral concepts and experiences. How was the moral lexicon formed? We explore this open question and hypothesize that words evolved to take on abstract moral meanings from concrete and grounded experiences. We test this hypothesis by...
Article
Theorists have argued that morality builds on several core modular foundations. When do different moral foundations emerge in life? Prior work has explored the conceptual development of different aspects of morality in childhood. Here, we offer an alternative approach to investigate the developmental emergence of moral foundations through the lexic...
Article
Full-text available
Categorization is ubiquitous in human cognition and society, and shapes how we perceive and understand the world. Because categories reflect the needs and perspectives of their creators, no category system is entirely objective, and inbuilt biases can have harmful social consequences. Here we propose methods for measuring biases in hierarchical sys...
Preprint
Full-text available
Humans often make creative use of words to express novel senses. A long-standing effort in natural language processing has been focusing on word sense disambiguation (WSD), but little has been explored about how the sense inventory of a word may be extended toward novel meanings. We present a paradigm of word sense extension (WSE) that enables word...
Preprint
Full-text available
Moral norms vary across cultures. A recent line of work suggests that English large language models contain human-like moral biases, but these studies typically do not examine moral variation in a diverse cultural setting. We investigate the extent to which monolingual English language models contain knowledge about moral norms in different countri...
Conference Paper
Full-text available
Humans often make creative use of words to express novel senses. A long-standing effort in natural language processing has been focusing on word sense disambiguation (WSD), but little has been explored about how the sense inventory of a word may be extended toward novel meanings. We present a paradigm of word sense extension (WSE) that enables word...
Article
Full-text available
Semantic change is attested commonly in the historical development of lexicons across the world's languages. Extensive research has sought to characterize regularity in semantic change, but existing studies have typically relied on manual approaches or the analysis of a restricted set of languages. We present a large-scale computational analysis to...
Article
Full-text available
Semantic change is attested commonly in the historical development of lexicons across the world's languages. Extensive research has sought to characterize regularity in semantic change, but existing studies have typically relied on manual approaches or the analysis of a restricted set of languages. We present a large-scale computational analysis to...
Article
Full-text available
Scientific progress, or scientific change, has been an important topic in the philosophy and history of science. Previous work has developed quantitative approaches to characterize the progression of science in different fields, but how individual scientists make progress through their careers is not well understood at a comprehensive scale. We cha...
Article
Full-text available
Humans can flexibly extend word usages across different grammatical classes, a phenomenon known as word class conversion. Noun-to-verb conversion, or denominal verb (e.g., to Google a cheap flight), is one of the most prevalent forms of word class conversion. However, existing natural language processing systems are impoverished in interpreting and...
Conference Paper
Full-text available
The meaning of a slang term can vary in different communities. However, slang semantic variation is not well understood and under-explored in the natural language processing of slang. One existing view argues that slang semantic variation is driven by culture-dependent communicative needs. An alternative view focuses on slang's social functions sug...
Preprint
Full-text available
The meaning of a slang term can vary in different communities. However, slang semantic variation is not well understood and under-explored in the natural language processing of slang. One existing view argues that slang semantic variation is driven by culture-dependent communicative needs. An alternative view focuses on slang's social functions sug...
Article
Full-text available
Languages vary considerably in syntactic structure. About 40% of the world's languages have subject-verb-object order, and about 40% have subject-object-verb order. Extensive work has sought to explain this word order variation across languages. However, the existing approaches are not able to explain coherently the frequency distribution and evolu...
Article
Gender associations have been a long‐standing research topic in psychological and social sciences. Although it is known that children learn aspects of gender associations at a young age, it is not well understood how they might emerge through the course of development. We investigate whether gender associations, such as the association of dresses w...
Preprint
Full-text available
Gender associations have been a long-standing research topic in psychological and social sciences. Although it is known that children learn aspects of gender association at a young age, it is not well understood how they might emerge through the course of development. We investigate whether gender associations, such as the association of dresses wi...
Preprint
Full-text available
Humans can flexibly extend word usages across different grammatical classes, a phenomenon known as word class conversion. Noun-to-verb conversion, or denominal verb (e.g., to Google a cheap flight), is one of the most prevalent forms of word class conversion. However, existing natural language processing systems are impoverished in interpreting and...
Conference Paper
Full-text available
Slang is a predominant form of informal language making flexible and extended use of words that is notoriously hard for natural language processing systems to interpret. Existing approaches to slang interpretation tend to rely on context but ignore semantic extensions common in slang word usage. We propose a semantically informed slang interpretati...
Preprint
Full-text available
Slang is a predominant form of informal language making flexible and extended use of words that is notoriously hard for natural language processing systems to interpret. Existing approaches to slang interpretation tend to rely on context but ignore semantic extensions common in slang word usage. We propose a semantically informed slang interpretati...
Preprint
Full-text available
In lexicalist linguistic theories, argument structure is assumed to be predictable from the meaning of verbs. As a result, the verb is the primary determinant of the meaning of a clause. In contrast, construction grammarians propose that argument structure is encoded in constructions (or form-meaning pairs) that are distinct from verbs. Decades of...
Preprint
Full-text available
Contextualized word embeddings have demonstrated state-of-the-art performance in various natural language processing tasks including those that concern historical semantic change. However, language models such as BERT was trained primarily on contemporary corpus data. To investigate whether training on historical corpus data improves diachronic sem...
Article
Significance Grammatical marking of features such as number, tense, and evidentiality varies widely across languages. Despite this variation, we show that grammatical markers support efficient information transfer from speakers to listeners. We apply a formal model of communication to data from dozens of languages and find that grammatical marking...
Article
Full-text available
Humans possess the unique ability to communicate emotions through language. Although concepts like anger or awe are abstract, there is a shared consensus about what these English emotion words mean. This consensus may give the impression that their meaning is static, but we propose this is not the case. We cannot travel back to earlier periods to s...
Preprint
Full-text available
Natural language relies on a finite lexicon to express an unbounded set of emerging ideas. One result of this tension is the formation of new compositions, such that existing linguistic units can be combined with emerging items into novel expressions. We develop a framework that exploits the cognitive mechanisms of chaining and multimodal knowledge...
Conference Paper
Full-text available
Natural language relies on a finite lexicon to express an unbounded set of emerging ideas. One result of this tension is the formation of new compositions, such that existing linguistic units can be combined with emerging items into novel expressions. We develop a framework that exploits the cognitive mechanisms of chaining and multimodal knowledge...
Preprint
Full-text available
Morality plays an important role in social well-being, but people's moral perception is not stable and changes over time. Recent advances in natural language processing have shown that text is an effective medium for informing moral change, but no attempt has been made to quantify the origins of these changes. We present a novel unsupervised framew...
Preprint
Full-text available
Humans possess the unique ability to communicate emotions through language. Although concepts like anger or awe are abstract, there is a shared consensus about what these English emotion words mean. This consensus may give the impression that their meaning is static, but we propose this is not the case. We cannot travel back to earlier periods to s...
Preprint
Full-text available
As the numbers of submissions to conferences grow quickly, the task of assessing the quality of academic papers automatically, convincingly, and with high accuracy attracts increasing attention. We argue that studying interpretable dimensions of these submissions could lead to scalable solutions. We extract a collection of writing features, and con...
Preprint
Full-text available
The use of euphemisms is a known driver of language change. It has been proposed that women use euphemisms more than men. Although there have been several studies investigating gender differences in language, the claim about euphemism usage has not been tested comprehensively through time. If women do use euphemisms more, this could mean that women...
Preprint
Transformer language models have shown remarkable ability in detecting when a word is anomalous in context, but likelihood scores offer no information about the cause of the anomaly. In this work, we use Gaussian models for density estimation at intermediate layers of three language models (BERT, RoBERTa, and XLNet), and evaluate our method on BLiM...
Preprint
Functionalist accounts of language suggest that forms are paired with meanings in ways that support efficient communication. Previous work on grammatical marking suggests that word forms have lengths that enable efficient production, and work on the semantic typology of the lexicon suggests that word meanings represent efficient partitions of seman...
Preprint
Slang is a common type of informal language, but its flexible nature and paucity of data resources present challenges for existing natural language systems. We take an initial step toward machine generation of slang by developing a framework that models the speaker's word choice in slang context. Our framework encodes novel slang meaning by relatin...
Article
Overextension—the phenomenon that children extend known words to describe referents outside their vocabulary—is a hallmark of lexical innovation in early childhood. Overextension is a subject of extensive inquiry in linguistics and developmental psychology, but there exists no coherent formal account of this phenomenon. We develop a general computa...
Conference Paper
Full-text available
The use of euphemisms is a known driver of language change. It has been proposed that women use euphemisms more than men. Although there have been several studies investigating gender differences in language, the claim about euphemism usage has not been tested comprehensively through time. If women do use euphemisms more, this could mean that women...
Preprint
Full-text available
Semantic shifts can reflect changes in beliefs across hundreds of years, but it is less clear whether trends in fast-changing communities across a short time can be detected. We propose semantic coordinates analysis, a method based on semantic shifts, that reveals changes in language within publications of a field (we use AI as example) across a sh...
Conference Paper
Full-text available
We present a methodological framework for inferring symmetry of verb predicates in natural language. Empirical work on predicate symmetry has taken two main approaches. The feature-based approach focuses on linguistic features pertaining to symmetry. The context-based approach denies the existence of absolute symmetry but instead argues that such i...
Preprint
We present a methodological framework for inferring symmetry of verb predicates in natural language. Empirical work on predicate symmetry has taken two main approaches. The feature-based approach focuses on linguistic features pertaining to symmetry. The context-based approach denies the existence of absolute symmetry but instead argues that such i...
Preprint
Overextension—the phenomenon that children extend known words to describe referents outside their vocabulary—is a hallmark of lexical innovation in early childhood. Overextension is a subject of extensive inquiry in linguistics and developmental psychology, but there exists no coherent formal account of this phenomenon. We develop a general computa...
Preprint
Full-text available
Word class flexibility refers to the phenomenon whereby a single word form is used across different grammatical categories. Extensive work in linguistic typology has sought to characterize word class flexibility across languages, but quantifying this phenomenon accurately and at scale has been fraught with difficulties. We propose a principled meth...
Article
We explore how linguistic categories extend over time as novel items are assigned to existing categories. As a case study we consider how Chinese numeral classifiers were extended to emerging nouns over the past half century. Numeral classifiers are common in East and Southeast Asian languages, and are prominent in the cognitive linguistics literat...
Preprint
Full-text available
Developing moral awareness in intelligent systems has shifted from a topic of philosophical inquiry to a critical and practical issue in artificial intelligence over the past decades. However, automated inference of everyday moral situations remains an under-explored problem. We present a text-based approach that predicts people's intuitive judgmen...
Article
Full-text available
Languages differ qualitatively in their numeral systems. At one extreme, some languages have a small set of number terms, which denote approximate or inexact numerosities; at the other extreme, many languages have forms for exact numerosities over a very large range, through a recursively defined counting system. Why do numeral systems vary as they...
Preprint
Lexical semantic typology has identified important cross-linguistic generalizations about the variation and commonalities in polysemy patterns---how languages package up meanings into words. Recent computational research has enabled investigation of lexical semantics at a much larger scale, but little work has explored lexical typology across seman...
Preprint
We explore how linguistic categories extend over time as novel items are assigned to existing categories. As a case study we consider how Chinese numeral classifiers were extended to emerging nouns over the past half century. Numeral classifiers are common in East and Southeast Asian languages, and are prominent in the cognitive linguistics literat...
Preprint
Chinese dynastic histories form a large continuous linguistic space of approximately 2000 years, from the 3rd century BCE to the 18th century CE. The histories are documented in Classical (Literary) Chinese in a corpus of over 20 million characters, suitable for the computational analysis of historical lexicon and semantic change. However, there is...
Article
Full-text available
In natural language, multiple meanings often share a single word form, a phenomenon known as colexification. Some sets of meanings are more frequently colexified across languages than others, but the source of this variation is not well understood. We propose that cross-linguistic variation in colexification frequency is non-arbitrary and reflects...
Preprint
Full-text available
In natural language, multiple meanings often share a single word form, a phenomenon known as colexification. Some sets of meanings are more frequently colexified across languages than others, but the source of this variation is not well understood. We propose that cross-linguistic variation in colexification frequency is non-arbitrary and reflects...
Preprint
Full-text available
We present a text-based framework for investigating moral sentiment change of the public via longitudinal corpora. Our framework is based on the premise that language use can inform people's moral perception toward right or wrong, and we build our methodology by exploring moral biases learned from diachronic word embeddings. We demonstrate how a pa...
Preprint
One way that languages are able to communicate a potentially infinite set of ideas through a finite lexicon is by compressing emerging meanings into words, such that over time, individual words come to express multiple, related senses of meaning. We propose that overarching communicative and cognitive pressures have created systematic directionalit...
Preprint
Human language relies on a finite lexicon to express a potentiallyinfinite set of ideas. A key result of this tension is that wordsacquire novel senses over time. However, the cognitive processesthat underlie the historical emergence of new word senses arepoorly understood. Here, we present a computational frameworkthat formalizes competing views o...
Article
Significance How do words develop new senses? Unlike changes in sound or grammar where there are rich formal characterizations, semantic change is poorly understood. Changes in meaning are often considered intractable, with sparse attempts at formalizing and evaluating the principles against historical data at scale. We present a data-enriched form...
Article
Crosslinguistic research on domains including kinship, color, folk biology, number, and spatial relations has documented the different ways in which languages carve up the world into named categories. Although word meanings vary widely across languages, unrelated languages often have words with similar or identical meanings, and many logically poss...
Article
Previous research has proposed an adaptive cue combination view of the development of human spatial reorientation (Newcombe & Huttenlocher, 2006), whereby information from multiple sources is combined in a weighted fashion in localizing a target, as opposed to being modular and encapsulated (Hermer & Spelke, 1996). However, no prior work has formal...
Article
One way that languages are able to communicate a potentially infinite set of ideas through a finite lexicon is by compressing emerging meanings into words, such that over time, individual words come to express multiple, related senses of meaning. We propose that overarching communicative and cognitive pressures have created systematic directionalit...
Article
Full-text available
Humans are experts at face individuation. Although previous work has identified a network of face-sensitive regions and some of the temporal signatures of face processing, as yet, we do not have a clear understanding of how such face-sensitive regions support learning at different time points. To study the joint spatio-temporal neural basis of face...
Article
Full-text available
The Sapir‐Whorf hypothesis holds that human thought is shaped by language, leading speakers of different languages to think differently. This hypothesis has sparked both enthusiasm and controversy, but despite its prominence it has only occasionally been addressed in computational terms. Recent developments support a view of the Sapir‐Whorf hypothe...
Conference Paper
Full-text available
What forces have shaped the evolution of the lexicon? Languages evolve under the pressure of having to communicate an unbounded set of ideas using a finite set of linguistic structures. This suggests why the transmission of ideas should be compressed such that one word will develop multiple senses. Previous theory also suggests how a word might dev...
Article
Full-text available
The Sapir-Whorf hypothesis holds that our thoughts are shaped by our native language, and that speakers of different languages therefore think differently. This hypothesis is controversial in part because it appears to deny the possibility of a universal groundwork for human cognition, and in part because some findings taken to support it have not...
Article
Semantic categories in the world’s languages often reflect a historical process of chaining: A name for one referent is extended to a conceptually related referent, and from there on to other referents, producing a chain of exemplars that all bear the same name. The beginning and end points of such a chain might in principle be rather dissimilar. T...
Conference Paper
Full-text available
Semantic categories in the world's languages often reflect a historical process of chaining: A name for one idea is extended to a conceptually related idea, and from there on to other ideas, producing a chain of concepts that all bear the same name. The beginning and end points of such a chain might in principle be conceptually rather dissimilar. T...
Article
Full-text available
Humans are remarkably proficient at categorizing visually-similar objects. To better understand the cortical basis of this categorization process, we used magnetoencephalography (MEG) to record neural activity while participants learned-with feedback-to discriminate two highly-similar, novel visual categories. We hypothesized that although prefront...
Data
Humans are remarkably proficient at categorizing visually-similar objects. To better understand the cortical basis of this categorization process, we used magnetoencephalography (MEG) to record neural activity while participants learned–with feedback–to discriminate two highly-similar, novel visual categories. We hypothesized that although prefront...
Data
Humans are remarkably proficient at categorizing visually-similar objects. To better understand the cortical basis of this categorization process, we used magnetoencephalography (MEG) to record neural activity while participants learned–with feedback–to discriminate two highly-similar, novel visual categories. We hypothesized that although prefront...
Article
Identifying brain regions with high differential response under multiple experimental conditions is a fundamental goal of functional imaging. In many studies, regions of interest (ROIs) are not determined a priori but are instead discovered from the data, a process that requires care because of the great potential for false discovery. An additional...
Article
Full-text available
Magnetoencephalography (MEG) enables a noninvasive interface with the brain that is potentially capable of providing movement-related information similar to that obtained using more invasive neural recording techniques. Previous studies have shown that movement direction can be decoded from multichannel MEG signals recorded in humans performing wri...
Chapter
Full-text available
We present a cluster-based decoding algorithm for discovering regions of interest (ROIs) from EEG/MEG data in source space (or optimal cluster of sources) and predicting multiple conditions in a single experimental trial. Our algorithm automatically identifies contiguous brain regions that yield maximum mean test statistics from hypothesis tests ov...
Article
Full-text available
Although the use of clustering methods has rapidly become one of the standard computational approaches in the literature of microarray gene expression data analysis, little attention has been paid to uncertainty in the results obtained. We present an R/Bioconductor port of a fast novel algorithm for Bayesian agglomerative hierarchical clustering an...
Data
Figure 4. Gene clustering dendrogram for the NASC data.
Data
Full-text available
Figure 3. Condition clustering dendrogram for the NASC data.
Data
Full-text available
GO annotations for BHC clusters. Statistically significantly over-represented GO annotations for BHC clusters (Bonferroni-corrected p-value < 0.05)
Data
Full-text available
GO annotations for agglomerative hierarchical clustering. Statistically significantly over-represented GO annotations for clusters manually identified from agglomerative hierarchical clustering (Bonferroni-corrected p-value < 0.05)
Data
LeafDisparity values for the NASC experiments. The BHC clustering dendrogram is compared to a standard hierarchical method using uncentred correlation coefficients and complete linnkage.
Data
Full-text available
Figure 2. Gene clustering dendrogram of a subset of the Ideker et al. data, showing leaf harmony values
Data
Full-text available
Table 1 – Speed-trial of the BHC algorithm. Trials were based on the NASC data (880 genes, 31 features), clustering over genes. In each case, the data were duplicated or a subset of genes taken as appropriate to get the required number genes and features. All trials were run on a single 2 GHz CPU core on a Macbook Pro laptop.
Data
Full-text available
Table 2. Data discretisation for NASC experiment clustering
Data
Full-text available
Table 3. Data discretisation for NASC gene clustering
Data
BHC cluster membership. BHC cluster membership
Article
Full-text available
The Dirichlet process mixture (DPM) is a widely used model for clustering and for general nonparametric Bayesian density es- timation. Unfortunately, like in many sta- tistical models, exact inference in a DPM is intractable, and approximate methods are needed to perform efficient inference. While most attention in the literature has been placed on...

Network

Cited By