About
779
Publications
241,935
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
6,725
Citations
Introduction
I am intrigued by what we can learn from language,
How language reflects our interaction with the living environment through cognition,
How language links, clusters, and excludes different people,
Not just by what is said, how and why, but also what is expressed or not said,
As well as what is heard, understood, or how does it affect the listener,
And, in sum, how language big data reflects collective human behaviors in time and space.
語言是人類在時空逆旅中留下的足跡
Current institution
Additional affiliations
September 2003 - June 2009
Education
August 1982 - August 1986
August 1976 - June 1980
Publications
Publications (779)
Interactions among the environment, humans and language underlie many of the most pressing challenges we face today. This study investigates the use of different verbs to encode various weather events in Sinitic languages, a language family spoken over a wide range of climates and with 3000 years of continuous textual documentation. We propose to s...
This paper adopts models from epidemiology to account for the development and decline of neologisms based on internet usage. The research design focuses on the issue of whether a host-driven epidemic model is well-suited to explain human behavior regarding neologisms. We extracted the search frequency data from Google Trends that covers the ninety...
This paper proposes a textual analytics approach to the discovery of trends and variations in social development. Specifically, we have designed a linguistic index that measures the marked usage of gendered modifiers in the Chinese language; this predicts the degree of occupational gender segregation by identifying the unbalanced distribution of ma...
This paper investigates the emergence of COVID-19 neologisms. It focuses on the strategies used to coin emerging neologisms, the relationship between the strategies and the usage preferences, as well as the correlation between internet usage data and epidemiological data. The internet usage data were collected from December 2019 to June 2020 from t...
Sentiment analysis is helpful to bestow ability of understanding human’s attitude in texts on artificial intelligence systems. In this area, text sentiment is usually signaled by a few indicative words that convey affective meanings and arouse readers’ collective emotions. However, most existing sentiment analysis models have predominantly featured...
Type theories provide a formal foundation for logic, mathematics, and computing. They have also been applied to the study of language in formal syntax and semantics, such as categorial grammar and Montague semantics. In this paper, we adopt insights from the type-theoretical definition of grammatical categories to model different registers in Manda...
Disyllabic coordinate compounds represent one of the most common lexical types in Mandarin and are particularly challenging for second-language (L2) learners to acquire, especially in terms of the correct ordering of the two roots. In this study, we investigated the errors made by L2 learners in using disyllabic coordinate compounds in the Hanyu Sh...
Dense large language models(LLMs) face critical efficiency bottlenecks as they rigidly activate all parameters regardless of input complexity. While existing sparsity methods(static pruning or dynamic activation) address this partially, they either lack adaptivity to contextual or model structural demands or incur prohibitive computational overhead...
Neologisms reflect new ideas or new concepts in our life and play an important role in cultural transmission and the vitality of human language. The explosion of neologisms, especially in the past two decades, can also be ascribed to the popularity and accessibility of digital content and social media. In this paper, we focus on the issue of how ne...
The prediction of lexical complexity in context is assuming an increasing relevance in Natural Language Processing research, since identifying complex words is often the first step of text simplification pipelines. To the best of our knowledge, though, datasets annotated with complex words are available only for English and for a limited number of...
Neologisms reflect new ideas or new concepts in our life and play an important role in cultural transmission and the vitality of human language. The explosion of neologisms, especially in the past two decades, can also be ascribed to the popularity and accessibility of digital content and social media. In this paper, we focus on the issue of how ne...
Neologisms reflect new ideas or new concepts in our life and play an important role in cultural transmission and the vitality of human language. The explosion of neologisms, especially in the past two decades, can also be ascribed to the popularity and accessibility of digital content and social media. In this paper, we focus on the issue of how ne...
Linguistic synesthesia as a productive figurative language usage has received little attention in the field of Natural Language Processing (NLP). Although linguistic synesthesia is similar to metaphor concerning involving conceptual mappings and showing great usefulness in the NLP tasks such as sentiment analysis and stance detection, the well-stud...
Conceptual metaphors are one of many linguistic devices that can potentially encode and reinforce gender stereotypes. However, little is known about how metaphors encode gender stereotypes, and in previous literature the concept of “gendered metaphor” has been mostly assumed rather than attested. We take the first step to tackle this issue by exami...
Despite being spoken by a large population of speakers worldwide, Cantonese is under-resourced in terms of the data scale and diversity compared to other major languages. This limitation has excluded it from the current “pre-training and fine-tuning” paradigm that is dominated by Transformer architectures. In this paper, we provide a comprehensive...
The final version of this preprint has been accepted by the journal of Language, Resources and Evaluation on Apr 19, 2024.
The impact of emotion on prosody in the context of speech communication has yielded inconclusive results when it comes to the prosodic patterns associated with high-arousal emotions of different emotional valences, such as “Happy” and “Anger”. To clarify the existing ambiguity, this study utilized an emotional speech database to examine prosodic me...
In this paper, gender-related variations in the semi-institutional discourse are examined. We investigate the cross-gender conversation in the talk show Behind the Headline with Wentao, a corpus of around 88,000 words of Mandarin Chinese conversation and identify gender variation between female and male guests. We explore the turn-taking features a...
With the development of information technologies, our world currently faces such an overwhelming mass of neologisms. Therefore, the study of neologisms has become an important research topic in recent years [1]. In this research, we investigate the factors that facilitate the efficient propagation of Chinese neologisms, based on Internet usage data...
Recent studies indicated a trend of quantifying lexical semantic changes with distributional models. In this study, we investigated whether state-of-the-art language models can tell us the story of how a word developed its senses over time. Specifically, we exploited the Bert model to obtain sense representations and quantitatively track usage chan...
Mandarin alphabetical words (MAWs) refer to the code-mixing of Romanized letters and characters such as X光 ‘X-ray’ in the Mandarin lexicon. Previous studies have mainly focused on MAWs’ formation but lacked empirical evidence regarding their morpho-syntactic behaviours. Classifiers have been used to infer nominals’ semantic properties and character...
We explore argument realization in the resultative V- de construction under the framework of the Theta System. We find that the theta grids of the resultative V- de construction are of two types, i.e., ([+c-m], [-c]) and ([+cm], [-m]), depending on the (a-)thematic relation between the verb and second/internal argument. Crucially, the external argu...
This chapter will introduce a Chinese event-based emotion corpus and explore an important task in emotion analysis—emotion cause detection. The corpus design and the data collection and annotation procedures will be described, as will the emotion cause detection task, which aims to detect the triggering cause of an emotion automatically. We regard...
The empirical theory of sense is one of the most challenging and least studied topics in computational lexical semantics. Previous studies have focused on disambiguation and sense tagging, both of which rely on prior knowledge of the inventory of possible senses of a word. However, because prior lexical knowledge cannot be assumed, and linguistic b...
This chapter provides a comprehensive overview of the co-development of Chinese language resources and Chinese language processing in the past three decades. The overview highlights the contribution of the Institute of Computational Linguistics at Peking University and the CKIP group at Academia Sinica, as they are the two groups that constructed m...
In this chapter, we will introduce the language resources developed by the Chinese Knowledge Information Processing (CKIP) Group at Academia Sinica in Taiwan over the past 30 years. These include monolingual and bilingual lexical knowledge bases (CKIP lexical knowledge base, Hantology, Chinese WordNet, Sinica BOW, and E-HowNet), Chinese grammar (In...
The ability to automatically segment and PoS tag any Chinese text at any time with high accuracy and recall is a prerequisite for the online processing of Chinese texts. While this goal is within reach, it has yet to be attained even after more than 30 years of Chinese language processing research. Most recent achievements in Chinese adopt either s...
This chapter will present a collective effort to compile a comprehensive repository of accessible Chinese language resources that can be used online, licensed for use, or accessed in published form. The compendium will be presented in three parts according to each language resource’s type of accessibility, which is a direct consequence of the type...
This paper examines how conceptualizations of 'election' have changed in post-colonial Hong Kong. Drawing on Burgers' (2016) approach to model how metaphors change their focus on social topics over time, we study the uses of ELECTION metaphors in speeches by government leaders. These changes are classified as either fundamental changes (the use of...
This paper examines how conceptualizations of 'election' have changed in post-colonial Hong Kong. Drawing on Burgers' (2016) approach to model how metaphors change their focus on social topics over time, we study the uses of ELECTION metaphors in speeches by government leaders. These changes are classified as either fundamental changes (the use of...
We explore argument realization in the resultative V-de construction under the framework of the Theta System. We find that the theta grids of the resultative V- de construction are of two types, i.e., ([+c-m], [-c]) and ([+cm], [-m]), depending on the (a-)thematic relation between the verb and second/internal argument. Crucially, the external argum...
This study investigates the general public’s concerns about COVID-19 vaccination by their comments in social media (YouTube) with NLP techniques and time series analysis. A set of keywords are traced in order to better understand the changes in public opinion and responses at different stages of the pandemic, as well as the influences of fake news....
Understanding the nature of meaning and its extensions (with metaphor as one typical kind) has been one core issue in figurative language study since Aristotle’s time. This research takes a computational cognitive perspective to model metaphor based on the assumption that meaning is perceptual, embodied, and encyclopedic. We model word meaning repr...
This study investigates the use of metaphors and the prospect of gain/loss conveyed in the coverage of the pandemic in a leading conventional news outlet in Macau. We discovered that war metaphors have predominantly been used in reports in the Macau Daily News, and have identified three sets of lexical expressions used in these metaphors. The main...
Linguistic synesthesia links two concepts from two distinct sensory domains and creates conceptual conflicts at the level of embodied cognition. Previous studies focused on constraints on the directionality of synesthetic mapping as a way to establish the conceptual hierarchy among the five senses (i.e., vision, hearing, taste, smell, and touch). T...
Nouns in human languages mostly profile concrete and abstract entities. But how much eventive information can be found in nouns? Will such eventive information found in sensory nouns have anything to do with the cognitive representation of the basic human senses? Importantly, is there any ontological and/or cognitive motivation that can account for...
Written for beginning learners of the language, this concise introduction to Chinese grammar assumes only a basic knowledge of Chinese, and no knowledge of grammatical terminology and practices. Comparing Chinese grammar patterns and rules with those of English, and illustrated with a wealth of real-life examples, it allows learners to understand t...
Recent studies suggest an increasing interest in detecting lexical semantic changes in the context of distributional semantics. However, most proposals have been implemented with English datasets but not much with Chinese data. This paper thus presents an exploratory study using the popular Skip-gram models and post-processing operations to obtain...
In this study, we examine the variations in the alternative pattern of light verb construction between Taiwan and Mainland Mandarin, based on a large-scale comparable corpora statistical approach. The results show that these two variants display significant differences in preference for introducing the theme of the taken complement of the light ver...
This research aims to explore the possibility of extracting language concepts and semantic associates through the diachronic construction of cross-linguistic knowledge resources based on the framework of an existing fine-grained ontological system developed by the Sinica BOW Tang 300 poems project. The authors attempt to provide language educators...
This study draws on corpus methodology to investigate people’s reactions to COVID-19 vaccination using the data of Macau netizens’ comments on a YouTube channel. Four main topics under discussion were identified based on the word lists. Meanwhile, people were concerned about the activity of vaccines and were also engaged in heated debates on both d...
Gender and language and gendered language are two important topics where linguistic studies have great societal impacts. These two topic have not been well studied in Mandarin Chinese due to its lack of grammatical gender marking. We explore the gendered usage of Mandarin Chinese sentence-final particle (SFP) in casual conversation to address this...
Written for beginning learners of the language, this concise introduction to Chinese grammar assumes only a basic knowledge of Chinese, and no knowledge of grammatical terminology and practices. Comparing Chinese grammar patterns and rules with those of English, and illustrated with a wealth of real-life examples, it allows learners to understand t...
Written for beginning learners of the language, this concise introduction to Chinese grammar assumes only a basic knowledge of Chinese, and no knowledge of grammatical terminology and practices. Comparing Chinese grammar patterns and rules with those of English, and illustrated with a wealth of real-life examples, it allows learners to understand t...
Written for beginning learners of the language, this concise introduction to Chinese grammar assumes only a basic knowledge of Chinese, and no knowledge of grammatical terminology and practices. Comparing Chinese grammar patterns and rules with those of English, and illustrated with a wealth of real-life examples, it allows learners to understand t...
Written for beginning learners of the language, this concise introduction to Chinese grammar assumes only a basic knowledge of Chinese, and no knowledge of grammatical terminology and practices. Comparing Chinese grammar patterns and rules with those of English, and illustrated with a wealth of real-life examples, it allows learners to understand t...
Written for beginning learners of the language, this concise introduction to Chinese grammar assumes only a basic knowledge of Chinese, and no knowledge of grammatical terminology and practices. Comparing Chinese grammar patterns and rules with those of English, and illustrated with a wealth of real-life examples, it allows learners to understand t...
Written for beginning learners of the language, this concise introduction to Chinese grammar assumes only a basic knowledge of Chinese, and no knowledge of grammatical terminology and practices. Comparing Chinese grammar patterns and rules with those of English, and illustrated with a wealth of real-life examples, it allows learners to understand t...
Written for beginning learners of the language, this concise introduction to Chinese grammar assumes only a basic knowledge of Chinese, and no knowledge of grammatical terminology and practices. Comparing Chinese grammar patterns and rules with those of English, and illustrated with a wealth of real-life examples, it allows learners to understand t...
Written for beginning learners of the language, this concise introduction to Chinese grammar assumes only a basic knowledge of Chinese, and no knowledge of grammatical terminology and practices. Comparing Chinese grammar patterns and rules with those of English, and illustrated with a wealth of real-life examples, it allows learners to understand t...
Written for beginning learners of the language, this concise introduction to Chinese grammar assumes only a basic knowledge of Chinese, and no knowledge of grammatical terminology and practices. Comparing Chinese grammar patterns and rules with those of English, and illustrated with a wealth of real-life examples, it allows learners to understand t...
Written for beginning learners of the language, this concise introduction to Chinese grammar assumes only a basic knowledge of Chinese, and no knowledge of grammatical terminology and practices. Comparing Chinese grammar patterns and rules with those of English, and illustrated with a wealth of real-life examples, it allows learners to understand t...
Written for beginning learners of the language, this concise introduction to Chinese grammar assumes only a basic knowledge of Chinese, and no knowledge of grammatical terminology and practices. Comparing Chinese grammar patterns and rules with those of English, and illustrated with a wealth of real-life examples, it allows learners to understand t...
Written for beginning learners of the language, this concise introduction to Chinese grammar assumes only a basic knowledge of Chinese, and no knowledge of grammatical terminology and practices. Comparing Chinese grammar patterns and rules with those of English, and illustrated with a wealth of real-life examples, it allows learners to understand t...
Written for beginning learners of the language, this concise introduction to Chinese grammar assumes only a basic knowledge of Chinese, and no knowledge of grammatical terminology and practices. Comparing Chinese grammar patterns and rules with those of English, and illustrated with a wealth of real-life examples, it allows learners to understand t...
Written for beginning learners of the language, this concise introduction to Chinese grammar assumes only a basic knowledge of Chinese, and no knowledge of grammatical terminology and practices. Comparing Chinese grammar patterns and rules with those of English, and illustrated with a wealth of real-life examples, it allows learners to understand t...
Written for beginning learners of the language, this concise introduction to Chinese grammar assumes only a basic knowledge of Chinese, and no knowledge of grammatical terminology and practices. Comparing Chinese grammar patterns and rules with those of English, and illustrated with a wealth of real-life examples, it allows learners to understand t...
Written for beginning learners of the language, this concise introduction to Chinese grammar assumes only a basic knowledge of Chinese, and no knowledge of grammatical terminology and practices. Comparing Chinese grammar patterns and rules with those of English, and illustrated with a wealth of real-life examples, it allows learners to understand t...
Written for beginning learners of the language, this concise introduction to Chinese grammar assumes only a basic knowledge of Chinese, and no knowledge of grammatical terminology and practices. Comparing Chinese grammar patterns and rules with those of English, and illustrated with a wealth of real-life examples, it allows learners to understand t...
Written for beginning learners of the language, this concise introduction to Chinese grammar assumes only a basic knowledge of Chinese, and no knowledge of grammatical terminology and practices. Comparing Chinese grammar patterns and rules with those of English, and illustrated with a wealth of real-life examples, it allows learners to understand t...
Written for beginning learners of the language, this concise introduction to Chinese grammar assumes only a basic knowledge of Chinese, and no knowledge of grammatical terminology and practices. Comparing Chinese grammar patterns and rules with those of English, and illustrated with a wealth of real-life examples, it allows learners to understand t...
Written for beginning learners of the language, this concise introduction to Chinese grammar assumes only a basic knowledge of Chinese, and no knowledge of grammatical terminology and practices. Comparing Chinese grammar patterns and rules with those of English, and illustrated with a wealth of real-life examples, it allows learners to understand t...
Written for beginning learners of the language, this concise introduction to Chinese grammar assumes only a basic knowledge of Chinese, and no knowledge of grammatical terminology and practices. Comparing Chinese grammar patterns and rules with those of English, and illustrated with a wealth of real-life examples, it allows learners to understand t...
Written for beginning learners of the language, this concise introduction to Chinese grammar assumes only a basic knowledge of Chinese, and no knowledge of grammatical terminology and practices. Comparing Chinese grammar patterns and rules with those of English, and illustrated with a wealth of real-life examples, it allows learners to understand t...
Written for beginning learners of the language, this concise introduction to Chinese grammar assumes only a basic knowledge of Chinese, and no knowledge of grammatical terminology and practices. Comparing Chinese grammar patterns and rules with those of English, and illustrated with a wealth of real-life examples, it allows learners to understand t...
Written for beginning learners of the language, this concise introduction to Chinese grammar assumes only a basic knowledge of Chinese, and no knowledge of grammatical terminology and practices. Comparing Chinese grammar patterns and rules with those of English, and illustrated with a wealth of real-life examples, it allows learners to understand t...
Written for beginning learners of the language, this concise introduction to Chinese grammar assumes only a basic knowledge of Chinese, and no knowledge of grammatical terminology and practices. Comparing Chinese grammar patterns and rules with those of English, and illustrated with a wealth of real-life examples, it allows learners to understand t...
Written for beginning learners of the language, this concise introduction to Chinese grammar assumes only a basic knowledge of Chinese, and no knowledge of grammatical terminology and practices. Comparing Chinese grammar patterns and rules with those of English, and illustrated with a wealth of real-life examples, it allows learners to understand t...
Written for beginning learners of the language, this concise introduction to Chinese grammar assumes only a basic knowledge of Chinese, and no knowledge of grammatical terminology and practices. Comparing Chinese grammar patterns and rules with those of English, and illustrated with a wealth of real-life examples, it allows learners to understand t...
Written for beginning learners of the language, this concise introduction to Chinese grammar assumes only a basic knowledge of Chinese, and no knowledge of grammatical terminology and practices. Comparing Chinese grammar patterns and rules with those of English, and illustrated with a wealth of real-life examples, it allows learners to understand t...
Written for beginning learners of the language, this concise introduction to Chinese grammar assumes only a basic knowledge of Chinese, and no knowledge of grammatical terminology and practices. Comparing Chinese grammar patterns and rules with those of English, and illustrated with a wealth of real-life examples, it allows learners to understand t...
Written for beginning learners of the language, this concise introduction to Chinese grammar assumes only a basic knowledge of Chinese, and no knowledge of grammatical terminology and practices. Comparing Chinese grammar patterns and rules with those of English, and illustrated with a wealth of real-life examples, it allows learners to understand t...
Written for beginning learners of the language, this concise introduction to Chinese grammar assumes only a basic knowledge of Chinese, and no knowledge of grammatical terminology and practices. Comparing Chinese grammar patterns and rules with those of English, and illustrated with a wealth of real-life examples, it allows learners to understand t...
Written for beginning learners of the language, this concise introduction to Chinese grammar assumes only a basic knowledge of Chinese, and no knowledge of grammatical terminology and practices. Comparing Chinese grammar patterns and rules with those of English, and illustrated with a wealth of real-life examples, it allows learners to understand t...
The use of transitive and intransitive verbs in Chinese grammar is introduced in this chapter. In particular, separable verbs (e.g., verb-object compounds), as a unique Chinese construction, are described in detail. In addition, special attention is paid to the expression of the temporal features of activities by suffixing the verbal aspects 了 le,...
Written for beginning learners of the language, this concise introduction to Chinese grammar assumes only a basic knowledge of Chinese, and no knowledge of grammatical terminology and practices. Comparing Chinese grammar patterns and rules with those of English, and illustrated with a wealth of real-life examples, it allows learners to understand t...
Written for beginning learners of the language, this concise introduction to Chinese grammar assumes only a basic knowledge of Chinese, and no knowledge of grammatical terminology and practices. Comparing Chinese grammar patterns and rules with those of English, and illustrated with a wealth of real-life examples, it allows learners to understand t...
Written for beginning learners of the language, this concise introduction to Chinese grammar assumes only a basic knowledge of Chinese, and no knowledge of grammatical terminology and practices. Comparing Chinese grammar patterns and rules with those of English, and illustrated with a wealth of real-life examples, it allows learners to understand t...
Written for beginning learners of the language, this concise introduction to Chinese grammar assumes only a basic knowledge of Chinese, and no knowledge of grammatical terminology and practices. Comparing Chinese grammar patterns and rules with those of English, and illustrated with a wealth of real-life examples, it allows learners to understand t...
Written for beginning learners of the language, this concise introduction to Chinese grammar assumes only a basic knowledge of Chinese, and no knowledge of grammatical terminology and practices. Comparing Chinese grammar patterns and rules with those of English, and illustrated with a wealth of real-life examples, it allows learners to understand t...
Written for beginning learners of the language, this concise introduction to Chinese grammar assumes only a basic knowledge of Chinese, and no knowledge of grammatical terminology and practices. Comparing Chinese grammar patterns and rules with those of English, and illustrated with a wealth of real-life examples, it allows learners to understand t...
Written for beginning learners of the language, this concise introduction to Chinese grammar assumes only a basic knowledge of Chinese, and no knowledge of grammatical terminology and practices. Comparing Chinese grammar patterns and rules with those of English, and illustrated with a wealth of real-life examples, it allows learners to understand t...
The proliferation of COVID-19 fake news on social media poses a severe threat to the health information ecosystem. We show that affective computing can make significant contributions to combat this infodemic. Given that fake news is often presented with emotional appeals, we propose a new perspective on the role of emotion in the attitudes, percept...
The present paper explores the synchronic variations and diachronic changes in political discourses in Hong Kong (HK) and in Mainland of People’s Republic of China (PRC). The relationship between lengths of linguistic constructs and their immediate constituents (including sentences and clauses, and clauses and words) are fitted using the function y...