Zhiliang Tian's research while affiliated with The Hong Kong University of Science and Technology and other places

Publications (17)

Preprint
Lifelong learning aims to accumulate knowledge and alleviate catastrophic forgetting when learning tasks sequentially. However, existing lifelong language learning methods only focus on the supervised learning setting. Unlabeled data, which can be easily accessed in real-world scenarios, are underexplored. In this paper, we explore a novel setting,...
Preprint
In knowledge distillation, a student model is trained with supervisions from both knowledge from a teacher and observations drawn from a training data distribution. Knowledge of a teacher is considered a subject that holds inter-class relations which send a meaningful supervision to a student; hence, much effort has been put to find such knowledge...
Preprint
Full-text available
Lifelong learning (LL) is vital for advanced task-oriented dialogue (ToD) systems. To address the catastrophic forgetting issue of LL, generative replay methods are widely employed to consolidate past knowledge with generated pseudo samples. However, most existing generative replay methods use only a single task-specific token to control their mode...
Chapter
Face-to-face communication leads to better interactions between speakers than text-to-text conversations since the speakers can capture both textual and visual signals. Image-grounded emotional response generation (IgERG) tasks requires chatbots to generate a response with the understanding of both textual contexts and speakers’ emotions in visual...
Preprint
Building models of natural language processing (NLP) is challenging in low-resource scenarios where only limited data are available. Optimization-based meta-learning algorithms achieve promising results in low-resource scenarios by adapting a well-generalized model initialization to handle new tasks. Nonetheless, these approaches suffer from the me...
Preprint
Full-text available
Text style transfer aims to alter the style (e.g., sentiment) of a sentence while preserving its content. A common approach is to map a given sentence to content representation that is free of style, and the content representation is fed to a decoder with a target style. Previous methods in filtering style completely remove tokens with style at the...
Preprint
Full-text available
Personalized conversation models (PCMs) generate responses according to speaker preferences. Existing personalized conversation tasks typically require models to extract speaker preferences from user descriptions or their conversation histories, which are scarce for newcomers and inactive users. In this paper, we propose a few-shot personalized con...
Article
Personalized conversation models (PCMs) generate responses according to speaker preferences. Existing personalized conversation tasks typically require models to extract speaker preferences from user descriptions or their conversation histories, which are scarce for newcomers and inactive users. In this paper, we propose a few-shot personalized con...
Preprint
Neural conversation models are known to generate appropriate but non-informative responses in general. A scenario where informativeness can be significantly enhanced is Conversing by Reading (CbR), where conversations take place with respect to a given external document. In previous work, the external document is utilized by (1) creating a context-...
Chapter
Despite the popularity of deep learning, structure learning for deep models remains a relatively under-explored area. In contrast, structure learning has been studied extensively for probabilistic graphical models (PGMs). In particular, an efficient algorithm has been developed for learning a class of tree-structured PGMs called hierarchical latent...

Citations

... AutoCite jointly learns citation recommendation and context generation based on the paper representations which are encoded with both citation graph structural and textual contexts. In addition to citation graph, some approach (Tian et al. 2021) utilizes social networks for few-shot text personalized conversation task and improves the generation performance. ...
... We have not investigated lifelong learning in the low-resource setting where only limited labeled data are available. In future works, we will consider combining PCLL with meta-learning (Zhao et al., 2022a) to extend our framework to a lifelong few-shot learning setting. We will also extend previous approaches of using unlabeled data (Zhang et al., 2020a) to build lifelong learning dialogue models. ...
... Style transfer is a popular task in natural language processing, referring to the process of converting a given sentence with a certain style (e.g., sentiment) into another style while retaining the original content (Shen et al., 2017;Fu et al., 2018;John et al., 2018;Lee et al., 2021). Recently, deep learning have become the dominant methods in text style transfer. ...
... As human conversations are almost always grounded with external knowledge, the absence of knowledge grounding has become one of the major gaps between current open-domain dialog systems and real human conversations [8,24,35]. A series of work [20,29] focused on generating a response based on the interaction between context and unstructured document knowledge, while a few others [22,33] introduced knowledge graphs into conversations. These models, however, usually under-perform in a low-resource setting. ...
... Memory networks for open-domain dialogue systems Tian et al. (2019) proposed a knowledge-grounded chit-chat system. A memory network was used to store queryresponse pairs and at the response generation stage, the generator produced the response conditioned on both the input query and memory pairs. ...
... 2. Diversity. Dist-n evaluates the proportion of n-grams of the generated responses (Li et al. 2016a;Song et al. 2017). 3. Consistency. ...
... Serban et al. [28] construct a conversational system that maps a context to a distributed vector encoding its meaning before creating a vector answer. Yan et al. [29] viewed the hierarchical modelling of contextual information as a recurrent encoding process. In addition, a weighted sequence (WSeq) attention model for HRED has been proposed, that also explicitly weights context vectors according to the relevance of query context. ...