Yinhe Zheng

Yinhe Zheng
Alibaba Group · Conversational AI Group

Doctor of Engineering
Collaboration Welcomed

About

62
Publications
4,146
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
565
Citations
Introduction
Working in NLP, especially the topics related to Dialogue systems.
Additional affiliations
January 2019 - present
Tsinghua University
Position
  • PostDoc Position
Description
  • Natural language processing; Dialogue system
September 2012 - July 2017
China University of Geosciences (Beijing)
Position
  • PhD
Description
  • Engineering computing

Publications

Publications (62)
Article
Full-text available
In traditional block theory, the removability and stability of rock blocks are analyzed independently; that is, the stability of a removable block is analyzed in detail, and nonremovable blocks are regarded as stable. However, in practical situations, nonremovable blocks may pose more danger than removable blocks. This paper presents a unified meth...
Article
This paper presents a block identification method for jointed rock masses that is based on exact arithmetic. Block structures are constructed accurately without introducing rounding errors, and therefore the robustness of the block identification algorithm is guaranteed. A rational number type is defined and the basic arithmetic operations of the r...
Preprint
Endowing dialogue systems with personas is essential to deliver more human-like conversations. However, this problem is still far from well explored due to the difficulties of both embodying personalities in natural languages and the persona sparsity issue observed in most dialogue corpora. This paper proposes a pre-training based personalized dial...
Article
Natural Language Understanding (NLU) is a vital component of dialogue systems, and its ability to detect Out-of-Domain (OOD) inputs is critical in practical applications, since the acceptance of the OOD input that is unsupported by the current system may lead to catastrophic failure. However, most existing OOD detection methods rely heavily on manu...
Preprint
Full-text available
With the emergence of increasingly powerful large language models, there is a burgeoning interest in leveraging these models for casual conversation and role-play applications. However, existing conversational and role-playing datasets often fail to capture the diverse and nuanced interactions typically exhibited by real-world role-play participant...
Preprint
Real-world data often have an open long-tailed distribution, and building a unified QA model supporting various tasks is vital for practical QA applications. However, it is non-trivial to extend previous QA approaches since they either require access to seen tasks of adequate samples or do not explicitly model samples from unseen tasks. In this pap...
Preprint
Lifelong learning (LL) is an important ability for NLP models to learn new tasks continuously. Architecture-based approaches are reported to be effective implementations for LL models. However, it is non-trivial to extend previous approaches to domain incremental LL scenarios since they either require access to task identities in the testing phase...
Preprint
Out-of-distribution (OOD) detection is essential for the reliable and safe deployment of machine learning systems in the real world. Great progress has been made over the past years. This paper presents the first review of recent advances in OOD detection with a particular focus on natural language processing approaches. First, we provide a formal...
Preprint
Out-of-Domain (OOD) intent detection is vital for practical dialogue systems, and it usually requires considering multi-turn dialogue contexts. However, most previous OOD intent detection approaches are limited to single dialogue turns. In this paper, we introduce a context-aware OOD intent detection (Caro) framework to model multi-turn contexts in...
Article
Full-text available
Introduction: The growing demand for mental health support has highlighted the importance of conversational agents as human supporters worldwide and in China. These agents could increase availability and reduce the relative costs of mental health support. The provided support can be divided into two main types: cognitive and emotional. Existing wo...
Preprint
Full-text available
Empathetic dialogue is a human-like behavior that requires the perception of both affective factors (e.g., emotion status) and cognitive factors (e.g., cause of the emotion). Besides concerning emotion status in early work, the latest approaches study emotion causes in empathetic dialogue. These approaches focus on understanding and duplicating emo...
Preprint
Lifelong learning aims to accumulate knowledge and alleviate catastrophic forgetting when learning tasks sequentially. However, existing lifelong language learning methods only focus on the supervised learning setting. Unlabeled data, which can be easily accessed in real-world scenarios, are underexplored. In this paper, we explore a novel setting,...
Preprint
Full-text available
Out-of-Domain (OOD) intent detection is important for practical dialog systems. To alleviate the issue of lacking OOD training samples, some works propose synthesizing pseudo OOD samples and directly assigning one-hot OOD labels to these pseudo samples. However, these one-hot labels introduce noises to the training process because some hard pseudo...
Preprint
Dialogue contradiction is a critical issue in open-domain dialogue systems. The contextualization nature of conversations makes dialogue contradiction detection rather challenging. In this work, we propose a benchmark for Contradiction Detection in Chinese Conversations, namely CDConv. It contains 12K multi-turn conversations annotated with three t...
Preprint
Full-text available
Lifelong learning (LL) is vital for advanced task-oriented dialogue (ToD) systems. To address the catastrophic forgetting issue of LL, generative replay methods are widely employed to consolidate past knowledge with generated pseudo samples. However, most existing generative replay methods use only a single task-specific token to control their mode...
Preprint
Full-text available
The growing demand for mental health support has highlighted the importance of conversational agents as human supporters worldwide and in China. These agents could increase availability and reduce the relative costs of mental health support. The provided support can be divided into two main types: cognitive and emotional support. Existing work on t...
Preprint
QA models with lifelong learning (LL) abilities are important for practical QA applications, and architecture-based LL methods are reported to be an effective implementation for these models. However, it is non-trivial to extend previous approaches to QA tasks since they either require access to task identities in the testing phase or do not explic...
Article
Pre-trained models have proved to be powerful in enhancing task-oriented dialog systems. However, current pre-training methods mainly focus on enhancing dialog understanding and generation tasks while neglecting the exploitation of dialog policy. In this paper, we propose GALAXY, a novel pre-trained dialog model that explicitly learns dialog policy...
Preprint
Full-text available
We have noticed that Marek et al. (2021) try to re-implement our paper Zheng et al. (2020a) in their work "OodGAN: Generative Adversarial Network for Out-of-Domain Data Generation". Our paper proposes a model to generate pseudo OOD samples that are akin to IN-Domain (IND) input utterances. These pseudo OOD samples can be used to improve the OOD det...
Preprint
Full-text available
Many studies on dialog emotion analysis focus on utterance-level emotion only. These models hence are not optimized for dialog-level emotion detection, i.e. to predict the emotion category of a dialog as a whole. More importantly, these models cannot benefit from the context provided by the whole dialog. In real-world applications, annotations to d...
Preprint
Building models of natural language processing (NLP) is challenging in low-resource scenarios where only limited data are available. Optimization-based meta-learning algorithms achieve promising results in low-resource scenarios by adapting a well-generalized model initialization to handle new tasks. Nonetheless, these approaches suffer from the me...
Preprint
Distinct is a widely used automatic metric for evaluating the diversity of language generation tasks. However, we observe that the original approach to calculating distinct scores has evident biases that tend to add higher penalties to longer sequences. In this paper, we refine the calculation of distinct scores by re-scaling the number of distinct...
Preprint
Full-text available
Pre-trained models have proved to be powerful in enhancing task-oriented dialog systems. However, current pre-training methods mainly focus on enhancing dialog understanding and generation tasks while neglecting the exploitation of dialog policy. In this paper, we propose GALAXY, a novel pre-trained dialog model that explicitly learns dialog policy...
Article
This paper proposes a dual-supervised uncertainty inference (DS-UI) framework for improving Bayesian estimation-based UI in DNN-based image recognition. In the DS-UI, we combine the classifier of a DNN, i.e. , the last fully-connected (FC) layer, with a mixture of Gaussian mixture models (MoGMM) to obtain an MoGMM-FC layer. Unlike existing UI met...
Preprint
Unsupervised domain adaptation (UDA) with pre-trained language models (PrLM) has achieved promising results since these pre-trained models embed generic knowledge learned from various domains. However, fine-tuning all the parameters of the PrLM on a small domain-specific corpus distort the learned generic knowledge, and it is also expensive to depl...
Preprint
Full-text available
Grounded dialogue models generate responses that are grounded on certain concepts. Limited by the distribution of grounded dialogue data, models trained on such data face the transferability challenges in terms of the data distribution and the type of grounded concepts. To address the challenges, we propose the grounded minimal editing framework, w...
Preprint
Full-text available
Incorporating multi-modal contexts in conversation is an important step for developing more engaging dialogue systems. In this work, we explore this direction by introducing MMChat: a large scale multi-modal dialogue corpus (32.4M raw dialogues and 120.84K filtered dialogues). Unlike previous corpora that are crowd-sourced or collected from fictiti...
Preprint
Full-text available
Although pre-trained language models have remarkably enhanced the generation ability of dialogue systems, open-domain Chinese dialogue systems are still limited by the dialogue data and the model size compared with English ones. In this paper, we propose EVA, a Chinese dialogue system that contains the largest Chinese pre-trained dialogue model wit...
Preprint
Full-text available
In this paper, we propose to combine pretrained language models with the modular dialogue paradigm for open-domain dialogue modeling. Our method, semantic-enhanced finetuning, instantiates conversation understanding, planning, and response generation as a language model finetuning task. At inference, we disentangle semantic and token variations by...
Preprint
Full-text available
Neural dialogue generation models trained with the one-hot target distribution suffer from the over-confidence issue, which leads to poor generation diversity as widely reported in the literature. Although existing approaches such as label smoothing can alleviate this issue, they fail to adapt to diverse dialog contexts. In this paper, we propose a...
Article
Generating stylized responses is essential to build intelligent and engaging dialogue systems. However, this task is far from well-explored due to the difficulties of rendering a particular style in coherent responses, especially when the target style is embedded only in unpaired texts that cannot be directly used to train the dialogue model. This...
Preprint
Personalised response generation enables generating human-like responses by means of assigning the generator a social identity. However, pragmatics theory suggests that human beings adjust the way of speaking based on not only who they are but also whom they are talking to. In other words, when modelling personalised dialogues, it might be favourab...
Chapter
The advancements of neural dialogue generation models show promising results on modeling short-text conversations. However, training such models usually needs a large-scale high-quality dialogue corpus, which is hard to access. In this paper, we present a large-scale cleaned Chinese conversation dataset LCCC, which contains a base version (6.8 mill...
Preprint
Generating stylized responses is essential to build intelligent and engaging dialogue systems. However, this task is far from well-explored due to the difficulties of rendering a particular style in coherent responses, especially when the target style is embedded only in unpaired texts that cannot be directly used to train the dialogue model. This...
Preprint
Recent advances in open-domain dialogue systems rely on the success of neural models that are trained on large-scale data. However, collecting large-scale dialogue data is usually time-consuming and labor-intensive. To address this data dilemma, we propose a novel data augmentation method for training open-domain dialogue models by utilizing unpair...
Preprint
The advancements of neural dialogue generation models show promising results on modeling short-text conversations. However, training such models usually needs a large-scale high-quality dialogue corpus, which is hard to access. In this paper, we present a large-scale cleaned Chinese conversation dataset, LCCC, which contains a base version (6.8mill...
Article
Endowing dialogue systems with personas is essential to deliver more human-like conversations. However, this problem is still far from well explored due to the difficulties of both embodying personalities in natural languages and the persona sparsity issue observed in most dialogue corpora. This paper proposes a pre-training based personalized dial...
Preprint
In natural language understanding components, detecting out-of-domain (OOD) inputs is important for dialogue systems since wrongly accepting these OOD utterances that are not currently supported may lead to catastrophic failures of the entire system. Entropy regularization is an effective solution to avoid such failures, however, its computation he...
Article
Analyzing the stability of rock blocks is important in rock engineering. Most block analysis methods assume the blocks are formed by planar fractures whereas the stability of blocks formed by curved fractures are rarely investigated. However, large fractures observed in field are often curved. This paper presents a method for analyzing the stabilit...
Article
Full-text available
Determining rock‐block structures is important for modeling jointed rock masses. This paper presents a method to construct the boundary structures of complex blocks by eliminating fictitious surfaces. The planes involved in forming a complex block are retrieved, and the boundary polygons on each plane are determined using a series of Boolean operat...
Preprint
Full-text available
Endowing a dialogue system with particular personality traits is essential to deliver more human-like conversations. However, due to the challenge of embodying personality via language expression and the lack of large-scale persona-labeled dialogue data, this research problem is still far from well-studied. In this paper, we investigate the problem...
Article
Since the concept of representative elementary volume (REV) of fractured rock masses is fundamental and significant in rock mechanics, it is worth studying by researchers. This paper critically reviews the latest research achievements in the field of REV of fractured rock masses. Three main aspects are discussed, which are research viewpoints and i...
Article
The removability and stability analysis of rock block are carried out independently in the classical block theory which tacitly approves the nonremovable block being absolutely stable and thus skips the stability analysis. However, in the practical engineering project, the "rock bridge" preventing the movement of rock block may crack, resulting the...
Article
Three-dimensional(3D) fracture connectivity, which can comprehensively reflect the fracture features, is a key parameter for evaluating the stability of engineering rock masses. For the purpose of determining a representative 3D connectivity value, the scale effect of 3D connectivity is investigated based on projection method and the representative...

Network

Cited By