Sheng Gao

Sheng Gao
  • Beijing University of Posts and Telecommunications

About

78
Publications
8,231
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,263
Citations
Introduction
Skills and Expertise
Current institution
Beijing University of Posts and Telecommunications

Publications

Publications (78)
Article
Full-text available
In zero-shot relation extraction, existing methods usually learn semantic features from seen relations to infer unseen relations. However, because there is no instance of unseen relation that can be used for training, it is still a challenge for the existing models to learn the semantic gap between seen relations and unseen relations, resulting in...
Article
In this work, we present that coreference resolution and dropped pronoun recovery are two strongly related tasks in Chinese conversations, as recovering the dropped pronoun needs to explore the referent of the pronoun at first. Meanwhile, the omitted entity mention should be recovered before its coreferences are resolved. This motivates us to propo...
Chapter
Privacy protection is an essential issue in biomedical natural language processing (BioNLP). Recently, some researchers apply federated learning (FL) in BioNLP to protect the privacy of biomedical data. However, their methods are only applicable for small NLP models, whose effectiveness is heavily limited in processing biomedical data. In this pape...
Preprint
In this paper, we present a neural model for joint dropped pronoun recovery (DPR) and conversational discourse parsing (CDP) in Chinese conversational speech. We show that DPR and CDP are closely related, and a joint model benefits both tasks. We refer to our model as DiscProReco, and it first encodes the tokens in each utterance in a conversation...
Article
Relation extraction has been an active research interest in the field of Natural Language Processing (NLP). The past works primarily focused on a corpus of formal text which is inherently non-dialogic. Recently, the dialogue-based relation extraction task, which detects relations among speaker-aware entities scattering in dialogues, has been gradua...
Preprint
Pronouns are often dropped in Chinese conversations and recovering the dropped pronouns is important for NLP applications such as Machine Translation. Existing approaches usually formulate this as a sequence labeling task of predicting whether there is a dropped pronoun before each token and its type. Each utterance is considered to be a sequence a...
Article
Full-text available
Medication recommendation based on Electronic Health Records (EHRs) is an important research direction, which aims to make prescription recommendations according to EHRs of patients. Most existing methods either only make recommendation through EHRs of the current admission while ignoring the patient’s historical records, or fail to fully consider...
Article
Relation classification is an important semantic processing task in the field of Natural Language Processing (NLP). The past works mainly focused on binary relations in a single sentence. Recently, cross-sentence N-ary relation classification, which detects relations among n entities across multiple sentences, has been arousing people’s interests....
Article
Outline extraction has been widely applied in online consultation to help experts quickly understand individual cases. Given a specific case described as unstructured plain text, outline extraction aims to make a summary for this case by answering a set of questions, which in fact is a new type of machine reading comprehension task. Inspired by a r...
Chapter
Dialogue State Tracking is one of the most important component in task-oriented dialog system, which can update the dialogue state and accurately estimate the compact state representation. This paper introduces a multi-level feature combination to capture correlation on the basis of the entire dialog and slots. Based on the Memory-Network [1], whic...
Chapter
In this paper, we study the problem of multi-choice reading comprehension, which requires a machine to select the correct answer from a set of candidates based on the given passage and question. Most existing approaches focus on designing sophisticated attention to model the interactions of the sequence triplets (passage, question and candidate opt...
Chapter
Dropped pronoun recovery, which aims to detect the type of pronoun dropped before each token, plays a vital role in many applications such as Machine Translation and Information Extraction. Recently, deep neural networks have been applied to this task. Though promising improvements have been observed, these methods recover dropped pronouns from the...
Article
Full-text available
When people do the reading comprehension, they often try to find the words from the passages which are similar to the question words first. Then people deduce the answer based on the context around these similar words. Therefore, the position information may be helpful in finding the answer rapidly and is useful for reading comprehension. However,...
Conference Paper
In this paper we propose a novel reinforcement learning based model for named entity recognition (NER), referred to as MM-NER. Inspired by the methodology of the AlphaGo Zero, MM-NER formalizes the problem of named entity recognition with a Monte-Carlo tree search (MCTS) enhanced Markov decision process (MDP) model, in which the time steps correspo...
Preprint
Full-text available
Pronouns are often dropped in Chinese sentences, and this happens more frequently in conversational genres as their referents can be easily understood from context. Recovering dropped pronouns is essential to applications such as Information Extraction where the referents of these dropped pronouns need to be resolved, or Machine Translation when Ch...
Preprint
Next basket recommendation, which aims to predict the next a few items that a user most probably purchases given his historical transactions, plays a vital role in market basket analysis. From the viewpoint of item, an item could be purchased by different users together with different items, for different reasons. Therefore, an ideal recommender sy...
Article
Full-text available
One of the major challenges to build a task-oriented dialogue system is that dialogue state transition frequently happens between multiple domains such as booking hotels or restaurants. Recently, the encoderdecoder model based on the end-to-end neural network has become an attractive approach to meet this challenge. However, it usually requires a s...
Conference Paper
Convolutional Neural Network (CNNs) are widely used in NLP tasks for their powerful ability to capture n-gram features. An effective way to improve CNNs performance is to incorporate prior knowledge or external resources like topic distribution. This paper considers sentence classification problem as semantic matching problem and proposes a topic m...
Article
Full-text available
Relation classification is a crucial ingredient in numerous information extraction systems and has attracted a great deal of attention in recent years. Traditional approaches largely rely on feature engineering and suffer from the limitations of domain adaption and the error propagation. To overcome the above problems, many deep neural-network-base...
Preprint
Dropout is used to avoid overfitting by randomly dropping units from the neural networks during training. Inspired by dropout, this paper presents GI-Dropout, a novel dropout method integrating with global information to improve neural networks for text classification. Unlike the traditional dropout method in which the units are dropped randomly ac...
Chapter
Visual relation, such as “person holds dog” is an effective semantic unit for image understanding, as well as a bridge to connect computer vision and natural language. Recent work has been proposed to extract the object features in the image with the aid of respective textual description. However, very little work has been done to combine the multi...
Preprint
In this paper we propose a novel reinforcement learning based model for sequence tagging, referred to as MM-Tag. Inspired by the success and methodology of the AlphaGo Zero, MM-Tag formalizes the problem of sequence tagging with a Monte Carlo tree search (MCTS) enhanced Markov decision process (MDP) model, in which the time steps correspond to the...
Article
Full-text available
In this paper, we address the problem of Entity Linking (EL) as aligning a textual mention to the referent entity in a knowledge base (e.g., Freebase). Most previous studies on EL mainly focus on designing various feature representations for the mentions and entities. However, these handcrafted features often ignore the internal meanings of words o...
Conference Paper
mage retrieval based on deep hashing methods has attracted more and more attentions from both academic and industry, due to the out-standing performance of deep neural network in various tasks of computer vision. However, most of the hashing methods are designed to learn simple similarity only for single-label image retrieval, thus cannot work well...
Article
Traditional recommender systems employ a history of item preferences by a set of users for recommending items of interest to a given user. Matrix factorization based models have achieved the state-of-the-art success in the personal recommendation tasks by aiming at predicting ratings through learning latent factors of users and items via the rating...
Article
Knowledge graph (KG) embedding aims at learning the latent semantic representations for entities and relations. However, most existing approaches can only be applied to KG completion, so cannot identify relations including unseen entities (or Out-of-KG entities). In this paper, motivated by the zero-shot learning, we propose a novel model, namely J...
Article
It inevitably comes out information overload problem with the increasing available data on e-commence websites. Most existing approaches have been proposed to recommend the users personal significant and interesting items on e-commence websites, by estimating unknown rating which the user may rate the unrated item, i.e., rating prediction. However,...
Article
Electronic Health Records (EHRs) refer to a collection of patient data, including diagnosis, medical history, medication, allergies, etc., mostly contained in the form of unstructured text. EHRs are designed to capture the state of a patient over time, thus the temporal information is crucial. Most previous works processing time in EHRs narrative f...
Article
With the increase of social networking websites and the interaction frequency among users, the prediction of information diffusion is required to support effective generalization and efficient inference in the context of social big data era. However, the existing models either rely on expensive probabilistic modeling of information diffusion based...
Article
Full-text available
Knowledge base is a very important databas for knowledge management, which is very useful fo Question Answering, Query Expansion and other A tasks. However, due to the fast-growing knowledge o the web and not all common knowledge expressed i the text is explicit, the knowledge base always suffer from incompleteness. Recently many researchers ar try...
Article
Recommender system has been recognized as a superior way for solving personal information overload problem. Rating, as an evaluation criteria revealing how much a customer likes a product, has been a foundation of recommender systems for a long period based on the popular latent factor models. However, review texts as the valuable user generated co...
Conference Paper
The mobile Internet brings tremendous opportunities for researchers to analyze user mobility pattern, which is of great importance for Internet Service Providers (ISP) to provide better location-based services. This paper focuses on predicting user mobility patterns based on their different mobility characteristics. For that, we collect real-world...
Conference Paper
Full-text available
AskStory is a company providing an e-recruitment service where job seekers find a variety of job openings. This paper discusses an approach to recommending job openings attractive to job seekers.
Article
Traditionally, pattern-based relation extraction methods are usually based on iterative bootstrapping model which generally implies semantic drift or low recall problem. In this paper, we present a novel semantic bootstrapping framework that uses semantic information of patterns and flexible match method to address such problem. We introduce formal...
Article
A knowledge base of triples like (subject entity, predicate relation,object entity) is a very important resource for knowledge management. It is very useful for human-like reasoning, query expansion, question answering (Siri) and other related AI tasks. However, such a knowledge base often suffers from incompleteness due to a large volume of increa...
Article
We propose VecLP, a novel Internet Video recommendation system working for Live TV Programs in this paper. Given little information on the live TV programs, our proposed VecLP system can effectively collect necessary information on both the programs and the subscribers as well as a large volume of related online videos, and then recommend the relev...
Article
Cross-domain recommendation has been proposed to transfer user behavior pattern by pooling together the rating data from multiple domains to alleviate the sparsity problem appearing in single rating domains. However, previous models only assume that multiple domains share a latent common rating pattern based on the user-item co-clustering. To captu...
Article
Measuring the similarity of patterns is the key in pattern-based approaches in relation extraction. Most existing methods generally rely on inflexible pattern similarity measurements which often lead to low recall. In this work, a novel kernel-based model is proposed to address this problem. Depending on the pattern similarities produced by our bot...
Article
Full-text available
Cross-domain recommendation has been proposed to transfer user behavior pattern by pooling together the rating data from multiple domains to alleviate the sparsity problem appearing in single rating domains. However, previous models only assume that multiple domains share a latent common rating pattern based on the user-item co-clustering. To captu...
Article
Full-text available
In Bayesian analysis of a statistical model, the predictive distribution is obtained by marginalizing over the parameters with their posterior distributions. Compared to the frequently used point estimate plug-in method, the predictive distribution leads to a more reliable result in calculating the predictive likelihood of the new upcoming data, es...
Article
Full-text available
Cyber-physical systems (CPS) are often characterized as smart systems, which intelligently interact with other systems across information and physical interfaces. An increased dependence on CPS led to the collection of a vast amount of human-centric data, which brings the information overload problem across multiple domains. Recommender systems in...
Conference Paper
Recommender systems always aim to provide recommendations for a user based on historical ratings collected from a single domain (e.g., movies or books) only, which may suffer from the data sparsity problem. Recently, several recommendation models have been proposed to transfer knowledge by pooling together the rating data from multiple domains to a...
Conference Paper
In this paper we address the problem of modelling relational data, which has appeared in many applications such as social network analysis, recommender systems and bioinformatics. Previous studies either consider latent feature based models to do link prediction in the relational data but disregarding local structure in the network, or focus exclus...
Conference Paper
With the increasing of online social networks, people always form the friendship networks among their social neighborhood, and also associate themselves with circles or communities due to their common interest. Thus there are two related networks: the friendship network among users as well as the affiliation network between users and circles. In th...
Conference Paper
Full-text available
In this paper we address the problem of link prediction in networked data, which appears in many applications such as social network analysis or recommender systems. Previous studies either consider latent feature based models but disregarding local structure in the network, or focus exclusively on capturing local structure of objects based on late...
Article
Full-text available
In this paper we address the problem of modeling relational data, which appear in many applications such as social network analysis, recommender systems and bioinformatics. Previous studies either consider latent feature based models but disregarding local structure in the network, or focus exclusively on capturing local structure of objects based...
Article
This paper aims at the problem of link pattern prediction in collections of objects connected by multiple relation types, where each type may play a distinct role. While common link analysis models are limited to single-type link prediction, we attempt here to capture the correlations among different relation types and reveal the impact of various...
Conference Paper
In this paper we address the problem of temporal link prediction, i.e., predicting the apparition of new links, in time-evolving networks. This problem appears in applications such as recommender systems, social network analysis or citation analysis. Link prediction in time-evolving networks is usually based on the topological structure of the netw...
Conference Paper
We address the problem of link prediction in collections of objects connected by multiple relation types, where each type may play a distinct role. While traditional link prediction models are limited to single-type link prediction we attempt here to jointly model and predict the multiple relation types, which we refer to as the Link Pattern Predic...
Conference Paper
Many real-world datasets can be considered as a linked collection of objects with multi-type relations, where each type of relations may play a distinct role. In this paper, we address the problem of link prediction in such multi-relational networks. While traditional link prediction methods are limited to single-type link prediction we attempt her...

Network

Cited By