About
174
Publications
20,289
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
2,394
Citations
Introduction
Current institution
Beihang University
Publications
Publications (174)
In-Context Learning (ICL) empowers Large Language Models (LLMs) for rapid task adaptation without Fine-Tuning (FT), but its reliance on demonstration selection remains a critical challenge. While many-shot ICL shows promising performance through scaled demonstrations, the selection method for many-shot demonstrations remains limited to random selec...
The performance of Large Language Models (LLMs) is intrinsically linked to the quality of its training data. Although several studies have proposed methods for high-quality data selection, they do not consider the importance of knowledge richness in text corpora. In this paper, we propose a novel and gradient-free High-Knowledge Scorer (HKS) to sel...
The performance of Large Language Models (LLMs) is intrinsically linked to the quality of its training data. Although several studies have proposed methods for high-quality data selection, they do not consider the importance of knowledge richness in text corpora. In this paper, we propose a novel and gradient-free High-Knowledge Scorer (HKS) to sel...
Retrieval Question Answering (ReQA) is a pivotal task in biomedical natural language processing, where the bi-encoders is a commonly employed solution due to its efficiency in retrieving answers from large candidate pools. However, bi-encoders falls short in capturing fine-grained interactions between questions and answers, a limitation that is eve...
Aligning Large Language Models (LLMs) with general human preferences has been proved crucial in improving the interaction quality between LLMs and human. However, human values are inherently diverse among different individuals, making it insufficient to align LLMs solely with general preferences. To address this, personalizing LLMs according to ind...
Recommender systems usually learn user interests from various user behaviors, including clicks and postclick behaviors (e.g., like and favorite, which reflects the true interests of users). However, these behaviors inevitably exhibit popularity bias, leading to some unfairness issues: 1) for items with similar quality, more popular ones get more ex...
Text ranking has witnessed significant advancements, attributed to the utilization of dual-encoder enhanced by Pre-trained Language Models (PLMs). Given the proliferation of available PLMs, selecting the most effective one for a given dataset has become a non-trivial challenge. As a promising alternative to human intuition and brute-force fine-tuni...
Knowledge Tracing (KT) aims to predict students’ future performances based on their former exercises and additional information in educational settings. KT has received much attention since it provides personalized experiences in educational situations. Simultaneously, the autoregressive modeling on the sequence of former exercises has been proven...
Self-attention, which allows transformers to capture deep bidirectional contexts, plays a vital role in BERT-like pre-trained language models. However, the maximum likelihood pre-training objective of BERT may produce an anisotropic word embedding space, which leads to biased attention scores for high-frequency tokens, as they are very close to eac...
Personalized education, tailored to individual student needs, leverages educational technology and artificial intelligence (AI) in the digital age to enhance learning effectiveness. The integration of AI in educational platforms provides insights into academic performance, learning preferences, and behaviors, optimizing the personal learning proces...
Collaborative filtering (CF) is an essential technique in recommender systems that provides personalized recommendations by only leveraging user-item interactions. However, most CF methods represent users and items as fixed points in the latent space, lacking the ability to capture uncertainty. While probabilistic embedding is proposed to intergrat...
Knowledge Tracing (KT) aims to predict students' future performances based on their former exercises and additional information in educational settings. KT has received significant attention since it facilitates personalized experiences in educational situations. Simultaneously, the autoregressive modeling on the sequence of former exercises has be...
p>Knowledge Tracing (KT) aims to predict students’ future performances based on their former exercises and additional information in educational settings. KT has received much attention since it provides personalized experiences in educational situations. The autoregressive modeling on the sequence of former exercises has been proven effective for...
p>Knowledge Tracing (KT) aims to predict students’ future performances based on their former exercises and additional information in educational settings. KT has received much attention since it provides personalized experiences in educational situations. The autoregressive modeling on the sequence of former exercises has been proven effective for...
Equipped with Chain-of-Thought (CoT), Large language models (LLMs) have shown impressive reasoning ability in various downstream tasks. Even so, suffering from hallucinations and the inability to access external knowledge, LLMs often come with incorrect or unfaithful intermediate reasoning steps, especially in the context of answering knowledge-int...
Introducing knowledge graphs (KGs) into recommendation systems can improve their performance, while reinforcement learning (RL) methods can help utilize graph data for recommendation. We investigate existing RL-based methods for recommendation on KGs, and find that such approaches do not make full use of information from user reviews. Introducing u...
User interest modeling is crucial for personalized news recommendation. Existing personalized news recommendation methods usually take the news data as the minimum interest modeling unit when modeling user’s interests. They ignored the low-level and high-level signals from user’s behaviors. In this paper, we propose a news recommendation method com...
italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Contribution:
In this study, an object tuple model has been proposed, and a quasi-experimental study on its usage in an introductory programming language course has been reported. This work can be adopted by all C language teachers and students in lear...
Distractor generation is one of the most important and challenging tasks in the automatic generation of multiple choice questions. Previous studies usually use a few ground truth distractors as training samples, which ignores more potential usable distractors, where the strong generation ability of deep learning models might not be fully utilized....
Limited by the corpus size and the annotation cost, biomedical question answering (BioQA) is a task of great research value. To generate professional biomedical answers, we first propose a text-to-text multi-task question generation model, which improves the accuracy of domain question generation with two auxiliary tasks. Based on this, a multi-tas...
The long-standing one-to-many issue of the open-domain dialogues poses significant challenges for automatic evaluation methods, i.e., there may be multiple suitable responses which differ in semantics for a given a conversational context. To tackle this challenge, we propose a novel learning-based automatic evaluation metric (CMN), which can robust...
Bo Zhao Jun Bai Chen Li- [...]
Zhang Xiong
Question answering (QA) plays a vital role in biomedical natural language processing. Among question answering tasks, the retrieval question answering (ReQA) aims to directly retrieve the correct answer from candidates and has attracted much attention in the community for its efficiency. Recently, researchers have introduced ReQA into the biomedica...
Large Transformer-based Pretrained Language Models (PLMs) dominate almost all Natural Language Processing (NLP) tasks. Nevertheless, they still make mistakes from time to time. For a model deployed in an industrial environment, fixing these mistakes quickly and robustly is vital to improve user experiences. Previous works formalize such problems as...
Explanations play an essential role in helping users evaluate results from recommender systems. Various natural language generation methods have been proposed to generate explanations for the recommendation. However, they usually suffer from two problems. First, since user-provided review text contains noisy data, the generated explanations may be...
Multiple Choice Questions (MCQs) are a kind of widely adopted approaches in learning assessment. Recently the automatic generation of MCQs has become a popular research area. In this task, Distractor Ranking (DR) is one of the most meaningful and challenging sub-tasks, where the DR models learn to select high-quality distractors from numerous candi...
Retrieval Question Answering (ReQA) is an essential mechanism of information sharing which aims to find the answer to a posed question from large-scale candidates. Currently, the most efficient solution is Dual-Encoder which has shown great potential in the general domain, while it still lacks research on biomedical ReQA. Obtaining a robust Dual-En...
Variational autoencoders (VAEs) are one of the powerful unsupervised learning frameworks in NLP for latent representation learning and latent-directed generation. The classic optimization goal of VAEs is to maximize the Evidence Lower Bound (ELBo), which consists of a conditional likelihood for generation and a negative Kullback-Leibler (KL) diverg...
Conversational recommender system is designed to proactively elicit the users preferences in a dialogue manner, which could effectively improve the user experience as well as the accuracy of recommendation compared with the traditional static recommender systems. As a powerful technique to flexibly produce context-dependent responses, generative di...
Recommender systems usually learn user interests from various user behaviors, including clicks and post-click behaviors (e.g., like and favorite). However, these behaviors inevitably exhibit popularity bias, leading to some unfairness issues: 1) for items with similar quality, more popular ones get more exposure; and 2) even worse the popular items...
Semantic search for candidate retrieval is an important yet neglected problem in retrieval-based Chatbots, which aims to select a bunch of candidate responses efficiently from a large pool. The existing bottleneck is to ensure the model architecture having two points: 1) rich interactions between a query and a response to produce query-relevant res...
Retrieval question answering (ReQA) is an essential mechanism to automatically satisfy the users’ information needs and overcome the problem of information overload. As a promising solution to achieve fast retrieval from large-scale candidate answers, dual-encoder framework has been widely studied to improve its representation quality for text in t...
Knowledge graph embedding (KGE) is to learn how to represent the low dimensional vectors for entities and relations based on the observed triples. When dealing with surrounding information, recent models either ignore the interactions between triples within the knowledge graph or use too many parameters to take the surrounding information into the...
Dual-Encoders is a promising mechanism for answer retrieval in question answering (QA) systems. Currently most conventional Dual-Encoders learn the semantic representations of questions and answers merely through matching score. Researchers proposed to introduce the QA interaction features in scoring function but at the cost of low efficiency in in...
Biomedical factoid question answering is an essential application for biomedical information sharing. Recently, neural network based approaches have shown remarkable performance for this task. However, due to the scarcity of annotated data which requires intensive knowledge of expertise, training a robust model on limited-scale biomedical datasets...
The traditional end-to-end Neural Question Generation (NQG) models tend to generate generic and bland questions, as there are two obscure points: 1) the modifications of the answer in the context can be used as the clues to the answer mentioned in the question, while they are generally not unique and can be used independently for generating diverse...
Named entity recognition (NER) is one of the most fundamental tasks in a variety of natural language applications. Due to the lack of delimiters in the Chinese language, Chinese NER task has been suffering from the shortage of word boundary information. Recently, incorporating word information has been proven an effective mechanism to alleviate thi...
With the development of the Internet, e-learning has become a new trend for education. However, unlike traditional learning that is face-to-face, e-learning systems construct an environment where learners control their learning process. Many issues have occurred in online learning systems, such as low efficiency, high dropout rates, poor grades and...
With the development of Internet technologies and the increasing demand for knowledge, increasingly more people choose online learning platforms as a way to acquire knowledge. However, the rapid growth in the types and number of courses makes it difficult for people to make choices, which leads to a series of problems, such as unsystematic learning...
Knowledge Tracing aims to model a student’s knowledge state from her past learning interactions and predict her performance in future. Although structures such as positional encoding or forgetting gate have already been used in Knowledge Tracing models, positional information with great potential is not fully utilized. In this paper, we propose a P...
Background
Biomedical question answering (QA) is a sub-task of natural language processing in a specific domain, which aims to answer a question in the biomedical field based on one or more related passages and can provide people with accurate healthcare-related information. Recently, a lot of approaches based on the neural network and large scale...
Biomedical factoid question answering is an important task in biomedical question answering application. It has attracted much attention because of its reliability of the answer. In question answering system, better representation of word is of much importance and a proper word embedding usually can improve the performance of system significantly....
Interpretability is a significant aspect of the distributed word representation learning model. Although the most advanced pretrained models have achieved the best results till date, the interpretability of a pretrained model is difficult to explain clearly. For this reason, based on the interpretability of distributed word embeddings, this paper p...
One of the key challenges for creating a successful chat bot is to find an effective way to learn from human-human conversation data. Recently, a few neural network based dialog models, including the RNN language model (RNNLM) and the hierarchical recurrent encoder-decoder (HRED) model have shown promising results on dialog response generation. How...
Lexical Answer Type (LAT) prediction is an essential part of question classification. It aims to assign certain lexical answer type to the questions to narrow down the search space and improve the classifier’s performance. LAT prediction is a challenge in the biomedical domain since it is more of a multi-label classification question, which means e...
In recurrent language models the usage of the class hierarchy of vocabulary is a major direction to overcome over-large vocabulary issue, yet the hierarchy is not aligned within the models, including the embedding, hidden and softmax layer. Currently most methods employ the hierarchical information in embedding and/or softmax layers. It is interest...
This paper proposes a sentiment analysis framework based on ranking learning. The framework utilizes BERT model pre-trained on large-scale corpora to extract text features and has two sub-networks for different sentiment analysis tasks. The first sub-network of the framework consists of multiple fully connected layers and intermediate rectified lin...
Smart learning systems provide relevant learning resources as a personalized bespoke package for learners based on their pedagogical needs and individual preferences. This paper introduces a learning style model to represent features of online learners. It also presents an enhanced recommendation method named Adaptive Recommendation based on Online...
Purpose
The purpose of this paper is to propose an attention alignment method for opinion mining of massive open online course (MOOC) comments. Opinion mining is essential for MOOC applications. In this study, the authors analyze some of bidirectional encoder representations from transformers (BERT’s) attention heads and explore how to use these at...
In recent years, teaching machines to ask meaningful and coherent questions has attracted considerable attention in natural language processing. Question generation has found wide applications in areas such as education (testing knowledge) and chatbots (enhancing interaction). Following previous studies on conversational question generation, we pro...
MmWave communication suffers from severe path loss due to high frequency and is sensitive to blockages because of high penetration loss, especially in mobile communication scenarios. It highly depends on line-of-sight channels and narrow beams, and thus efficient beam tracking and beam alignment are necessary techniques to maintain robust communica...
Response generation is an important direction in conversation systems. Currently a lot of approaches have been proposed and achieved significant improvement. However, an important limitation has been widely realized as most models tend to generate general answers. To cope with this limitation, besides the needs of more sophisticated generation mode...
Answer selection is one of the most important techniques in question answering applications since it can improve the user experience to a large extend. To achieve a better answer selection performance, a fundamental approach is to better understand the answers and questions. In this research, motivated by the Ebbinghaus Forgetting Curve which indic...
In programming courses, the traditional assessment approach tends to evaluate student performance by scoring one or more project-level summative assignments. This approach no longer meets the requirements of a quality programming language education. Based on an upgraded peer code review model, we propose a formative assessment approach to assess th...
Biomedical event extraction plays an important role in the extraction of biological information from large-scale scientific publications. However, most state-of-the-art systems separate this task into several steps, which leads to cascading errors. In addition, it is complicated to generate features from syntactic and dependency analysis separately...
The dialogue response generation is a challenging task in chatbot applications. Recently neural-network-based dialogue models, including the sequence-to-sequence model and the RNN language models, are able to generate fluent and grammatically compliant responses, while there is a major limitation that most of the responses generated by these models...
Emotion cause extraction is one of the most important applications in natural language processing tasks. It is a difficult challenge due to the complex semantic information between emotion description and the whole document. Previous approaches have revealed that clause is an important indicator for emotion-cause extraction. As such selecting suita...
Highly mature service-oriented architecture systems have great flexibility and reusability, and can align business processes and information technologies with high quality. Service identification plays a key role in this respect. Further, of the different methods employed, the most popular and preferred is process-oriented service identification. H...
Purpose
The purpose of this paper is to propose an approach to incorporate contextual information into collaborative filtering (CF) based on the restricted Boltzmann machine (RBM) and deep belief networks (DBNs). Traditionally, neither the RBM nor its derivative model has been applied to modeling contextual information. In this work, the authors a...
Extracting representative topics and improving the extraction performance is rather challenging. In this work, we formulate a novel problem, called Interactive Area Topics Extraction, and propose a learning interactive topics extraction (LITE) model to regard this problem as a sequential decision making process and construct an end-to-end framework...
Social recommendation has attracted increasing attention over the years due to the potential value of social relations, which can be harnessed to mitigate the dilemma of data sparsity in traditional recommender systems. However, recent studies show that social recommenders fail in the practical use in industry for the reason that some problems in s...
Recent advances in sequence-to-sequence learning reveal a purely data-driven approach to the response generation task. Despite its diverse applications, existing neural models are prone to producing short and generic replies, making it infeasible to tackle open-domain challenges. In this research, we analyze this critical issue in light of the mode...
Neural network methods have achieved promising results for document-level sentiment classification. Since the popularity of Web 2.0, a growing number of websites provide users with voting and feedback systems (or called social feedback system). However, most existing sentiment classification models only focus on text information while ignoring the...