Ji-Rong Wen

Ji-Rong Wen
Renmin University of China | RUC · School of Information

PhD

About

482
Publications
81,261
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
12,988
Citations
Citations since 2017
286 Research Items
6608 Citations
201720182019202020212022202305001,0001,5002,000
201720182019202020212022202305001,0001,5002,000
201720182019202020212022202305001,0001,5002,000
201720182019202020212022202305001,0001,5002,000
Introduction
Additional affiliations
July 1999 - September 2013
Microsoft
Position
  • Senior Researcher, Group Manager

Publications

Publications (482)
Article
Full-text available
Fine-grained classification with few labeled samples has urgent needs in practice since fine-grained samples are more difficult and expensive to collect and annotate. Standard few-shot learning (FSL) focuses on generalising across seen and unseen classes, where the classes are at the same level of granularity. Therefore, when applying existing FSL...
Article
Legal judgment prediction (LJP) is a fundamental task of legal artificial intelligence. It aims to automatically predict the judgment results of legal cases. Three typical subtasks are relevant law article prediction, charge prediction, and term of penalty prediction. Due to the wide range of potential applications, LJP has attracted a great deal o...
Article
Web search provides a promising way for people to obtain information and has been extensively studied. With the surge of deep learning and large-scale pre-training techniques, various neural information retrieval models are proposed, and they have demonstrated the power for improving search (especially, the ranking) quality. All these existing sear...
Preprint
To facilitate research on text generation, this paper presents a comprehensive and unified library, TextBox 2.0, focusing on the use of pre-trained language models (PLMs). To be comprehensive, our library covers $13$ common text generation tasks and their corresponding $83$ datasets and further incorporates $45$ PLMs covering general, translation,...
Preprint
Although pre-trained language models (PLMs) have shown impressive performance by text-only self-supervised training, they are found lack of visual semantics or commonsense, e.g., sizes, shapes, and colors of commonplace objects. Existing solutions often rely on explicit images for visual knowledge augmentation (requiring time-consuming retrieval or...
Preprint
Dense retrieval aims to map queries and passages into low-dimensional vector space for efficient similarity measuring, showing promising effectiveness in various large-scale retrieval tasks. Since most existing methods commonly adopt pre-trained Transformers (e.g. BERT) for parameter initialization, some work focuses on proposing new pre-training t...
Article
Full-text available
Many current deep learning approaches make extensive use of backbone networks pre-trained on large datasets like ImageNet, which are then fine-tuned to perform a certain task. In remote sensing, the lack of comparable large annotated datasets and the wide diversity of sensing platforms impedes similar developments. In order to contribute towards th...
Article
Deep semantic matching aims to discriminate the relationship between documents based on deep neural networks. In recent years, it becomes increasingly popular to organize documents with a graph structure, then leverage both the intrinsic document features and the extrinsic neighbor features to derive discrimination. Most of the existing works mainl...
Preprint
Multi-hop Question Answering over Knowledge Graph~(KGQA) aims to find the answer entities that are multiple hops away from the topic entities mentioned in a natural language question on a large-scale Knowledge Graph (KG). To cope with the vast search space, existing work usually adopts a two-stage approach: it firstly retrieves a relatively small s...
Preprint
Deep semantic matching aims to discriminate the relationship between documents based on deep neural networks. In recent years, it becomes increasingly popular to organize documents with a graph structure, then leverage both the intrinsic document features and the extrinsic neighbor features to derive discrimination. Most of the existing works mainl...
Preprint
Full-text available
RecBole has recently attracted increasing attention from the research community. As the increase of the number of users, we have received a number of suggestions and update requests. This motivates us to make some significant improvements on our library, so as to meet the user requirements and contribute to the research community. In order to show...
Preprint
Text retrieval is a long-standing research topic on information seeking, where a system is required to return relevant information resources to user's queries in natural language. From classic retrieval methods to learning-based ranking functions, the underlying retrieval models have been continually evolved with the ever-lasting technical innovati...
Preprint
Full-text available
With the growth of high-dimensional sparse data in web-scale recommender systems, the computational cost to learn high-order feature interaction in CTR prediction task largely increases, which limits the use of high-order interaction models in real industrial applications. Some recent knowledge distillation based methods transfer knowledge from com...
Preprint
We study the text generation task under the approach of pre-trained language models (PLMs). Typically, an auto-regressive (AR) method is adopted for generating texts in a token-by-token manner. Despite many advantages of AR generation, it usually suffers from inefficient inference. Therefore, non-autoregressive (NAR) models are proposed to generate...
Preprint
Sampling proper negatives from a large document pool is vital to effectively train a dense retrieval model. However, existing negative sampling strategies suffer from the uninformative or false negative problem. In this work, we empirically show that according to the measured relevance scores, the negatives ranked around the positives are generally...
Preprint
Full-text available
To develop effective and efficient graph similarity learning (GSL) models, a series of data-driven neural algorithms have been proposed in recent years. Although GSL models are frequently deployed in privacy-sensitive scenarios, the user privacy protection of neural GSL models has not drawn much attention. To comprehensively understand the privacy...
Preprint
Legal case matching, which automatically constructs a model to estimate the similarities between the source and target cases, has played an essential role in intelligent legal systems. Semantic text matching models have been applied to the task where the source and target legal cases are considered as long-form text documents. These general-purpose...
Preprint
Full-text available
We focus on the setting of contextual batched bandit (CBB), where a batch of rewards is observed from the environment in each episode. But the rewards of the non-executed actions are unobserved (i.e., partial-information feedbacks). Existing approaches for CBB usually ignore the rewards of the non-executed actions, resulting in feedback information...
Preprint
While self-supervised learning techniques are often used to mining implicit knowledge from unlabeled data via modeling multiple views, it is unclear how to perform effective representation learning in a complex and inconsistent context. To this end, we propose a methodology, specifically consistency and complementarity network (CoCoNet), which avai...
Preprint
Although artificial intelligence (AI) has made significant progress in understanding molecules in a wide range of fields, existing models generally acquire the single cognitive ability from the single molecular modality. Since the hierarchy of molecular knowledge is profound, even humans learn from different modalities including both intuitive diag...
Preprint
Users' search tasks have become increasingly complicated, requiring multiple queries and interactions with the results. Recent studies have demonstrated that modeling the historical user behaviors in a session can help understand the current search intent. Existing context-aware ranking models primarily encode the current session sequence (from the...
Preprint
Document retrieval has been extensively studied within the index-retrieve framework for decades, which has withstood the test of time. Unfortunately, such a pipelined framework limits the optimization of the final retrieval quality, because indexing and retrieving are separated stages that can not be jointly optimized in an end-to-end manner. In or...
Preprint
Full-text available
Multimodal learning, especially large-scale multimodal pre-training, has developed rapidly over the past few years and led to the greatest advances in artificial intelligence (AI). Despite its effectiveness, understanding the underlying mechanism of multimodal pre-training models still remains a grand challenge. Revealing the explainability of such...
Preprint
Full-text available
Person-job fit is the core technique of online recruitment platforms, which can improve the efficiency of recruitment by accurately matching the job positions with the job seekers. Existing works mainly focus on modeling the unidirectional process or overall matching. However, recruitment is a two-way selection process, which means that both candid...
Preprint
We propose a video feature representation learning framework called STAR-GNN, which applies a pluggable graph neural network component on a multi-scale lattice feature graph. The essence of STAR-GNN is to exploit both the temporal dynamics and spatial contents as well as visual connections between regions at different scales in the frames. It model...
Article
Personalized search tailors document ranking lists for each individual user based on her interests and query intent to better satisfy the user’s information need. Many personalized search models have been proposed. They first build a user interest profile from the user’s search history, and then re-rank the documents based on the personalized match...
Article
To characterize complex and heterogeneous side information in recommender systems, heterogeneous information network (HIN) has shown superior performance and attracted much research attention. In HIN, the rich entities, relations and paths can be utilized to model the correlations of users and items, such a task setting is often called HIN-based re...
Preprint
Full-text available
As an essential operation of legal retrieval, legal case matching plays a central role in intelligent legal systems. This task has a high demand on the explainability of matching results because of its critical impacts on downstream applications -- the matched legal cases may provide supportive evidence for the judgments of target cases and thus in...
Article
In recommender system, top- N recommendation is an important task with implicit feedback data. Although the recent success of deep learning largely pushes forward the research on top- N recommendation, there are increasing concerns on appropriate evaluation of recommendation algorithms. It becomes emergent to study how recommendation algorithms can...
Preprint
Pre-trained language models (PLMs) have achieved notable success in natural language generation (NLG) tasks. Up to now, most of the PLMs are pre-trained in an unsupervised manner using large-scale general corpus. In the meanwhile, an increasing number of models pre-trained with less labeled data showcase superior performance compared to unsupervise...
Preprint
Conversational recommender systems (CRS) aim to proactively elicit user preference and recommend high-quality items through natural language conversations. Typically, a CRS consists of a recommendation module to predict preferred items for users and a conversation module to generate appropriate responses. To develop an effective CRS, it is essentia...
Preprint
Full-text available
In order to support the study of recent advances in recommender systems, this paper presents an extended recommendation library consisting of eight packages for up-to-date topics and architectures. First of all, from a data perspective, we consider three important topics related to data issues (i.e., sparsity, bias and distribution shift), and deve...
Article
Search result diversification aims to generate diversified search results so as to meet the various information needs of users. Most of those existing diversification methods greedily select the optimal documents one-by-one comparing with the selected document sequences. Due to the fact that the information utilities of the candidate documents are...
Preprint
This paper aims to advance the mathematical intelligence of machines by presenting the first Chinese mathematical pre-trained language model~(PLM) for effectively understanding and representing mathematical problems. Unlike other standard NLP tasks, mathematical texts are difficult to understand, since they involve mathematical terminology, symbols...
Preprint
Full-text available
In order to develop effective sequential recommenders, a series of sequence representation learning (SRL) methods are proposed to model historical user behaviors. Most existing SRL methods rely on explicit item IDs for developing the sequence models to better capture user preference. Though effective to some extent, these methods are difficult to b...
Preprint
Full-text available
Relevant recommendation is a special recommendation scenario which provides relevant items when users express interests on one target item (e.g., click, like and purchase). Besides considering the relevance between recommendations and trigger item, the recommendations should also be diversified to avoid information cocoons. However, existing divers...
Article
Full-text available
The fundamental goal of artificial intelligence (AI) is to mimic the core cognitive activities of human. Despite tremendous success in the AI research, most of existing methods have only single-cognitive ability. To overcome this limitation and take a solid step towards artificial general intelligence (AGI), we develop a foundation model pre-traine...
Preprint
Full-text available
The learn-to-compare paradigm of contrastive representation learning (CRL), which compares positive samples with negative ones for representation learning, has achieved great success in a wide range of domains, including natural language processing, computer vision, information retrieval and graph learning. While many research works focus on data a...
Article
A persistent data structure , also known as a multiversion data structure in the database literature, is a data structure that preserves all its previous versions as it is updated over time. Every update (inserting, deleting, or changing a data record) to the data structure creates a new version, while all the versions are kept in the data structur...
Preprint
Pretrained language models (PLMs) have made remarkable progress in text generation tasks via fine-tuning. While, it is challenging to fine-tune PLMs in a data-scarce situation. Therefore, it is non-trivial to develop a general and lightweight model that can adapt to various text generation tasks based on PLMs. To fulfill this purpose, the recent pr...
Preprint
Nowadays, pretrained language models (PLMs) have dominated the majority of NLP tasks. While, little research has been conducted on systematically evaluating the language abilities of PLMs. In this paper, we present a large-scale empirical study on general language ability evaluation of PLMs (ElitePLM). In our study, we design four evaluation dimens...
Preprint
Commonsense reasoning in natural language is a desired ability of artificial intelligent systems. For solving complex commonsense reasoning tasks, a typical solution is to enhance pre-trained language models~(PTMs) with a knowledge-aware graph neural network~(GNN) encoder that models a commonsense knowledge graph~(CSKG). Despite the effectiveness,...
Preprint
Recently, contrastive learning has been shown to be effective in improving pre-trained language models (PLM) to derive high-quality sentence representations. It aims to pull close positive examples to enhance the alignment while push apart irrelevant negatives for the uniformity of the whole representation space. However, previous works mostly adop...
Preprint
Full-text available
Recent years have witnessed the significant advance in dense retrieval (DR) based on powerful pre-trained language models (PLM). DR models have achieved excellent performance in several benchmark datasets, while they are shown to be not as competitive as traditional sparse retrieval models (e.g., BM25) in a zero-shot retrieval setting. However, in...
Preprint
Full-text available
Personalized dialogue systems explore the problem of generating responses that are consistent with the user's personality, which has raised much attention in recent years. Existing personalized dialogue systems have tried to extract user profiles from dialogue history to guide personalized response generation. Since the dialogue history is usually...
Preprint
Large-scale single-stream pre-training has shown dramatic performance in image-text retrieval. Regrettably, it faces low inference efficiency due to heavy attention layers. Recently, two-stream methods like CLIP and ALIGN with high inference efficiency have also shown promising performance, however, they only consider instance-level alignment betwe...
Article
Recent studies show that historical behaviors (such as queries and their clicks) contained in a search session can benefit the ranking performance of subsequent queries in the session. Existing neural context-aware ranking models usually rank documents based on either latent representations of user search behaviors, or the word-level interactions b...
Preprint
Knowledge-grounded conversation (KGC) shows great potential in building an engaging and knowledgeable chatbot, and knowledge selection is a key ingredient in it. However, previous methods for knowledge selection only concentrate on the relevance between knowledge and dialogue context, ignoring the fact that age, hobby, education and life experience...
Preprint
Full-text available
Unbiased learning to rank has been proposed to alleviate the biases in the search ranking, making it possible to train ranking models with user interaction data. In real applications, search engines are designed to display only the most relevant k documents from the retrieved candidate set. The rest candidates are discarded. As a consequence, posit...
Article
Full-text available
Key Smart City applications such as traffic management and public security rely heavily on the intelligent processing of video and image data, often in the form of visual retrieval tasks, such as person Re-IDentification (ReID) and vehicle re-identification. For these tasks, Deep Neural Networks (DNNs) have been the dominant solution for the past d...
Preprint
Full-text available
The key of sequential recommendation lies in the accurate item correlation modeling. Previous models infer such information based on item co-occurrences, which may fail to capture the real causal relations, and impact the recommendation performance and explainability. In this paper, we equip sequential recommendation with a novel causal discovery m...
Preprint
Full-text available
Personalized PageRank (PPR) is a popular node proximity metric in graph mining and network research. Given a graph G=(V,E) and a source node $s \in V$, a single-source PPR (SSPPR) query asks for the PPR value $\vpi(u)$ with respect to s, which represents the relative importance of node u in the context of the source node s. Among existing algorithm...
Preprint
The state-of-the-art Mixture-of-Experts (short as MoE) architecture has achieved several remarkable successes in terms of increasing model capacity. However, MoE has been hindered widespread adoption due to complexity, communication costs, and training instability. Here we present a novel MoE architecture based on matrix product operators (MPO) fro...
Preprint
Web search provides a promising way for people to obtain information and has been extensively studied. With the surgence of deep learning and large-scale pre-training techniques, various neural information retrieval models are proposed and they have demonstrated the power for improving search (especially, the ranking) quality. All these existing se...
Article
Personalized PageRank (PPR) is a popular node proximity metric in graph mining and network research. A single-source PPR (SSPPR) query asks for the PPR value of each node on the graph. Due to its importance and wide applications, decades of efforts have been devoted to the efficient processing of SSPPR queries. Among existing algorithms, LocalPush...
Preprint
Recently, deep neural networks such as RNN, CNN and Transformer have been applied in the task of sequential recommendation, which aims to capture the dynamic preference characteristics from logged user behavior data for accurate recommendation. However, in online platforms, logged user behavior data is inevitable to contain noise, and deep recommen...
Preprint
Full-text available
Explainable recommendation has shown its great advantages for improving recommendation persuasiveness, user satisfaction, system transparency, among others. A fundamental problem of explainable recommendation is how to evaluate the explanations. In the past few years, various evaluation strategies have been proposed. However, they are scattered in...
Preprint
Full-text available
Designing spectral convolutional networks is a challenging problem in graph learning. ChebNet, one of the early attempts, approximates the spectral convolution using Chebyshev polynomials. GCN simplifies ChebNet by utilizing only the first two Chebyshev polynomials while still outperforming it on real-world datasets. GPR-GNN and BernNet demonstrate...
Article
In recommender systems, it is essential to understand the underlying factors that affect user-item interaction. Recently, several studies have utilized disentangled representation learning to discover such hidden factors from user-item interaction data, which shows promising results. However, without any external guidance signal, the learned disent...
Article
Citation count prediction is an important task for estimating the future impact of research papers. Most of the existing works utilize the information extracted from the paper itself. In this article, we focus on how to utilize another kind of useful data signal (i.e., peer review text) to improve both the performance and interpretability of the pr...
Preprint
Text Generation aims to produce plausible and readable text in human language from input data. The resurgence of deep learning has greatly advanced this field by neural generation models, especially the paradigm of pretrained language models (PLMs). Grounding text generation on PLMs is seen as a promising direction in both academia and industry. In...