Lei Xu

Lei Xu
Shanghai Jiao Tong University | SJTU · Department of Computer Science and Engineering

Ph.D

About

566
Publications
44,613
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
14,724
Citations
Additional affiliations
October 2008 - present
Institute of Physics, Chinese Academy of Sciences
Position
  • Guest Professor
September 2009 - present
Xidian University
Position
  • Professor
March 2007 - present
Peking University
Position
  • Chang Jiang Chair Professor
Education
February 1982 - December 1986
Tsinghua University
Field of study
  • Pattern Recognition and Signal Processing
March 1978 - December 1981
Harbin Institute of Technology
Field of study
  • Electrical Engineering

Publications

Publications (566)
Article
Full-text available
Fatty acid composition (FA) is an important indicator of meat quality in beef cattle. We investigated potential functional candidate genes for FA in beef cattle by integrating genomic and transcriptomic dataset through multiple strategies. In this study, we observed 65 SNPs overlapping with five candidate genes (CCDC57, FASN, HDAC11, ALG14, and ZMA...
Article
Full-text available
As a versatile tool for effectively communicating abstract concepts, freehand sketching has garnered significant attention and has undergone extensive exploration in computer vision. Sketch recognition is one of the fundamental and challenging tasks. Sketches are composed of a limited number of sparse and simple lines, rendering them challenging to...
Article
Generating high-quality and drug-like molecules from scratch within the expansive chemical space presents a significant challenge in the field of drug discovery. In prior research, value-based reinforcement learning algorithms have been employed to generate molecules with multiple desired properties iteratively. The immediate reward was defined as...
Conference Paper
Full-text available
Recent sketch synthesis methods have demonstrated the capability of generating lifelike outcomes. However, these methods directly encode the entire sketches making it challenging to decouple the strokes from the sketches and have difficulty in controlling local sketch synthesis, e.g., stroke editing. Besides, the sketch editing task encounters the...
Conference Paper
The effectiveness of model training heavily relies on the quality of available training resources. However, budget constraints often impose limitations on data collection efforts. To tackle this challenge, we introduce causal exploration in this paper, a strategy that leverages the underlying causal knowledge for both data collection and model trai...
Conference Paper
Free-hand sketch, as a versatile medium of communication, can be viewed as a collection of strokes arranged in a spatial layout to convey a concept. Due to the abstract nature of the sketches, changes in stroke position may make them difficult to recognize. Recently, Graphic sketch representations are effective in representing sketches. However, ex...
Preprint
General intelligence requires quick adaption across tasks. While existing reinforcement learning (RL) methods have made progress in generalization, they typically assume only distribution changes between source and target domains. In this paper, we explore a wider range of scenarios where both the distribution and environment spaces may change. For...
Preprint
Full-text available
The effectiveness of model training heavily relies on the quality of available training resources. However, budget constraints often impose limitations on data collection efforts. To tackle this challenge, we introduce causal exploration in this paper, a strategy that leverages the underlying causal knowledge for both data collection and model trai...
Article
Full-text available
Reinforcement learning (RL) has been applied to financial portfolio management in recent years. Current studies mostly focus on profit accumulation without much consideration of risk. Some risk-return balanced studies extract features from price and volume data only, which is highly correlated and missing representation of risk features. To tackle...
Article
Prediction of drug-target interactions (DTIs) is a crucial step in drug discovery, and deep learning methods have shown great promise on various DTI datasets. However, existing approaches still face several challenges, including limited labeled data, hidden bias issue, and a lack of generalization ability to out-of-domain data. These challenges hin...
Article
Full-text available
Retrosynthetic planning, which aims to identify synthetic pathways for target molecules from starting materials, is a fundamental problem in synthetic chemistry. Computer-aided retrosynthesis has made significant progress, in which heuristic search algorithms, including Monte Carlo Tree Search (MCTS) and A* search, have played a crucial role. Howev...
Article
Full-text available
De novo molecular generation is a promising approach to drug discovery, building novel molecules from the scratch that can bind the target proteins specifically. With the increasing availability of machine learning algorithms and computational power, artificial intelligence (AI) has emerged as a valuable tool for this purpose. Here, we have develop...
Article
Full-text available
Designing 3D molecules with high binding affinity for specific protein targets is crucial in drug design. One challenge is that the atomic interaction between molecules and proteins in 3D space has to be taken into account. However, the existing target-aware methods solely model the joint distribution between the molecules and proteins, disregardin...
Chapter
Existing graph convolutional network (GCN) models for the traveling salesman problem (TSP) cannot generalize well to TSP instances with larger number of cities than training samples, and the NP-Hard nature of the TSP renders it impractical to use large-scale instances for training. This paper proposes a novel approach that generalizes well a pre-tr...
Article
Full-text available
Multiple trait genomic selection incorporating correlated traits can improve the predictive ability of low‐heritability traits. In this study, we evaluated genomic prediction accuracy using multi‐trait BayesCπ method (MT‐BayesCπ), which allows for a broader range of mixture priors for important traits in beef cattle. We compared the prediction perf...
Conference Paper
Protein binding site prediction is an important prerequisite for the discovery of new drugs. Usually, natural 3D U-Net is adopted as the standard site prediction framework to do per-voxel binary mask classification. However, this scheme only performs feature extraction for single-scale samples, which may bring the loss of global or local informatio...
Chapter
Anti-money laundering (AML) is essential for safeguarding financial systems. One critical way is to monitor the tremendous daily transaction records to filter out suspicious transactions or accounts, which is time consuming and requires rich experience and expert knowledge to construct filtering rules. Deep learning methods have been used to model...
Preprint
Full-text available
Diffusion models have showcased their remarkable capability to synthesize diverse and high-quality images, sparking interest in their application for real image editing. However, existing diffusion-based approaches for local image editing often suffer from undesired artifacts due to the pixel-level blending of the noised target images and diffusion...
Article
Full-text available
Graphic sketch representations are effective for representing sketches. Existing methods take the patches cropped from sketches as the graph nodes, and construct the edges based on sketch's drawing order or Euclidean distances on the canvas. However, the drawing order of a sketch may not be unique, while the patches from semantically related parts...
Article
Deep learning methods have demonstrated promising performance on the NP-hard Graph Matching (GM) problems. However, the state-of-the-art methods usually require the ground-truth labels, which may take extensive human efforts or be impractical to collect. In this paper, we present a robust self-supervised bidirectional learning method (IA-SSGM) to t...
Article
Encoding sketches as Gaussian mixture model (GMM)-distributed latent codes is an effective way to control sketch synthesis. Each Gaussian component represents a specific sketch pattern, and a code randomly sampled from the Gaussian can be decoded to synthesize a sketch with the target pattern. However, existing methods treat the Gaussians as indivi...
Chapter
Adversarial attacks can help to reveal the vulnerability of neural networks. In the text classification domain, synonym replacement is an effective way to generate adversarial examples. However, the number of replacement combinations grows exponentially with the text length, making the search difficult. In this work, we propose an attack method whi...
Chapter
There is a surge of interests in recent years to develop graph neural network (GNN) based learning methods for the NP-hard traveling salesman problem (TSP). However, the existing methods not only have limited search space but also require a lot of training instances with ground-truth solutions that are time-consuming to compute. In this paper, we p...
Article
Protein binding site prediction is an important prerequisite task of drug discovery and design. While binding sites are very small, irregular and varied in shape, making the prediction very challenging. Standard 3D U-Net has been adopted to predict binding sites but got stuck with unsatisfactory prediction results, incomplete, out-of-bounds, or eve...
Article
Full-text available
The bigger picture An intracranial aneurysm (IA) is a pathological expansion of a weak area of a blood vessel wall in the brain because of the long-term effects of abnormal blood flow. Epidemiological estimates suggest that approximately 3% of the population has an intracranial aneurysm. While rupture is rare (occurring in less than 1% of cases), a...
Preprint
Full-text available
Graphic sketch representations are effective for representing sketches. Existing methods take the patches cropped from sketches as the graph nodes, and construct the edges based on sketch's drawing order or Euclidean distances on the canvas. However, the drawing order of a sketch may not be unique, while the patches from semantically related parts...
Article
Full-text available
Most existing deep learning methods for graph matching tasks tend to focus on affinity learning in a feedforward fashion to assist the neural network solver. However, the potential benefits of a direct feedback from the neural network solver to the affinity learning are usually underestimated and overlooked. In this paper, we propose a bidirectiona...
Article
Semantic face editing has achieved substantial progress in recent years. However, existing face editing methods, which often encode the entire image into a single code, still have difficulty in enabling flexible editing while keeping high-fidelity reconstruction. The one-code scheme also brings entangled face manipulations and limited flexibility i...
Article
Full-text available
Traditional drug discovery is very laborious, expensive, and time-consuming, due to the huge combinatorial complexity of the discrete molecular search space. Researchers have turned to machine learning methods for help to tackle this difficult problem. However, most existing methods are either virtual screening on the available database of compound...
Article
Identifying synergistic drug combinations (SDCs) is a great challenge due to the combinatorial complexity and the fact that SDC is cell line specific. The existing computational methods either did not consider the cell line specificity of SDC, or did not perform well by building model for each cell line independently. In this paper, we present a no...
Chapter
Word substitution based textual adversarial attack is actually a combinatorial optimization problem. Existing greedy search methods are time-consuming due to extensive unnecessary victim model calls in word ranking and substitution. In this work, we propose a learnable attack method which uses neural networks to guide the greedy search to reduce vi...
Preprint
Full-text available
Semantic face editing has achieved substantial progress in recent years. Known as a growingly popular method, latent space manipulation performs face editing by changing the latent code of an input face to liberate users from painting skills. However, previous latent space manipulation methods usually encode an entire face into a single low-dimensi...
Article
Current face recognition tasks are usually carried out on high-quality face images, but in reality, most face images are captured under unconstrained or poor conditions, e.g., by video surveillance. Existing methods are featured by learning data uncertainty to avoid overfitting the noise, or by adding margins to the angle or cosine space of the nor...
Chapter
Existing face sketch synthesis methods extend conditional generative adversarial network framework with promising performance. However, they usually pre-train on additional large-scale datasets, and the performance is still not satisfied. To tackle the issues, we develop a deep bidirectional network based on the least mean square error reconstructi...
Preprint
Full-text available
Current face recognition tasks are usually carried out on high-quality face images, but in reality, most face images are captured under unconstrained or poor conditions, e.g., by video surveillance. Existing methods are featured by learning data uncertainty to avoid overfitting the noise, or by adding margins to the angle or cosine space of the nor...
Article
Most existing reinforcement learning (RL)-based portfolio management models do not take into account the market conditions, which limits their performance in risk-return balancing. In this paper, we propose DeepTrader, a deep RL method to optimize the investment policy. In particular, to tackle the risk-return balancing problem, our model embeds ma...
Article
Existing deep learning methods for graph matching(GM) problems usually considered affinity learningto assist combinatorial optimization in a feedforward pipeline, and parameter learning is executed by back-propagating the gradients of the matching loss. Such a pipeline pays little attention to the possible complementary benefit from the optimizatio...
Article
Learning to synthesize free-hand sketches controllably according to specified categories and sketching styles is a challenging task, due to the lack of training data with category labels and style labels. One choice to control the synthesis is by self-organizing a latent coding space to preserve the similarity of structural patterns of the observed...
Article
The coronavirus disease 2019 (COVID-19) epidemic continues to spread rapidly around the world and nearly 20 millions people are infected. This paper utilises both single-locus analysis and joint-SNPs analysis for detection of significant single nucleotide polymorphisms (SNPs) in the phenotypes of symptomatic vs. asymptomatic, the early collection t...
Article
Full-text available
Least mean square error reconstruction for the self-organizing network (Lmser) was proposed in 1991, featured by a bidirectional architecture with several built-in natures. In this paper, we developed Lmser into CNN based Lmser (CLmser), highlighted by new findings on strengths of two major built-in natures of Lmser, namely duality in connection we...
Article
Full-text available
Runs of homozygosity (ROH) are continuous homozygous regions that generally exist in the DNA sequence of diploid organisms. Identifications of ROH leading to reduction in performance can provide valuable insight into the genetic architecture of complex traits. Here, we evaluated genome-wide patterns of homozygosity and their association with import...
Conference Paper
For the problem of early detection of atrial fibrillation (AF) from electrocardiogram (ECG), it is difficult to capture subject-invariant discriminative features from ECG signals, due to the high variation in ECG morphology across subjects and the noise in ECG. In this paper, we propose an Discrete Biorthogonal Wavelet Transform (DBWT) Based Convol...
Article
Full-text available
Various methods have been proposed for genomic prediction (GP) in livestock. These methods have mainly focused on statistical considerations and did not include genome annotation information. In this study, to improve the predictive performance of carcass traits in Chinese Simmental beef cattle, we incorporated the genome annotation information int...
Preprint
We present a self-learning approach that combines deep reinforcement learning and Monte Carlo tree search to solve the traveling salesman problem. The proposed approach has two advantages. First, it adopts deep reinforcement learning to compute the value functions for decision, which removes the need of hand-crafted features and labelled data. Seco...
Article
There has been a framework sketched for learning deep bidirectional intelligence. The framework has an inbound that features two actions: one is the acquiring action, which gets inputs in appropriate patterns, and the other is A-S cognition, derived from the abbreviated form of words abstraction and self-organization, which abstracts input patterns...
Preprint
Full-text available
Background: Runs of homozygosity (ROH) are continuous homozygous regions that generally exist in the DNA sequence of diploid organisms. Identifications of regions of the genome lead to reduction in performance can provide valuable insight into the genetic architecture of complex traits. Here, we evaluated genome-wide patterns of homozygosity and th...
Preprint
Full-text available
Background: Body size traits as one of the main breeding selection criteria was widely used to monitor cattle growth and to evaluate the selection response. In this study, body size was defined as body height (BH), body length (BL), hip height (HH), heart size (HS), abdominal size (AS), and cannon bone size (CS). We performed genome-wide associatio...
Preprint
Full-text available
Background: Body size traits as one of the main breeding selection criteria was widely used to monitor cattle growth and to evaluate the selection response. In this study, body size was defined as body height (BH), body length (BL), hip height (HH), heart size (HS), abdominal size (AS), and cannon bone size (CS). We performed genome-wide associatio...
Preprint
Full-text available
Background: Body size traits as one of the main breeding selection criteria was widely used to monitor cattle growth and to evaluate the selection response. In this study, body size was defined as body height (BH), body length (BL), hip height (HH), heart size (HS), abdominal size (AS), and cannon bone size (CS). We performed genome-wide associatio...
Article
Full-text available
Causal inference is a powerful modeling tool for explanatory analysis, which might enable current machine learning to become explainable. How to marry causal inference with machine learning to develop eXplainable Artificial Intelligence (XAI) algorithms is one of key steps towards to the artificial intelligence 2.0. With the aim of bringing knowled...
Article
Full-text available
Background Fatty acids are important traits that affect meat quality and nutritive values in beef cattle. Detection of genetic variants for fatty acid composition can help to elucidate the genetic mechanism underpinning these traits and promote the improvement of fatty acid profiles. In this study, we performed a genome-wide association study (GWAS...
Article
Full-text available
Non-additive effects play important roles in determining genetic changes with regard to complex traits; however, such effects are usually ignored in genetic evaluation and quantitative trait locus (QTL) mapping analysis. In this study, a two-component genome-based restricted maximum likelihood (GREML) was applied to obtain the additive genetic vari...
Chapter
Deep bidirectional Intelligence (BI) via YIng YAng (IA) system, or shortly Deep IA-BI, is featured by circling A-mapping and I-mapping (or shortly AI circling) that sequentially performs each of five actions. A basic foundation of IA-BI is bidirectional learning that makes the cascading of A-mapping and I-mapping (shortly A-I cascading) approximate...
Chapter
Proposed in 1991, Least Mean Square Error Reconstruction for self-organizing network, shortly Lmser, was a further development of the traditional auto-encoder (AE) by folding the architecture with respect to the central coding layer and thus leading to the features of Duality in Connection Weight (DCW) and Duality in Paired Neurons (DPN), as well a...
Chapter
Neural style transfer has been demonstrated to be powerful in creating artistic images with help of Convolutional Neural Networks (CNN), but continuously controllable transfer is still a challenging task. This paper provides a computational decomposition of the style into basic factors, which aim to be factorized, interpretable representations of t...
Chapter
Medical image segmentation is the premise of many medical image applications including disease diagnosis, anatomy, and radiation therapy. This paper presents a k-Dense-UNet for segmentation of Electron Microscopy (EM) images. Firstly, based on the characteristics of the long skip connection of U-Net and the mechanism of short skip connection of Den...
Article
Full-text available
Correct heartbeat classification from electrocardiogram (ECG) signals is fundamental to the diagnosis of arrhythmia. The recent advancement in deep convolutional neural network (CNN) has renewed the interest in applying deep learning techniques to improve the accuracy of heartbeat classification. So far, the results are not very exciting. Most of t...
Conference Paper
Existing single image super-resolution (SISR) methods usually focus on Low-Resolution (LR) images which are artificially generated from High-Resolution (HR) images by a down-sampling process, but are not robust for unmatched training set and testing set. This paper proposes a GAN Flexible Lmser (GFLmser) network that bidirectionally learns the High...
Article
Full-text available
Genomic selection (GS) has been widely considered as a valuable strategy for enhancing the rate of genetic gain in farm animals. However, the construction of a large reference population is a big challenge for small populations like indigenous cattle. In order to evaluate the potential application of GS for Chinese indigenous cattle, we assessed th...
Article
Full-text available
Genome-wide association studies (GWAS) have commonly been used to identify candidate genes that control economically important traits in livestock. Our objective was to detect potential candidate genes associated mainly with muscle development traits related to dimension of hindquarter in cattle. A next generation sequencing (NGS) dataset to impute...
Article
Full-text available
Objective: Tumor heterogeneity renders identification of suitable biomarkers of gastric cancer (GC) challenging. Here, we aimed to identify prognostic genes of GC using computational analysis. Methods: We first used microarray technology to profile gene expression of GC and paired nontumor tissues from 198 patients. Based on these profiles and p...