About
66
Publications
13,358
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,670
Citations
Introduction
Xiao Wang currently works at Kihara Lab, Computer Science department of Purdue University. Xiao mainly focus on deep learning, computer vision and bioinformatics. Also, Xiao's recent work EnAET has achieved state-of-the-art performance on semi supervised learning.
Publications
Publications (66)
Deep neural networks have been successfully applied to many real-world applications. However, these successes rely heavily on large amounts of labeled data, which is expensive to obtain. Recently, Auto-Encoding Transformation (AET) and MixMatch have been proposed and achieved state-of-the-art results for unsupervised and semi-supervised learning, r...
In this paper, we propose a deep neural network-based car-following model that has two distinctive properties. First, unlike most existing car-following models that take only the instantaneous velocity, velocity difference, and position difference as inputs, this new model takes the velocities, velocity differences, and position differences that we...
RNA plays a crucial role not only in information transfer as messenger RNA during gene expression but also in various biological functions as non-coding RNAs. Understanding mechanical mechanisms of function needs tertiary structure information; however, experimental determination of three-dimensional RNA structures is costly and time-consuming, lea...
Cryo-electron microscopy (cryo-EM) has become a powerful tool for determining the structures of macromolecules, such as proteins and DNA/RNA complexes. While high-resolution cryo-EM maps are increasingly available, there is still a substantial number of maps determined at intermediate or low resolution. These maps present challenges when it comes t...
Cryogenic electron microscopy (cryo-EM) has now been widely used for determining multichain protein complexes. However, modeling a large complex structure, such as those with more than ten chains, is challenging, particularly when the map resolution decreases. Here we present DiffModeler, a fully automated method for modeling large protein complex...
Protein–protein interactions are involved in almost all processes in a living cell and determine the biological functions of proteins. To obtain mechanistic understandings of protein–protein interactions, the tertiary structures of protein complexes have been determined by biophysical experimental methods, such as X-ray crystallography and cryogeni...
The EMDataResource Ligand Model Challenge aimed to assess the reliability and reproducibility of modeling ligands bound to protein and protein–nucleic acid complexes in cryogenic electron microscopy (cryo-EM) maps determined at near-atomic (1.9–2.5 Å) resolution. Three published maps were selected as targets: Escherichia coli beta-galactosidase wit...
Cryogenic electron microscopy (cryo-EM) has now been widely used for determining multi-chain protein complexes. However, modeling a complex structure is challenging particularly when the map resolution is low, typically in the intermediate resolution range of 5 to 10 Å. Within this resolution range, even accurate structure fitting is difficult, let...
The EMDataResource Ligand Model Challenge aimed to assess the reliability and reproducibility of modeling ligands bound to protein and protein/nucleic-acid complexes in cryogenic electron microscopy (cryo-EM) maps determined at near-atomic (1.9-2.5 Å) resolution. Three published maps were selected as targets: E. coli beta-galactosidase with inhibit...
Three-dimensional structure modeling from maps is an indispensable step for studying proteins and their complexes with cryogenic electron microscopy. Although the resolution of determined cryogenic electron microscopy maps has generally improved, there are still many cases where tracing protein main chains is difficult, even in maps determined at a...
Structure modeling from maps is an indispensable step for studying proteins and their complexes with cryogenic electron microscopy (cryo-EM). Although the resolution of determined cryo-EM maps has generally improved, there are still many cases where tracing protein main-chains is difficult, even in maps determined at a near atomic resolution. Here,...
DNA and RNA play fundamental roles in various cellular processes, where their three-dimensional structures provide information critical to understanding the molecular mechanisms of their functions. Although an increasing number of nucleic acid structures and their complexes with proteins are determined by cryogenic electron microscopy (cryo-EM), st...
RNA is not only playing a core role in the central dogma as mRNA between DNA and protein, but also many non-coding RNAs have been discovered to have unique and diverse biological functions. As genome sequences become increasingly available and our knowledge of RNA sequences grows, the study of RNA's structure and function has become more demanding....
Reliably scoring and ranking candidate models of protein complexes and assigning their oligomeric state from the structure of the crystal lattice represent outstanding challenges. A community-wide effort was launched to tackle these challenges. The latest resources on protein complexes and interfaces were exploited to derive a benchmark dataset con...
Some cognitive research has discovered that humans accomplish event segmentation as a side effect of event anticipation. Inspired by this discovery, we propose a simple yet effective end-to-end self-supervised learning framework for event segmentation/boundary detection. Unlike the mainstream clustering-based methods, our framework exploits a trans...
As a representative self-supervised method, contrastive learning has achieved great successes in unsupervised training of representations. It trains an encoder by distinguishing positive samples from negative ones given query anchors. These positive and negative samples play critical roles in defining the objective to learn the discriminative encod...
Reliably scoring and ranking candidate models of protein complexes and assigning their oligomeric state from the structure of the crystal lattice represent outstanding challenges. A community-wide effort was launched to tackle these challenges. The latest resources on protein complexes and interfaces were exploited to derive a benchmark dataset con...
As more protein structure models have been determined from cryogenic electron microscopy (cryo-EM) density maps, establishing how to evaluate the model accuracy and how to correct models in cases where they contain errors is becoming crucial to ensure the quality of the structural models deposited in the public database, the PDB. Here, a new protoc...
Representation learning has significantly been developed with the advance of contrastive learning methods. Most of those methods are benefited from various data augmentations that are carefully designated to maintain their identities so that the images transformed from the same instance can still be retrieved. However, those carefully designed tran...
As more protein structure models have been determined from cryo-electron microscopy (cryo-EM) density maps, establishing how to evaluate the model accuracy and how to correct models in case they contain errors is becoming crucial to ensuring the quality of structure models deposited to the public database, PDB. Here, we present a new protocol for e...
An increasing number of protein structures are being determined by cryogenic electron microscopy (cryo-EM). Although the resolution of determined cryo-EM density maps is improving in general, there are still many cases where amino acids of a protein are assigned with different levels of confidence. Here we developed a method that identifies potenti...
This paper presents the methods that have participated in the SHREC 2022 contest on protein-ligand binding site recognition. The prediction of protein-ligand binding regions is an active research domain in computational biophysics and structural biology and plays a relevant role for molecular docking and drug design. The goal of the contest is to a...
This paper presents the methods that have participated in the SHREC 2022 contest on protein-ligand binding site recognition. The prediction of protein-ligand binding regions is an active research domain in computational biophysics and structural biology and plays a relevant role for molecular docking and drug design. The goal of the contest is to a...
Many recent self-supervised frameworks for visual representation learning are based on certain forms of Siamese networks. Such networks are conceptually symmetric with two parallel encoders, but often practically asymmetric as numerous mechanisms are devised to break the symmetry. In this work, we conduct a formal study on the importance of asymmet...
As a representative self-supervised method, contrastive learning has achieved great successes in unsupervised training of representations. It trains an encoder by distinguishing positive samples from negative ones given query anchors. These positive and negative samples play critical roles in defining the objective to learn the discriminative encod...
Cryo-electron tomography (cryo-ET) is an imaging technique that allows three-dimensional visualization of macro-molecular assemblies under near-native conditions. Cryo-ET comes with a number of challenges, mainly low signal-to-noise and inability to obtain images from all angles. Computational methods are key to analyze cryo-electron tomograms. To...
Osteoclasts are multinucleated cells that exclusively resorb bone matrix proteins and minerals on the bone surface. They differentiate from monocyte/macrophage lineage cells in the presence of osteoclastogenic cytokines such as the receptor activator of nuclear factor-κB ligand (RANKL) and are stained positive for tartrate-resistant acid phosphatas...
Proteins are essential to nearly all cellular mechanism and the effectors of the cells activities. As such, they often interact through their surface with other proteins or other cellular ligands such as ions or organic molecules. The evolution generates plenty of different proteins, with unique abilities, but also proteins with related functions h...
Osteoclasts are multinucleated cells that exclusively resorb bone matrix proteins and minerals on the bone surface. They differentiate from monocyte/macrophage-lineage cells in the presence of osteoclastogenic cytokines such as the receptor activator of nuclear factor-κB ligand (RANKL) and are stained positive for tartrate-resistant acid phosphatas...
Some cognitive research has discovered that humans accomplish event segmentation as a side effect of event anticipation. Inspired by this discovery, we propose a simple yet effective end-to-end self-supervised learning framework for event segmentation/boundary detection. Unlike the mainstream clustering-based methods, our framework exploits a trans...
We present the results for CAPRI Round 50, the 4th joint CASP-CAPRI protein assembly prediction challenge. The Round comprised a total of 12 targets, including 6 dimers, 3 trimers, and 3 higher-order oligomers. Four of these were easy targets, for which good structural templates were available either for the full assembly, or for the main interface...
The mechanisms by which transcriptional activation domains (tADs) initiate eukaryotic gene expression have been an enigma for decades because most tADs lack specificity in sequence, structure, and interactions with targets. Machine learning analysis of datasets of transcriptional activation domain sequences generated in vivo elucidated several func...
Physical interactions of proteins play key functional roles in many important cellular processes. To understand molecular mechanisms of such functions, it is crucial to determine the structure of protein complexes. To complement experimental approaches, which usually take a considerable amount of time and resources, various computational methods ha...
An increasing number of density maps of macromolecular structures, including proteins and DNA/RNA complexes, have been determined by cryo-electron microscopy (cryo-EM). Although lately maps at a near-atomic resolution are routinely reported, there are still substantial fractions of maps determined at intermediate or low resolutions, where extractin...
Representation learning has significantly been developed with the advance of contrastive learning methods. Most of those methods have benefited from various data augmentations that are carefully designated to maintain their identities so that the images transformed from the same instance can still be retrieved. However, those carefully designed tra...
Physical interactions of proteins play key roles in many important cellular processes. Therefore, it is crucial to determine the structure of protein complexes to understand molecular mechanisms of interactions. To complement experimental approaches, which usually take a considerable amount of time and resources, various computational methods have...
Deep neural networks have been successfully applied to many real-world applications. However, such successes rely heavily on large amounts of labeled data that is expensive to obtain. Recently, many methods for semi-supervised learning have been proposed and achieved excellent performance. In this study, we propose a new EnAET framework to further...
Contrastive learning relies on constructing a collection of negative examples that are sufficiently hard to discriminate against positive queries when their representations are self-trained. Existing contrastive learning methods either maintain a queue of negative samples over minibatches while only a small portion of them are updated in an iterati...
Transformation equivariant representations (TERs) aim to capture the intrinsic visual structures that equivary to various transformations by expanding the notion of
translation
equivariance underlying the success of convolutional neural networks (CNNs). For this purpose, we present both deterministic AutoEncoding Transformations (AET) and probabi...
An increasing number of density maps of macromolecular structures, including proteins and protein and DNA/RNA complexes, have been determined by cryo-electron microscopy (cryo-EM). Although lately maps at a near-atomic resolution are routinely reported, there are still substantial fractions of maps determined at intermediate or low resolutions, whe...
Cryo-electron tomography (cryo-ET) is an imaging technique that allows us to three-dimensionally visualize both the structural details of macro-molecular assemblies under near-native conditions and its cellular context. Electrons strongly interact with biological samples, limiting electron dose. The latter limits the signal-to-noise ratio and hence...
Motivation:
Many important cellular processes involve physical interactions of proteins. Therefore, determining protein quaternary structures provides critical insights for understanding molecular mechanisms of functions of the complexes. To complement experimental methods, many computational methods have been developed to predict structures of pr...
In this paper, we propose Computerized Adaptive Testing (CAT) method based on deep learning, which is improved in some aspects. First, training samples used for Model-GRU is generated by monte carlo simulation, as a data-driven method. Second, comparing with time consuming conventional methods, the proposed deep learning based methods can greatly r...
New IT (Intelligent Technology) era calls for a new generation of lifelong-learning talents with scientific literacy, humanistic literacy and sound personality. Development of intelligent technology has changed organization of knowledge, interactions among new generation of learners and way of learning and living, and even social organization and s...
In this paper, we propose computerized Adaptive Testing(CAT) method based on deep learning, which is improved in some aspects. First, training samples used for Model-GRU is generated by monte carlo simulation , as a data-driven method. Second, comparing with time consuming conventional methods, the proposed deep learning based methods can greatly r...
In this paper, we propose computerized Adaptive Testing(CAT) method based on deep learning, which is improved in some aspects. First, training samples used for Model-GRU is generated by monte carlo simulation , as a data-driven method. Second, comparing with time consuming conventional methods, the proposed deep learning based methods can greatly r...
New IT (Intelligent Technology) era calls for a new generation of lifelong-learning talents with scientific literacy, humanistic literacy and sound personality. Development of intelligent technology has changed organization of knowledge, interactions among new generation of learners and way of learning and living, and even social organization and s...
In the paper, considering the limitation of effective method in E-learning area, a recommendation framework for E-Learning based on deep learning is proposed. Our model is based on deep learning, which has strong capability to learn from train set.It has some improvements than previous methods. First, it is based on the conventional K-Nearnest Neig...
In the paper, considering the limitation of effective method in E-learning area, a recommendation framework for E-Learning based on deep learning is proposed. Our model is based on deep learning, which has strong capability to learn from train set.It has some improvements than previous methods. First, it is based on the conventional K-Nearnest Neig...