Hao Lin

Hao Lin
University of Electronic Science and Technology of China | UESTC · School of Life Science and Technology

PhD.

About

221
Publications
37,140
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
13,259
Citations
Additional affiliations
July 2018 - March 2020
University of Electronic Science and Technology of China
Position
  • Professor
August 2016 - July 2018
University of Electronic Science and Technology of China
Position
  • Senior Researcher
August 2009 - July 2016
University of Electronic Science and Technology of China
Position
  • Professor (Associate)
Education
September 2002 - July 2007
Inner Mongolia University
Field of study
  • Biophysics
September 1998 - July 2002
Inner Mongolia University
Field of study
  • Physics

Publications

Publications (221)
Article
The spatial distribution pattern of long non-coding RNA (lncRNA) in cell is tightly related to their function. With the increment of publicly available subcellular location data, a number of computational methods have been developed for the recognition of the subcellular localization of lncRNA. Unfortunately, these computational methods suffer from...
Article
Ice plant (Mesembryanthemum crystallinum), a member of the Aizoaceae family, is a typical halophyte crop and a model plant for studying the mechanism of transition from C3 photosynthesis to crassulacean acid metabolism (CAM). Here, we report a high‐quality chromosome‐level ice plant genome sequence. This 98.05% genome sequence is anchored to nine c...
Article
The Brassicaceae is an important plant family. We built a user-friendly, web-based, comparative, and functional genomic database, The Brassicaceae Genome Resource (TBGR, http://www.tbgr.org.cn), based on 82 released genomes from 27 Brassicaceae species. The TBGR database contains a large number of important functional genes, including 4,096 glucosi...
Preprint
Flowering plants (angiosperms) dominate our planet and sustain all life on Earth. However, evolutionary relationships among the angiosperm lineages that diverged early – Amborellales, Nymphaeales, Austrobaileyales and Mesangiospermae, which further comprises monocots and other four clades – have remained highly disputed likely because of their rapi...
Article
Full-text available
As a newly discovered protein posttranslational modification, lysine lactylation (Kla) plays a pivotal role in various cellular processes. High throughput mass spectrometry is the primary approach for the detection of Kla sites. However, experimental approaches for identifying Kla sites are often time‐consuming and labor‐intensive when compared to...
Article
Diabetes is a metabolic disorder caused by insufficient insulin secretion and insulin secretion disorders. From health to diabetes, there are generally three stages: health, pre-diabetes and type 2 diabetes. Early diagnosis of diabetes is the most effective way to prevent and control diabetes and its complications. In this work, we collected the ph...
Article
Full-text available
4mC is a type of DNA alteration that has the ability to synchronize multiple biological movements, for example, DNA replication, gene expressions, and transcriptional regulations. Accurate prediction of 4mC sites can provide exact information to their hereditary functions. The purpose of this study was to establish a robust deep learning model to r...
Article
Full-text available
Post-translational modification (PTM) refers to the covalent and enzymatic modification of proteins after protein biosynthesis, which orchestrates a variety of biological processes. Detecting PTM sites in proteome scale is one of the key steps to in-depth understanding their regulation mechanisms. In this study, we presented an integrated method ba...
Article
Gene expression is directly controlled by transcription factors (TFs) in a complex combination manner. It remains a challenging task to systematically infer how the cooperative binding of TFs drives gene activity. Here, we quantitatively analyzed the correlation between TFs and surveyed the TF interaction networks associated with gene expression in...
Article
Full-text available
Long noncoding RNAs (lncRNAs) are widely present in different species and play critical roles in response to abiotic stresses. However, the functions of lncRNAs in Chinese cabbage under heat stress remain unknown. Here, we first conducted a global comparative analysis of 247,242 lncRNAs among 37 species. The results indicated that lncRNAs were poor...
Article
Full-text available
Background Dimension disaster is often associated with feature extraction. The extracted features may contain more redundant feature information, which leads to the limitation of computing ability and overfitting problems. Objective Feature selection is an important strategy to overcome the problems from dimension disaster. In most machine learnin...
Article
Protein-ligand interactions are necessary for majority protein functions. Adenosine-5’-triphosphate (ATP) is one such ligand that plays vital role as a coenzyme in providing energy for cellular activities, catalyzing biological reaction and signaling. Knowing ATP binding residues of proteins is helpful for annotation of protein function and drug de...
Article
Full-text available
The pairwise interaction between transcription factors (TFs) plays an important role in enhancer-promoter loop formation. Although thousands of TFs in the human genome have been found, only a few TF pairs have been demonstrated to be related to loop formation. It is still a challenge to determine which TF pairs could be involved in the enhancer-pro...
Article
N4-methylcytosine (4mC) is a type of DNA modification which could regulate several biological progressions such as transcription regulation, replication and gene expressions. Precisely recognizing 4mC sites in genomic sequences can provide specific knowledge about their genetic roles. This study aimed to develop a deep learning-based model to predi...
Article
Full-text available
DNA modification plays a pivotal role in regulating gene expression in cell development. As prevalent markers on DNA, 5-methylcytosine (5mC), N6-methyladenine (6mA), and N4-methylcytosine (4mC) can be recognized by specific methyltransferases, facilitating cellular defense and the versatile regulation of gene expression in eukaryotes and prokaryote...
Article
Full-text available
Background: SARS-Cov-2 is a newly emerged coronavirus that causes a severe type of pneumonia in the host organism. It is an urgent need to find some inhibitors against SARS-Cov-2. Therefore, drug repurposing study is an effective strategy for treating pneumonia to find the inhibitors of SARS-Cov-2 proteins. Method: For this purpose, a library of 25...
Article
Full-text available
Cyclin proteins are capable to regulate the cell cycle by forming a complex with cyclin-dependent kinases to activate cell cycle. Correct recognition of cyclin proteins could provide key clues for studying their functions. However, their sequences share low similarity, which results in poor prediction for sequence similarity-based methods. Thus, it...
Article
Full-text available
The rapid spread of SARS-CoV-2 infection around the globe has caused a massive health and socioeconomic crisis. Identification of phosphorylation sites is an important step for understanding the molecular mechanisms of SARS-CoV-2 infection and the changes within the host cells pathways. In this study, we present DeepIPs, a first specific deep-learn...
Article
DNase I hypersensitive site (DHS) refers to the hypersensitive region of chromatin for the DNase I enzyme. It is an important part of the noncoding region and contains a variety of regulatory elements, such as promoter, enhancer, and transcription factor-binding site, etc. Moreover, the related locus of disease (or trait) are usually enriched in th...
Article
Diabetes is a global epidemic. Long-term exposure to hyperglycemia can cause chronic damage to various tissues. Thus, early diagnosis of diabetes is crucial. In this study, we designed a computational system to predict diabetes risk by fusing multifarious types of physical examination data. We collected 1,507,563 physical examination data of health...
Article
Full-text available
Three-dimensional (3D) architecture of the chromosomes is of crucial importance for transcription regulation and DNA replication. Various high-throughput chromosome conformation capture-based methods have revealed that CTCF-mediated chromatin loops are a major component of 3D architecture. However, CTCF-mediated chromatin loops are cell type specif...
Article
As a key region, promoter plays a key role in transcription regulation. A eukaryotic promoter database called EPD has been constructed to store eukaryotic POL II promoters. Although there are some promoter databases for specific prokaryotic species or specific promoter type, such as RegulonDB for Escherichia coli K-12, DBTBS for Bacillus subtilis a...
Article
Full-text available
Allergens have the ability to enter the body and cause illness. Leukotriene is the widespread allergen which could stimulate mast cells to discharge histamine which causes allergy symptoms. An effective strategy for treating leukotriene-induced allergy is to find the inhibitors of leukotriene or histamine activity from phytochemicals. For this purp...
Article
Full-text available
Chronic myelogenous leukemia (CML) is a type of cancer with a series of characteristics that make it particularly suitable for observations on leukemogenesis. Research have exhibited that the occurrence and progression of CML are associated with the dynamic alterations of histone modification (HM) patterns. In this study, we analyze the distributio...
Article
Full-text available
Simple sequence repeats (SSRs) are popular and important molecular markers that exist widely in plants. Here, we conducted a comprehensive identification and comparative analysis of SSRs in 14 tree species. A total of 16, 298 SSRs were identified from 429, 449 genes, and primers were successfully designed for 99.44% of the identified SSRs. Our anal...
Article
Full-text available
Apiaceae is one of the most important families in Apiales and includes many economically important vegetables and medicinal plants. The TEOSINTE BRANCHED 1/CYCLOIDEA/PROLIFERATING CELL FACTOR 1/2 (TCP) gene family plays an important role in regulating plant growth and development, but it has not been widely studied in Apiaceae. In the present study...
Article
Full-text available
The protein Yin Yang 1 (YY1) could form dimers that facilitate the interaction between active enhancers and promoter-proximal elements. YY1-mediated enhancer–promoter interaction is the general feature of mammalian gene control. Recently, some computational methods have been developed to characterize the interactions between DNA elements by elucida...
Article
Full-text available
As a newly discovered protein posttranslational modification, histone lysine crotonylation (Kcr) involved in cellular regulation and human diseases. Various proteomics technologies have been developed to detect Kcr sites. However, experimental approaches for identifying Kcr sites are often time-consuming and labor-intensive, which is difficult to w...
Article
Full-text available
Celery (Apium graveolens L. 2n = 2x = 22), a member of the Apiaceae family, is among the most important and globally grown vegetables. Here, we report a high‐quality genome sequence assembly, anchored to 11 chromosomes, with total length of 3.33 Gb and N50 scaffold length of 289.78 Mb. Most (92.91%) of the genome is composed of repetitive sequences...
Article
Motivation: Protein carbonylation is one of the most important oxidative stress-induced post-translational modifications (PTMs), which is generally characterized as stability, irreversibility and relative early formation. It plays significant role in orchestrating various biological processes and has been already demonstrated to be related to many...
Article
Full-text available
Transcription factors play key roles in cell-fate decisions by regulating 3D genome conformation and gene expression. The traditional view is that methylation of DNA hinders transcription factors binding to them, but recent research has shown that many transcription factors prefer to bind to methylated DNA. Therefore, identifying such transcription...
Article
Full-text available
N6-methyladenosine (m6A) is the methylation of the adenosine at the nitrogen-6 position, which is the most abundant RNA methylation modification and involves a series of important biological processes. Accurate identification of m6A sites in genome-wide is invaluable for better understanding their biological functions. In this work, an ensemble pre...
Article
Full-text available
5hmC, 6mA and 4mC are three common DNA modifications and involve in various of biological processes. Accurate genome-wide identification of these sites is invaluable for better understanding their biological functions. Due to the labor-intensive and expensive nature of experimental methods, it is urgent to develop computational methods for the geno...
Article
Full-text available
Hepatocellular carcinoma (HCC) is a serious cancer which ranked the fourth in cancer-related death worldwide. Hence, more accurate diagnostic models are urgently needed to aid the early HCC diagnosis under clinical scenarios and thus improve HCC treatment and survival. Several conventional methods have been used for discriminating HCC from cirrhosi...
Article
Motivation: DNA N4-methylcytosine (4mC) is a crucial epigenetic modification. However, the knowledge about its biological functions is limited. Effective and accurate identification of 4mC sites will be helpful to reveal its biological functions and mechanisms. Since experimental methods are cost and ineffective, a number of machine learning based...
Article
Full-text available
The locations of the initiation of genomic DNA replication are defined as origins of replication sites (ORIs), which regulate the onset of DNA replication and play significant roles in the DNA replication process. The study of ORIs is essential for understanding the cell-division cycle and gene expression regulation. Accurate identification of ORIs...
Article
Knowledge of the sub-cellular localization of the most diverse class of transcribed RNA, long non-coding RNAs (lncRNAs) will lead us to identify different types of cancers and other diseases as lncRNAs play key role in related cellular functions. In recent days with the exponential growth of known records, it becomes essential to establish new mach...
Article
Full-text available
Messenger RNAs (mRNAs) shoulder special responsibilities that transmit genetic code from DNA to discrete locations in the cytoplasm. The locating process of mRNA might provide spatial and temporal regulation of mRNA and protein functions. The situ hybridization and quantitative transcriptomics analysis could provide detail information about mRNA su...
Article
Full-text available
As one of the most popular post-transcriptional modifications, pseudouridine (Ψ) participates in a series of biological processes. Therefore, the efficient detection of pseudouridine sites is very important in revealing its functions in biological processes. Although experimental techniques have been proposed for identifying Ψ sites at single-base...
Article
Full-text available
5hmC, 6mA, and 4mC are three common DNA modifications and are involved in various of biological processes. Accurate genome-wide identification of these sites is invaluable for better understanding their biological functions. Owing to the labor-intensive and expensive nature of experimental methods, it is urgent to develop computational methods for...
Article
Full-text available
Bioluminescent proteins (BLPs) are widely distributed in many living organisms that act as a key role of light emission in the bioluminescence. Bioluminescence serves various functions in finding food and protecting themselves of lives of creatures. With the routinely biotechnological application of bioluminescence, it is recognized to be essential...
Article
Full-text available
Meiotic recombination is one of the most important driving forces of biological evolution, which is initiated by double-strand DNA breaks. Recombination has important roles in genome diversity and evolution. This review firstly provides a comprehensive survey of the 15 computational methods developed for identifying recombination hotspots in Saccha...
Article
Mycobacterium tuberculosis (MTB) can cause the terrible tuberculosis (TB), which is reported as one of the most dreadful epidemic. Although many biochemical molecular drugs have been developed to cope with this disease, the drug resistance—especially the multidrug-resistant (MDR) and extensively drug-resistance (XDR)—poses a huge threat to the trea...
Article
Many efforts have been made in developing bioinformatics algorithms to predict functional attributes of genes and proteins from their primary sequences. One challenge in this process is to intuitively analyze and to understand the statistical features that have been selected by heuristic or iterative methods. In this paper, we developed VisFeature,...
Article
Full-text available
Motivation: Numerous experimental and computational studies in the biomedical literature have provided considerable amounts of data on diverse RNA-RNA interactions (RRIs). However, few text mining systems for RRIs information extraction are available. Results: RNA Interactome Scoper (RIscoper) represents the first tool for full-scale RNA interac...
Article
Motivation: DNA N6-methyladenine (6mA) is associated with a wide range of biological processes. Since the distribution of 6mA site in the genome is non-random, accurate identification of 6mA sites is crucial for understanding its biological functions. Although experimental methods have been proposed for this regard, they are still cost-ineffective...
Article
Full-text available
As an essential post-transcriptional modification, N7-methylguanosine (m7G) regulates nearly every step of the life cycle of mRNA. Accurate identification of the m7G site in the transcriptome will provide insights into its biological functions and mechanisms. Although the m7G-methylated RNA immunoprecipitation sequencing (MeRIP-seq) method has been...
Article
Full-text available
RNA N2-methylguanosine (m2G) is one kind of posttranscriptional modification and plays crucial roles in the control and stabilization of tRNA. However, our knowledge about the biological functions of m2G is still limited. The key step of revealing its new function is to recognize the m2G sites in the transcriptome. Since there is no effective metho...
Article
Full-text available
DNA N6-methyladenine (6mA) is a prevalent kind of DNA modification and involves in various of biological processes. Accurate genome-wide identification of 6mA sites is invaluable for better understanding its biological functions. Due to the labor-intensive and expensive nature of experimental methods for 6mA detection in eukaryotes genome, it is ur...
Article
Full-text available
5-Methylcytosine (m5C) plays an extremely important role in the basic biochemical process. With the great increase of identified m5C sites in a wide variety of organisms, their epigenetic roles become largely unknown. Hence, accurate identification of m5C site is a key step in understanding its biological functions. Over the past several years, mor...
Article
Full-text available
Promoter is a fundamental DNA element located around the transcription start site (TSS) and could regulate gene transcription. Promoter recognition is of great significance in determining transcription units, studying gene structure, analyzing gene regulation mechanisms, and annotating gene functional information. Many models have already been prop...
Article
Motivation: Dihydrouridine (D) is a common RNA posttranscriptional modification found in eukaryotes, bacteria and a few archaea. The modification can promote the conformational flexibility of individual nucleotide bases. And its levels are increased in cancerous tissues. Therefore, it is necessary to detect D in RNA for further understanding its f...
Article
The soluble carrier hormone binding protein (HBP) plays an important role in the growth of human and other animals. HBP can also selectively and non-covalently interact with hormone. Therefore, accurate identification of HBP is an important prerequisite for understanding its biological functions and molecular mechanisms. Since experimental methods...
Article
Full-text available
Histone modifications are associated with alternative splicing. It has been suggested that histone modifications act in combinational patterns in gene expression regulation. However, how they interact with each other and what is their casual relationships in the process of RNA splicing remain unclear. In this study, the combinatorial patterns of 38...