About
105
Publications
32,878
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
3,223
Citations
Introduction
Current institution
Additional affiliations
June 2017 - present
September 2010 - December 2016
Education
September 2010 - December 2016
September 2006 - June 2010
Publications
Publications (105)
Summary: Creation of Circos plot is one of the most efficient approaches to visualize genomic data. However, the installation and use of existing tools to make Circos plot are challenging for users lacking of coding experiences. To address this issue, we developed an R/Shiny application shinyCircos, a graphical user interface for interactive creati...
Cold damage poses a significant challenge to the cultivation of soft-seeded pomegranate varieties, hindering the growth of the pomegranate industry. The genetic basis of cold tolerance in pomegranates has remained elusive, largely due to the lack of high-quality genome assemblies for cold-tolerant varieties and comprehensive population-scale genomi...
Non-canonical peptides (NCPs) are a class of peptides generated from regions previously thought of as non-coding, such as introns, 5' untranslated regions (UTRs), 3' UTRs, and intergenic regions. In recent years, the significance and diverse functions of NCPs have come to light, yet a systematic and comprehensive NCP database remains absent. Here,...
We previously developed shinyCircos, an interactive web application for creating Circos diagrams, which has been widely recognized for its graphical user interface and ease of use. Here, we introduce shinyCircos-V2.0, an upgraded version of shinyCircos that includes a new user interface with enhanced usability and many new features for creating adv...
Development of interactive web applications to deposit, visualize and analyze biological datasets is a major subject of bioinformatics. R is a programming language for data science, which is also one of the most popular languages used in biological data analysis and bioinformatics. However, building interactive web applications was a great challeng...
The cell wall is a crucial feature that allows ancestral streptophyte green algae to colonize land. Expansin, an extracellular protein that mediates cell wall loosening in a pH-dependent manner, could be a powerful tool for studying cell wall evolution. However, the evolutionary trajectory of the expansin family remains largely unknown. Here, we co...
Powdery mildew is a devastating disease that affects wheat yield and quality. Wheat wild relatives represent valuable sources of disease resistance genes. Cloning and characterization of these genes will facilitate their incorporation into wheat breeding programs. Here, we report the cloning of Pm57, a wheat powdery mildew resistance gene from Aegi...
Sugar Will Eventually be Exported Transporter (SWEET) proteins are highly conserved in various organisms and play crucial roles in sugar transport processes. However, SWEET proteins in peanuts, an essential leguminous crop worldwide, remain lacking in systematic characterization. Here, we identified 94 SWEET genes encoding the conservative MtN3/sal...
Inositolphosphorylceramide synthase (IPCS) catalyses ceramides and phosphatidylinositol (PI) into inositolphosphorylceramide (IPC), which is involved in the regulation of plant growth and development. A total of three OsIPCS family genes have been identified in rice. However, most of their functions remain unknown. Here, the functions of OsIPCSs we...
Fusarium verticillioides (F. verticillioides) is a widely distributed phytopathogen that incites multiple destructive diseases in maize, posing a grave threat to corn yields and quality worldwide. However, there are few reports of resistance genes to F. verticillioides. Here, we reveal that a combination of two single nucleotide polymorphisms (SNPs...
Soybean (Glycine max (L.) Merr.) is a globally significant crop, widely cultivated for oilseed production and animal feeds. In recent years, the rapid growth of multi-omics data from thousands of soybean accessions has provided unprecedented opportunities for researchers to explore genomes, genetic variations, and gene functions. To facilitate the...
Powdery mildew is a devastating disease that affects wheat yield and quality. Despite wheat wild relatives being a valuable source of resistance genes, their incorporation into wheat improvement is constrained by their adverse effects on agronomic traits, difficulty in isolating genes and poor understanding of the resistance mechanisms. Here, we re...
Chromosome evolution drives species evolution, speciation, and adaptive radiation. Accurate genome assembly is crucial to understanding chromosome evolution of species, such as dikaryotic fungi. Rust fungi (Pucciniales) in dikaryons represent the largest group of plant pathogens, but the evolutionary process of adaptive radiation in Pucciniales rem...
Hybrid maize displays superior heterosis and contributes over 30% of total worldwide cereal production. However, the molecular mechanisms of heterosis remain obscure. Here we show that structural variants (SVs) between the parental lines have a predominant role underpinning maize heterosis. De novo assembly and analyses of 12 maize founder inbred l...
As a major food crop and model organism, rice has been mostly studied with the largest number of functionally characterized genes among all crops. We previously built the funRiceGenes database including ~ 2800 functionally characterized rice genes and ~ 5000 members of different gene families. Since being published, the funRiceGenes database has be...
Resistant starch (RS) is a special group of starches which are slowly degraded and rarely digested in the gastrointestinal tract. It was recognized as a new type of dietary fiber that improved cardiovascular, cerebrovascular, and intestinal health. Breeding high-RS-content wheat is one of the most efficient and convenient approaches for providing a...
Dynamic alteration of the epitranscriptome exerts regulatory effects on the lifecycle of oncogenic viruses in vitro. However, little is known about these effects in vivo because of the general lack of suitable animal infection models of these viruses. Using a model of rapid‐onset Marek's disease lymphoma in chickens, we investigated changes in vira...
Background
Numerous studies have shown that gluten aggregation properties directly affect the processing quality of wheat, however, the genetic basis of gluten aggregation properties were rarely reported.
Results
To explore the genetic basis of gluten aggregation properties in wheat, an association population consisted with 207 wheat genotypes wer...
Heterosis refers to the superior performance of hybrids over their parents, which is a general phenomenon occurring in diverse organisms. Many commercial hybrids produce high yield without delayed flowering, which we refer to as optimal heterosis that is desired in hybrid breeding. Here, we attempted to illustrate the genomic basis of optimal heter...
Gliadin is a group of grain storage proteins that confers extensibility/viscosity to the dough and are vital to end-use quality in wheat. Moreover, gliadins are one of the important components for nutritional quality because they contain the nutritional unprofitable epitopes that cause chronic immune-mediated intestinal disorder in genetically susc...
As a major food crop and model organism, rice has been mostly studied with the largest number of functionally characterized genes among all crops. We previously built the funRiceGenes database including ∼2800 functionally characterized rice genes and ∼5000 members of different gene families. Since being published, the funRiceGenes database has been...
Small RNAs (sRNAs) constitute a large portion of functional elements in eukaryotic genomes. Long inverted repeats (LIRs) can be transcribed into long hairpin RNAs (hpRNAs), which can further be processed into small interfering RNAs (siRNAs) with vital biological roles. In this study, we systematically identified a total of 6 619 473 LIRs in 424 euk...
Normal microsporogenesis is determined by both nuclear and mitochondrial genes. In maize C-type cytoplasmic male sterility, it is unclear how the development of meiocytes and microspores is affected by the mitochondrial sterility gene and the nuclear restorer gene. In this study, we sequenced the transcriptomes of single meiocytes (tetrad stage) an...
Introduction
Gliadins are the major components of gluten proteins with vital roles on properties of end-use wheat product and health-relate quality of wheat. However, the function and regulation mechanisms of γ-gliadin genes remain unclear.
Objectives
Dissect the effect of DNA methylation in the promoter of γ-gliadin gene on its expression level a...
In plants, the cell fates of a vegetative cell (VC) and generative cell (GC) are determined after the asymmetric division of the haploid microspore. The VC exits the cell cycle and grows a pollen tube, while the GC undergoes further mitosis to produce two sperm cells for double fertilization. However, our understanding of the mechanisms underlying...
Background
Whole‐exome sequencing (WES) can expedite research on genetic variation in non‐human primate (NHP) models of human diseases. However, NHP‐specific reagents for exome capture are not available. This study reports the use of human‐specific capture reagents in WES for olive baboons, marmosets, and vervet monkeys.
Methods
Exome capture was...
We have previously demonstrated that General Control Non-derepressible 1 (AtGCN1) is essential for translation inhibition under cold stress through interacting with GCN2 to phosphorylate eukaryotic translation initiation factor 2 (eIF2). Here, we report that the flower time of the atgcn1 mutant is later than that of the wild type (WT), and some sil...
Bread wheat is one of the most important crops worldwide, supplying approximately one-fifth of the daily protein and the calories for human consumption. Gluten aggregation properties play important roles in determining the processing quality of wheat (Triticum aestivum L.) products. Nevertheless, the genetic basis of gluten aggregation properties h...
Xu Wang Yajie Wang Wen Yao- [...]
Na Liu
In the present study, the complete mitogenome of Clavaria fumosa, was sequenced, assembled, and compared. The complete mitogenome of C. fumosa is 256,807 bp in length and is the largest mitogenomes among all Basidiomycota mitogenomes reported. Comparative mitogenomic analysis indicated that the C. fumosa mitogenome contained the most introns and in...
We previously conducted a QTL analysis of small RNA (sRNA) abundance in flag leaves of an immortalized rice F2 (IMF2) population by aligning sRNA reads to the reference genome to quantify the expression levels of sRNAs. However, this approach missed about half of the sRNAs as only 50% of all sRNA reads could be uniquely aligned to the reference gen...
The grain size and shape of rice are limited by the growth of the spikelet hulls, and are important selective target during domestication and breeding. In this study, we identified a glycine- and proline-rich protein (OsGPRP3), which belongs to a conserved family rarely studied. We found that OsGPRP3 was highly expressed in the seed at 10 days afte...
Fusarium ear rot (FER) caused by Fusarium verticillioides is one of the most common diseases affecting maize production worldwide. FER results in severe yield losses and grain contamination with health-threatening mycotoxins. Although most studies to date have focused on comprehensive analysis of gene regulation in maize during defense responses ag...
Background:
Next generation sequencing (NGS) has been widely used in biological research, due to its rapid decrease in cost and increasing ability to generate data. However, while the sequence generation step has seen many improvements over time, the library preparation step has not, resulting in low-efficiency library preparation methods, especia...
Identification of structural variations between individuals is very important for the understanding of phenotype variations and diseases. Despite the existence of dozens of programs for prediction of structural variations, none of them is the golden standard in this field and the results of multiple programs were usually integrated to get more reli...
Statement
The authors have withdrawn our manuscript whilst we perform additional experiments to test some of our conclusions further. Therefore, the authors do not wish this work to be cited as reference for the project. If you have any questions, please contact the corresponding author.
Non-circular plots of whole genomes are natural representations of genomic data aligned along all chromosomes. Currently, there is no specialized graphical user interface (GUI) designed to produce non-circular whole genome diagrams, and the use of existing tools requires considerable coding effort from users. Moreover, such tools also require impro...
Complete and highly accurate reference genomes and gene annotations are indispensable for basic biological research and trait improvement of woody tree species. In this study, we integrated single‐molecule sequencing and high‐throughput chromosome conformation capture techniques to produce a high‐quality and long‐range contiguity chromosome‐scale g...
With the rapid decreasing of sequencing cost, large volume of genotype data has been generated in many organisms based on high-throughput sequencing, which was utilized in various fields of biological studies in the post-genome era. The raw sequencing data were usually deposited in the NCBI SRA database. Construction of the database to store and an...
Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translate...
我们提出了使用稀疏矩阵存储高通量基因型数据的方法(Yao et al., 2019)。和传统方法比较,稀疏矩阵消耗的计算资源更少,包括占用的内存更小,需要的磁盘存储空间更少,数据存取的速度更快。基于稀疏矩阵的方法,我们使用2058份水稻材料在8584244个SNP位点的基因型数据构建了ECOGEMS数据库,仅仅占用了310MB的磁盘空间。而且,我们在ECOGEMS数据库中搭建了基因型数据分析所需常用工具的图形界面。稀疏矩阵的方法不仅适用于只包含纯合基因型的数据,还可以应用于包含杂合基因型的数据。我们使用稀疏矩阵的方法构建了包含1210份玉米材料在35370939个SNP位点的基因型数据库MaizeSNPDB。
Low temperature is an environmental stress factor that is always been applied in research on improving crop growth, productivity, and quality of crops. Polyunsaturated fatty acids (PUFAs) play an important role in cold tolerance, so its genetic manipulation of the PUFA contents in crops has led to the modification of cold sensitivity. In this study...
Precise mapping of quantitative trait loci (QTLs) is critical for assessing the genetic effects and identification of candidate genes for quantitative traits. Interval and composite interval mappings have been the methods of choice for several decades, which have provided tools for identifying genomic regions harboring causal genes for quantitative...
Summary
We proposed to store large-scale genotype data as integer sparse matrices, which consumed much fewer computing resources for storage and analysis than traditional approaches. In addition, the raw genotype data could be readily recovered from integer sparse matrices. Utilizing this approach, we stored the genotype data of 1612 Asian cultivat...
Nuclear factor-Y (NF-Y) transcription factors are important regulators of several essential biological processes, including embryogenesis, drought resistance, meristem maintenance and photoperiod-dependent flowering in Arabidopsis. However, the regulatory mechanisms of NF-Ys in maize (Zea mays) are not well understood yet. In this study, we identif...
Background: As a main staple food, rice is also a model plant for functional genomic studies of monocots. Decoding of every DNA element of the rice genome is essential for genetic improvement to address increasing food demands. The past 15 years have witnessed extraordinary advances in rice functional genomics. Systematic characterization and prope...
Over the past 30 years we have performed many fundamental studies on two
indica
(one of two Asian cultivated ricesubspecies) varieties Zhenshan 97 (ZS97) and Minghui 63 (MH63) which represent the two major varietal groups of the
indica
subspecies and are the parents of an elite Chinese hybrid. However, lacking of their high-quality reference genome...
Tiller angle is one of the most important components of the ideal plant architecture that can greatly enhance rice grain yield. Understanding the genetic basis of tiller angle and mining favorable alleles will be helpful for breeding new plant-type varieties. Here, we performed genome-wide association studies (GWAS) to identify genes controlling ti...
Genome-wide association study for tiller angle in the whole population, the indica and the japonica subpopulations by LR.
Manhattan plots and quantile-quantile plot for tiller angle in the full population (a), indica subpopulation (b) and japonica subpopulation (c). The horizontal dashed lines of the Manhattan plots indicate the significance thresh...
Significant signals for tiller angle detected only in Wuhan using the LMM and LR methods.
(DOC)
Co-localization of associated sites with the previously detected tiller angle-related QTLs in rice.
(DOC)
Genome-wide association study for tiller angle in the whole population, the indica and the japonica subpopulations by LMM.
Manhattan plots and quantile-quantile plots for tiller angle in the full population (a), indica subpopulation (b) and japonica subpopulation (c). The horizontal dashed lines of the Manhattan plots indicate the significance thre...
Expression profiles of Os03g51660 (TAC3) and Os03g51670 and the sequence of TAC3 cDNA.
(a) Genotyping of the 4A-02006 mutant. W, the wild type; H, heterozygote; M, homozygote. (b) qRT-PCR expression analysis of Os03g51670 in wild type DJ, heterozygote (4A-02006 H)and homozygote (4A-02006 M) mutant using the leaves of tillering stage; the number of...
Significant signals for tiller angle detected only in Hainan using the LMM and LR methods.
(DOC)
The tiller angle of 529 O. sativa in Hainan and Wuhan.
(XLS)
Awn is one of the most important domesticated traits in rice (Oryza sativa).
Understanding the genetic basis of awn length is important for grain harvest and
production because long awn length is disadvantageous for both grain harvest and
milling. We investigated the awn length of 529 rice cultivars and performed a genomewide
association studies (G...
Significance
Indica rice accounts for >70% of total rice production worldwide, is genetically highly diverse, and can be divided into two major varietal groups independently bred and widely cultivated in China and Southeast Asia. Here, we generated high-quality genome sequences for two elite rice varieties, Zhenshan 97 and Minghui 63, representing...
Heterotrimeric Heme Activator Protein (HAP) family genes are involved in the regulation of flowering in plants. It is not
clear how many HAP genes regulate heading date in rice. In this study, we identified 35 HAP genes, including seven newly identified genes, and performed gene duplication and candidate gene-based association analyses.
Analyses sh...
Mitogen-activated protein (MAP) kinase cascades, with each cascade consisting of a MAP kinase kinase kinase (MAPKKK), a MAP kinase kinase (MAPKK), and a MAP kinase (MAPK), play important roles in dicot plant responses to pathogen infection. However, no single MAP kinase cascade has been identified in rice, and the functions of MAP kinase cascades i...
Background:
The dispensable genome of a species, consisting of the dispensable sequences present only in a subset of individuals, is believed to play important roles in phenotypic variation and genome evolution. However, construction of the dispensable genome is costly and labor-intensive at present, and so the influence of the dispensable genome...
Significance
Intensive rice breeding over the past 50 y has produced many high-performing cultivars, but our knowledge of the genomic changes associated with such improvement remains limited. By analyzing sequences of 1,479 rice accessions, this study identified genomic changes associated with breeding efforts, referred to as breeding signatures, i...
Rice is a staple food crop and an ideal model for functional genome research of monocot. Utilization of heterosis has contributed tremendously to the increased productivity of rice. We report the next generation sequencing of Zhenshan 97 and Minghui 63, which are the parents of an elite rice hybrid Shanyou 63. Both Zhenshan 97 and Minghui 63 were s...
Jia Wang Wen Yao Dan Zhu- [...]
Qifa Zhang
We performed a genetic analysis of sRNA abundance in flag leaf from an immortalized F(2) (IMF2) population in rice. We identified 53,613,739 unique sRNAs and 165,797 sRNA expression traits (s-traits). A total of 66,649 s-traits mapped 40,049 local-sQTLs and 30,809 distant-sQTLs. By defining 80,362 sRNA clusters, 22,263 sRNA cluster QTLs (scQTLs) we...
CCT domain-containing genes generally control flowering in plants. Currently, only six of the 41 CCT family genes have been confirmed to control flowering in rice. To efficiently identify more heading date-related genes from the CCT family, we compared the positions of heading date QTLs and CCT genes and found that 25 CCT family genes were located...
Rice Variation Map (RiceVarMap, http:/ricevarmap.ncpgr.cn) is a database of rice genomic variations. The database provides comprehensive information of 6 551 358 single nucleotide
polymorphisms (SNPs) and 1 214 627 insertions/deletions (INDELs) identified from sequencing data of 1479 rice accessions.
The SNP genotypes of all accessions were imputed...
Grains from cereals contribute an important source of protein to human food, and grain protein content (GPC) is an important determinant of nutritional quality in cereals. Here we show that the quantitative trait locus (QTL) qPC1 in rice controls GPC by regulating the synthesis and accumulation of glutelins, prolamins, globulins, albumins and starc...
Grains from cereals contribute an important source of protein to human food, and grain protein content (GPC) is an important determinant of nutritional quality in cereals. Here we show that the quantitative trait locus (QTL) qPC1 in rice controls GPC by regulating the synthesis and accumulation of glutelins, prolamins, globulins, albumins and starc...