About
97
Publications
28,702
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
2,487
Citations
Introduction
Additional affiliations
September 2010 - December 2016
Education
September 2010 - December 2016
September 2006 - June 2010
Publications
Publications (97)
Background:
The dispensable genome of a species, consisting of the dispensable sequences present only in a subset of individuals, is believed to play important roles in phenotypic variation and genome evolution. However, construction of the dispensable genome is costly and labor-intensive at present, and so the influence of the dispensable genome...
Summary
We proposed to store large-scale genotype data as integer sparse matrices, which consumed much fewer computing resources for storage and analysis than traditional approaches. In addition, the raw genotype data could be readily recovered from integer sparse matrices. Utilizing this approach, we stored the genotype data of 1612 Asian cultivat...
Development of interactive web applications to deposit, visualize and analyze biological datasets is a major subject of bioinformatics. R is a programming language for data science, which is also one of the most popular languages used in biological data analysis and bioinformatics. However, building interactive web applications was a great challeng...
As a major food crop and model organism, rice has been mostly studied with the largest number of functionally characterized genes among all crops. We previously built the funRiceGenes database including ~ 2800 functionally characterized rice genes and ~ 5000 members of different gene families. Since being published, the funRiceGenes database has be...
Summary: Creation of Circos plot is one of the most efficient approaches to visualize genomic data. However, the installation and use of existing tools to make Circos plot are challenging for users lacking of coding experiences. To address this issue, we developed an R/Shiny application shinyCircos, a graphical user interface for interactive creati...
Inositolphosphorylceramide synthase (IPCS) catalyses ceramides and phosphatidylinositol (PI) into inositolphosphorylceramide (IPC), which is involved in the regulation of plant growth and development. A total of three OsIPCS family genes have been identified in rice. However, most of their functions remain unknown. Here, the functions of OsIPCSs we...
Fusarium verticillioides (F. verticillioides) is a widely distributed phytopathogen that incites multiple destructive diseases in maize, posing a grave threat to corn yields and quality worldwide. However, there are few reports of resistance genes to F. verticillioides. Here, we reveal that a combination of two single nucleotide polymorphisms (SNPs...
Soybean (Glycine max (L.) Merr.) is a globally significant crop, widely cultivated for oilseed production and animal feeds. In recent years, the rapid growth of multi-omics data from thousands of soybean accessions has provided unprecedented opportunities for researchers to explore genomes, genetic variations, and gene functions. To facilitate the...
We previously developed shinyCircos, an interactive web application for creating Circos diagrams, which has been widely recognized for its graphical user interface and ease of use. Here, we introduce shinyCircos-V2.0, an upgraded version of shinyCircos that includes a new user interface with enhanced usability and many new features for creating adv...
Chromosome evolution drives species evolution, speciation, and adaptive radiation. Accurate genome assembly is crucial to understanding chromosome evolution of species, such as dikaryotic fungi. Rust fungi (Pucciniales) in dikaryons represent the largest group of plant pathogens, but the evolutionary process of adaptive radiation in Pucciniales rem...
Hybrid maize displays superior heterosis and contributes over 30% of total worldwide cereal production. However, the molecular mechanisms of heterosis remain obscure. Here we show that structural variants (SVs) between the parental lines have a predominant role underpinning maize heterosis. De novo assembly and analyses of 12 maize founder inbred l...
Resistant starch (RS) is a special group of starches which are slowly degraded and rarely digested in the gastrointestinal tract. It was recognized as a new type of dietary fiber that improved cardiovascular, cerebrovascular, and intestinal health. Breeding high-RS-content wheat is one of the most efficient and convenient approaches for providing a...
Dynamic alteration of the epitranscriptome exerts regulatory effects on the lifecycle of oncogenic viruses in vitro. However, little is known about these effects in vivo because of the general lack of suitable animal infection models of these viruses. Using a model of rapid‐onset Marek's disease lymphoma in chickens, we investigated changes in vira...
Background
Numerous studies have shown that gluten aggregation properties directly affect the processing quality of wheat, however, the genetic basis of gluten aggregation properties were rarely reported.
Results
To explore the genetic basis of gluten aggregation properties in wheat, an association population consisted with 207 wheat genotypes wer...
Heterosis refers to the superior performance of hybrids over their parents, which is a general phenomenon occurring in diverse organisms. Many commercial hybrids produce high yield without delayed flowering, which we refer to as optimal heterosis that is desired in hybrid breeding. Here, we attempted to illustrate the genomic basis of optimal heter...
Gliadin is a group of grain storage proteins that confers extensibility/viscosity to the dough and are vital to end-use quality in wheat. Moreover, gliadins are one of the important components for nutritional quality because they contain the nutritional unprofitable epitopes that cause chronic immune-mediated intestinal disorder in genetically susc...
As a major food crop and model organism, rice has been mostly studied with the largest number of functionally characterized genes among all crops. We previously built the funRiceGenes database including ∼2800 functionally characterized rice genes and ∼5000 members of different gene families. Since being published, the funRiceGenes database has been...
Small RNAs (sRNAs) constitute a large portion of functional elements in eukaryotic genomes. Long inverted repeats (LIRs) can be transcribed into long hairpin RNAs (hpRNAs), which can further be processed into small interfering RNAs (siRNAs) with vital biological roles. In this study, we systematically identified a total of 6 619 473 LIRs in 424 euk...
Normal microsporogenesis is determined by both nuclear and mitochondrial genes. In maize C-type cytoplasmic male sterility, it is unclear how the development of meiocytes and microspores is affected by the mitochondrial sterility gene and the nuclear restorer gene. In this study, we sequenced the transcriptomes of single meiocytes (tetrad stage) an...
Introduction
Gliadins are the major components of gluten proteins with vital roles on properties of end-use wheat product and health-relate quality of wheat. However, the function and regulation mechanisms of γ-gliadin genes remain unclear.
Objectives
Dissect the effect of DNA methylation in the promoter of γ-gliadin gene on its expression level a...
In plants, the cell fates of a vegetative cell (VC) and generative cell (GC) are determined after the asymmetric division of the haploid microspore. The VC exits the cell cycle and grows a pollen tube, while the GC undergoes further mitosis to produce two sperm cells for double fertilization. However, our understanding of the mechanisms underlying...
Background
Whole‐exome sequencing (WES) can expedite research on genetic variation in non‐human primate (NHP) models of human diseases. However, NHP‐specific reagents for exome capture are not available. This study reports the use of human‐specific capture reagents in WES for olive baboons, marmosets, and vervet monkeys.
Methods
Exome capture was...
We have previously demonstrated that General Control Non-derepressible 1 (AtGCN1) is essential for translation inhibition under cold stress through interacting with GCN2 to phosphorylate eukaryotic translation initiation factor 2 (eIF2). Here, we report that the flower time of the atgcn1 mutant is later than that of the wild type (WT), and some sil...
Bread wheat is one of the most important crops worldwide, supplying approximately one-fifth of the daily protein and the calories for human consumption. Gluten aggregation properties play important roles in determining the processing quality of wheat (Triticum aestivum L.) products. Nevertheless, the genetic basis of gluten aggregation properties h...
Xu Wang Yajie Wang Wen Yao- [...]
Na Liu
In the present study, the complete mitogenome of Clavaria fumosa, was sequenced, assembled, and compared. The complete mitogenome of C. fumosa is 256,807 bp in length and is the largest mitogenomes among all Basidiomycota mitogenomes reported. Comparative mitogenomic analysis indicated that the C. fumosa mitogenome contained the most introns and in...
We previously conducted a QTL analysis of small RNA (sRNA) abundance in flag leaves of an immortalized rice F2 (IMF2) population by aligning sRNA reads to the reference genome to quantify the expression levels of sRNAs. However, this approach missed about half of the sRNAs as only 50% of all sRNA reads could be uniquely aligned to the reference gen...
The grain size and shape of rice are limited by the growth of the spikelet hulls, and are important selective target during domestication and breeding. In this study, we identified a glycine- and proline-rich protein (OsGPRP3), which belongs to a conserved family rarely studied. We found that OsGPRP3 was highly expressed in the seed at 10 days afte...
Fusarium ear rot (FER) caused by Fusarium verticillioides is one of the most common diseases affecting maize production worldwide. FER results in severe yield losses and grain contamination with health-threatening mycotoxins. Although most studies to date have focused on comprehensive analysis of gene regulation in maize during defense responses ag...
Background:
Next generation sequencing (NGS) has been widely used in biological research, due to its rapid decrease in cost and increasing ability to generate data. However, while the sequence generation step has seen many improvements over time, the library preparation step has not, resulting in low-efficiency library preparation methods, especia...
Identification of structural variations between individuals is very important for the understanding of phenotype variations and diseases. Despite the existence of dozens of programs for prediction of structural variations, none of them is the golden standard in this field and the results of multiple programs were usually integrated to get more reli...
Statement
The authors have withdrawn our manuscript whilst we perform additional experiments to test some of our conclusions further. Therefore, the authors do not wish this work to be cited as reference for the project. If you have any questions, please contact the corresponding author.
Non-circular plots of whole genomes are natural representations of genomic data aligned along all chromosomes. Currently, there is no specialized graphical user interface (GUI) designed to produce non-circular whole genome diagrams, and the use of existing tools requires considerable coding effort from users. Moreover, such tools also require impro...
With the rapid decreasing of sequencing cost, large volume of genotype data has been generated in many organisms based on high-throughput sequencing, which was utilized in various fields of biological studies in the post-genome era. The raw sequencing data were usually deposited in the NCBI SRA database. Construction of the database to store and an...
Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translate...
我们提出了使用稀疏矩阵存储高通量基因型数据的方法(Yao et al., 2019)。和传统方法比较,稀疏矩阵消耗的计算资源更少,包括占用的内存更小,需要的磁盘存储空间更少,数据存取的速度更快。基于稀疏矩阵的方法,我们使用2058份水稻材料在8584244个SNP位点的基因型数据构建了ECOGEMS数据库,仅仅占用了310MB的磁盘空间。而且,我们在ECOGEMS数据库中搭建了基因型数据分析所需常用工具的图形界面。稀疏矩阵的方法不仅适用于只包含纯合基因型的数据,还可以应用于包含杂合基因型的数据。我们使用稀疏矩阵的方法构建了包含1210份玉米材料在35370939个SNP位点的基因型数据库MaizeSNPDB。
Complete and highly accurate reference genomes and gene annotations are indispensable for basic biological research and trait improvement of woody tree species. In this study, we integrated single‐molecule sequencing and high‐throughput chromosome conformation capture techniques to produce a high‐quality and long‐range contiguity chromosome‐scale g...
Low temperature is an environmental stress factor that is always been applied in research on improving crop growth, productivity, and quality of crops. Polyunsaturated fatty acids (PUFAs) play an important role in cold tolerance, so its genetic manipulation of the PUFA contents in crops has led to the modification of cold sensitivity. In this study...
Precise mapping of quantitative trait loci (QTLs) is critical for assessing the genetic effects and identification of candidate genes for quantitative traits. Interval and composite interval mappings have been the methods of choice for several decades, which have provided tools for identifying genomic regions harboring causal genes for quantitative...
Nuclear factor-Y (NF-Y) transcription factors are important regulators of several essential biological processes, including embryogenesis, drought resistance, meristem maintenance and photoperiod-dependent flowering in Arabidopsis. However, the regulatory mechanisms of NF-Ys in maize (Zea mays) are not well understood yet. In this study, we identif...
Background: As a main staple food, rice is also a model plant for functional genomic studies of monocots. Decoding of every DNA element of the rice genome is essential for genetic improvement to address increasing food demands. The past 15 years have witnessed extraordinary advances in rice functional genomics. Systematic characterization and prope...
Over the past 30 years we have performed many fundamental studies on two
indica
(one of two Asian cultivated ricesubspecies) varieties Zhenshan 97 (ZS97) and Minghui 63 (MH63) which represent the two major varietal groups of the
indica
subspecies and are the parents of an elite Chinese hybrid. However, lacking of their high-quality reference genome...
Tiller angle is one of the most important components of the ideal plant architecture that can greatly enhance rice grain yield. Understanding the genetic basis of tiller angle and mining favorable alleles will be helpful for breeding new plant-type varieties. Here, we performed genome-wide association studies (GWAS) to identify genes controlling ti...
Genome-wide association study for tiller angle in the whole population, the indica and the japonica subpopulations by LR.
Manhattan plots and quantile-quantile plot for tiller angle in the full population (a), indica subpopulation (b) and japonica subpopulation (c). The horizontal dashed lines of the Manhattan plots indicate the significance thresh...
Significant signals for tiller angle detected only in Wuhan using the LMM and LR methods.
(DOC)
Co-localization of associated sites with the previously detected tiller angle-related QTLs in rice.
(DOC)
Genome-wide association study for tiller angle in the whole population, the indica and the japonica subpopulations by LMM.
Manhattan plots and quantile-quantile plots for tiller angle in the full population (a), indica subpopulation (b) and japonica subpopulation (c). The horizontal dashed lines of the Manhattan plots indicate the significance thre...
Expression profiles of Os03g51660 (TAC3) and Os03g51670 and the sequence of TAC3 cDNA.
(a) Genotyping of the 4A-02006 mutant. W, the wild type; H, heterozygote; M, homozygote. (b) qRT-PCR expression analysis of Os03g51670 in wild type DJ, heterozygote (4A-02006 H)and homozygote (4A-02006 M) mutant using the leaves of tillering stage; the number of...
Significant signals for tiller angle detected only in Hainan using the LMM and LR methods.
(DOC)
The tiller angle of 529 O. sativa in Hainan and Wuhan.
(XLS)
Awn is one of the most important domesticated traits in rice (Oryza sativa).
Understanding the genetic basis of awn length is important for grain harvest and
production because long awn length is disadvantageous for both grain harvest and
milling. We investigated the awn length of 529 rice cultivars and performed a genomewide
association studies (G...
Significance
Indica rice accounts for >70% of total rice production worldwide, is genetically highly diverse, and can be divided into two major varietal groups independently bred and widely cultivated in China and Southeast Asia. Here, we generated high-quality genome sequences for two elite rice varieties, Zhenshan 97 and Minghui 63, representing...
Heterotrimeric Heme Activator Protein (HAP) family genes are involved in the regulation of flowering in plants. It is not
clear how many HAP genes regulate heading date in rice. In this study, we identified 35 HAP genes, including seven newly identified genes, and performed gene duplication and candidate gene-based association analyses.
Analyses sh...