
Sunil Kumar SahuBeijing Genomics Institute · BGI-Research
Sunil Kumar Sahu
PhD., Biotechnology
Exploring the fascinating world of plant genomics and evolution
About
169
Publications
332,773
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
2,240
Citations
Introduction
I am currently working as a Research Scientist at BGI-Shenzhen. My research is primarily focused on Plant Molecular Biology, Evolution and Bioinformatics. My current project is regarding 'Decoding the genomes of commercially, evolutionarily and agriculturally important plants'.
Note: Open for collaborations
Additional affiliations
September 2019 - February 2022
January 2018 - February 2022
January 2018 - February 2022
Education
November 2010 - February 2015
August 2008 - June 2010
August 2005 - May 2008
Publications
Publications (169)
Rice (Oryza sativa) is the leading source of nutrition for more than half of the world's population, and by far it is the most important commercial food crop. But, its growth and production are significantly hampered by the bacterial pathogen Xanthomonas oryzae pv. oryzae (Xoo) which causes leaf blight disease. Earlier studies have reported the ant...
Two of the most economically important plants in the Artocarpus genus are jackfruit (A. heterophyllus Lam.) and breadfruit (A. altilis (Parkinson) Fosberg). Both species are long-lived trees that have been cultivated for thousands of years in their native regions. Today they are grown throughout tropical to subtropical areas as an important source...
Dipterocarpaceae are typical tropical plants (dipterocarp forests) that are famous for its high economic value because of their production of fragrant oleoresins, top‐quality timber and usage in traditional Chinese medicine. Currently, the lack of Dipterocarpaceae genomes has been a limiting factor to decipher the fragrant oleoresin biosynthesis an...
Chloranthales remain the last major mesangiosperm lineage without a nuclear genome assembly. We therefore assemble a high-quality chromosome-level genome of Chloranthus spicatus to resolve enigmatic evolutionary relationships, as well as explore patterns of genome evolution among the major lineages of mesangiosperms (eudicots, monocots, magnoliids,...
Wood is the most important natural and endlessly renewable source of energy. Despite the ecological and economic importance of wood, many aspects of its formation have not yet been investigated. We performed chromosome-scale genome assemblies of three timber trees (Ochroma pyramidale, Mesua ferrea, and Tectona grandis) which exhibit different wood...
Mahogany species (family Meliaceae) are highly valued for their aesthetic and durable wood. Despite their economic and ecological importance, genomic resources for mahogany species are limited, hindering genetic improvement and conservation efforts. Here we perform chromosome-scale genome assemblies of two commercially important mahogany species: S...
Rosa roxburghii and Rosa sterilis , two species belonging to the Rosaceae family, are widespread in the southwest of China. These species have gained recognition for their remarkable abundance of ascorbate in their fresh fruits, making them an ideal vitamin C resource. In this study, we generated two high‐quality chromosome‐scale genome assemblies...
Recent technological developments in spatial transcriptomics allow researchers to measure gene expression of cells and their spatial locations at the single-cell level, generating detailed biological insight into biological processes. A comprehensive database could facilitate the sharing of spatial transcriptomic data and streamline the data acquis...
The Legume family (Leguminosae or Fabaceae), is one of the largest and economically important flowering plants. Heartwood, the core of a tree trunk or branch, is a valuable and renewable resource employed for centuries in constructing sturdy and sustainable structures. Hongmu refers to a category of precious timber trees in China, encompassing 29 w...
Introduction
Fungus-derived secondary metabolites are fascinating with biomedical potential and chemical diversity. Mining endophytic fungi for drug candidates is an ongoing process in the field of drug discovery and medicinal chemistry. Endophytic fungal symbionts from terrestrial plants, marine flora, and fauna tend to produce interesting types o...
Parasitic plants have evolved to be subtly or severely dependent on host plants to complete their life cycle. To provide new insights into the biology of parasitic plants in general, we assembled genomes for members of the sandalwood order Santalales, including a stem hemiparasite (Scurrula) and two highly modified root holoparasites (Balanophora)...
The availability of multiple sequenced genomes from a single species made it possible to explore intra- and inter-specific genomic comparisons at higher resolution and build clade-specific pan-genomes of several crops. The pan-genomes of crops constructed from various cultivars, accessions, landraces, and wild ancestral species represent a compendi...
The availability of multiple sequenced genomes from a single species made it possible to explore intra- and inter-specific genomic comparisons at higher resolution and build clade-specific pangenomes of several crops. The pan-genomes of crops constructed from various cultivars/accessions, landraces, and wild ancestral species represent a compendium...
Vertebrate embryogenesis is a remarkable process, during which numerous cell types of different lineages arise within a short time frame. An overwhelming challenge to understand this process is the lack of dynamic chromatin accessibility information to correlate cis-regulatory elements (CREs) and gene expression within the hierarchy of cell fate de...
Acorales is the sister lineage to all the other extant monocot plants. Genomic resource enhancement of this genus can help to reveal early monocot genomic architecture and evolution. Here, we assemble the genome of Acorus gramineus and reveal that it has ~45% fewer genes than the majority of monocots, although they have similar genome size. Phyloge...
Background:
Whole-genome bisulfite sequencing (WGBS) technology can provide comprehensive DNA methylation at a single-base resolution on a genome-wide scale, and is considered to be the gold standard for the detection of 5-methylcytosine (5 mC). However, the International Human Epigenome Consortium propose a full DNA methylome should have at least...
Due to the rapid increase in population and the decreasing availability of arable land and freshwater resources per capita, global crop production will need to double by 2050 to meet human food demand. In early March, the World Bank and the Food and Agriculture Organization (FAO) of the United Nations released a report saying that 45 countries are...
Long-read sequencing (LRS) technology along with the recent development of computational tools, crowned with the “Method of the Year 2022”, have made it possible to sequence and assemble the genomes of practically every representative species. By considering its immense role and potential in advancing scientific research, the LRS market was valued...
Background:
The medicinal material quality of Citrus reticulata 'Chachi' differs depending on the bioactive components influenced by the planting area. Environmental factors, such as soil nutrients, the plant-associated microbiome and climatic conditions, play important roles in the accumulation of bioactive components in citrus. However, how thes...
The ability to explore life kingdoms is largely driven by innovations and breakthroughs in technology, from the invention of the microscope 350 years ago to the recent emergence of single cell sequencing, by which the scientific community has been able to visualize life at an unprecedented resolution. Most recently, the Spatially Resolved Transcrip...
Background
Rheum tanguticum Maxim . ex Balf is a traditional Chinese medicinal plant that is commonly used to treat many ailments. It belongs to the Polygonacae family and grows in northwest and southwest China. At high elevations, the color of the plant’s young leaves is purple, which gradually changes to green during the growth cycle. Anthraquino...
Cremastra appendiculata (D. Don) Makino is a rare terrestrial orchid with a high market value as an ornamental and Chinese traditional medicinal herb with a wide range of pharmacological properties. The pseudobulbs of C. appendiculata are one of the primary sources of the famous traditional Chinese medicine “Shancigu”, which has been clinically use...
Recently, single-cell RNA sequencing (scRNA-seq) provides unprecedented power for accurately understanding gene expression regulatory mechanisms. However, scRNA-seq studies have limitations in plants, due to difficulty in protoplast isolation that requires enzymatic digestion of the cell walls from various plant tissues. Therefore, to overcome this...
Genetic and environmental factors collectively determine plant growth and yield. In the past 20 years, genome-wide association studies (GWAS) have been conducted on crops to decipher genetic loci that contribute to growth and yield, however, plant genotype appears to be insufficient to explain the trait variations. Here, we unravel the associations...
Citrus grandis ‘Tomentosa’ (CGT) (Huajuhong, HJH) is a widely used medicinal plant, which is mainly produced in Guangdong and Guangxi provinces of South China. Particularly, HJH from Huazhou (HZ) County of Guangdong province has been well-regarded as the best national product for geo-herbalism. But the reasons for geo-herbalism property in HJH from...
The interaction between selective nutrients and linked genes involving a specific organ reveals the genetic make-up of an individual in response to a particular nutrient. The interaction of genes with food opens opportunities for the addition of bioactive compounds for specific populations comprising identical genotypes. The slight difference in th...
Background: Marine sponges are sedentary invertebrates that are found in temperate, arctic, and tropical climates. They are well known for contributing significant bioactive substances with pharmacological values which are recovered from the marine environment. Sponge-associated symbiotic microbes like bacteria and fungi tend to produce secondary m...
The muskrat (Ondatra zibethicus) is a semi-aquatic rodent species with ecological, economic, and medicinal importance. Here we present an improved genome assembly, which is the first high-quality chromosome-level genome of the muskrat with high completeness and contiguity assembled using stLFR, BGISEQ, and Hi-C sequencing technologies. The genome s...
The raccoon dog (Nyctereutes procyonoides) is an invasive canid species native to East Asia with several distinct characteristics. Here, we report a chromosome-scale genome of the raccoon dog with high contiguity, completeness, and accuracy. The intact taste receptor genes, expanded gene families and positively selected genes related to digestion,...
Poaching and trafficking have a substantial negative impact on the population growth and range expansion of the Chinese pangolin ( Manis pentadactyla ). However, recently reported activities of Chinese pangolins in several sites of Guangdong province in China indicate a promising sign for the recovery of this threatened species. Here, we re-sequenc...
Critically endangered species are usually restricted to small and isolated populations. High inbreeding without gene flow among populations further aggravates their threatened condition and reduces the likelihood of their long‐term survival. Chinese alligator (Alligator sinensis) is one of the most endangered crocodiles in the world and has experie...
Background:
The exact animal origin of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) remains obscure and understanding its host range is vital for preventing interspecies transmission.
Methods:
Herein, we applied single-cell sequencing to multiple tissues of 20 species (30 data sets) and integrated them with public resources (45 d...
Background
The evolution of parasites is often directly affected by the host's environment. Studies on the evolution of the same parasites in different hosts are of great interest and are highly relevant to our understanding of divergence.
Methods
Here we performed whole-genome sequencing of Parascaris univalens from different Equus hosts (horses,...
Plant being sessile are more vulnerable to variable environmental constrains particularly salinity and drought stress, thereby causes decline in crop development and yield in arid and semi‐arid regions. In the current scenario of global warming, drought, and salinity – either alone, combined co-occurring or in sequence – are predicted to become mor...
The South China tiger (Panthera tigris amoyensis, SCT) is the most critically endangered subspecies of tigers due to functional extinction in the wild. Inbreeding depression is observed among the captive population descended from six wild ancestors, resulting in high juvenile mortality and low reproduction. We assembled and characterized the first...
The gut microbiota is essential for host health and survival. Here, using samples from animals living in the Qinghai-Tibetan Plateau, we recovered 119,568 metagenome-assembled genomes (MAGs) that were clustered into 19,251 species-level genome bins (SGBs) of which most represent novel species. We present a novel mechanism shaping mammalian gut micr...
A major challenge in understanding vertebrate embryogenesis is the lack of topographical transcriptomic information that can help correlate microenvironmental cues within the hierarchy of cell-fate decisions. Here, we employed Stereo-seq to profile 91 zebrafish embryo sections covering six critical time points during the first 24 h of development,...
The Pedinophyceae (Viridiplantae) comprise a class of small uniflagellate algae with a pivotal position in the phylogeny of the Chlorophyta as the sister group of the ‘core chlorophytes’. We present a chromosome‐level genome assembly of the freshwater type species of the class, Pedinomonas minor .
We sequenced the genome using Pacbio, Illumina and...
MADS-box is an important transcription factor family that is involved in the regulation of various stages of plant growth and development, especially flowering regulation and flower development. Being a holoparasitic plant, the body structure of Balanophoraceae has changed dramatically over time, and its vegetative and reproductive organs have been...
The plastid organelle is essential for many vital cellular processes and the growth and development of plants. The availability of a large number of complete plastid genomes could be effectively utilized to understand the evolution of the plastid genomes and phylogenetic relationships among plants. We comprehensively analyzed the plastid genomes of...
Cycads represent one of the most ancient lineages of living seed plants. Identifying genomic features uniquely shared by cycads and other extant seed plants, but not non-seed-producing plants, may shed light on the origin of key innovations, as well as the early diversification of seed plants. Here, we report the 10.5-Gb reference genome of Cycas p...
Herbivores can drastically alter the morphology of macroalgae by directly consuming tissue and by inflicting structural wounds. Macroalgae host abundant and diverse epibiont communities , the dynamics of which tend to be mostly unknown in space and time. As the cultivation of macroalgae gains momentum worldwide, it is key to measure how epibionts c...
The green peafowl (Pavo muticus) is facing a high risk of extinction due to the long-term and widespread threats of poaching and habitat conversion. Here, we present a high-quality chromosome-level genome assembly of the green peafowl with high contiguity and accuracy assembled by PacBio sequencing, DNBSEQ short-read sequencing, and Hi-C sequencing...
Sea buckthorn ( Hippophae rhamnoides ), a horticulturally multipurpose species in the family Elaeagnaceae, can build associations with Frankia actinomycetes to enable symbiotic nitrogen‐fixing. Currently, no high‐quality reference genome is available for an actinorhizal plant, which greatly hinders the study of actinorhizal symbiotic nodulation.
He...
A global international initiative, such as the Earth BioGenome Project (EBP), requires both agreement and coordination on standards to ensure that the collective effort generates rapid progress toward its goals. To this end, the EBP initiated five technical standards committees comprising volunteer members from the global genomics scientific commun...
The masked palm civet ( Paguma larvata ) is a small carnivore with distinct biological characteristics, that likes an omnivorous diet and also serves as a vector of pathogens. Although this species is not an endangered animal, its population is reportedly declining. Since the severe acute respiratory syndrome (SARS) epidemic in 2003, the public has...
Clarifying the evolutionary processes underlying species diversification and adaptation is a key focus of evolutionary biology. Begonia (Begoniaceae) is one of the most species-rich angiosperm genera with ~2,000 species, most of which are shade-adapted.
Here, we present chromosome-scale genome assemblies for four species of Begonia (B. loranthoides...
Seaweeds are macroscopic forms of phylogenetically diverse assemblage of marine algae that grow in oceans, thriving among profoundly harsh environmental conditions. They are cultivated in many maritime nations primarily for edible applications; besides hydrocolloid extraction is the second largest domain of utility for this renewable biomass. The f...
The availability of viral entry factors is a prerequisite for the cross-species transmission of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Large-scale single-cell screening of animal cells could reveal the expression patterns of viral entry genes in different hosts. However, such exploration for SARS-CoV-2 remains limited. Here,...
Background
The evolution of parasites is often directly affected by the host's environment. Studies on the evolution of the same parasites in different hosts are extremely attractive and highly relevant to our understanding of divergence and speciation.
Methods
Here we performed whole genome sequencing of Parascaris univalens from different Equus...
Vertebrate embryogenesis is a remarkably dynamic process during which numerous cell types of different lineages generate, change, or disappear within a short period of time. A major challenge in understanding this process is the lack of topographical transcriptomic information that can help correlate microenvironmental cues within the hierarchy of...
Clausena lansium (Lour.) Skeels (Rutaceae), recognized as wampee, is a widely distributed fruit tree which is utilized as a folk-medicine for treatment of several common diseases. However, the genomic information about this medicinally important species is still lacking. Therefore, we assembled the first genome of Clausena genus with a total length...
The evolution of parasites is often directly affected by the host's environment or behavior. Studies on the evolution of the same parasites in different hosts are extremely attractive and highly relevant to our understanding of divergence and speciation. Here we presented the first molecular evidence of divergence of Equus roundworms in different h...
Soummam river sediments were used to isolate a biosurfactant-producing and petroleum-degrading bacterium. The strain was identified as Alcaligenes aquatilis YGD 2906 using phenotypic characterization and 16S ribosomal RNA sequencing. The culture supernatant of the isolated strain showed no haemolytic activity had an oiled displacement of 23.66 ± 0....
Small RNAs play a major role in the post-transcriptional regulation of gene expression in eukaryotes. Despite the evolutionary importance of streptophyte algae, knowledge on small RNAs in this group of green algae is almost non-existent. We used genome and transcriptome data of 34 algal and plant species, and performed genome-wide analyses of small...
Extant giant pandas are divided into Sichuan and Qinling subspecies. The giant panda has many species-specific characteristics, including comparatively small organs for body size, small genitalia of male individuals, and low reproduction. Here, we report the most contiguous, high-quality chromosome-level genomes of two extant giant panda subspecies...
Despite being the world’s third largest ocean, the Indian Ocean is one of the least studied and understood with respect to microbial diversity as well as biogeochemical and ecological functions. In this study, we investigated the microbial community and its metabolic potential for nitrogen (N) acquisition in the oligotrophic surface waters of the I...