Large-scale transcriptomic analysis in chickpea (Cicer arietinum L.), an orphan legume crop of the semi-arid tropics of Asia and Africa. Plant Biotechnol J 9:922-931

International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Patancheru, India.
Plant Biotechnology Journal (Impact Factor: 5.75). 05/2011; 9(8):922-31. DOI: 10.1111/j.1467-7652.2011.00625.x
Source: PubMed


Chickpea (Cicer arietinum L.) is an important legume crop in the semi-arid regions of Asia and Africa. Gains in crop productivity have been low however, particularly because of biotic and abiotic stresses. To help enhance crop productivity using molecular breeding techniques, next generation sequencing technologies such as Roche/454 and Illumina/Solexa were used to determine the sequence of most gene transcripts and to identify drought-responsive genes and gene-based molecular markers. A total of 103,215 tentative unique sequences (TUSs) have been produced from 435,018 Roche/454 reads and 21,491 Sanger expressed sequence tags (ESTs). Putative functions were determined for 49,437 (47.8%) of the TUSs, and gene ontology assignments were determined for 20,634 (41.7%) of the TUSs. Comparison of the chickpea TUSs with the Medicago truncatula genome assembly (Mt 3.5.1 build) resulted in 42,141 aligned TUSs with putative gene structures (including 39,281 predicted intron/splice junctions). Alignment of ∼37 million Illumina/Solexa tags generated from drought-challenged root tissues of two chickpea genotypes against the TUSs identified 44,639 differentially expressed TUSs. The TUSs were also used to identify a diverse set of markers, including 728 simple sequence repeats (SSRs), 495 single nucleotide polymorphisms (SNPs), 387 conserved orthologous sequence (COS) markers, and 2088 intron-spanning region (ISR) markers. This resource will be useful for basic and applied research for genome analysis and crop improvement in chickpea.

Download full-text


Available from: Pooran M Gaur,
  • Source
    • "Earlier studies related to in depth analysis of seed development have been undertaken using various high throughput techniques in plants like Arabidopsis (Girke et al., 2000; Le et al., 2010), wheat (Gregersen et al., 2005) rice (Xue et al., 2012) and oats (Gutierrez- Gonzalez et al., 2013) and also in important legumes such as soybean (Severin et al., 2010; Jones and Vodkin, 2013) and Medicago (Gallardo et al., 2007). Although there have been studies reporting transcriptome analysis in chickpea (Hiremath et al., 2011; Garg et al., 2011a,b; Singh et al., 2013; Afonso-Grunz et al., 2014; Kudapa et al., 2014), none of these have focussed specifically on chickpea seed development. However, such studies have been undertaken in the related legume soybean where seed development was extensively studied with respect to different tissues (Severin et al., 2010) as well as in seeds taking developmental stages from few days post fertilization to maturity where major landmarks in seed development, including accumulation of nutrients, synthesis of storage proteins and desiccation were analyzed using RNA-seq (Jones and Vodkin, 2013). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Understanding developmental processes, especially in non-model crop plants, is extremely important in order to unravel unique mechanisms regulating development. Chickpea (C. arietinum L.) seeds are especially valued for their high carbohydrate and protein content. Therefore, in order to elucidate the mechanisms underlying seed development in chickpea, deep sequencing of transcriptomes from four developmental stages was undertaken. In this study, next generation sequencing platform was utilized to sequence the transcriptome of four distinct stages of seed development in chickpea. About 1.3 million reads were generated which were assembled into 51,099 unigenes by merging the de novo and reference assemblies. Functional annotation of the unigenes was carried out using the Uniprot, COG and KEGG databases. RPKM based digital expression analysis revealed specific gene activities at different stages of development which was validated using Real time PCR analysis. More than 90% of the unigenes were found to be expressed in at least one of the four seed tissues. DEGseq was used to determine differentially expressing genes which revealed that only 6.75% of the unigenes were differentially expressed at various stages. Homology based comparison revealed 17.5% of the unigenes to be putatively seed specific. Transcription factors were predicted based on HMM profiles built using TF sequences from five legume plants and analyzed for their differential expression during progression of seed development. Expression analysis of genes involved in biosynthesis of important secondary metabolites suggested that chickpea seeds can serve as a good source of antioxidants. Since transcriptomes are a valuable source of molecular markers like simple sequence repeats (SSRs), about 12,000 SSRs were mined in chickpea seed transcriptome and few of them were validated. In conclusion, this study will serve as a valuable resource for improved chickpea breeding.
    Frontiers in Plant Science 12/2014; 5. DOI:10.3389/fpls.2014.00698 · 3.95 Impact Factor
  • Source
    • "In terms of differential expression studies, Sanger ESTs generated from drought-challenged tissues of drought-tolerant (ICC 4958) and drought-sensitive (ICC 1882) were used for in silico expression studies (Varshney et al. 2009a). However, in another comprehensive study, after aligning 37 million Illumina short sequence tags generated from drought-challenged root tissues of the same genotypes as used in the transcriptome assembly, Hiremath et al. (2011) identified 2974 TUSs with significant expression changes, 2823 of which could be associated with gene ontology annotations. Furthermore, the expression patterns of many genes suggested that their role in various pathways of secondary metabolism. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Terminal drought is one of the major constraints in chickpea (Cicer arietinum L.), causing more than 50% production losses. With the objective of accelerating genetic understanding and crop improvement through genomics-assisted breeding, a draft genome sequence has been assembled for the CDC Frontier variety. In this context, 544.73 Mb of sequence data were assembled, capturing of 73.8% of the genome in scaffolds. In addition, large-scale genomic resources including several thousand simple sequence repeats and several million single nucleotide polymorphisms, high-density diversity array technology (15 360 clones) and Illumina GoldenGate assay genotyping platforms, high-density genetic maps and transcriptome assemblies have been developed. In parallel, by using linkage mapping approach, one genomic region harbouring quantitative trait loci for several drought tolerance traits has been identified and successfully introgressed in three leading chickpea varieties (e.g. JG 11, Chefe, KAK 2) by using a marker-assisted backcrossing approach. A multilocation evaluation of these marker-assisted backcrossing lines provided several lines with 10–24% higher yield than the respective recurrent parents.Modern breeding approaches like marker-assisted recurrent selection and genomic selection are being deployed for enhancing drought tolerance in chickpea. Some novel mapping populations such as multiparent advanced generation intercross and nested association mapping populations are also being developed for trait mapping at higher resolution, as well as for enhancing the genetic base of chickpea. Such advances in genomics and genomics-assisted breeding will accelerate precision and efficiency in breeding for stress tolerance in chickpea. Additional keywords: backcrossing, Cicer arietinum, genome sequence, quantitative trait loci, yield.
    Functional Plant Biology 07/2014; 41(11):2-6. DOI:10.1071/FP13318 · 3.15 Impact Factor
  • Source
    • "The advantage of this approach is that the identified SNPs are mostly located in single copy genes, which are a pre-requisite for SNP marker analysis. In various model or major crop species, transcriptome sequencing has been used for allele discovery and gene expression analysis [5]–[9]. Here, the limited number of SNPs detected in the genes was due to selection constraints in coding regions which result in finding only few thousand useful markers. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Open source single nucleotide polymorphism (SNP) discovery pipelines for next generation sequencing data commonly requires working knowledge of command line interface, massive computational resources and expertise which is a daunting task for biologists. Further, the SNP information generated may not be readily used for downstream processes such as genotyping. Hence, a comprehensive pipeline has been developed by integrating several open source next generation sequencing (NGS) tools along with a graphical user interface called Integrated SNP Mining and Utilization (ISMU) for SNP discovery and their utilization by developing genotyping assays. The pipeline features functionalities such as pre-processing of raw data, integration of open source alignment tools (Bowtie2, BWA, Maq, NovoAlign and SOAP2), SNP prediction (SAMtools/SOAPsnp/CNS2snp and CbCC) methods and interfaces for developing genotyping assays. The pipeline outputs a list of high quality SNPs between all pairwise combinations of genotypes analyzed, in addition to the reference genome/sequence. Visualization tools (Tablet and Flapjack) integrated into the pipeline enable inspection of the alignment and errors, if any. The pipeline also provides a confidence score or polymorphism information content value with flanking sequences for identified SNPs in standard format required for developing marker genotyping (KASP and Golden Gate) assays. The pipeline enables users to process a range of NGS datasets such as whole genome re-sequencing, restriction site associated DNA sequencing and transcriptome sequencing data at a fast speed. The pipeline is very useful for plant genetics and breeding community with no computational expertise in order to discover SNPs and utilize in genomics, genetics and breeding studies. The pipeline has been parallelized to process huge datasets of next generation sequencing. It has been developed in Java language and is available at as a standalone free software.
    PLoS ONE 07/2014; 9(7):e101754. DOI:10.1371/journal.pone.0101754 · 3.23 Impact Factor
Show more