Article

Transcriptome analysis in Coffea eugenioides, an Arabica coffee ancestor, reveals differentially expressed genes in leaves and fruits

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Studies in diploid parental species of polyploid plants are important to understand their contributions to the formation of plant and species evolution. Coffea eugenioides is a diploid species that is considered to be an ancestor of allopolyploid Coffea arabica together with Coffea canephora. Despite its importance in the evolutionary history of the main economic species of coffee, no study has focused on C. eugenioides molecular genetics. RNA-seq creates the possibility to generate reference transcriptomes and identify coding genes and potential candidates related to important agronomic traits. Therefore, the main objectives were to obtain a global overview of transcriptionally active genes in this species using next-generation sequencing and to analyze specific genes that were highly expressed in leaves and fruits with potential exploratory characteristics for breeding and understanding the evolutionary biology of coffee. A de novo assembly generated 36,935 contigs that were annotated using eight databases. We observed a total of ~5000 differentially expressed genes between leaves and fruits. Several genes exclusively expressed in fruits did not exhibit similarities with sequences in any database. We selected ten differentially expressed unigenes in leaves and fruits to evaluate transcriptional profiles using qPCR. Our study provides the first gene catalog for C. eugenioides and enhances the knowledge concerning the mechanisms involved in the C. arabica homeologous. Furthermore, this work will open new avenues for studies into specific genes and pathways in this species, especially related to fruit, and our data have potential value in assisted breeding applications.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... To compare with available coffee sequences, the full coffee-LRS isoforms were processed with BLASTn (1e-20) to C.canephora CDS with UTR and C. arabica EST database, respectively and the other way around [23,24]. The C. eugenioides transcriptome (young leaves and mature fruits) from Illumina was also used in the comparison [29]. ...
... 2,969 sequences), encoding the purine metabolism and thiamine metabolism pathway. In comparison, only 802 sequences were associated with 142 pathways and 374 enzymes in C. eugenioides transcriptome and starch and sucrose pathway relating to 450 contigs was the most encoded pathway [29]. ...
... More than twice the number of isoforms were identified in the tetraploid Arabica LRS transcriptome (immature, intermediate and mature fruits) compared with the C. eugenioides contigs (36,935 de novo assembled contigs, average length: 701 bp, from immature leaves and mature fruits), C.canephora CDS with UTR (25,570 sequences, from a variety of tissues, including fruits) and C. arabica EST database (35,153 contigs, including fruits) ( Table 2) [14,23,29]. The coffee-LRS isoforms show greater transcript length, diversity and a lower GC content. ...
Article
Full-text available
Background: Polyploidization contributes to the complexity of gene expression resulting in numerous related but different transcripts. This study explored the transcriptome diversity and complexity of tetraploid Arabica coffee (Coffea arabica) bean. Long-read sequencing (LRS) by Pacbio Isoform sequencing (Iso-seq) was used to obtain full-length transcripts without the difficulty and uncertainty of assembly required for reads from short read technologies. The tetraploid transcriptome was annotated and compared with data from the sub-genome progenitors. Caffeine and sucrose genes were targeted for case analysis. Findings: An isoform-level tetraploid coffee bean reference transcriptome with 95,995 distinct transcripts (average 3,236 bp) was obtained. A total of 88,715 sequences (92.42%) were annotated with BLASTx against NCBI non-redundant plant proteins, including 34,719 high quality annotations. Further BLASTn to NCBI non-redundant nucleotide sequences, C. canephora coding sequences with UTR, C.arabica ESTs and Rfam resulted in 1,213 sequences without hits, were potential novel genes in coffee. Longer UTRs were captured, especially in the 5’UTRs, facilitating the identification of upstream ORFs (uORFs). The LRS also revealed more and longer transcript variants in key caffeine and sucrose metabolism genes from this polyploid genome. Long sequences (>10kb) were poorly annotated. Conclusions: LRS technology shows the limitation of previous studies. It provides an important tool to produce a reference transcriptome including more of the diversity of full-length transcripts to help understand the biology and support the genetic improvement of polyploid species such as coffee.
... Currently, the use of large-scale genomic analyses such as RNAseq (RNA total sequencing) and GBS (Genotype-by-Sequencing) is common on characterization of genotypes, biological processes, and genetic control of agronomic traits in several plant species (Reviewed by Kang et al., 2016). In coffee, the conclusion of the first Coffee Genome Project (Vieira et al., 2006) prompted several studies on genomic aspects of major traits, such as defense response to pathogens (Florez et al., 2017), caffeine metabolism , fruit metabolism (Yuyama et al., 2016), drought tolerance (Mofatto et al., 2016). Also, total genome sequences from C. arabica (Tran et al., 2018) and C. canephora (Denoeud et al., 2014) are available, which opened the possibility for novel wide genome selection strategies, such as Genomic Wide Selection (GWS) (Carvalho et al., 2020) and Genomic Wide Association Selection (GWAS) (Sant'Ana et al., 2018). ...
... The number of DEGs identified here was low, considering the high scope of RNAseq technique. Other studies comparing large-scale expression in coffee identified a higher number of DEGs, such as ~2000 in beans (Cheng et al., 2018), 610 in seedlings (Haile and Kang, 2018), ~5000 in leaves and fruits (Yuyama et al., 2016). Therefore, the reduced number of DEGs here represent a strong evidence that low caffeine content in coffee plants does not result in a comprehensive reprogramming of coffee plant metabolism. ...
Article
Full-text available
Differential gene expression profiles and metabolic networks are valuable tools for the genetic characterization of agronomic traits. In this study, we used large-scale expression analyses to identify modified biological processes in caffeine-free coffee plants. The first step was the large-scale sequencing of RNA from young and developing tissues of caffeine-free plants (AC1) and plants with normal concentrations of the compound (MN). The resulting 65,000 sequences were analyzed in silico for identification of 171 genes with differential expression between treatments, and establishment of metabolic networks associated with levels of caffeine. Few genes were mapped onto metabolic pathways, indicating that low caffeine has no major effects on physiological processes. The differential expression observed in silico was validated for 12 selected genes in field experiments using qPCR. The expression profile of 5 genes differed on the analyses, and the rest confirmed the in silico profile. Among the validated genes two of them, FIG and LSM-l, may control other agronomic traits associated with low caffeine content in coffee tissues. These genes are potential markers for use in association with other current markers for assisted selection of low-caffeine coffee. Therefore, they may improve the efficiency and effectiveness of coffee breeding programs.
... However, in practical, whether Arabica coffee fruits have the same pattern as seedlings being stable when treated with the same environment is yet to be confirmed. Recent transcriptome analysis in C. eugenioides, provides a global view of highly transcriptional expressed genes with various function in fruits and leaves: biological process related genes were significantly highly expressed in fruits while molecular function is lower compared to leaves, indicating tissues specific functions (Yuyama et al., 2015). Importantly, this study improves our understanding of the C. arabica background and future studies can benefit from this resource from C. eugenioides (Yuyama et al., 2015). ...
... Recent transcriptome analysis in C. eugenioides, provides a global view of highly transcriptional expressed genes with various function in fruits and leaves: biological process related genes were significantly highly expressed in fruits while molecular function is lower compared to leaves, indicating tissues specific functions (Yuyama et al., 2015). Importantly, this study improves our understanding of the C. arabica background and future studies can benefit from this resource from C. eugenioides (Yuyama et al., 2015). ...
Article
Full-text available
Coffee is one of the most valuable commodities exported worldwide. Greater understanding of the molecular basis of coffee quality is required to meet the increasing demands of consumers. Genotype and environment (G and E) have been shown to influence coffee quality. Analysis of coffee metabolism, the genes governing the accumulation of key components and the influence of environment on their expression during seed development supports the identification of the molecular determinants of coffee quality.
... In this short-read sequencing reference transcriptome, around 24,548 unigenes were identified to be protein-coding sequences (Ivamoto et al., 2017). Another reference transcriptome studied earlier was from C. eugenioides, the maternal progenitor of C. arabica (Yuyama et al., 2016). This reference was built using Illumina short-read sequencing and de novo assembly. ...
... Tissue-specific expression revealed a different number of genes expressed in the leaves and fruits of C. eugenioides, including 2050 and 3299 uniquely expressed genes individually and 31,583 commonly expressed genes (Yuyama et al., 2016). Functional annotation identified quite distinct roles of individual genes in leaves and fruits. ...
... These resources provide opportunities for mining new markers that can be used for coffee germplasm management and breeding. Several studies involving SNPs discovery in coffee through mining of expressed sequence tags (ESTs) or transcriptome data have been published (de Kochko et al. 2010;Vidal et al. 2010;Combes et al. 2013;Yuyama et al. 2016). Recently, Genotyping by Sequencing (GBS) was applied to generate SNP markers for QTL mapping of agronomic traits in coffee (Moncada et al. 2016). ...
Article
Full-text available
Coffee is one of the most widely consumed beverages and represents a multibillion-dollar global industry. Accurate identification of coffee cultivars is essential for efficient management, exchange, and use of coffee genetic resources. To date, a universal platform that can allow data comparison across different laboratories and genotyping platforms has not been developed by the coffee research community. Using expressed sequence tags (EST) of Coffea arabica, C. canephora and C. racemosa from public databases, we developed 7538 single nucleotide polymorphism (SNP) markers and selected 180 for validation using 25 C. arabica and C. canephora accessions from Puerto Rico. Based on the validation result, we designated a panel of 55 SNP markers that are polymorphic across the two species. The average minor allele frequency and information index of this SNP panel are 0.281 and 0.690, respectively. This panel enabled the differentiation of all tested accessions of C. canephora, which accounts for 79.2 % of the total polymorphism in the samples. Only 21.8 % of the polymorphic SNPs were detected in the 12 C. arabica cultivars, which, nonetheless, were able to unambiguously differentiate the 12 Arabica cultivars into ten unique genotypes, including two synonymous groups. Several local Puerto Rican cultivars with partial Timor pedigree, including Limaní, Frontón, and TARS 18087, showed substantial genetic difference from the other common Arabica cultivars, such as Catuai, Borbón, and Mundo Nuevo. This coffee SNP panel provides robust and universally comparable DNA fingerprints, thus can serve as a genotyping tool to assist coffee germplasm management, propagation of planting material, and coffee cultivar authentication. © 2016 Springer Science+Business Media New York (outside the USA)
... Genetic control of these processes can be investigated by the study of changes in gene expression through bean ripening, including transcripts regulating bean filling as well as in response to stress 2,5,6 . Early studies applied RT-PCR (coffee beans) or microarrays to mainly coffee leaves or seedlings 5,7,8 . More recently, different tissues of Arabica coffee including flowers, leaves and fruit pericarp have been subjected to transcriptome analysis. ...
Article
Full-text available
The composition of the maturing coffee bean determines the processing performance and ultimate quality of the coffee produced from the bean. Analysis of differences in gene expression during bean maturation may explain the basis of genetic and environmental variation in coffee quality. The transcriptome of the coffee bean was analyzed at three stages of development, immature (green), intermediate (yellow) and mature (red). A total of more than 120 million 150 bp paired-end reads were collected by sequencing of transcripts of triplicate samples at each developmental stage. A greater number of transcripts were expressed at the yellow stage. As the beans matured the types of highly expressed transcripts changed from transcripts predominantly associated with galactomannan, triacylglycerol (TAG), TAG lipase, 11 S and 7S-like storage protein and Fasciclin-like arabinogalactan protein 17 (FLA17) in green beans to transcripts related to FLA1 at the yellow stage and TAG storage lipase SDP1, and SDP1-like in red beans. This study provides a genomic resource that can be used to investigate the impact of environment and genotype on the bean transcriptome and develop coffee varieties and production systems that are better adapted to deliver quality coffee despite climate variations.
... Recently, other transcriptome studies of coffee, however not involving plant-pathogen interactions, showed similar number of contigs assembled as in this present study. Leaf and fruit transcriptome analysis of Coffea eugenioides produced 36,935 contigs using Illumina HiSeq platform (Yuyama et al. 2016). Also Mofatto et al. (2016) obtained a total of 41,512 contigs from C. arabica transcriptome, comparing the molecular responses to drought in two commercial cultivars using 454-pyrosequencing and Sanger platforms. ...
Article
Full-text available
Key message: We provide a transcriptional profile of coffee rust interaction and identified putative up regulated resistant genes Coffee rust disease, caused by the fungus Hemileia vastatrix, is one of the major diseases in coffee throughout the world. The use of resistant cultivars is considered to be the most effective control strategy for this disease. To identify candidate genes related to different mechanism defense in coffee, we present a time-course comparative gene expression profile of Caturra (susceptible) and Híbrido de Timor (HdT, resistant) in response to H. vastatrix race XXXIII infection. The main objectives were to obtain a global overview of transcriptome in both interaction, compatible and incompatible, and, specially, analyze up-regulated HdT specific genes with inducible resistant and defense signaling pathways. Using both Coffea canephora as a reference genome and de novo assembly, we obtained 43,159 transcripts. At early infection events (12 and 24 h after infection), HdT responded to the attack of H. vastatrix with a larger number of up-regulated genes than Caturra, which was related to prehaustorial resistance. The genes found in HdT at early hours were involved in receptor-like kinases, response ion fluxes, production of reactive oxygen species, protein phosphorylation, ethylene biosynthesis and callose deposition. We selected 13 up-regulated HdT-exclusive genes to validate by real-time qPCR, which most of them confirmed their higher expression in HdT than in Caturra at early stage of infection. These genes have the potential to assist the development of new coffee rust control strategies. Collectively, our results provide understanding of expression profiles in coffee-H. vastatrix interaction over a time course in susceptible and resistant coffee plants.
... Understanding the molecular mechanisms underlying seed development is one of the major goals to improve grain weight and grain constituents. Transcriptome and proteome maps provide a powerful tool to investigate the maize seed development [10]. In addition, the rapid development of proteomic technologies provides an unprecedented opportunity for plant proteomic profiling [11]. ...
Article
Full-text available
Grain weight is one of the most important yield components and a developmentally complex structure comprised of two major compartments (endosperm and pericarp) in maize (Zea mays L.), however, very little is known concerning the coordinated accumulation of the numerous proteins involved. Herein, we used isobaric tags for relative and absolute quantitation (iTRAQ)-based comparative proteomic method to analyze the characteristics of dynamic proteomics for endosperm and pericarp during grain development. Totally, 9539 proteins were identified for both components at four development stages, among which 1401 proteins were non-redundant, 232 proteins were specific in pericarp and 153 proteins were specific in endosperm. A functional annotation of the identified proteins revealed the importance of metabolic and cellular processes, and binding and catalytic activities for the tissue development. Three and 76 proteins involved in 49 Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways were integrated for the specific endosperm and pericarp proteins, respectively, reflecting their complex metabolic interactions. In addition, four proteins with important functions and different expression levels were chosen for gene cloning and expression analysis. Different concordance between mRNA level and the protein abundance was observed across different proteins, stages, and tissues as in previous research. These results could provide useful message for understanding the developmental mechanisms in grain development in maize.
... To understand the diversity added to the transcriptome by polyploidy and this analysis, comparison was made between the Arabica EST database and the C. canephora cds. More than twice the number of isoforms were identified in the tetraploid Arabica LRS transcriptome compared with C. eugenioides contigs (36,935, from immature leaves and mature fruits) and C. canephora cds (25,574, from different tissues, including fruits) [13,20]. ...
... In parallel to these genetic/genomic approaches, coffee transcriptomics was also gaining momentum. The first Expression Sequence Tags (ESTs) were established in the early 2000 using the traditional Sanger sequencing protocol (Poncet et al., 2006;Vieira et al., 2006), but quite soon the Next Generation Sequencing (NGS) technologies took over; first using the Roche pyrosequencing approach (also known as 454) and then the Illumina protocol (RNA-Seq) (Yuyama et al., 2016). The first set of ESTs established by Nestlé R and D (France) and IRD (France) allowed the construction of the first RNA chip (Privat et al., 2011). ...
... Coffee (Coffea spp.) is an economically important crop and one of the main export products of several countries in Latin America, Africa and Asia (Ivamoto et al. 2017). Recently, the genomes of C. canephora (Denoeud et al. 2014) and C. arabica (Van Deynze et al. 2017) as well as the transcriptomes of Arabica coffee (Ivamoto et al. 2017) and an Arabica coffee ancestor Coffea eugenioides (Yuyama et al. 2016) were reported. These advances open possibilities for the study of candidate genes related to agronomical traits and metabolic pathways and for assisted breeding applications through the introduction of new desirable traits by genetic engineering (Ribas et al. 2011). ...
Article
Full-text available
The establishment of a simple, rapid and efficient transient expression system is a necessary tool for the functional validation of candidate genes in coffee biotechnology. The effects of Agrobacterium strain, age of the donor plant, infiltration method, and infiltration medium on transgene expression in detached coffee leaves were evaluated. Regarding the effect of Agrobacterium strain, the expression of uidA was higher in GV3101-treated coffee disks than in LBA4404 and ATHV-treated samples. On the other hand, transient expression of uidA was significantly higher in leaf disks from young plants (6-weeks-old) (13.1 ± 1.4%) than in mature tissue (12-weeks-old) (1.6 ± 1.2%). Transient uidA expression was higher in detached coffee leaf disks from young plants infiltrated with one injection of 15 µL of Agrobacterium strain GV3101::1303 suspended in MS salts supplemented with 30 g/L sucrose, 1.9 g/L MES and 200 uM AS with subsequent sanding of the abaxial epidermis. Using the optimized protocol, expression of the uidA gene was observed 6, 24 and 48 h and 5 weeks after bacterial injection. DNA was extracted from coffee disks with positive GUS expression and specific mgfp5 and uidA fragments were amplified 5 weeks post-agroinfiltration. On the other hand, using the optimized protocol, a specific cry10Aa (500 bp) fragment was amplified in the agro-infiltrated coffee leaf disks 5 weeks post-agroinfiltration with the plasmid pB427-35S-cry10Aa. Moreover, the expression of the gene cry10Aa in two infiltrated coffee leaf disks was verified by RT-PCR and an expected 500 bp fragment was amplified.
... for an accurate RT-qPCR normalization should be performed accordingly to each for specific condition (Cruz et al., 2009;Goulao et al., 2012;Carvalho et al., 2013;Yuyama et al., 2016). ...
Article
Full-text available
World coffee production has faced increasing challenges associated with ongoing climatic changes. Several studies, which have been almost exclusively based on temperature increase, have predicted extensive reductions (higher than half by 2050) of actual coffee cropped areas. However, recent studies showed that elevated [CO2] can strongly mitigate the negative impacts of heat stress at the physiological and biochemical levels in coffee leaves. In addition, it has also been shown that coffee genotypes can successfully cope with temperatures above what has been traditionally accepted. Altogether, this information suggests that the real impact of climate changes on coffee growth and production could be significantly lower than previously estimated. Gene expression studies are an important tool to unravel crop acclimation ability, demanding the use of adequate reference genes. We have examined the transcript stability of 10 candidate reference genes to normalize RT-qPCR expression studies using a set of 24 cDNAs from leaves of three coffee genotypes (CL153, Icatu and IPR108), grown under 380 or 700 μL CO2 L-1, and submitted to increasing temperatures from 25/20°C (day/night) to 42/34°C. Samples were analyzed according to genotype, [CO2], temperature, multiple stress interaction ([CO2], temperature) and total stress interaction (genotype, [CO2] and temperature). The transcript stability of each gene was assessed through a multiple analytical approach combining the Coeficient of Variation method and three algorithms (geNorm, BestKeeper, NormFinder). The transcript stability varied according to the type of stress for most genes, but the consensus ranking obtained with RefFinder, classified MDH as the gene with the highest mRNA stability to a global use, followed by ACT and S15, whereas α-TUB and CYCL showed the least stable mRNA contents. Using the coffee expression profiles of the gene encoding the large-subunit of ribulose-1,5-bisphosphate carboxylase/oxygenase (RLS), results from the in silico aggregation and experimental validation of the best number of reference genes showed that two reference genes are adequate to normalize RT-qPCR data. Altogether, this work highlights the importance of an adequate selection of reference genes for each single or combined experimental condition and constitutes the basis to accurately study molecular responses of Coffea spp. in a context of climate changes and global warming.
... This high number of modifications also explains why βPG was to our knowledge never before identified in proteome studies although it is regularly identified as being highly expressed [31]. When the expression level of for instance the gene P92990, one of the βPGhomologues in Arabidopsis, in different tissues is visualized with Genevisible it is found to be high in all tissues. ...
Article
Full-text available
The structure and the activity of proteins are often regulated by transient or stable post- translational modifications (PTM). Different from well-known, abundant modifications such as phosphorylation and glycosylation some modifications are limited to one or a few proteins across a broad range of related species. Although few examples of the latter type are known, the evolutionary conservation of these modifications and the enzymes responsible for their synthesis suggest an important physiological role. Here, the first observation of a new, fold-directing PTM is described. During the analysis of alfalfa cell wall proteins a -2Da mass shift was observed on phenylalanine residues in the repeated tetrapeptide FxxY of the beta-subunit of polygalacturonase. This modular protein is known to be involved in developmental and stress-responsive processes. The presence of this modification was confirmed using in-house and external datasets acquired by different commonly used techniques in proteome studies. Based on these analyses it was found that all identified phenylalanine residues in the sequence FxxY of this protein were modified to α,β-didehydro-Phe (ΔPhe). Besides showing the reproducible identification of ΔPhe in different species arguments that substantiate the fold-determining role of ΔPhe are given.
... In addition to the microarrays, the integrated use of metabolomics and proteomics analysis, coupled with modern sequencing technologies, will be able to trace gene expression profiles and to reconstruct metabolic pathways in coffee (Joët et al., 2009). Recently, using the Illumina protocol (RNA-Seq), the expression profile of C. eugenioides was studied (Yuyama et al., 2016). ...
... Five genes were found to be similar to the corresponding genes in A. thaliana (Table 1). Next, C. canephora FRL sequences were used in the BlastP search against the C. arabica genome sequence (http://www.phvtozome.net) and the C. eugenioides EST databank 25 . Five sequences were found in C. eugenioides and 10 in C. arabica. ...
Article
Full-text available
Coffea arabica is an allotetraploid of high economic importance. C. arabica transcriptome is a combination of the transcripts of two parental genomes (C. eugenioides and C. canephora) that gave rise to the homeologous genes of the species. Previous studies have reported the transcriptional dynamics of C. arabica. In these reports, the ancestry of homeologous genes was identified and the overall regulation of homeologous differential expression (HDE) was explored. One of these genes is part of the FRIGIDA-like family (FRL), which includes the Arabidopsis thaliana flowering-time regulation protein, FRIGIDA (FRI). As nonfunctional FRI proteins give rise to rapid-cycling summer annual ecotypes instead of vernalization-responsive winter-annuals, allelic variation in FRI can modulate flowering time in A. thaliana. Using bioinformatics, genomic analysis, and the evaluation of gene expression of homeologs, we characterized the FRL gene family in C. arabica. Our findings indicate that C. arabica expresses 10 FRL homeologs, and that, throughout flower and fruit development, these genes are differentially transcribed. Strikingly, in addition to confirming the expression of FRL genes during zygotic embryogenesis, we detected FRL expression during direct somatic embryogenesis, a novel finding regarding the FRL gene family. The HDE profile of FRL genes suggests an intertwined homeologous gene regulation. Furthermore, we observed that FLC gene of C. arabica has an expression profile similar to that of CaFRL genes.
... In parallel to these genetic/genomic approaches, coffee transcriptomics was also gaining momentum. The first Expression Sequence Tags (ESTs) were established in the early 2000 using the traditional Sanger sequencing protocol (Poncet et al., 2006;Vieira et al., 2006), but quite soon the Next Generation Sequencing (NGS) technologies took over; first using the Roche pyrosequencing approach (also known as 454) and then the Illumina protocol (RNA-Seq) (Yuyama et al., 2016). The first set of ESTs established by Nestlé R and D (France) and IRD (France) allowed the construction of the first RNA chip (Privat et al., 2011). ...
... Even though the two genotypes are closely related due to the allopolyploid origin of C. arabica involving C. canephora as a progenitor [71,72], significant transcriptomic differences were found between the two genotypes in response to eCO 2 , and even at aCO 2 ( Figure 3), with relevant fold change differences found between them (aCO 2 : -13.36 to 14.23; eCO 2 : -10.99 to 12.17). In fact, despite the close relationship, C. canephora and C. arabica evolution and selection determined different ecological requirements (namely regarding, altitude and temperature, as well as rainfall amount), and distinct acclimation responses to environmental stresses (e.g., to drought, heat, cold) (for a review, see [73,74]), which would be associated to different transcriptional profiles, as it is the case in the present work. ...
Article
Full-text available
As atmospheric [CO 2 ] continues to rise to unprecedented levels, understanding its impact on plants is imperative to improve crop performance and sustainability under future climate conditions. In this context, transcriptional changes promoted by elevated CO 2 (eCO 2) were studied in genotypes from the two major traded coffee species: the allopolyploid Coffea arabica (Icatu) and its diploid parent, C. canephora (CL153). While Icatu expressed more genes than CL153, a higher number of differentially expressed genes were found in CL153 as a response to eCO 2. Although many genes were found to be commonly expressed by the two genotypes under eCO 2 , unique genes and pathways differed between them, with CL153 showing more enriched GO terms and metabolic pathways than Icatu. Divergent functional categories and significantly enriched pathways were found in these genotypes, which altogether supports contrasting responses to eCO 2. A considerable number of genes linked to coffee physiological and biochemical responses were found to be affected by eCO 2 with the significant upregulation of photosynthetic, antioxidant, and lipidic genes. This supports the absence of photosynthesis down-regulation and, therefore, the maintenance of increased photosynthetic potential promoted by eCO 2 in these coffee genotypes.
... To understand the diversity added to the transcriptome by polyploidy and this analysis, comparison was made between the Arabica EST database and the C. canephora cds. More than twice the number of isoforms were identified in the tetraploid Arabica LRS transcriptome compared with C. eugenioides contigs (36,935, from immature leaves and mature fruits) and C. canephora cds (25,574, from different tissues, including fruits) [13,20]. ...
Article
Full-text available
Coffea arabica L. is an important crop in several developing countries. Despite its economic importance, minimal transcriptome data are available for fruit tissues, especially during fruit development where several compounds related to coffee quality are produced. To understand the molecular aspects related to coffee fruit and grain development, we report a large-scale transcriptome analysis of leaf, flower and perisperm fruit tissue development. Illumina sequencing yielded 41,881,572 high-quality filtered reads. De novo assembly generated 65,364 unigenes with an average length of 1,264 bp. A total of 24,548 unigenes were annotated as protein coding genes, including 12,560 full-length sequences. In the annotation process, we identified nine candidate genes related to the biosynthesis of raffinose family oligossacarides (RFOs). These sugars confer osmoprotection and are accumulated during initial fruit development. Four genes from this pathway had their transcriptional pattern validated by quantitative reverse transcription polymerase chain reaction (qRT-PCR). Furthermore, we identified ~24,000 putative target sites for microRNAs (miRNAs) and 134 putative transcriptionally active transposable elements (TE) sequences in our dataset. This C. arabica transcriptomic atlas provides an important step for identifying candidate genes related to several coffee metabolic pathways, especially those related to fruit chemical composition and therefore beverage quality. Our results are the starting point for enhancing our knowledge about the coffee genes that are transcribed during the flowering and initial fruit development stages.
Chapter
Coffee is a popular beverage with significant economic importance. The economies of many developing countries depend heavily on the earnings from this crop. Besides increasing demand, coffee productivity has not increased in many coffee-growing countries and remained desolately low or on a plateau for the last several years. Genetic improvement of coffee through conventional breeding approaches has several limitations primarily due to the insufficient understanding of the genetic and molecular mechanism associated with various agronomic traits. However, recent development in omics approaches and technological advancements has accelerated research on understanding the basic mechanisms related to crop improvement. In coffee, significant progress has been made in the genomics front especially whole-genome sequencing, transcriptomics, marker development, and gene identification. Besides, other omics tools such as metabolomics are now being integrated along with genomics to further expand our knowledge on the key mechanism underlying various physiological and cellular processes. This review article highlights up-to-date information on the available omics resources (genomics, transcriptomics, proteomics, and metabolomics) and their applications in coffee for genetic improvement.
Article
Full-text available
The processability and ultimate quality of coffee (Coffea arabica) are determined by the composition of the matured fruits. The basis of genetic variation in coffee fruit quality could be explained by studying color formation during fruit maturation. Transcriptome profiling was conducted on matured fruits of four C. arabica varieties (orange colored fruits (ORF); purple colored fruits (PF); red colored fruits (RF) and yellow colored fruits (YF)) to identify key color-regulating genes, biosynthesis pathways and transcription factors implicated in fruit color formation. A total of 39,938 genes were identified in the transcriptomes of the four C. arabica varieties. In all, 2,745, 781 and 1,224 differentially expressed genes (DEGs) were detected in YF_vs_PF, YF_vs_RF and YF_vs_ORF, respectively, with 1,732 DEGs conserved among the three pairwise groups. Functional annotation of the DEGs led to the detection of 28 and 82 key genes involved in the biosynthesis of carotenoids and anthocyanins, respectively. Key transcription factors bHLH, MYB, NAC, MADS, and WRKY implicated in fruit color regulation were detected. The high expression levels of gene-LOC113688784 (PSY), gene-LOC113730013 (β-CHY), gene-LOC113728842 (CCD7), gene-LOC113689681 (NCED) and gene-LOC113729473 (ABA2) in YF may have accounted for the yellow coloration. The differential expression of several anthocyanin and carotenoid-specific genes in the fruits substantially account for the purple (PF), red (RF), and orange (ORF) colorations. This study provides important insights into fruit color formation and variations in C. arabica and will help to develop coffee varieties with specific color and quality traits.
Article
Full-text available
Lipids, including the diterpenes cafestol and kahweol, are key compounds that contribute to the quality of coffee beverages. We determined total lipid content and cafestol and kahweol concentrations in green beans and genotyped 107 Coffea arabica accessions, including wild genotypes from the historical FAO collection from Ethiopia. A genome-wide association study was performed to identify genomic regions associated with lipid, cafestol and kahweol contents and cafestol/kahweol ratio. Using the diploid Coffea canephora genome as a reference, we identified 6,696 SNPs. Population structure analyses suggested the presence of two to three groups (K = 2 and K = 3) corresponding to the east and west sides of the Great Rift Valley and an additional group formed by wild accessions collected in western forests. We identified 5 SNPs associated with lipid content, 4 with cafestol, 3 with kahweol and 9 with cafestol/kahweol ratio. Most of these SNPs are located inside or near candidate genes related to metabolic pathways of these chemical compounds in coffee beans. In addition, three trait-associated SNPs showed evidence of directional selection among cultivated and wild coffee accessions. Our results also confirm a great allelic richness in wild accessions from Ethiopia, especially in accessions originating from forests in the west side of the Great Rift Valley.
Article
Full-text available
Association analysis was performed at the whole genome level to identify loci affecting the caffeine and trigonelline content of Coffea arabica beans. DNA extracted from extreme phenotypes was bulked (high and low caffeine, and high and low trigonelline) based on biochemical analysis of the germplasm collection. Sequencing and mapping using the combined reference genomes of C. canephora and C. eugenioides (CC and CE) identified 1351 non-synonymous SNPs that distinguished the low- and high-caffeine bulks. Gene annotation analysis with Blast2GO revealed that these SNPs corresponding to 908 genes with 56 unique KEGG pathways and 49 unique enzymes. Based on KEGG pathway-based analysis, 40 caffeine-associated SNPs were discovered, among which nine SNPs were tightly associated with genes encoding enzymes involved in the conversion of substrates (i.e. SAM, xanthine and IMP) which participate in the caffeine biosynthesic pathways. Likewise, 1060 non-synonymous SNPs were found to distinguish the low- and high-trigonelline bulks. They were associated with 719 genes involved in 61 unique KEGG pathways and 51 unique enzymes. The KEGG pathway-based analysis revealed 24 trigonelline-associated SNPs tightly linked to genes encoding enzymes involved in the conversion of substrates (i.e. SAM, L-tryptophan) which participate in the trigonelline biosynthesis pathways. These SNPs could be useful targets for further functional validation and subsequent application in arabica quality breeding.
Article
Full-text available
Coffea arabica L. is an important agricultural commodity, accounting for 60% of traded coffee worldwide. Nitrogen (N) is a macronutrient that is usually limiting to plant yield; however, molecular mechanisms of plant acclimation to N limitation remain largely unknown in tropical woody crops. In this study, we investigated the transcriptome of coffee roots under N starvation, analyzing poly-A+ libraries and small RNAs. We also evaluated the concentration of selected amino acids and N-source preferences in roots. Ammonium was preferentially taken up over nitrate, and asparagine and glutamate were the most abundant amino acids observed in coffee roots. We obtained 34,654 assembled contigs by mRNA sequencing, and validated the transcriptional profile of 12 genes by RT-qPCR. Illumina small RNA sequencing yielded 8,524,332 non-redundant reads, resulting in the identification of 86 microRNA families targeting 253 genes. The transcriptional pattern of eight miRNA families was also validated. To our knowledge, this is the first catalog of differentially regulated amino acids, N sources, mRNAs, and sRNAs in Arabica coffee roots.
Article
Full-text available
Background Somatic embryogenesis (SE) is a useful biotechnological tool to study the morpho-physiological, biochemical and molecular processes during the development of Coffea canephora . Plant growth regulators (PGR) play a key role during cell differentiation in SE. The Auxin-response-factor (ARF) and Auxin/Indole-3-acetic acid (Aux/IAA) are fundamental components involved in the signaling of the IAA. The IAA signaling pathway activates or represses the expression of genes responsive to auxins during the embryogenic transition of the somatic cells. The growing development of new generation sequencing technologies (NGS), as well as bioinformatics tools, has allowed us to broaden the landscape of SE study of various plant species and identify the genes directly involved. Methods Analysis of transcriptome expression profiles of the C. canephora genome and the identification of a particular set of differentially expressed genes (DEG) during SE are described in this study. Results A total of eight ARF and seven Aux/IAA differentially expressed genes were identified during the different stages of the SE induction process. The quantitative expression analysis showed that ARF18 and ARF5 genes are highly expressed after 21 days of the SE induction, while Aux/IAA7 and Aux/IAA12 genes are repressed. Discussion The results of this study allow a better understanding of the genes involved in the auxin signaling pathway as well as their expression profiles during the SE process.
Article
Full-text available
Coffea arabica L. enfrenta serios problemas de susceptibilidad a enfermedades, favorecidos por la poca variabilidad genética de sus cultivares comerciales; por ello, es importante estudiar diferentes fuentes de variación que sean útiles en el mejoramiento genético. El objetivo de este estudio fue irradiar con rayos gamma semillas de C. arabica de las variedades Geisha, Oro Azteca y Marsellesa para determinar la dosis letal media (DL50) y evaluar su respuesta fisiológica sobre la germinación, supervivencia, altura de planta (AP), diámetro de tallo (DT), altura al primer par de hojas (APPH) y área foliar (AF). Se utilizó el irradiador Transelektro LGI-01, con una razón de dosis de 752.76 Gy·h-1. Las dosis de irradiación evaluadas fueron 0, 100, 200, 300, 400 y 500 Gy. El experiment se estableció bajo un diseño factorial completamente al azar con dos factores (variedad, con tres niveles, y dosis de irradiación, con seis niveles) y tres repeticiones (225 semillas por repetición). El registro de la germinación se hizo a los 20 días después de la siembra. Las variables restantes se evaluaron 120 días después de la siembra. Los resultados mostraron que la germinación, supervivencia, AP, APPH, DT y AF se vieron significativamente afectados de forma negativa por la irradiación gamma a partir de la dosis de 200 Gy en las tres variedades. La DL50 fue de 70 Gy para la variedad Geisha, 85 Gy para Marsellesa y 90 Gy para Oro Azteca. Dosis menores a 100 Gy se pueden emplear en programas de fitomejoramiento para C. arabica.
Article
Full-text available
Coffee is one of the most economically important agricultural commodities in the world. Labeling accuracy and conservation efficiency are essential for coffee germplasm management and for the exchange and utilization in breeding new varieties. However, due to its homogenous genetic background, accurate identification of Coffea arabica germplasm has not been fully achieved. Specifically, data comparison across different laboratories and genotyping platforms has not been available. Here, we report the screening of 672 candidate SNPs using Nano-Fluidic Array genotyping. Based on call rate, Minor Allele Frequency and Linkage Disequilibrium, a set of 96 SNPs were selected for genotyping C. arabica. This validated panel is suitable for use in coffee germplasm conservation and crop improvement, including varietal identification, seeds and nursery accreditation, and coffee bean authentication.
Article
Full-text available
*Motivation:* High-throughput nucleotide sequencing provides quantitative readouts in assays for RNA expression (RNA-Seq), protein-DNA binding (ChIP-Seq) or cell counting (barcode sequencing). Statistical inference of differential signal in such data requires estimation of their variability throughout the dynamic range. When the number of replicates is small, error modelling is needed to achieve statistical power. Results: We propose an error model that uses the negative binomial distribution, with variance and mean linked by local regression, to model the null distribution of the count data. The method controls type-I error and provides good detection power. *Availability:* A free open-source R software package, DESeq , is available from the Bioconductor project and from "http://www-huber.embl.de/users/anders/DESeq":http://www-huber.embl.de/users/anders/DESeq.
Article
Full-text available
UDP-glycosyltransferases (EC 2.4.1.x; UGTs) are enzymes coded by an important gene family of higher plants. They are involved in the modification of secondary metabolites, phytohormones, and xenobiotics by transfer of sugar moieties from an activated nucleotide molecule to a wide range of acceptors. This modification regulates various functions like detoxification of xenobiotics, hormone homeostasis, and biosynthesis of secondary metabolites. Here, we describe the identification of 96 UGT genes in Cicer arietinum (CaUGT) and report their tissue-specific differential expression based on publically available RNA-seq and expressed sequence tag data. This analysis has established medium to high expression of 84 CaUGTs and low expression of 12 CaUGTs. We identified several closely related orthologs of CaUGTs in other genomes and compared their exon-intron arrangement. An attempt was made to assign functional specificity to chickpea UGTs by comparing substrate binding sites with experimentally determined specificity. These findings will assist in precise selection of candidate genes for various applications and understanding functional genomics of chickpea.
Article
Full-text available
Coffee is a valuable beverage crop due to its characteristic flavor, aroma, and the stimulating effects of caffeine. We generated a high-quality draft genome of the species Coffea canephora, which displays a conserved chromosomal gene order among asterid angiosperms. Although it shows no sign of the whole-genome triplication identified in Solanaceae species such as tomato, the genome includes several species-specific gene family expansions, among them N-methyltransferases (NMTs) involved in caffeine production, defense-related genes, and alkaloid and flavonoid enzymes involved in secondary compound synthesis. Comparative analyses of caffeine NMTs demonstrate that these genes expanded through sequential tandem duplications independently of genes from cacao and tea, suggesting that caffeine in eudicots is of polyphyletic origin.
Article
Full-text available
The authors Rabie Saidi and Tunca Dogan were omitted from the list of the UniProt consortium in the acknowledgements section of this paper. The corrected consortium list is provided below. The UniProt Consortium UniProt has been prepared by Rolf Apweiler, Alex Bateman, Maria Jesus Martin, Claire O'Donovan, Michele Magrane, Yasmin Alam–Faruque, Emanuele Alpi, Ricardo Antunes, Joanna Arganiska, Elisabet Barrera Casanova, Benoit Bely, Mark Bingley, Carlos Bonilla, Ramona Britto, Borisas Bursteinas, Wei Mun Chan, Gayatri Chavali, Elena Cibrian–Uhalte, Alan Da Silva, Maurizio De Giorgi, Tunca Dogan, Francesco Fazzini, Paul Gane, Leyla Garcia Castro, Penelope Garmiri, Emma Hatton–Ellis, Reija Hieta, Rachael Huntley, Duncan Legge, Wudong Liu, Jie Luo, Alistair MacDougall, Prudence Mutowo, Andrew Nightingale, Sandra Orchard, Klemens Pichler, Diego Poggioli, Sangya Pundir, Luis Pureza, Guoying Qi, Steven Rosanoff, Rabie Saidi, Tony Sawford, Aleksandra Shypitsyna, Edward Turner, Vladimir Volynkin, Tony Wardell, Xavier Watkins, Hermann Zellner, Matt Corbett, Mike Donnelly, Pieter van Rensburg, Mickael Goujon, Hamish McWilliam and Rodrigo Lopez at the European Bioinformatics Institute (EMBL–EBI); Ioannis Xenarios, Lydie Bougueleret, Alan Bridge, Sylvain Poux, Nicole Redaschi, Lucila Aimo, Andrea Auchincloss, Kristian Axelsen, Parit Bansal, Delphine Baratin, Pierre–Alain Binz, Marie–Claude Blatter, Brigitte Boeckmann, Jerven Bolleman, Emmanuel Boutet, Lionel Breuza, Cristina Casal–Casas, Edouard de Castro, Lorenzo Cerutti, Elisabeth Coudert, Beatrice Cuche, Mikael Doche, Dolnide Dornevil, Severine Duvaud, Anne Estreicher, Livia Famiglietti, Marc Feuermann, Elisabeth Gasteiger, Sebastien Gehant, Vivienne Gerritsen, Arnaud Gos, Nadine Gruaz–Gumowski, Ursula Hinz, Chantal Hulo, Janet James, Florence Jungo, Guillaume Keller, Vicente Lara, Philippe Lemercier, Jocelyne Lew, Damien Lieberherr, Thierry Lombardot, Xavier Martin, Patrick Masson, Anne Morgat, Teresa Neto, Salvo Paesano, Ivo Pedruzzi, Sandrine Pilbout, Monica Pozzato, Manuela Pruess, Catherine Rivoire, Bernd Roechert, Michel Schneider, Christian Sigrist, Karin Sonesson, Sylvie Staehli, Andre Stutz, Shyamala Sundaram, Michael Tognolli, Laure Verbregue and Anne–Lise Veuthey at the SIB Swiss Institute of Bioinformatics (SIB); Cathy H. Wu, Cecilia N. Arighi, Leslie Arminski, Chuming Chen, Yongxing Chen, John S. Garavelli, Hongzhan Huang, Kati Laiho, Peter McGarvey, Darren A. Natale, Baris E. Suzek, C. R. Vinayaka, Qinghua Wang, Yuqi Wang, Lai–Su Yeh, Meher Shruti Yerramalla and Jian Zhang at the Protein Information Resource (PIR).
Article
Full-text available
Polyploid plants can exhibit transcriptional modulation in homeologous genes in response to abiotic stresses. Coffea arabica, an allotetraploid, accounts for 75 % of the world's coffee production. Extreme temperatures, salinity and drought limit crop productivity, which includes coffee plants. Mannitol is known to be involved in abiotic stress tolerance in higher plants. This study aimed to investigate the transcriptional responses of genes involved in mannitol biosynthesis and catabolism in C. arabica leaves under water deficit, salt stress and high temperature. Mannitol concentration was significantly increased in leaves of plants under drought and salinity, but reduced by heat stress. Fructose content followed the level of mannitol only in heat-stressed plants, suggesting the partitioning of the former into other metabolites during drought and salt stress conditions. Transcripts of the key enzymes involved in mannitol biosynthesis, CaM6PR, CaPMI and CaMTD, were modulated in distinct ways depending on the abiotic stress. Our data suggest that changes in mannitol accumulation during drought and salt stress in leaves of C. arabica are due, at least in part, to the increased expression of the key genes involved in mannitol biosynthesis. In addition, the homeologs of the Coffea canephora subgenome did not present the same pattern of overall transcriptional response, indicating differential regulation of these genes by the same stimulus. In this way, this study adds new information on the differential expression of C. arabica homeologous genes under adverse environmental conditions showing that abiotic stresses can influence the homeologous gene regulation pattern, in this case, mainly on those involved in mannitol pathway.
Article
Full-text available
The mission of the Universal Protein Resource (UniProt) (http://www.uniprot.org) is to provide the scientific community with a comprehensive, high-quality and freely accessible resource of protein sequences and functional annotation. It integrates, interprets and standardizes data from literature and numerous resources to achieve the most comprehensive catalog possible of protein information. The central activities are the biocuration of the UniProt Knowledgebase and the dissemination of these data through our Web site and web services. UniProt is produced by the UniProt Consortium, which consists of groups from the European Bioinformatics Institute (EBI), the SIB Swiss Institute of Bioinformatics (SIB) and the Protein Information Resource (PIR). UniProt is updated and distributed every 4 weeks and can be accessed online for searches or downloads.
Article
Full-text available
The aim of the present study was to perform a genomic analysis of non-specific lipid-transfer proteins (nsLTPs) in coffee. Several nsLTPs-encoding cDNA and gene sequences were cloned from Coffea arabica and Coffea canephora species. In this work, their analyses revealed that coffee nsLTPs belong to Type II LTP characterized under their mature forms by a molecular weight of around 7.3 kDa, a basic isoelectric points of 8.5 and the presence of typical CXC pattern, with X being an hydrophobic residue facing towards the hydrophobic cavity. Even if several single nucleotide polymorphisms were identified in these nsLTP-coding sequences, 3D predictions showed that they do not have a significant impact on protein functions. Northern blot and RT-qPCR experiments revealed specific expression of Type II nsLTPs-encoding genes in coffee fruits, mainly during the early development of endosperm of both C. arabica and C. canephora. As part of our search for tissue-specific promoters in coffee, an nsLTP promoter region of around 1.2 kb was isolated. It contained several DNA repeats including boxes identified as essential for grain specific expression in other plants. The whole fragment, and a series of 5' deletions, were fused to the reporter gene β-glucuronidase (uidA) and analyzed in transgenic Nicotiana tabacum plants. Histochemical and fluorimetric GUS assays showed that the shorter (345 bp) and medium (827 bp) fragments of nsLTP promoter function as grain-specific promoters in transgenic tobacco plants.
Article
Full-text available
We announce the release of an advanced version of the Molecular Evolutionary Genetics Analysis (MEGA) software, which currently contains facilities for building sequence alignments, inferring phylogenetic histories, and conducting molecular evolutionary analysis. In version 6.0, MEGA now enables the inference of timetrees, as it implements our RelTime method for estimating divergence times for all branching points in a phylogeny. A new Timetree Wizard in MEGA6 facilitates this timetree inference by providing a graphical user interface (GUI) to specify the phylogeny and calibration constraints step-by-step. This version also contains enhanced algorithms to search for the optimal trees under evolutionary criteria and implements a more advanced memory management that can double the size of sequence data sets to which MEGA can be applied. Both GUI and command-line versions of MEGA6 can be downloaded from www.megasoftware.net free of charge.
Article
Full-text available
A significant number of terpenoid compounds are glycosides with the sugars linked to the active groups. Sometimes, the glycosidic residue is crucial for their activity, but in other cases glycosylation only improves pharmacokinetic parameters. Enzymatic glycosylation of terpenoids is a useful tool due to the high selectivity and the mildness of the reaction conditions, in comparison with chemical methods. Several types of biocatalysts have been used in the enzymatic glycosylation of terpenoids. These include the use of glycosyltransferases, trans-glycosidases, and whole-cell biotransformation systems capable of regenerating the cofactor, such as fungi, bacteria, plant-cell cultures, etc. Many biosynthesized terpenoid glycosides display medicinal and pharmacological properties and can be used as pro-drug substances. These terpenoid glycosides have also been employed as food additives (e.g. low-caloric sweetener compounds) and cosmetics, and even have applications as controlled-release fragrances.
Article
Full-text available
Guinea grass (Panicum maximum Jacq.) is a tropical African grass often used to feed beef cattle, which is an important economic activity in Brazil. Brazil is the leader in global meat exportation because of its exclusively pasture-raised bovine herds. Guinea grass also has potential uses in bioenergy production due to its elevated biomass generation through the C4 photosynthesis pathway. We generated approximately 13 Gb of data from Illumina sequencing of P. maximum leaves. Four different genotypes were sequenced, and the combined reads were assembled de novo into 38,192 unigenes and annotated; approximately 63% of the unigenes had homology to other proteins in the NCBI non-redundant protein database. Functional classification through COG (Clusters of Orthologous Groups), GO (Gene Ontology) and KEGG (Kyoto Encyclopedia of Genes and Genomes) analyses showed that the unigenes from Guinea grass leaves are involved in a wide range of biological processes and metabolic pathways, including C4 photosynthesis and lignocellulose generation, which are important for cattle grazing and bioenergy production. The most abundant transcripts were involved in carbon fixation, photosynthesis, RNA translation and heavy metal cellular homeostasis. Finally, we identified a number of potential molecular markers, including 5,035 microsatellites (SSRs) and 346,456 single nucleotide polymorphisms (SNPs). To the best of our knowledge, this is the first study to characterize the complete leaf transcriptome of P. maximum using high-throughput sequencing. The biological information provided here will aid in gene expression studies and marker-assisted selection-based breeding research in tropical grasses.
Article
Full-text available
Nutrient response networks are likely to have been among the first response networks to evolve, as the ability to sense and respond to the levels of available nutrients is critical for all organisms. Although several forward genetic screens have been successful in identifying components of plant sugar-response networks, many components remain to be identified. Toward this end, a reverse genetic screen was conducted in Arabidopsis thaliana to identify additional components of sugar-response networks. This screen was based on the rationale that some of the genes involved in sugar-response networks are likely to be themselves sugar regulated at the steady-state mRNA level and to encode proteins with activities commonly associated with response networks. This rationale was validated by the identification of hac1 mutants that are defective in sugar response. HAC1 encodes a histone acetyltransferase. Histone acetyltransferases increase transcription of specific genes by acetylating histones associated with those genes. Mutations in HAC1 also cause reduced fertility, a moderate degree of resistance to paclobutrazol and altered transcript levels of specific genes. Previous research has shown that hac1 mutants exhibit delayed flowering. The sugar-response and fertility defects of hac1 mutants may be partially explained by decreased expression of AtPV42a and AtPV42b, which are putative components of plant SnRK1 complexes. SnRK1 complexes have been shown to function as central regulators of plant nutrient and energy status. Involvement of a histone acetyltransferase in sugar response provides a possible mechanism whereby nutritional status could exert long-term effects on plant development and metabolism.
Article
Full-text available
Bowtie 1 is a fast and memory-efficient program for aligning short reads to mammalian genomes. Burrows-Wheeler indexing allows Bowtie to align more than 25 million 35-bp reads per CPU hour to the human genome in a memory footprint of as little as 1.1 gigabytes. Bowtie extends previous Burrows-Wheeler techniques with a quality-aware search algorithm that permits mismatches. Multiple processor cores can be used simultaneously to achieve greater alignment speed. Bowtie is free, open source software available for download from http://bowtie.cbcb.umd.edu . The Burrows-Wheeler Transformation of a text T, BWT(T), is constructed as shown to the right. The Burrows- Wheeler Matrix of T is the matrix whose rows are all distinct cyclic rotations of T$ sorted lexicographically ($ is "less than" all other characters). BWT(T) is the sequence of characters in the last column of this matrix.
Article
Full-text available
Background Cymbidium sinense belongs to the Orchidaceae, which is one of the most abundant angiosperm families. C. sinense, a high-grade traditional potted flower, is most prevalent in China and some Southeast Asian countries. The control of flowering time is a major bottleneck in the industrialized development of C. sinense. Little is known about the mechanisms responsible for floral development in this orchid. Moreover, genome references for entire transcriptome sequences do not currently exist for C. sinense. Thus, transcriptome and expression profiling data for this species are needed as an important resource to identify genes and to better understand the biological mechanisms of floral development in C. sinense. Results In this study, de novo transcriptome assembly and gene expression analysis using Illumina sequencing technology were performed. Transcriptome analysis assembles gene-related information related to vegetative and reproductive growth of C. sinense. Illumina sequencing generated 54,248,006 high quality reads that were assembled into 83,580 unigenes with an average sequence length of 612 base pairs, including 13,315 clusters and 70,265 singletons. A total of 41,687 (49.88%) unique sequences were annotated, 23,092 of which were assigned to specific metabolic pathways by the Kyoto Encyclopedia of Genes and Genomes (KEGG). Gene Ontology (GO) analysis of the annotated unigenes revealed that the majority of sequenced genes were associated with metabolic and cellular processes, cell and cell parts, catalytic activity and binding. Furthermore, 120 flowering-associated unigenes, 73 MADS-box unigenes and 28 CONSTANS-LIKE (COL) unigenes were identified from our collection. In addition, three digital gene expression (DGE) libraries were constructed for the vegetative phase (VP), floral differentiation phase (FDP) and reproductive phase (RP). The specific expression of many genes in the three development phases was also identified. 32 genes among three sub-libraries with high differential expression were selected as candidates connected with flower development. Conclusion RNA-seq and DGE profiling data provided comprehensive gene expression information at the transcriptional level that could facilitate our understanding of the molecular mechanisms of floral development at three development phases of C. sinense. This data could be used as an important resource for investigating the genetics of the flowering pathway and various biological mechanisms in this orchid.
Article
Full-text available
Background Cocos nucifera (coconut), a member of the Arecaceae family, is an economically important woody palm grown in tropical regions. Despite its agronomic importance, previous germplasm assessment studies have relied solely on morphological and agronomical traits. Molecular biology techniques have been scarcely used in assessment of genetic resources and for improvement of important agronomic and quality traits in Cocos nucifera, mostly due to the absence of available sequence information. Methodology/Principal Findings To provide basic information for molecular breeding and further molecular biological analysis in Cocos nucifera, we applied RNA-seq technology and de novo assembly to gain a global overview of the Cocos nucifera transcriptome from mixed tissue samples. Using Illumina sequencing, we obtained 54.9 million short reads and conducted de novo assembly to obtain 57,304 unigenes with an average length of 752 base pairs. Sequence comparison between assembled unigenes and released cDNA sequences of Cocos nucifera and Elaeis guineensis indicated that the assembled sequences were of high quality. Approximately 99.9% of unigenes were novel compared to the released coconut EST sequences. Using BLASTX, 68.2% of unigenes were successfully annotated based on the Genbank non-redundant (Nr) protein database. The annotated unigenes were then further classified using the Gene Ontology (GO), Clusters of Orthologous Groups (COG) and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases. Conclusions/Significance Our study provides a large quantity of novel genetic information for Cocos nucifera. This information will act as a valuable resource for further molecular genetic studies and breeding in coconut, as well as for isolation and characterization of functional genes involved in different biochemical pathways in this important tropical crop species.
Article
Full-text available
Coffee quality, in the present context of overproduction worldwide, has to be considered as a main selection criterion for coffee improvement. After a definition of quality, and an overview of the non genetic factors affecting its variation, this review focuses on the genetic factors involved in the control of coffee quality variation. Regarding the complexity of this trait, the different types of quality are first presented. Then, the great variation within and between coffee species is underlined, mainly for biochemical compounds related to quality (caffeine, sugars, chlorogenic acids, lipids). The ways for breeding quality traits for cultivated species, Coffea arabica and Coffea canephora are discussed, with specific challenges for each species. For C. arabica, maintaining a good quality in F1 intraspecific hybrids, introgressed lines from Timor hybrid, and grafted varieties are the main challenges. For C. canephora, the improvement is mainly based on intraspecific and interspecific hybrids, using the whole genetic variability available within this species. An improvement is obtained for bean size, with significant genetic gains in current breeding programmes. The content in biochemical compounds related to cup quality is another way to improve Robusta quality. Finally, ongoing programmes towards the understanding of the molecular determinism of coffee quality, particularly using coffee ESTs, are presented.
Article
Full-text available
The ESTHER database, which is freely available via a web server (http://bioweb.ensam.inra.fr/esther) and is widely used, is dedicated to proteins with an α/β-hydrolase fold, and it currently contains >30 000 manually curated proteins. Herein, we report those substantial changes towards improvement that we have made to improve ESTHER during the past 8 years since our 2004 update. In particular, we generated 87 new families and increased the coverage of the UniProt Knowledgebase (UniProtKB). We also renewed the ESTHER website and added new visualization tools, such as the Overall Table and the Family Tree. We also address two topics of particular interest to the ESTHER users. First, we explain how the different enzyme classifications (bacterial lipases, peptidases, carboxylesterases) used by different communities of users are combined in ESTHER. Second, we discuss how variations of core architecture or in predicted active site residues result in a more precise clustering of families, and whether this strategy provides trustable hints to identify enzyme-like proteins with no catalytic activity.
Article
Full-text available
Plant growth requires cell wall extension. The cotton AtRD22-Like 1 gene GhRDL1, predominately expressed in elongating fiber cells, encodes a BURP domain-containing protein. Here, we show that GhRDL1 is localized in cell wall and interacts with GhEXPA1, an α-expansin functioning in wall loosening. Transgenic cotton over-expressing GhRDL1 showed an increase of fiber length and seed mass, and an enlargement of endopleura cells of ovules. Expression of either GhRDL1 or GhEXPA1 alone in Arabidopsis led to a substantial increase in seed size; interestingly, their co-expression resulted in the increased number of siliques, the nearly doubled seed mass, and the enhanced biomass production. Cotton plants over-expressing GhRDL1 and GhEXPA1 proteins produced strikingly more fruits (bolls), leading to up to 40% higher fiber yield per plant without adverse effects on fiber quality and vegetative growth. We demonstrate that engineering cell wall protein partners has a great potential in promoting plant growth and crop yield.
Article
Full-text available
High-throughput DNA sequencing is a powerful and versatile new technology for ob-taining comprehensive and quantitative data about RNA expression (RNA-Seq), protein-DNA binding (ChIP-Seq), and genetic variations between individuals. It addresses es-sentially all of the use cases that microarrays were applied to in the past, but produces more detailed and more comprehensive results. One of the basic statistical tasks is inference (testing, regression) on discrete count values (e.g., representing the number of times a certain type of mRNA was sampled by the sequencing machine). Challenges are posed by a large dynamic range, heteroskedas-ticity and small numbers of replicates. Hence, model-based approaches are needed to achieve statistical power. I will present an error model that uses the negative binomial distribution, with vari-ance and mean linked by local regression, to model the null distribution of the count data. The method controls type-I error and provides good detection power. I will also discuss how to use the GLM framework to detect alternative transcript isoform usage. A free open-source R software package, DESeq, is available from the Bioconductor project.
Article
Full-text available
Accuracy in quantitative real-time polymerase chain reaction (qPCR) requires the use of stable endogenous controls. Normalization with multiple reference genes is the gold standard, but their identification is a laborious task, especially in species with limited sequence information. Coffee (Coffea ssp.) is an important agricultural commodity and, due to its economic relevance, is the subject of increasing research in genetics and biotechnology, in which gene expression analysis is one of the most important fields. Notwithstanding, relatively few works have focused on the analysis of gene expression in coffee. Moreover, most of these works have used less accurate techniques such as northern blot assays instead of more accurate techniques (e.g., qPCR) that have already been extensively used in other plant species. Aiming to boost the use of qPCR in studies of gene expression in coffee, we uncovered reference genes to be used in a number of different experimental conditions. Using two distinct algorithms implemented by geNorm and Norm Finder, we evaluated a total of eight candidate reference genes (psaB, PP2A, AP47, S24, GAPDH, rpl39, UBQ10, and UBI9) in four different experimental sets (control versus drought-stressed leaves, control versus drought-stressed roots, leaves of three different coffee cultivars, and four different coffee organs). The most suitable combination of reference genes was indicated in each experimental set for use as internal control for reliable qPCR data normalization. This study also provides useful guidelines for reference gene selection for researchers working with coffee plant samples under conditions other than those tested here.
Article
Full-text available
Seeds from open-pollinated flowers collected from hybrids of several Coffea species were analysed for caffeine content. The caffeine content was not always intermediary to that of the parents; higher and lower values were found. Diploid F1 hybrids between accessions of C. eugenioides and C. salvatrix showed the lowest seed caffeine content. Seeds of the tetraploid hybrids C. arabica C. salvatrix or C. arabica C. eugenioides hybrids presented low caffeine content. The possibility of breeding coffee to reduce the caffeine content in the seeds by interspecific hybridization of C. arabica with other Coffea species is discussed.
Article
Full-text available
Two Coffea arabica — Hemileia vastatrix incompatible interactions (I1: coffee cv. Caturra — rust race VI and I2: coffee cv S4 Agaro — rust race II) and a compatible interaction (coffee cv. Caturra — rust race II) were compared in relation to the infection process and chitinase activity. In the two incompatible interactions the fungus ceased growth in the early infection stages, while in the compatible interaction no fungus growth inhibition was observed. A high constitutive level of chitinase activity was detected in the intercellular fluid of healthy leaves. Upon infection, chitinase isoforms were more abundant in incompatible interactions than in the compatible interaction. Immunodetection showed that class I chitinases are particularly relevant in the incompatible interactions and might participate in the defence response of the coffee plants.
Article
Full-text available
Plant chitinases, a class of glycosyl hydrolases, participate in various aspects of normal plant growth and development, including cell wall metabolism and disease resistance. The rice (Oryza sativa) genome encodes 37 putative chitinases and chitinase-like proteins. However, none of them has been characterized at the genetic level. In this study, we report the isolation of a brittle culm mutant, bc15, and the map-based cloning of the BC15/OsCTL1 (for chitinase-like1) gene affected in the mutant. The gene encodes the rice chitinase-like protein BC15/OsCTL1. Mutation of BC15/OsCTL1 causes reduced cellulose content and mechanical strength without obvious alterations in plant growth. Bioinformatic analyses indicated that BC15/OsCTL1 is a class II chitinase-like protein that is devoid of both an amino-terminal cysteine-rich domain and the chitinase activity motif H-E-T-T but possesses an amino-terminal transmembrane domain. Biochemical assays demonstrated that BC15/OsCTL1 is a Golgi-localized type II membrane protein that lacks classical chitinase activity. Quantitative real-time polymerase chain reaction and β-glucuronidase activity analyses indicated that BC15/OsCTL1 is ubiquitously expressed. Investigation of the global expression profile of wild-type and bc15 plants, using Illumina RNA sequencing, further suggested a possible mechanism by which BC15/OsCTL1 mediates cellulose biosynthesis and cell wall remodeling. Our findings provide genetic evidence of a role for plant chitinases in cellulose biosynthesis in rice, which appears to differ from their roles as revealed by analysis of Arabidopsis (Arabidopsis thaliana).
Article
Full-text available
Like many other plant defense compounds, glucosinolates are present constitutively in plant tissues, but are also induced to higher levels by herbivore attack. Of the major glucosinolate types, indolic glucosinolates are most frequently induced regardless of the type of herbivore involved. Over 90% of previous studies found that herbivore damage to glucosinolate-containing plants led to an increased accumulation of indolic glucosinolates at levels ranging up to 20-fold. Aliphatic and aromatic glucosinolates are also commonly induced by herbivores, though usually at much lower magnitudes than indolic glucosinolates, and aliphatic and aromatic glucosinolates may even undergo declines following herbivory. The glucosinolate defense system also requires another partner, the enzyme myrosinase, to hydrolyze the parent glucosinolates into biologically active derivatives. Much less is known about myrosinase induction after herbivory compared to glucosinolate induction, and no general trends are evident. However, it is clear that insect feeding stimulates the formation of various myrosinase associated proteins whose function is not yet understood. The biochemical mechanism of glucosinolate induction involves a jasmonate signaling cascade that leads eventually to increases in the transcript levels of glucosinolate biosynthetic genes. Several recently described transcription factors controlling glucosinolate biosynthesis are activated by herbivory or wounding. Herbivore induction of glucosinolates has sometimes been demonstrated to increase protection against subsequent herbivore attack, but more research is needed to evaluate the costs and benefits of this phenomenon.
Article
Full-text available
O café é um dos principais produtos agrícolas, sendo considerado o segundo item em importância do comércio internacional de "commodities". O gênero Coffea pertence à família Rubiaceae que também inclui outras plantas importantes. Este gênero contém aproximadamente 100 espécies, mas a produção comercial é baseada somente em duas espécies, Coffea arabica e Coffea canephora, que representam aproximadamente 70 % e 30 % do mercado total de café, respectivamente. O Projeto Genoma Café Brasileiro foi desenvolvido com o objetivo de disponibilizar os modernos recursos da genômica à comunidade científica e aos diferentes segmentos da cadeia produtiva do café. Para isso, foram seqüenciados 214.964 clones escolhidos aleatoriamente de 37 bibliotecas de cDNA de C. arabica, C. canephora e C. racemosa representando estádios específicos do desenvolvimento de células e de tecidos do cafeeiro, resultando em 130.792, 12.381 e 10.566 seqüências de cada espécie, respectivamente, após processo de trimagem. Os ESTs foram agrupados em 17.982 contigs e em 32.155 singletons. A comparação destas seqüências pelo programa BLAST revelou que 22 % não tiveram nenhuma similaridade significativa às seqüências no banco de dados do National Center for Biotechnology Information (de função conhecida ou desconhecida). A base de dados de ESTs do cafeeiro resultou na identificação de cerca de 33.000 unigenes diferentes. Os resultados de anotação das seqüências foram armazenados em base de dados "online" em http://www.lge.ibi.unicamp.br/cafe. Os recursos desenvolvidos por este projeto disponibilizam ferramentas genéticas e genômicas que podem ser decisivas para a sustentabilidade, a competitividade e a futura viabilidade da agroindústria cafeeira nos mercados interno e externo.
Article
Full-text available
Coffee trees (Rubiaceae) and tomato (Solanaceae) belong to the Asterid clade, while grapevine (Vitaceae) belongs to the Rosid clade. Coffee and tomato separated from grapevine 125 million years ago, while coffee and tomato diverged 83-89 million years ago. These long periods of divergent evolution should have permitted the genomes to reorganize significantly. So far, very few comparative mappings have been performed between very distantly related species belonging to different clades. We report the first multiple comparison between species from Asterid and Rosid clades, to examine both macro-and microsynteny relationships. Thanks to a set of 867 COSII markers, macrosynteny was detected between coffee, tomato and grapevine. While coffee and tomato genomes share 318 orthologous markers and 27 conserved syntenic segments (CSSs), coffee and grapevine also share a similar number of syntenic markers and CSSs: 299 and 29 respectively. Despite large genome macrostructure reorganization, several large chromosome segments showed outstanding macrosynteny shedding new insights into chromosome evolution between Asterids and Rosids. We also analyzed a sequence of 174 kb containing the ovate gene, conserved in a syntenic block between coffee, tomato and grapevine that showed a high-level of microstructure conservation. A higher level of conservation was observed between coffee and grapevine, both woody and long life-cycle plants, than between coffee and tomato. Out of 16 coffee genes of this syntenic segment, 7 and 14 showed complete synteny between coffee and tomato or grapevine, respectively. These results show that significant conservation is found between distantly related species from the Asterid (Coffea canephora and Solanum sp.) and Rosid (Vitis vinifera) clades, at the genome macrostructure and microstructure levels. At the ovate locus, conservation did not decline in relation to increasing phylogenetic distance, suggesting that the time factor alone does not explain divergences. Our results are considerably useful for syntenic studies between supposedly remote species for the isolation of important genes for agronomy.
Article
Full-text available
Plant chitinases (EC 3.2.1.14) belong to relatively large gene families subdivided in classes that suggest class-specific functions. They are commonly induced upon the attack of pathogens and by various sources of stress, which led to associating them with plant defense in general. However, it is becoming apparent that most of them display several functions during the plant life cycle, including taking part in developmental processes such as pollination and embryo development. The number of chitinases combined with their multiple functions has been an obstacle to a better understanding of their role in plants. It is therefore important to identify and inventory all chitinase genes of a plant species to be able to dissect their function and understand the relations between the different classes. Complete sequencing of the Arabidopsis genome has made this task feasible and we present here a survey of all putative chitinase-encoding genes accompanied by a detailed analysis of their sequence. Based on their characteristics and on studies on other plant chitinases, we propose an overview of their possible functions as well as modified annotations for some of them.
Article
Full-text available
Abiotic stresses are among the most important factors that affect food production. One important step to face these environmental challenges is the transcriptional modulation. Quantitative real-time PCR is a rapid, sensitive, and reliable method for the detection of mRNAs and it has become a powerful tool to mitigate plant stress tolerance; however, suitable reference genes are required for data normalization. Reference genes for coffee plants during nitrogen starvation, salinity and heat stress have not yet been reported. We evaluated the expression stability of ten candidate reference genes using geNorm PLUS, NormFinder, and BestKeeper softwares, in plants submitted to nitrogen starvation, salt and heat stress. EF1, EF1α, GAPDH, MDH, and UBQ10 were ranked as the most stable genes in all stresses and software analyses, while RPL39 and RPII were classified as the less reliable references. For reference gene validation, the transcriptional pattern of a Coffea non-symbiotic hemoglobin (CaHb1) was analyzed using the two new recommended and the most unstable gene references for normalization. The most unstable gene may lead to incorrect interpretation of CaHb1 transcriptional analysis. Here, we recommend two new reference genes in Coffea for use in data normalization in abiotic stresses: MDH and EF1.
Article
Full-text available
With the arrival of low-cost, next-generation sequencing, a multitude of new plant genomes are being publicly released, providing unseen opportunities and challenges for comparative genomics studies. Here, we present PLAZA 2.5, a user-friendly online research environment to explore genomic information from different plants. This new release features updates to previous genome annotations and a substantial number of newly available plant genomes as well as various new interactive tools and visualizations. Currently, PLAZA hosts 25 organisms covering a broad taxonomic range, including 13 eudicots, five monocots, one lycopod, one moss, and five algae. The available data consist of structural and functional gene annotations, homologous gene families, multiple sequence alignments, phylogenetic trees, and colinear regions within and between species. A new Integrative Orthology Viewer, combining information from different orthology prediction methodologies, was developed to efficiently investigate complex orthology relationships. Cross-species expression analysis revealed that the integration of complementary data types extended the scope of complex orthology relationships, especially between more distantly related species. Finally, based on phylogenetic profiling, we propose a set of core gene families within the green plant lineage that will be instrumental to assess the gene space of draft or newly sequenced plant genomes during the assembly or annotation phase.
Article
Full-text available
Allopolyploidy is considered as a major factor contributing to speciation, diversification, and plant ecological adaptation. In particular, the expression of duplicate genes (homeologs) can be altered leading to functional plasticity and to phenotypic novelty. This study investigated the influence of growing temperatures on homeologous gene expression in Coffea arabica L., a recent allopolyploid involving 2 closely related diploid parental species. The relative expression of homeologs of 13 genes all located in the same genomic region was analyzed using an SNP ratio quantification method based on dideoxy-terminated sequences of cDNA amplicons. The relative expression of homeologous genes varied depending on the gene, the organ, and the growing condition. Nevertheless, expression of both homeologs was always detected (i.e., no silencing). Although the growing conditions were suitable for one or other of the parental species, neither subgenome appeared preferentially expressed. Furthermore, relative homeologous expression showed moderate variations across organs and conditions and appeared uncorrelated between adjacent genes. These results indicate the absence of signs of subfunctionalization suggesting C. arabica has not undergone noticeable diploidization. Furthermore, these results suggest that the expression of homeologous genes in C. arabica is regulated by a shared trans-regulation mechanism acting similarly on the 2 subgenomes and that the observed biases in the relative homeolog expression may result from cis fine-scale factors.
Article
Full-text available
The root phenotype of an Arabidopsis (Arabidopsis thaliana) mutant of CHITINASE-LIKE1 (CTL1), called arm (for anion-related root morphology), was previously shown to be conditional on growth on high nitrate, chloride, or sucrose. Mutants grown under restrictive conditions displayed inhibition of primary root growth, radial swelling, proliferation of lateral roots, and increased root hair density. We found here that the spatial pattern of CTL1 expression was mainly in the root and root tips during seedling development and that the protein localized to the cell wall. Fourier-transform infrared microspectroscopy of mutant root tissues indicated differences in spectra assigned to linkages in cellulose and pectin. Indeed, root cell wall polymer composition analysis revealed that the arm mutant contained less crystalline cellulose and reduced methylesterification of pectins. We also explored the implication of growth regulators on the phenotype of the mutant response to the nitrate supply. Exogenous abscisic acid application inhibited more drastically primary root growth in the arm mutant but failed to repress lateral branching compared with the wild type. Cytokinin levels were higher in the arm root, but there were no changes in mitotic activity, suggesting that cytokinin is not directly involved in the mutant phenotype. Ethylene production was higher in arm but inversely proportional to the nitrate concentration in the medium. Interestingly, eto2 and eto3 ethylene overproduction mutants mimicked some of the conditional root characteristics of the arm mutant on high nitrate. Our data suggest that ethylene may be involved in the arm mutant phenotype, albeit indirectly, rather than functioning as a primary signal.
Article
Full-text available
Massively parallel sequencing of cDNA has enabled deep and efficient probing of transcriptomes. Current approaches for transcript reconstruction from such data often rely on aligning reads to a reference genome, and are thus unsuitable for samples with a partial or missing reference genome. Here we present the Trinity method for de novo assembly of full-length transcripts and evaluate it on samples from fission yeast, mouse and whitefly, whose reference genome is not yet available. By efficiently constructing and analyzing sets of de Bruijn graphs, Trinity fully reconstructs a large fraction of transcripts, including alternatively spliced isoforms and transcripts from recently duplicated genes. Compared with other de novo transcriptome assemblers, Trinity recovers more full-length transcripts across a broad range of expression levels, with a sensitivity similar to methods that rely on genome alignments. Our approach provides a unified solution for transcriptome reconstruction in any sample, especially in the absence of a reference genome.
Article
Full-text available
Whole-genome duplication (WGD), or polyploidy, followed by gene loss and diploidization has long been recognized as an important evolutionary force in animals, fungi and other organisms, especially plants. The success of angiosperms has been attributed, in part, to innovations associated with gene or whole-genome duplications, but evidence for proposed ancient genome duplications pre-dating the divergence of monocots and eudicots remains equivocal in analyses of conserved gene order. Here we use comprehensive phylogenomic analyses of sequenced plant genomes and more than 12.6 million new expressed-sequence-tag sequences from phylogenetically pivotal lineages to elucidate two groups of ancient gene duplications-one in the common ancestor of extant seed plants and the other in the common ancestor of extant angiosperms. Gene duplication events were intensely concentrated around 319 and 192 million years ago, implicating two WGDs in ancestral lineages shortly before the diversification of extant seed plants and extant angiosperms, respectively. Significantly, these ancestral WGDs resulted in the diversification of regulatory genes important to seed and flower development, suggesting that they were involved in major innovations that ultimately contributed to the rise and eventual dominance of seed plants and angiosperms.
Article
Full-text available
Arabica coffee (Coffea arabica L.) is a self-compatible perennial allotetraploid species (2n=4x=44), whereas Robusta coffee (C. canephora L.) is a self-incompatible perennial diploid species (2n=2x=22). C. arabica (C(a) C(a) E(a) E(a) ) is derived from a spontaneous hybridization between two closely related diploid coffee species, C. canephora (CC) and C. eugenioides (EE). To investigate the patterns and degree of DNA sequence divergence between the Arabica and Robusta coffee genomes, we identified orthologous bacterial artificial chromosomes (BACs) from C. arabica and C. canephora, and compared their sequences to trace their evolutionary history. Although a high level of sequence similarity was found between BACs from C. arabica and C. canephora, numerous chromosomal rearrangements were detected, including inversions, deletions and insertions. DNA sequence identity between C. arabica and C. canephora orthologous BACs ranged from 93.4% (between E(a) and C(a) ) to 94.6% (between C(a) and C). Analysis of eight orthologous gene pairs resulted in estimated ages of divergence between 0.046 and 0.665 million years, indicating a recent origin of the allotetraploid species C. arabica. Analysis of transposable elements revealed differential insertion events that contributed to the size increase in the C(a) sub-genome compared to its diploid relative. In particular, we showed that insertion of a Ty1-copia LTR retrotransposon occurred specifically in C. arabica, probably shortly after allopolyploid formation. The two sub-genomes of C. arabica, C(a) and E(a) , showed sufficient sequence differences, and a whole-genome shotgun approach could be suitable for sequencing the allotetraploid genome of C. arabica.
Article
Allopolyploidy is considered as a major factor contributing to speciation, diversification, and plant ecological adaptation. In particular, the expression of duplicate genes (homeologs) can be altered leading to functional plasticity and to phenotypic novelty. This study investigated the influence of growing temperatures on homeologous gene expression in Coffea arabica L., a recent allopolyploid involving 2 closely related diploid parental species. The relative expression of homeologs of 13 genes all located in the same genomic region was analyzed using an SNP ratio quantification method based on dideoxy-terminated sequences of cDNA amplicons. The relative expression of homeologous genes varied depending on the gene, the organ, and the growing condition. Nevertheless, expression of both homeologs was always detected (i.e., no silencing). Although the growing conditions were suitable for one or other of the parental species, neither subgenome appeared preferentially expressed. Furthermore, relative homeologous expression showed moderate variations across organs and conditions and appeared uncorrelated between adjacent genes. These results indicate the absence of signs of subfunctionalization suggesting C. arabica has not undergone noticeable diploidization. Furthermore, these results suggest that the expression of homeologous genes in C. arabica is regulated by a shared trans-regulation mechanism acting similarly on the 2 subgenomes and that the observed biases in the relative homeolog expression may result from cis fine-scale factors.
Article
Genetic variation for seed dormancy in nature is a typical quantitative trait controlled by multiple loci on which environmental factors have a strong effect. Finding the genes underlying dormancy quantitative trait loci is a major scientific challenge, which also has relevance for agriculture and ecology. In this study we describe the identification of the DELAY OF GERMINATION 1 (DOG1) gene previously identified as a quantitative trait locus involved in the control of seed dormancy. This gene was isolated by a combination of positional cloning and mutant analysis and is absolutely required for the induction of seed dormancy. DOG1 is a member of a small gene family of unknown molecular function, with five members in Arabidopsis. The functional natural allelic variation present in Arabidopsis is caused by polymorphisms in the cis-regulatory region of the DOG1 gene and results in considerable expression differences between the DOG1 alleles of the accessions analyzed
Article
Plastid engineering provides several advantages for the next generation of transgenic technology, including the convenient use of transgene stacking and the generation of high expression levels of foreign proteins. With the goal of generating transplastomic plants with multiresistance against both phytopathogens and insects, a construct containing a monocistronic patterned gene stack was transformed into Nicotiana benthamiana plastids harbouring sweet potato sporamin, taro cystatin and chitinase from Paecilomyces javanicus. Transplastomic lines were screened and characterized by Southern/Northern/Western blot analysis for the confirmation of transgene integration and respective expression level. Immunogold localization analyses confirmed the high level of accumulation proteins that were specifically expressed in leaf and root plastids. Subsequent functional bioassays confirmed that the gene stacks conferred a high level of resistance against both insects and phytopathogens. Specifically, larva of Spodoptera litura and Spodoptera exigua either died or exhibited growth retardation after ingesting transplastomic plant leaves. In addition, the inhibitory effects on both leaf spot diseases caused by Alternaria alternata and soft rot disease caused by Pectobacterium carotovorum subsp. carotovorum were markedly observed. Moreover, tolerance to abiotic stresses such as salt/osmotic stress was highly enhanced. The results confirmed that the simultaneous expression of sporamin, cystatin and chitinase conferred a broad spectrum of resistance. Conversely, the expression of single transgenes was not capable of conferring such resistance. To the best of our knowledge, this is the first study to demonstrate an efficacious stacked combination of plastid-expressed defence genes which resulted in an engineered tolerance to various abiotic and biotic stresses.
Article
One of the reasons for the worldwide growing coffee consumption is the pleasant flavor of the final coffee beverage. It is well known that the character impact compounds of coffee are not present in the green state but are mainly formed during roasting. However, an extensive literature review indicated that more than 200 green coffee volatiles have been identified so far. Furthermore, the isolation and chemical identification of a range of compounds that have not been described in green coffee yet is reported by the present work. Results of GC-olfactometry reveal that only a few have an aroma impact on the typical flavor of green coffee. Some compounds may survive roasting and may contribute to the final roasted coffee flavor.
Article
Polyploidy has occurred throughout the evolutionary history of plants and led to diversification and plant ecological adaptation. Functional plasticity of duplicate genes is believed to play a major role in the environmental adaptation of polyploids. In this context, we characterized genome-wide homoeologous gene expression in Coffea arabica, a recent allopolyploid combining two subgenomes that derive from two closely related diploid species, and investigated its variation in response to changing environment. The transcriptome of leaves of C. arabica cultivated at different growing temperatures suitable for one or the other parental species was examined using RNA-sequencing. The relative contribution of homoeologs to gene expression was estimated for 9959 and 10 628 genes in warm and cold conditions, respectively. Whatever the growing conditions, 65% of the genes showed equivalent levels of homoeologous gene expression. In 92% of the genes, relative homoeologous gene expression varied < 10% between growing temperatures. The subgenome contributions to the transcriptome appeared to be only marginally altered by the different conditions (involving intertwined regulations of homeologs) suggesting that C. arabica's ability to tolerate a broader range of growing temperatures than its diploid parents does not result from differential use of homoeologs.
Article
Two pathogen-induced uridine diphosphate glycosyltransferases (UGTs) identified previously via co-expression with induced proanthocyanidin (PA) synthesis in poplar were cloned and characterized. Phylogenetic analysis grouped both genes with other known flavonoid UGTs that act on flavonols and anthocyanins. Recombinant enzymes were produced in order to test if they could glycoslate flavonoids. PtUGT78L1 accepted the flavonols quercetin and kaempferol as well as cyanidin, and used UDP-galactose as a sugar donor. PtUGT78M1 did not accept any of the flavonoids tested as a substrate, but did transfer glucose from UDP-glucose to the universal substrate 2,4,6-trichlorophenol. However, neither enzyme acted on the flavan-3-ols catechin or epicatechin, intermediates in the PA biosynthetic pathway.
Article
Nonhost resistance (NHR) of plants to fungal pathogens comprises different defense layers. Epidermal penetration resistance of Arabidopsis to Phakopsora pachyrhizi requires functional PEN1, PEN2 and PEN3 genes, whereas post-invasion resistance in the mesophyll depends on the combined functionality of PEN2, PAD4 and SAG101. Other genetic components of Arabidopsis post-invasion mesophyll resistance remain elusive. We performed comparative transcriptional profiling of wild-type, pen2 and pen2 pad4 sag101 mutants after inoculation with P. pachyrhizi to identify a novel trait for mesophyll NHR. Quantitative reverse transcription-polymerase chain reaction (RT-qPCR) analysis and microscopic analysis confirmed the essential role of the candidate gene in mesophyll NHR. UDP-glucosyltransferase UGT84A2/bright trichomes 1 (BRT1) is a novel component of Arabidopsis mesophyll NHR to P. pachyrhizi. BRT1 is a putative cytoplasmic enzyme in phenylpropanoid metabolism. BRT1 is specifically induced in pen2 with post-invasion resistance to P. pachyrhizi. Silencing or mutation of BRT1 increased haustoria formation in pen2 mesophyll. Yet, the brt1 mutation did not affect NHR to P. pachyrhizi in wild-type plants. We assign a novel function to BRT1, which is important for post-invasion NHR of Arabidopsis to P. pachyrhizi. BRT1 might serve to confer durable resistance against P. pachyrhizi to soybean.
Article
ABP19 and ABP20 (ABP19/20) have been isolated from the shoot apices of peach (Prunus persica L.) as proteins specifically bound to auxins. Sequence analysis has revealed that ABP19/20 are related to families of germin and germin-like proteins. Here, I showed that ABP19/20 were most abundant in young leaves and that their expression fluctuated during the light–dark cycle. These patterns of expression coincided with those of SaGLP and PnGLP, sequence homologues of ABP19/20. Both SaGLP and PnGLP have been isolated as transcripts that are specifically up-regulated in leaves during photoperiodic induction of flowering. In contrast, ABP19/20 expression in leaves was not correlated with the timing of flower bud initiation. In buds, ABP20 was expressed throughout development, whereas the expression of ABP19 mRNA was extremely low. ABP20 may have some role in the differentiation and development of both floral and vegetative buds. I also showed that the ABP19/20 protein has superoxide dismutase (SOD) activity. Several GLPs of different clades of the phylogenetic tree have SOD activity, suggesting that SOD may be widespread throughout proteins of the GLP family. Analysis of the 5′-flanking region of ABP19 revealed putative regulatory elements, including those that were auxin-responsive, light-regulated, and clock-responsive.
Article
A cDNA encoding CYP79B1 has been isolated from Sinapis alba. CYP79B1 from S. alba shows 54% sequence identity and 73% similarity to sorghum CYP79A1 and 95% sequence identity to the Arabidopsis T42902, assigned CYP79B2. The high identity and similarity to sorghum CYP79A1, which catalyses the conversion of tyrosine to p-hydroxyphenylacetaldoxime in the biosynthesis of the cyanogenic glucoside dhurrin, suggests that CYP79B1 similarly catalyses the conversion of amino acid(s) to aldoxime(s) in the biosynthesis of glucosinolates. Within the highly conserved PERF and the heme-binding region of A-type cytochromes, the CYP79 family has unique substitutions that define the family-specific consensus sequences of FXP(E/D)RH and SFSTG(K/R)RGC(A/I)A, respectively. Sequence analysis of PCR products generated with CYP79B subfamily-specific primers identified CYP79B homologues in Tropaeolum majus, Carica papaya, Arabidopsis, Brassica napus and S. alba. The five glucosinolate-producing plants identified a CYP79B amino acid consensus sequence KPERHLNECSEVTLTENDLRFISFSTGKRGC. The unique substitutions in the PERF and the heme-binding domain and the high sequence identity and similarity of CYP79B1, CYP79B2 and CYP79A1, together with the isolation of CYP79B homologues in the distantly related Tropaeolaceae, Caricaceae and Brassicaceae within the Capparales order, show that the initial part of the biosynthetic pathway of glucosinolates and cyanogenic glucosides is catalysed by evolutionarily conserved cytochromes P450. This confirms that the appearance of glucosinolates in Capparales is based on a cyanogen predisposition. Identification of CYP79 homologues in glucosinolate-producing plants provides an important tool for tissue-specific regulation of the level of glucosinolates to improve nutritional value and pest resistance.
Article
Sequence comparison of orthologous regions enables estimation of the divergence between genomes, analysis of their evolution and detection of particular features of the genomes, such as sequence rearrangements and transposable elements. Despite the economic importance of Coffea species, little genomic information is currently available. Coffea is a relatively young genus that includes more than one hundred diploid species and a single tetraploid species. Three Coffea orthologous regions of 470-900 kb were analyzed and compared: both subgenomes of allotetraploid Coffea arabica (contributed by the diploid species Coffea eugenioides and Coffea canephora) and the genome of diploid C. canephora. Sequence divergence was calculated on global alignments or on coding and non-coding sequences separately. A search for transposable elements detected 43 retrotransposons and 198 transposons in the sequences analyzed. Comparative insertion analysis made it possible to locate 165 TE insertions in the phylogenetic tree of the three genomes/subgenomes. In the tetraploid C. arabica, a homoeologous non-reciprocal transposition (HNRT) was detected and characterized: a 50 kb region of the C. eugenioides derived subgenome replaced the C. canephora derived counterpart. Comparative sequence analysis on three Coffea genomes/subgenomes revealed almost perfect gene synteny, low sequence divergence and a high number of shared transposable elements. Compared to the results of similar analysis in other genera (Aegilops/Triticum and Oryza), Coffea genomes/subgenomes appeared to be dramatically less diverged, which is consistent with the relatively recent radiation of the Coffea genus. Based on nucleotide substitution frequency, the HNRT was dated at 10,000-50,000 years BP, which is also the most recent estimation of the origin of C. arabica.
Article
For almost a decade, our knowledge on the organisation of the family 1 UDP-glycosyltransferases (UGTs) has been limited to the model plant A. thaliana. The availability of other plant genomes represents an opportunity to obtain a broader view of the family in terms of evolution and organisation. Family 1 UGTs are known to glycosylate several classes of plant secondary metabolites. A phylogeny reconstruction study was performed to get an insight into the evolution of this multigene family during the adaptation of plants to life on land. The organisation of the UGTs in the different organisms was also investigated. More than 1500 putative UGTs were identified in 12 fully sequenced and assembled plant genomes based on the highly conserved PSPG motif. Analyses by maximum likelihood (ML) method were performed to reconstruct the phylogenetic relationships existing between the sequences. The results of this study clearly show that the UGT family expanded during the transition from algae to vascular plants and that in higher plants the clustering of UGTs into phylogenetic groups appears to be conserved, although gene loss and gene gain events seem to have occurred in certain lineages. Interestingly, two new phylogenetic groups, named O and P, that are not present in A. thaliana were discovered.