eggNOG: automated construction and annotation of orthologous groups of genes

European Molecular Biology Laboratory, Meyerhofstrasse 1, 69117 Heidelberg, Germany.
Nucleic Acids Research (Impact Factor: 8.81). 02/2008; 36(Database issue):D250-4. DOI: 10.1093/nar/gkm796
Source: PubMed

ABSTRACT The identification of orthologous genes forms the basis for most comparative genomics studies. Existing approaches either lack functional annotation of the identified orthologous groups, hampering the interpretation of subsequent results, or are manually annotated and thus lag behind the rapid sequencing of new genomes. Here we present the eggNOG database ('evolutionary genealogy of genes: Non-supervised Orthologous Groups'), which contains orthologous groups constructed from Smith-Waterman alignments through identification of reciprocal best matches and triangular linkage clustering. Applying this procedure to 312 bacterial, 26 archaeal and 35 eukaryotic genomes yielded 43 582 course-grained orthologous groups of which 9724 are extended versions of those from the original COG/KOG database. We also constructed more fine-grained groups for selected subsets of organisms, such as the 19 914 mammalian orthologous groups. We automatically annotated our non-supervised orthologous groups with functional descriptions, which were derived by identifying common denominators for the genes based on their individual textual descriptions, annotated functional categories, and predicted protein domains. The orthologous groups in eggNOG contain 1 241 751 genes and provide at least a broad functional description for 77% of them. Users can query the resource for individual genes via a web interface or download the complete set of orthologous groups at


Available from: Jean Muller, Jun 03, 2015
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The only animal cells known that can maintain functional plastids (kleptoplasts) in their cytosol occur in the digestive gland epithelia of sacoglossan slugs. Only a few species of the many hundred known can profit from kleptoplasty during starvation long-term, but why is not understood. The two sister taxa Elysia cornigera and Elysia timida sequester plastids from the same algal species, but with a very different outcome: while E. cornigera usually dies within the first two weeks when deprived of food, E. timida can survive for many months to come. Here we compare the responses of the two slugs to starvation, blocked photosynthesis and light-stress. The two species respond differently, but in both starvation is the main denominator that alters global gene-expression profiles. The kleptoplasts’ ability to fix CO2 decreases at a similar rate in both slugs during starvation, but only E. cornigera individuals die in the presence of functional kleptoplasts, concomitant with the accumulation of reactive oxygen species (ROS) in the digestive tract. We show that profiting from the acquisition of robust plastids, and key to E. timida's longer survival, is determined by an increased starvation tolerance that keeps ROS levels at bay.
    Proceedings of the Royal Society B: Biological Sciences 01/2015; DOI:10.1098/rspb.2014.2519 · 5.29 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Gene content differences in human gut microbes can lead to inter-individual phenotypic variations such as digestive capacity. It is unclear whether gene content variation is caused by differences in microbial species composition or by the presence of different strains of the same species; the extent of gene content variation in the latter is unknown. Unlike pan-genome studies of cultivable strains, the use of metagenomic data can provide an unbiased view of structural variation of gut bacterial strains by measuring them in their natural habitats, the gut of each individual in this case, representing native boundaries between gut bacterial populations. We analyzed publicly available metagenomics data from fecal samples to characterize inter-individual variation in gut bacterial species. A comparison of 11 abundant gut bacterial species showed that the gene content of strains from the same species differed, on average, by 13% between individuals. This number is based on gene deletions only and represents a lower limit, yet the variation is already in similar range as observed between completely sequenced strains of cultivable species. We show that accessory genes that differ considerably between individuals can encode important functions, such as polysaccharide utilization and capsular polysaccharide synthesis loci. Metagenomics can yield insights into gene content variation of strains in complex communities, which cannot be predicted by phylogenetic marker genes alone. The large degree of inter-individual variability in gene content implies that strain resolution must be considered in order to fully assess the functional potential of an individual's human gut microbiome.
    Genome biology 04/2015; 16(1):82. DOI:10.1186/s13059-015-0646-9 · 10.47 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Celery of the family Apiaceae is a biennial herb that is cultivated and consumed worldwide. Lignin is essential for cell wall structural integrity, stem strength, water transport, mechanical support, and plant pathogen defense. This study discussed the mechanism of lignin formation at different stages of celery development. The transcriptome profile, lignin distribution, anatomical characteristics, and expression profile of leaves at three stages were analyzed. Regulating lignin synthesis in celery growth development has a significant economic value. Celery leaves at three stages were collected, and Illumina paired-end sequencing technology was used to analyze large-scale transcriptome sequences. From Stage 1 to 3, the collenchyma and vascular bundles in the petioles and leaf blades thickened and expanded, whereas the phloem and the xylem extensively developed. Spongy and palisade mesophyll tissues further developed and were tightly arranged. Lignin accumulation increased in the petioles and the mesophyll (palisade and spongy), and the xylem showed strong lignification. Lignin accumulation in different tissues and at different stages of celery development coincides with the anatomic characteristics and transcript levels of genes involved in lignin biosynthesis. Identifying the genes that encode lignin biosynthesis-related enzymes accompanied by lignin distribution may help elucidate the regulatory mechanisms of lignin biosynthesis in celery.
    Scientific Reports 02/2015; 5:8259. DOI:10.1038/srep08259 · 5.08 Impact Factor