[show abstract][hide abstract] ABSTRACT: GeneDB (http://www.genedb.org) is a genome database for prokaryotic and eukaryotic pathogens and closely related organisms. The resource provides a portal to genome sequence and annotation data, which is primarily generated by the Pathogen Genomics group at the Wellcome Trust Sanger Institute. It combines data from completed and ongoing genome projects with curated annotation, which is readily accessible from a web based resource. The development of the database in recent years has focused on providing database-driven annotation tools and pipelines, as well as catering for increasingly frequent assembly updates. The website has been significantly redesigned to take advantage of current web technologies, and improve usability. The current release stores 41 data sets, of which 17 are manually curated and maintained by biologists, who review and incorporate data from the scientific literature, as well as other sources. GeneDB is primarily a production and annotation database for the genomes of predominantly pathogenic organisms.
Nucleic Acids Research 11/2012; 40(Database issue):D98-108. · 8.28 Impact Factor
[show abstract][hide abstract] ABSTRACT: The pir genes comprise the largest multi-gene family in Plasmodium, with members found in P. vivax, P. knowlesi and the rodent malaria species. Despite comprising up to 5% of the genome, little is known about the functions of the proteins encoded by pir genes. P. chabaudi causes chronic infection in mice, which may be due to antigenic variation. In this model, pir genes are called cirs and may be involved in this mechanism, allowing evasion of host immune responses. In order to fully understand the role(s) of CIR proteins during P. chabaudi infection, a detailed characterization of the cir gene family was required.
The cir repertoire was annotated and a detailed bioinformatic characterization of the encoded CIR proteins was performed. Two major sub-families were identified, which have been named A and B. Members of each sub-family displayed different amino acid motifs, and were thus predicted to have undergone functional divergence. In addition, the expression of the entire cir repertoire was analyzed via RNA sequencing and microarray. Up to 40% of the cir gene repertoire was expressed in the parasite population during infection, and dominant cir transcripts could be identified. In addition, some differences were observed in the pattern of expression between the cir subgroups at the peak of P. chabaudi infection. Finally, specific cir genes were expressed at different time points during asexual blood stages.
In conclusion, the large number of cir genes and their expression throughout the intraerythrocytic cycle of development indicates that CIR proteins are likely to be important for parasite survival. In particular, the detection of dominant cir transcripts at the peak of P. chabaudi infection supports the idea that CIR proteins are expressed, and could perform important functions in the biology of this parasite. Further application of the methodologies described here may allow the elucidation of CIR sub-family A and B protein functions, including their contribution to antigenic variation and immune evasion.
[show abstract][hide abstract] ABSTRACT: Although eukaryotic protein kinases (ePKs) contribute to many cellular processes, only three Plasmodium falciparum ePKs have thus far been identified as essential for parasite asexual blood stage development. To identify pathways essential for parasite transmission between their mammalian host and mosquito vector, we undertook a systematic functional analysis of ePKs in the genetically tractable rodent parasite Plasmodium berghei. Modeling domain signatures of conventional ePKs identified 66 putative Plasmodium ePKs. Kinomes are highly conserved between Plasmodium species. Using reverse genetics, we show that 23 ePKs are redundant for asexual erythrocytic parasite development in mice. Phenotyping mutants at four life cycle stages in Anopheles stephensi mosquitoes revealed functional clusters of kinases required for sexual development and sporogony. Roles for a putative SR protein kinase (SRPK) in microgamete formation, a conserved regulator of clathrin uncoating (GAK) in ookinete formation, and a likely regulator of energy metabolism (SNF1/KIN) in sporozoite development were identified.
[show abstract][hide abstract] ABSTRACT: SUMMARY: BamView is an interactive Java application for visualizing the large amounts of data stored for sequence reads which are aligned against a reference genome sequence. It supports the BAM (Binary Alignment/Map) format. It can be used in a number of contexts including SNP calling and structural annotation. BamView has also been integrated into Artemis so that the reads can be viewed in the context of the nucleotide sequence and genomic features. AVAILABILITY: BamView and Artemis are freely available (under a GPL licence) for download (for MacOSX, UNIX and Windows) at: http://bamview.sourceforge.net/
[show abstract][hide abstract] ABSTRACT: Recent advances in high-throughput sequencing present a new opportunity to deeply probe an organism's transcriptome. In this study, we used Illumina-based massively parallel sequencing to gain new insight into the transcriptome (RNA-Seq) of the human malaria parasite, Plasmodium falciparum. Using data collected at seven time points during the intraerythrocytic developmental cycle, we (i) detect novel gene transcripts; (ii) correct hundreds of gene models; (iii) propose alternative splicing events; and (iv) predict 5' and 3' untranslated regions. Approximately 70% of the unique sequencing reads map to previously annotated protein-coding genes. The RNA-Seq results greatly improve existing annotation of the P. falciparum genome with over 10% of gene models modified. Our data confirm 75% of predicted splice sites and identify 202 new splice sites, including 84 previously uncharacterized alternative splicing events. We also discovered 107 novel transcripts and expression of 38 pseudogenes, with many demonstrating differential expression across the developmental time series. Our RNA-Seq results correlate well with DNA microarray analysis performed in parallel on the same samples, and provide improved resolution over the microarray-based method. These data reveal new features of the P. falciparum transcriptional landscape and significantly advance our understanding of the parasite's red blood cell-stage transcriptome.
[show abstract][hide abstract] ABSTRACT: Efforts to annotate the genomes of a wide variety of model organisms are currently carried out by sequencing centers, model organism databases and academic/institutional laboratories around the world. Different annotation methods and tools have been developed over time to meet the needs of biologists faced with the task of annotating biological data. While standardized methods are essential for consistent curation within each annotation group, methods and tools can differ between groups, especially when the groups are curating different organisms. Biocurators from several institutes met at the Third International Biocuration Conference in Berlin, Germany, April 2009 and hosted the 'Best Practices in Genome Annotation: Inference from Evidence' workshop to share their strategies, pipelines, standards and tools. This article documents the material presented in the workshop.
Database The Journal of Biological Databases and Curation 01/2010; 2010:baq001. · 4.20 Impact Factor
[show abstract][hide abstract] ABSTRACT: Artemis and Artemis Comparison Tool (ACT) have become mainstream tools for viewing and annotating sequence data, particularly for microbial genomes. Since its first release, Artemis has been continuously developed and supported with additional functionality for editing and analysing sequences based on feedback from an active user community of laboratory biologists and professional annotators. Nevertheless, its utility has been somewhat restricted by its limitation to reading and writing from flat files. Therefore, a new version of Artemis has been developed, which reads from and writes to a relational database schema, and allows users to annotate more complex, often large and fragmented, genome sequences.
Artemis and ACT have now been extended to read and write directly to the Generic Model Organism Database (GMOD, http://www.gmod.org) Chado relational database schema. In addition, a Gene Builder tool has been developed to provide structured forms and tables to edit coordinates of gene models and edit functional annotation, based on standard ontologies, controlled vocabularies and free text.
Artemis and ACT are freely available (under a GPL licence) for download (for MacOSX, UNIX and Windows) at the Wellcome Trust Sanger Institute web sites: http://www.sanger.ac.uk/Software/Artemis/ http://www.sanger.ac.uk/Software/ACT/
[show abstract][hide abstract] ABSTRACT: Plasmodium knowlesi is an intracellular malaria parasite whose natural vertebrate host is Macaca fascicularis (the 'kra' monkey); however, it is now increasingly recognized as a significant cause of human malaria, particularly in southeast Asia. Plasmodium knowlesi was the first malaria parasite species in which antigenic variation was demonstrated, and it has a close phylogenetic relationship to Plasmodium vivax, the second most important species of human malaria parasite (reviewed in ref. 4). Despite their relatedness, there are important phenotypic differences between them, such as host blood cell preference, absence of a dormant liver stage or 'hypnozoite' in P. knowlesi, and length of the asexual cycle (reviewed in ref. 4). Here we present an analysis of the P. knowlesi (H strain, Pk1(A+) clone) nuclear genome sequence. This is the first monkey malaria parasite genome to be described, and it provides an opportunity for comparison with the recently completed P. vivax genome and other sequenced Plasmodium genomes. In contrast to other Plasmodium genomes, putative variant antigen families are dispersed throughout the genome and are associated with intrachromosomal telomere repeats. One of these families, the KIRs, contains sequences that collectively match over one-half of the host CD99 extracellular domain, which may represent an unusual form of molecular mimicry.
[show abstract][hide abstract] ABSTRACT: Toxoplasma gondii is a globally distributed protozoan parasite that can infect virtually all warm-blooded animals and humans. Despite the existence of a sexual phase in the life cycle, T. gondii has an unusual population structure dominated by three clonal lineages that predominate in North America and Europe, (Types I, II, and III). These lineages were founded by common ancestors approximately10,000 yr ago. The recent origin and widespread distribution of the clonal lineages is attributed to the circumvention of the sexual cycle by a new mode of transmission-asexual transmission between intermediate hosts. Asexual transmission appears to be multigenic and although the specific genes mediating this trait are unknown, it is predicted that all members of the clonal lineages should share the same alleles. Genetic mapping studies suggested that chromosome Ia was unusually monomorphic compared with the rest of the genome. To investigate this further, we sequenced chromosome Ia and chromosome Ib in the Type I strain, RH, and the Type II strain, ME49. Comparative genome analyses of the two chromosomal sequences revealed that the same copy of chromosome Ia was inherited in each lineage, whereas chromosome Ib maintained the same high frequency of between-strain polymorphism as the rest of the genome. Sampling of chromosome Ia sequence in seven additional representative strains from the three clonal lineages supports a monomorphic inheritance, which is unique within the genome. Taken together, our observations implicate a specific combination of alleles on chromosome Ia in the recent origin and widespread success of the clonal lineages of T. gondii.
Genome Research 10/2006; 16(9):1119-25. · 14.40 Impact Factor
[show abstract][hide abstract] ABSTRACT: African trypanosomes evade humoral immunity through antigenic variation, whereby they switch expression of the gene encoding their VSG (variant surface glycoprotein) coat. Switching proceeds by duplication of silent VSG genes into a transcriptionally active locus. The genome project has revealed that most of the silent archive consists of hundreds of subtelomeric VSG tandem arrays, and that most of these are not functional genes. Precedent suggests that they can contribute combinatorially to the formation of expressed, functional genes through segmental gene conversion. These findings from the genome project have major implications for evolution of the VSG archive and for transmission of the parasite in the field.
Biochemical Society Transactions 12/2005; 33(Pt 5):986-9. · 2.59 Impact Factor
[show abstract][hide abstract] ABSTRACT: African trypanosomes cause human sleeping sickness and livestock trypanosomiasis in sub-Saharan Africa. We present the sequence and analysis of the 11 megabase-sized chromosomes of Trypanosoma brucei. The 26-megabase genome contains 9068 predicted genes, including approximately 900 pseudogenes and approximately 1700 T. brucei-specific genes. Large subtelomeric arrays contain an archive of 806 variant surface glycoprotein (VSG) genes used by the parasite to evade the mammalian immune system. Most VSG genes are pseudogenes, which may be used to generate expressed mosaic genes by ectopic recombination. Comparisons of the cytoskeleton and endocytic trafficking systems with those of humans and other eukaryotic organisms reveal major differences. A comparison of metabolic pathways encoded by the genomes of T. brucei, T. cruzi, and Leishmania major reveals the least overall metabolic capability in T. brucei and the greatest in L. major. Horizontal transfer of genes of bacterial origin has contributed to some of the metabolic differences in these parasites, and a number of novel potential drug targets have been identified.
[show abstract][hide abstract] ABSTRACT: The oomycete Phytophthora infestans causes late blight, the potato disease that precipitated the Irish famines in 1846 and 1847. It represents a reemerging threat to potato production and is one of >70 species that are arguably the most devastating pathogens of dicotyledonous plants. Nevertheless, little is known about the molecular bases of pathogenicity in these algae-like organisms or of avirulence molecules that are perceived by host defenses. Disease resistance alleles, products of which recognize corresponding avirulence molecules in the pathogen, have been introgressed into the cultivated potato from a wild species, Solanum demissum, and R1 and R3a have been identified. We used association genetics to identify Avr3a and show that it encodes a protein that is recognized in the host cytoplasm, where it triggers R3a-dependent cell death. Avr3a resides in a region of the P. infestans genome that is colinear with the locus containing avirulence gene ATR1(NdWsB) in Hyaloperonospora parasitica, an oomycete pathogen of Arabidopsis. Remarkably, distances between conserved genes in these avirulence loci were often similar, despite intervening genomic variation. We suggest that Avr3a has undergone gene duplication and that an allele evading recognition by R3a arose under positive selection.
Proceedings of the National Academy of Sciences 06/2005; 102(21):7766-71. · 9.74 Impact Factor
[show abstract][hide abstract] ABSTRACT: African trypanosomes cause human sleeping sickness and livestock trypanosomiasis in sub-Saharan Africa. We present the sequence and analysis of the 11 megabase-sized chromosomes of Trypanosoma brucei . The 26-megabase genome contains 9068 predicted genes, including ~900 pseudogenes and ~1700 T. brucei –specific genes. Large subtelomeric arrays contain an archive of 806 variant surface glycoprotein (VSG) genes used by the parasite to evade the mammalian immune system. Most VSG genes are pseudogenes, which may be used to generate expressed mosaic genes by ectopic recombination. Comparisons of the cytoskeleton and endocytic trafficking systems with those of humans and other eukaryotic organisms reveal major differences. A comparison of metabolic pathways encoded by the genomes of T. brucei , T. cruzi , and Leishmania major reveals the least overall metabolic capability in T. brucei and the greatest in L. major . Horizontal transfer of genes of bacterial origin has contributed to some of the metabolic differences in these parasites, and a number of novel potential drug targets have been identified.