Avril Coghlan

Beijing Genomics Institute, Shenzhen, Guangdong Sheng, China

Are you Avril Coghlan?

Claim your profile

Publications (12)115.42 Total impact

  • Source
    Article: The genome of the blood fluke Schistosoma mansoni.
    [show abstract] [hide abstract]
    ABSTRACT: Schistosoma mansoni is responsible for the neglected tropical disease schistosomiasis that affects 210 million people in 76 countries. Here we present analysis of the 363 megabase nuclear genome of the blood fluke. It encodes at least 11,809 genes, with an unusual intron size distribution, and new families of micro-exon genes that undergo frequent alternative splicing. As the first sequenced flatworm, and a representative of the Lophotrochozoa, it offers insights into early events in the evolution of the animals, including the development of a body pattern with bilateral symmetry, and the development of tissues into organs. Our analysis has been informed by the need to find new drug targets. The deficits in lipid metabolism that make schistosomes dependent on the host are revealed, and the identification of membrane receptors, ion channels and more than 300 proteases provide new insights into the biology of the life cycle and new targets. Bioinformatics approaches have identified metabolic chokepoints, and a chemogenomic screen has pinpointed schistosome proteins for which existing drugs may be active. The information generated provides an invaluable resource for the research community to develop much needed new control tools for the treatment and eradication of this important and neglected disease.
    Nature 08/2009; 460(7253):352-8. · 36.28 Impact Factor
  • Source
    Article: nGASP--the nematode genome annotation assessment project.
    [show abstract] [hide abstract]
    ABSTRACT: While the C. elegans genome is extensively annotated, relatively little information is available for other Caenorhabditis species. The nematode genome annotation assessment project (nGASP) was launched to objectively assess the accuracy of protein-coding gene prediction software in C. elegans, and to apply this knowledge to the annotation of the genomes of four additional Caenorhabditis species and other nematodes. Seventeen groups worldwide participated in nGASP, and submitted 47 prediction sets across 10 Mb of the C. elegans genome. Predictions were compared to reference gene sets consisting of confirmed or manually curated gene models from WormBase. The most accurate gene-finders were 'combiner' algorithms, which made use of transcript- and protein-alignments and multi-genome alignments, as well as gene predictions from other gene-finders. Gene-finders that used alignments of ESTs, mRNAs and proteins came in second. There was a tie for third place between gene-finders that used multi-genome alignments and ab initio gene-finders. The median gene level sensitivity of combiners was 78% and their specificity was 42%, which is nearly the same accuracy reported for combiners in the human genome. C. elegans genes with exons of unusual hexamer content, as well as those with unusually many exons, short exons, long introns, a weak translation start signal, weak splice sites, or poorly conserved orthologs posed the greatest difficulty for gene-finders. This experiment establishes a baseline of gene prediction accuracy in Caenorhabditis genomes, and has guided the choice of gene-finders for the annotation of newly sequenced genomes of Caenorhabditis and other nematode species. We have created new gene sets for C. briggsae, C. remanei, C. brenneri, C. japonica, and Brugia malayi using some of the best-performing gene-finders.
    BMC Bioinformatics 01/2009; 9:549. · 2.75 Impact Factor
  • Source
    Article: TreeFam: 2008 Update.
    [show abstract] [hide abstract]
    ABSTRACT: TreeFam (http://www.treefam.org) was developed to provide curated phylogenetic trees for all animal gene families, as well as orthologue and paralogue assignments. Release 4.0 of TreeFam contains curated trees for 1314 families and automatically generated trees for another 14,351 families. We have expanded TreeFam to include 25 fully sequenced animal genomes, as well as four genomes from plant and fungal outgroup species. We have also introduced more accurate approaches for automatically grouping genes into families, for building phylogenetic trees, and for inferring orthologues and paralogues. The user interface for viewing phylogenetic trees and family information has been improved. Furthermore, a new perl API lets users easily extract data from the TreeFam mysql database.
    Nucleic Acids Research 02/2008; 36(Database issue):D735-40. · 8.03 Impact Factor
  • Article: Genomix: a method for combining gene-finders' predictions, which uses evolutionary conservation of sequence and intron-exon structure.
    Avril Coghlan, Richard Durbin
    [show abstract] [hide abstract]
    ABSTRACT: Correct gene predictions are crucial for most analyses of genomes. However, in the absence of transcript data, gene prediction is still challenging. One way to improve gene-finding accuracy in such genomes is to combine the exons predicted by several gene-finders, so that gene-finders that make uncorrelated errors can correct each other. We present a method for combining gene-finders called Genomix. Genomix selects the predicted exons that are best conserved within and/or between species in terms of sequence and intron-exon structure, and combines them into a gene structure. Genomix was used to combine predictions from four gene-finders for Caenorhabditis elegans, by selecting the predicted exons that are best conserved with C.briggsae and C.remanei. On a set of approximately 1500 confirmed C.elegans genes, Genomix increased the exon-level specificity by 10.1% and sensitivity by 2.7% compared to the best input gene-finder. Scripts and Supplementary Material can be found at http://www.sanger.ac.uk/Software/analysis/genomix
    Bioinformatics 07/2007; 23(12):1468-75. · 5.47 Impact Factor
  • Source
    Article: Caenorhabditis evolution: if they all look alike, you aren't looking hard enough.
    [show abstract] [hide abstract]
    ABSTRACT: Caenorhabditis elegans is widely known as a model organism for cell, molecular, developmental and neural biology, but it is also being used for evolutionary studies. A recent meeting of researchers in Portugal covered topics as diverse as phylogenetics, genetic mapping of quantitative and qualitative intraspecific variation, evolutionary developmental biology and population genetics. Here, we summarize the main findings of the meeting, which marks the formal birth of a research community dedicated to Caenorhabditis species evolution.
    Trends in Genetics 04/2007; 23(3):101-4. · 10.06 Impact Factor
  • Article: Comparative genomics in C. elegans, C. briggsae, and other Caenorhabditis species.
    Avril Coghlan, Jason E Stajich, Todd W Harris
    [show abstract] [hide abstract]
    ABSTRACT: The genome of the nematode Caenorhabditis elegans was the first animal genome sequenced. Subsequent sequencing of the Caenorhabditis briggsae genome enabled a comparison of the genomes of two nematode species. In this chapter, we describe the methods that we used to compare the C. elegans genome to that of C. briggsae. We discuss how these methods could be developed to compare the C. elegans and C. briggsae genomes to those of Caenorhabditis remanei, C. n. sp. represented by strains PB2801 and CB5161, among others (1), and Caenorhabditis japonica, which are currently being sequenced.
    Methods in molecular biology (Clifton, N.J.) 02/2006; 351:13-29.
  • Source
    Article: TreeFam: a curated database of phylogenetic trees of animal gene families.
    [show abstract] [hide abstract]
    ABSTRACT: TreeFam is a database of phylogenetic trees of gene families found in animals. It aims to develop a curated resource that presents the accurate evolutionary history of all animal gene families, as well as reliable ortholog and paralog assignments. Curated families are being added progressively, based on seed alignments and trees in a similar fashion to Pfam. Release 1.1 of TreeFam contains curated trees for 690 families and automatically generated trees for another 11 646 families. These represent over 128 000 genes from nine fully sequenced animal genomes and over 45 000 other animal proteins from UniProt; approximately 40-85% of proteins encoded in the fully sequenced animal genomes are included in TreeFam. TreeFam is freely available at http://www.treefam.org and http://treefam.genomics.org.cn.
    Nucleic Acids Research 02/2006; 34(Database issue):D572-80. · 8.03 Impact Factor
  • Source
    Article: Chromosome evolution in eukaryotes: a multi-kingdom perspective.
    [show abstract] [hide abstract]
    ABSTRACT: In eukaryotes, chromosomal rearrangements, such as inversions, translocations and duplications, are common and range from part of a gene to hundreds of genes. Lineage-specific patterns are also seen: translocations are rare in dipteran flies, and angiosperm genomes seem prone to polyploidization. In most eukaryotes, there is a strong association between rearrangement breakpoints and repeat sequences. Current data suggest that some repeats promoted rearrangements via non-allelic homologous recombination, for others the association might not be causal but reflects the instability of particular genomic regions. Rearrangement polymorphisms in eukaryotes are correlated with phenotypic differences, so are thought to confer varying fitness in different habitats. Some seem to be under positive selection because they either trap favorable allele combinations together or alter the expression of nearby genes. There is little evidence that chromosomal rearrangements cause speciation, but they probably intensify reproductive isolation between species that have formed by another route.
    Trends in Genetics 01/2006; 21(12):673-82. · 10.06 Impact Factor
  • Article: Nematode genome evolution.
    Avril Coghlan
    [show abstract] [hide abstract]
    ABSTRACT: Nematodes are the most abundant type of animal on earth, and live in hot springs, polar ice, soil, fresh and salt water, and as parasites of plants, vertebrates, insects, and other nematodes. This extraordinary ability to adapt, which hints at an underlying genetic plasticity, has long fascinated biologists. The fully sequenced genomes of Caenorhabditis elegans and Caenorhabditis briggsae, and ongoing sequencing projects for eight other nematodes, provide an exciting opportunity to investigate the genomic changes that have enabled nematodes to invade many different habitats. Analyses of the C. elegans and C. briggsae genomes suggest that these include major changes in gene content; as well as in chromosome number, structure and size. Here I discuss how the data set of ten genomes will be ideal for tackling questions about nematode evolution, as well as questions relevant to all eukaryotes.
    WormBook 02/2005;
  • Source
    Article: Origins of recently gained introns in Caenorhabditis.
    Avril Coghlan, Kenneth H Wolfe
    [show abstract] [hide abstract]
    ABSTRACT: The genomes of the nematodes Caenorhabditis elegans and Caenorhabditis briggsae both contain approximately 100,000 introns, of which >6,000 are unique to one or the other species. To study the origins of new introns, we used a conservative method involving phylogenetic comparisons to animal orthologs and nematode paralogs to identify cases where an intron content difference between C. elegans and C. briggsae was caused by intron insertion rather than deletion. We identified 81 recently gained introns in C. elegans and 41 in C. briggsae. Novel introns have a stronger exon splice site consensus sequence than the general population of introns and show the same preference for phase 0 sites in codons over phases 1 and 2. More of the novel introns are inserted in genes that are expressed in the C. elegans germ line than expected by chance. Thirteen of the 122 gained introns are in genes whose protein products function in premRNA processing, including three gains in the gene for spliceosomal protein SF3B1 and two in the nonsense-mediated decay gene smg-2. Twenty-eight novel introns have significant DNA sequence identity to other introns, including three that are similar to other introns in the same gene. All of these similarities involve minisatellites or palindromes in the intron sequences. Our results suggest that at least some of the intron gains were caused by reverse splicing of a preexisting intron.
    Proceedings of the National Academy of Sciences 09/2004; 101(31):11362-7. · 9.68 Impact Factor
  • Source
    Article: The genome sequence of Caenorhabditis briggsae: a platform for comparative genomics.
    [show abstract] [hide abstract]
    ABSTRACT: The soil nematodes Caenorhabditis briggsae and Caenorhabditis elegans diverged from a common ancestor roughly 100 million years ago and yet are almost indistinguishable by eye. They have the same chromosome number and genome sizes, and they occupy the same ecological niche. To explore the basis for this striking conservation of structure and function, we have sequenced the C. briggsae genome to a high-quality draft stage and compared it to the finished C. elegans sequence. We predict approximately 19,500 protein-coding genes in the C. briggsae genome, roughly the same as in C. elegans. Of these, 12,200 have clear C. elegans orthologs, a further 6,500 have one or more clearly detectable C. elegans homologs, and approximately 800 C. briggsae genes have no detectable matches in C. elegans. Almost all of the noncoding RNAs (ncRNAs) known are shared between the two species. The two genomes exhibit extensive colinearity, and the rate of divergence appears to be higher in the chromosomal arms than in the centers. Operons, a distinctive feature of C. elegans, are highly conserved in C. briggsae, with the arrangement of genes being preserved in 96% of cases. The difference in size between the C. briggsae (estimated at approximately 104 Mbp) and C. elegans (100.3 Mbp) genomes is almost entirely due to repetitive sequence, which accounts for 22.4% of the C. briggsae genome in contrast to 16.5% of the C. elegans genome. Few, if any, repeat families are shared, suggesting that most were acquired after the two species diverged or are undergoing rapid evolution. Coclustering the C. elegans and C. briggsae proteins reveals 2,169 protein families of two or more members. Most of these are shared between the two species, but some appear to be expanding or contracting, and there seem to be as many as several hundred novel C. briggsae gene families. The C. briggsae draft sequence will greatly improve the annotation of the C. elegans genome. Based on similarity to C. briggsae, we found strong evidence for 1,300 new C. elegans genes. In addition, comparisons of the two genomes will help to understand the evolutionary forces that mold nematode genomes.
    PLoS Biology 12/2003; 1(2):E45. · 11.45 Impact Factor
  • Source
    Article: Fourfold faster rate of genome rearrangement in nematodes than in Drosophila.
    Avril Coghlan, Kenneth H Wolfe
    [show abstract] [hide abstract]
    ABSTRACT: We compared the genome of the nematode Caenorhabditis elegans to 13% of that of Caenorhabditis briggsae, identifying 252 conserved segments along their chromosomes. We detected 517 chromosomal rearrangements, with the ratio of translocations to inversions to transpositions being approximately 1:1:2. We estimate that the species diverged 50-120 million years ago, and that since then there have been 4030 rearrangements between their whole genomes. Our estimate of the rearrangement rate, 0.4-1.0 chromosomal breakages/Mb per Myr, is at least four times that of Drosophila, which was previously reported to be the fastest rate among eukaryotes. The breakpoints of translocations are strongly associated with dispersed repeats and gene family members in the C. elegans genome.
    Genome Research 07/2002; 12(6):857-67. · 13.61 Impact Factor