Annabel E Todd

University College London, London, ENG, United Kingdom

Are you Annabel E Todd?

Claim your profile

Publications (10)41.67 Total impact

  • Source
    Article: Comprehensive reanalysis of transcription factor knockout expression data in Saccharomyces cerevisiae reveals many new targets.
    [show abstract] [hide abstract]
    ABSTRACT: Transcription factor (TF) perturbation experiments give valuable insights into gene regulation. Genome-scale evidence from microarray measurements may be used to identify regulatory interactions between TFs and targets. Recently, Hu and colleagues published a comprehensive study covering 269 TF knockout mutants for the yeast Saccharomyces cerevisiae. However, the information that can be extracted from this valuable dataset is limited by the method employed to process the microarray data. Here, we present a reanalysis of the original data using improved statistical techniques freely available from the BioConductor project. We identify over 100,000 differentially expressed genes-nine times the total reported by Hu et al. We validate the biological significance of these genes by assessing their functions, the occurrence of upstream TF-binding sites, and the prevalence of protein-protein interactions. The reanalysed dataset outperforms the original across all measures, indicating that we have uncovered a vastly expanded list of relevant targets. In summary, this work presents a high-quality reanalysis that maximizes the information contained in the Hu et al. compendium. The dataset is available from ArrayExpress (accession: E-MTAB-109) and it will be invaluable to any scientist interested in the yeast transcriptional regulatory system.
    Nucleic Acids Research 04/2010; 38(14):4768-77. · 8.03 Impact Factor
  • Source
    Article: Progress of structural genomics initiatives: an analysis of solved target structures.
    [show abstract] [hide abstract]
    ABSTRACT: The explosion in gene sequence data and technological breakthroughs in protein structure determination inspired the launch of structural genomics (SG) initiatives. An often stated goal of structural genomics is the high-throughput structural characterisation of all protein sequence families, with the long-term hope of significantly impacting on the life sciences, biotechnology and drug discovery. Here, we present a comprehensive analysis of solved SG targets to assess progress of these initiatives. Eleven consortia have contributed 316 non-redundant entries and 323 protein chains to the Protein Data Bank (PDB), and 459 and 393 domains to the CATH and SCOP structure classifications, respectively. The quality and size of these proteins are comparable to those solved in traditional structural biology and, despite huge scope for duplicated efforts, only 14% of targets have a close homologue (>/=30% sequence identity) solved by another consortium. Analysis of CATH and SCOP revealed the significant contribution that structural genomics is making to the coverage of superfamilies and folds. A total of 67% of SG domains in CATH are unique, lacking an already characterised close homologue in the PDB, whereas only 21% of non-SG domains are unique. For 29% of domains, structure determination revealed a remote evolutionary relationship not apparent from sequence, and 19% and 11% contributed new superfamilies and folds. The secondary structure class, fold and superfamily distributions of this dataset reflect those of the genomes. The domains fall into 172 different folds and 259 superfamilies in CATH but the distribution is highly skewed. The most populous of these are those that recur most frequently in the genomes. Whilst 11% of superfamilies are bacteria-specific, most are common to all three superkingdoms of life and together the 316 PDB entries have provided new and reliable homology models for 9287 non-redundant gene sequences in 206 completely sequenced genomes. From the perspective of this analysis, it appears that structural genomics is on track to be a success, and it is hoped that this work will inform future directions of the field.
    Journal of Molecular Biology 06/2005; 348(5):1235-60. · 4.00 Impact Factor
  • Article: The CATH Domain Structure Database and related resources Gene3D and DHS provide comprehensive domain family information for genome analysis.
    Nucleic Acids Research. 01/2005; 33:247-251.
  • Article: Inferring protein function from structure.
    Methods of biochemical analysis 02/2003; 44:387-407.
  • Source
    Article: Sequence and structural differences between enzyme and nonenzyme homologs.
    [show abstract] [hide abstract]
    ABSTRACT: To improve our understanding of the evolution of novel functions, we performed a sequence, structural, and functional analysis of homologous enzymes and nonenzymes of known three-dimensional structure. In most examples identified, the nonenzyme is derived from an ancestral catalytic precursor (as opposed to the reverse evolutionary scenario, nonenzyme to enzyme), and the active site pocket has been disrupted in some way, owing to the substitution of critical catalytic residues and/or steric interactions that impede substrate binding and catalysis. Pairwise sequence identity is typically insignificant, and almost one-half of the enzyme and nonenzyme pairs do not share any similarity in function. Heterooligomeric enzymes comprising homologous subunits in which one chain is catalytically inactive and enzyme polypeptides that contain internal catalytic and noncatalytic duplications of an ancient enzyme domain are also discussed.
    Structure 11/2002; 10(10):1435-51. · 6.35 Impact Factor
  • Article: Plasticity of enzyme active sites.
    [show abstract] [hide abstract]
    ABSTRACT: The expectation is that any similarity in reaction chemistry shared by enzyme homologues is mediated by common functional groups conserved through evolution. However, detailed enzyme studies have revealed the flexibility of many active sites, in that different functional groups, unconserved with respect to position in the primary sequence, mediate the same mechanistic role. Nevertheless, the catalytic atoms might be spatially equivalent. More rarely, the active sites have completely different locations in the protein scaffold. This variability could result from: (1) the hopping of functional groups from one position to another to optimize catalysis; (2) the independent specialization of a low-activity primordial enzyme in different phylogenetic lineages; (3) functional convergence after evolutionary divergence; or (4) circular permutation events.
    Trends in Biochemical Sciences 09/2002; 27(8):419-26. · 10.85 Impact Factor
  • Source
    Article: The CATH protein family database: a resource for structural and functional annotation of genomes.
    [show abstract] [hide abstract]
    ABSTRACT: Over the last decade, there have been huge increases in the numbers of protein sequences and structures determined. In parallel, many methods have been developed for recognising similarities between these proteins, arising from their common evolutionary background, and for clustering such relatives into protein families. Here we review some of the protein family resources available to the biologist and describe how these can be used to provide structural and functional annotations for newly determined sequences. In particular we describe recent developments to the CATH domain database of protein structural families which have facilitated genome annotation and which have also revealed important caveats that must be considered when transferring functional data between homologous proteins.
    PROTEOMICS 02/2002; 2(1):11-21. · 4.51 Impact Factor
  • Article: The CATH protein family database: A resource for structural and functional annotation of genomes
    [show abstract] [hide abstract]
    ABSTRACT: Over the last decade, there have been huge increases in the numbers of protein sequences and structures determined. In parallel, many methods have been developed for recognising similarities between these proteins, arising from their common evolutionary background, and for clustering such relatives into protein families. Here we review some of the protein family resources available to the biologist and describe how these can be used to provide structural and functional annotations for newly determined sequences. In particular we describe recent developments to the CATH domain database of protein structural families which have facilitated genome annotation and which have also revealed important caveats that must be considered when transferring functional data between homologous proteins.
    Proteomics 01/2002; 2(1):11 - 21. · 4.43 Impact Factor
  • Article: The CATH Database provides insights into protein structure/function relationships.
    Nucleic Acids Research. 01/1999; 27:275-279.
  • Article: Target selection and determination of function in structural genomics.
    [show abstract] [hide abstract]
    ABSTRACT: The first crucial step in any structural genomics project is the selection and prioritization of target proteins for structure determination. There may be a number of selection criteria to be satisfied, including that the proteins have novel folds, that they be representatives of large families for which no structure is known, and so on. The better the selection at this stage, the greater is the value of the structures obtained at the end of the experimental process. This value can be further enhanced once the protein structures have been solved if the functions of the given proteins can also be determined. Here we describe the methods used at either end of the experimental process: firstly, sensitive sequence comparison techniques for selecting a high-quality list of target proteins, and secondly the various computational methods that can be applied to the eventual 3D structures to determine the most likely biochemical function of the proteins in question.
    International Union of Biochemistry and Molecular Biology Life 55(4-5):249-55. · 3.51 Impact Factor