328
1,700.70
5.19
640

Publication History View all

  • [Show abstract] [Hide abstract]
    ABSTRACT: Membrane proteins have become a major focus in structure prediction, due to their medical importance. There is, however, a lack of fast and reliable methods that specialise in the modelling of membrane protein loops. Often methods designed for soluble proteins are applied directly to membrane proteins. In this paper we investigate the validity of such an approach in the realm of fragment-based methods. We also examine the differences in membrane and soluble protein loops that might affect accuracy. We test our ability to predict soluble and membrane protein loops with the previously published method FREAD. We show that it is possible to predict accurately the structure of membrane protein loops using a database of membrane protein fragments (0.5-1Å median root mean square deviation). The presence of homologous proteins in the database helps prediction accuracy. However, even when homologues are removed better results are still achieved using fragments of membrane proteins (0.8-1.6Å) rather than soluble proteins (1-4Å) to model membrane protein loops. We find that many fragments of soluble proteins have shapes similar to their membrane protein counterparts but have very different sequences, however they do not appear to differ in their substitution patterns Our findings may allow further improvements to fragment-based loop modelling algorithms for membrane proteins. The current version of our proof-of-concept loop modelling protocol produces high accuracy loop models for membrane proteins and is available as a web server at http://medeller.info/fread. © Proteins 2013;. © 2013 Wiley Periodicals, Inc.
    Proteins Structure Function and Bioinformatics 02/2014; 82(2). DOI:10.1002/prot.24299
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The evolution of proteins is one of the fundamental processes that has delivered the diversity and complexity of life we see around ourselves today. While we tend to define protein evolution in terms of sequence level mutations, insertions and deletions, it is hard to translate these processes to a more complete picture incorporating a polypeptide's structure and function. By considering how protein structures change over time we can gain an entirely new appreciation of their long-term evolutionary dynamics. In this work we seek to identify how populations of proteins at different stages of evolution explore their possible structure space. We use an annotation of superfamily age to this space and explore the relationship between these ages and a diverse set of properties pertaining to a superfamily's sequence, structure and function. We note several marked differences between the populations of newly evolved and ancient structures, such as in their length distributions, secondary structure content and tertiary packing arrangements. In particular, many of these differences suggest a less elaborate structure for newly evolved superfamilies when compared with their ancient counterparts. We show that the structural preferences we report are not a residual effect of a more fundamental relationship with function. Furthermore, we demonstrate the robustness of our results, using significant variation in the algorithm used to estimate the ages. We present these age estimates as a useful tool to analyse protein populations. In particularly, we apply this in a comparison of domains containing greek key or jelly roll motifs.
    PLoS Computational Biology 11/2013; 9(11):e1003325. DOI:10.1371/journal.pcbi.1003325
  • [Show abstract] [Hide abstract]
    ABSTRACT: High-throughput sequencing technologies produce short sequence reads that can contain phase information if they span two or more heterozygote genotypes. This information is not routinely used by current methods that infer haplotypes from genotype data. We have extended the SHAPEIT2 method to use phase-informative sequencing reads to improve phasing accuracy. Our model incorporates the read information in a probabilistic model through base quality scores within each read. The method is primarily designed for high-coverage sequence data or data sets that already have genotypes called. One important application is phasing of single samples sequenced at high coverage for use in medical sequencing and studies of rare diseases. Our method can also use existing panels of reference haplotypes. We tested the method by using a mother-father-child trio sequenced at high-coverage by Illumina together with the low-coverage sequence data from the 1000 Genomes Project (1000GP). We found that use of phase-informative reads increases the mean distance between switch errors by 22% from 274.4 kb to 328.6 kb. We also used male chromosome X haplotypes from the 1000GP samples to simulate sequencing reads with varying insert size, read length, and base error rate. When using short 100 bp paired-end reads, we found that using mixtures of insert sizes produced the best results. When using longer reads with high error rates (5-20 kb read with 4%-15% error per base), phasing performance was substantially improved.
    The American Journal of Human Genetics 10/2013; 93(4):687-696. DOI:10.1016/j.ajhg.2013.09.002
  • [Show abstract] [Hide abstract]
    ABSTRACT: Antibodies are a class of proteins indispensable for the vertebrate immune system. The general architecture of all antibodies is very similar, but they contain a hypervariable region which allows millions of antibody variants to exist, each of which can bind to different molecules. This binding malleability means that antibodies are an increasingly important category of biopharmaceuticals and biomarkers. We present Antibody i-Patch, a method that annotates the most likely antibody residues to be in contact with the antigen. We show that our predictions correlate with energetic importance and thus we argue that they may be useful in guiding mutations in the artificial affinity maturation process. Using our predictions as constraints for a rigid-body docking algorithm, we are able to obtain high-quality results in minutes. Our annotation method and re-scoring system for docking achieve their predictive power by using antibody-specific statistics. Antibody i-Patch is available from http://www.stats.ox.ac.uk/research/proteins/resources.
    Protein Engineering Design and Selection 09/2013; DOI:10.1093/protein/gzt043
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Gene expression in multiple individual cells from a tissue or culture sample varies according to cell-cycle, genetic, epigenetic and stochastic differences between the cells. However, single-cell differences have been largely neglected in the analysis of the functional consequences of genetic variation. Here we measure the expression of 92 genes affected by Wnt signaling in 1,440 single cells from 15 individuals to associate single-nucleotide polymorphisms (SNPs) with gene-expression phenotypes, while accounting for stochastic and cell-cycle differences between cells. We provide evidence that many heritable variations in gene function-such as burst size, burst frequency, cell cycle-specific expression and expression correlation/noise between cells-are masked when expression is averaged over many cells. Our results demonstrate how single-cell analyses provide insights into the mechanistic and network effects of genetic variability, with improved statistical power to model these effects on gene expression.
    Nature Biotechnology 07/2013; DOI:10.1038/nbt.2642
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The binding site of an antibody is formed between the two variable domains, VH and VL, of its antigen binding fragment (Fab). Understanding how VH and VL orientate with respect to one another is important both for studying the mechanisms of antigen specificity and affinity and improving antibody modelling, docking and engineering. Different VH-VL orientations are commonly described using relative measures such as root-mean-square deviation. Recently, the orientation has also been characterised using the absolute measure of a VH-VL packing angle. However, a single angle cannot fully describe all modes of orientation. Here, we present a method which fully characterises VH-VL orientation in a consistent and absolute sense using five angles (HL, HC1, LC1, HC2 and LC2) and a distance (dc). Additionally, we provide a computational tool, ABangle, to allow the VH-VL orientation for any antibody to be automatically calculated and compared with all other known structures. We compare previous studies and show how the modes of orientation being identified relate to movements of different angles. Thus, we are able to explain why different studies identify different structural clusters and different residues as important. Given this result, we then identify those positions and their residue identities which influence each of the angular measures of orientation. Finally, by analysing VH-VL orientation in bound and unbound forms, we find that antibodies specific for protein antigens are significantly more flexible in their unbound form than antibodies specific for hapten antigens. ABangle is freely available at http://opig.stats.ox.ac.uk/webapps/abangle.
    Protein Engineering Design and Selection 05/2013; DOI:10.1093/protein/gzt020
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Membrane proteins are estimated to be the targets of 50% of drugs that are currently in development, yet we have few membrane protein crystal structures. As a result, for a membrane protein of interest, the much-needed structural information usually comes from a homology model. Current homology modelling software is optimized for globular proteins, and ignores the constraints that the membrane is known to place on protein structure. Our Memoir server produces homology models using alignment and coordinate generation software that has been designed specifically for transmembrane proteins. Memoir is easy to use, with the only inputs being a structural template and the sequence that is to be modelled. We provide a video tutorial and a guide to assessing model quality. Supporting data aid manual refinement of the models. These data include a set of alternative conformations for each modelled loop, and a multiple sequence alignment that incorporates the query and template. Memoir works with both α-helical and β-barrel types of membrane proteins and is freely available at http://opig.stats.ox.ac.uk/webapps/memoir.
    Nucleic Acids Research 05/2013; 41(Web Server issue). DOI:10.1093/nar/gkt331
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Staphylococcus aureus is a major cause of healthcare associated mortality, but like many important bacterial pathogens, it is a common constituent of the normal human body flora. Around a third of healthy adults are carriers. Recent evidence suggests that evolution of S. aureus during nasal carriage may be associated with progression to invasive disease. However, a more detailed understanding of within-host evolution under natural conditions is required to appreciate the evolutionary and mechanistic reasons why commensal bacteria such as S. aureus cause disease. Therefore we examined in detail the evolutionary dynamics of normal, asymptomatic carriage. Sequencing a total of 131 genomes across 13 singly colonized hosts using the Illumina platform, we investigated diversity, selection, population dynamics and transmission during the short-term evolution of S. aureus. WE CHARACTERIZED THE PROCESSES BY WHICH THE RAW MATERIAL FOR EVOLUTION IS GENERATED: micro-mutation (point mutation and small insertions/deletions), macro-mutation (large insertions/deletions) and the loss or acquisition of mobile elements (plasmids and bacteriophages). Through an analysis of synonymous, non-synonymous and intergenic mutations we discovered a fitness landscape dominated by purifying selection, with rare examples of adaptive change in genes encoding surface-anchored proteins and an enterotoxin. We found evidence for dramatic, hundred-fold fluctuations in the size of the within-host population over time, which we related to the cycle of colonization and clearance. Using a newly-developed population genetics approach to detect recent transmission among hosts, we revealed evidence for recent transmission between some of our subjects, including a husband and wife both carrying populations of methicillin-resistant S. aureus (MRSA). This investigation begins to paint a picture of the within-host evolution of an important bacterial pathogen during its prevailing natural state, asymptomatic carriage. These results also have wider significance as a benchmark for future systematic studies of evolution during invasive S. aureus disease.
    PLoS ONE 05/2013; 8(5):e61319. DOI:10.1371/journal.pone.0061319
  • [Show abstract] [Hide abstract]
    ABSTRACT: MOTIVATION: Many computational methods for RNA secondary structure prediction, and, in particular, for the prediction of a consensus structure of an alignment of RNA sequences, have been developed. Most methods however ignore biophysical factors such as the kinetics of RNA folding; no current implementation considers both evolutionary information and folding kinetics, thus losing information which, when considered, might lead to better predictions. RESULTS: We present an iterative algorithm, Oxfold, in the framework of stochastic context-free grammars, that emulates the kinetics of RNA folding in a simplified way, in combination with a molecular evolution model. This method improves considerably upon existing grammatical models that do not consider folding kinetics. Additionally, the model compares favourably to non-kinetic thermodynamic models. AVAILABILITY: http://www.stats.ox.ac.uk/~anderson CONTACT: anderson@stats.ox.ac.uk.
    Bioinformatics 02/2013; 29(6). DOI:10.1093/bioinformatics/btt050
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Author Summary The human leukocyte antigen (HLA) proteins influence how pathogens and components of body cells are presented to immune cells. It has long been known that they are highly variable and that this variation is associated with differential risk for autoimmune and infectious diseases. Variant frequencies differ substantially between and even within continents. Determining HLA genotypes is thus an important part of many studies to understand the genetic basis of disease risk. However, conventional methods for HLA typing (e.g. targeted sequencing, hybridisation, amplification) are typically laborious and expensive. We have developed a method for inferring an individual's HLA genotype based on evaluating genetic information from nearby variable sites that are more easily assayed, which aims to integrate heterogeneous data. We introduce two key innovations: we allow for single HLA types to appear on heterogeneous backgrounds of genetic information and we take into account the possibility of genotyping error, which is common within the HLA region. We show that the method is well-suited to deal with multi-population datasets: it enables integrated HLA type inference for individuals of differing ancestry and ethnicity. It will therefore prove useful particularly in international collaborations to better understand disease risks, where samples are drawn from multiple countries.
    PLoS Computational Biology 02/2013; 9(2):e1002877. DOI:10.1371/journal.pcbi.1002877
Information provided on this web page is aggregated encyclopedic and bibliographical information relating to the named institution. Information provided is not approved by the institution itself. The institution’s logo (and/or other graphical identification, such as a coat of arms) is used only to identify the institution in a nominal way. Under certain jurisdictions it may be property of the institution.
View all

Top publications last week

 
IEEE Transactions on Signal Processing 04/2002; 50(3-50):736 - 746. DOI:10.1109/78.984773
8 Downloads
 
Proteins Structure Function and Bioinformatics 10/2010; 78(13):2781-97. DOI:10.1002/prot.22792
8 Downloads