[show abstract][hide abstract] ABSTRACT: MOTIVATION: GSATools is a free software package to analyse conformational ensembles and to detect functional motions in proteins by means of a Structural Alphabet. The software integrates with the widely used GROMACS simulation package and can generate a range of graphical outputs. Three applications can be supported: (i) investigation of the conformational variability of local structures; (ii) detection of allosteric communication; (iii) identification of local regions that are critical for global functional motions. These analyses provide insights into the dynamics of proteins and allow for targeted design of functional mutants in theoretical and experimental studies. AVAILABILITY: The C source code of the GSATools along with a set of pre-compiled binaries is freely available under GNU General Public License from http://mathbio.nimr.mrc.ac.uk/wiki/GSATools. CONTACT: email@example.com and firstname.lastname@example.org. SUPPLEMENTARY INFORMATION: A step-by-step tutorial is available in the Supplementary data at Bioinformatics online.
[show abstract][hide abstract] ABSTRACT: Prion diseases are fatal neurodegenerative diseases characterized by the formation of β-rich oligomers and the accumulation of amyloid fibrillar deposits in the central nervous system. Understanding the conversion of the cellular prion protein into its β-rich polymeric conformers is fundamental to tackling the early stages of the development of prion diseases. In this paper, we have identified unfolding and refolding steps critical to the conversion into a β-rich conformer for different constructs of the ovine prion protein by molecular dynamics simulations. By combining our results with in vitro experiments, we show that the folded C-terminus of the ovine prion protein is able to recurrently undergo a drastic conformational change by displacement of the H1 helix, uncovering of the H2H3 domain, and formation of persistent β-sheets between H2 and H3 residues. The observed β-sheets refold toward the C-terminus exposing what we call a "bending region" comprising residues 204-214. This is strikingly coincident with the region harboring mutations determining the fate of the prion oligomerization process. The β-rich intermediate is used here for the construction of a putative model for the assembly into an oligomeric aggregate. The results presented here confirm the importance of the H2H3 domain for prion oligomer formation and therefore its potential use as molecular target in the design of novel prion inhibitors.
Journal of Chemical Theory and Computation 05/2013; 9(5):2455-2465. · 5.39 Impact Factor
[show abstract][hide abstract] ABSTRACT: To uncover the structural and dynamical determinants involved in the highly specific binding of Ras GTPase to its effectors, the conformational states of Ras in uncomplexed form and complexed to the downstream effectors Byr2, PI3Kγ, PLCε, and RalGDS were investigated using molecular dynamics and cross-comparison of the trajectories. The subtle changes in the dynamics and conformations of Ras upon effector binding require an analysis that targets local changes independent of global motions. Using a structural alphabet, a computational procedure is proposed to quantify local conformational changes. Positions detected by this approach were characterized as either specific for a particular effector, specific for an effector domain type, or as effector unspecific. A set of nine structurally connected residues (Ras residues 5-8, 32-35, 39-42, 55-59, 73-78, and 161-165), which link the effector binding site to the distant C-terminus, changed dynamics upon effector binding, indicating a potential effector-unspecific signaling route within the Ras structure. Additional conformational changes were detected along the N-terminus of the central β-sheet. Besides the Ras residues at the effector interface (e.g., D33, E37, D38, and Y40), which adopt effector-specific local conformations, the binding signal propagates from the interface to distant hot-spot residues, in particular to Y5 and D57. The results of this study reveal possible conformational mechanisms for the stabilization of the active state of Ras upon downstream effector binding and for the structural determinants responsible for effector specificity.
Journal of Chemical Theory and Computation 01/2013; 9(1):738-749. · 5.39 Impact Factor
[show abstract][hide abstract] ABSTRACT: Solvation forces are crucial determinants in the equilibrium between the folded and unfolded state of proteins. Particularly interesting are the solvent forces of denaturing solvent mixtures on folded and misfolded states of proteins involved in neurodegeneration. The C-terminal globular domain of the ovine prion protein (1UW3) and its analogue H2H3 in the α-rich and β-rich conformation were used as model structures to study the solvation forces in 4 M aqueous urea using molecular dynamics. The model structures display very different secondary structures and solvent exposures. Most protein atoms favor interactions with urea over interactions with water. The force difference between protein-urea and protein-water interactions correlates with hydrophobicity; i.e., urea interacts preferentially with hydrophobic atoms, in agreement with results from solvent transfer experiments. Solvent Shannon entropy maps illustrate the mobility gradient of the urea-water mixture from the first solvation shell to the bulk. Single urea molecules replace water in the first solvation shell preferably at locations of relatively high solvent entropy.
Journal of Chemical Theory and Computation 10/2012; 8(10):3977-3984. · 5.39 Impact Factor
[show abstract][hide abstract] ABSTRACT: Implicit solvation is a mean force approach to model solvent forces acting on a solute molecule. It is frequently used in molecular simulations to reduce the computational cost of solvent treatment. In the first instance, the free energy of solvation and the associated solvent-solute forces can be approximated by a function of the solvent-accessible surface area (SASA) of the solute and differentiated by an atom-specific solvation parameter σ(i) (SASA). A procedure for the determination of values for the σ(i) (SASA) parameters through matching of explicit and implicit solvation forces is proposed. Using the results of Molecular Dynamics simulations of 188 topologically diverse protein structures in water and in implicit solvent, values for the σ(i) (SASA) parameters for atom types i of the standard amino acids in the GROMOS force field have been determined. A simplified representation based on groups of atom types σ(g) (SASA) was obtained via partitioning of the atom-type σ(i) (SASA) distributions by dynamic programming. Three groups of atom types with well separated parameter ranges were obtained, and their performance in implicit versus explicit simulations was assessed. The solvent forces are available at http://mathbio.nimr.mrc.ac.uk/wiki/Solvent_Forces.
Journal of Chemical Theory and Computation 07/2012; 8(7):2391-2403. · 5.39 Impact Factor
[show abstract][hide abstract] ABSTRACT: Evaluating alternative multiple protein sequence alignments is an important unsolved problem in Biology. The most accurate way of doing this is to use structural information. Unfortunately, most methods require at least two structures to be embedded in the alignment, a condition rarely met when dealing with standard datasets.
We developed STRIKE, a method that determines the relative accuracy of two alternative alignments of the same sequences using a single structure. We validated our methodology on three commonly used reference datasets (BAliBASE, Homestrad and Prefab). Given two alignments, STRIKE manages to identify the most accurate one in 70% of the cases on average. This figure increases to 79% when considering very challenging datasets like the RV11 category of BAliBASE. This discrimination capacity is significantly higher than that reported for other metrics such as Contact Accepted mutation or Blosum. We show that this increased performance results both from a refined definition of the contacts and from the use of an improved contact substitution score.
STRIKE is an open source freeware available from www.tcoffee.org
Supplementary data are available at Bioinformatics online.
[show abstract][hide abstract] ABSTRACT: Allostery offers a highly specific way to modulate protein function. Therefore, understanding this mechanism is of increasing interest for protein science and drug discovery. However, allosteric signal transmission is difficult to detect experimentally and to model because it is often mediated by local structural changes propagating along multiple pathways. To address this, we developed a method to identify communication pathways by an information-theoretical analysis of molecular dynamics simulations. Signal propagation was described as information exchange through a network of correlated local motions, modeled as transitions between canonical states of protein fragments. The method was used to describe allostery in two-component regulatory systems. In particular, the transmission from the allosteric site to the signaling surface of the receiver domain NtrC was shown to be mediated by a layer of hub residues. The location of hubs preferentially connected to the allosteric site was found in close agreement with key residues experimentally identified as involved in the signal transmission. The comparison with the networks of the homologues CheY and FixJ highlighted similarities in their dynamics. In particular, we showed that a preorganized network of fragment connections between the allosteric and functional sites exists already in the inactive state of all three proteins.
The FASEB Journal 11/2011; 26(2):868-81. · 5.70 Impact Factor
[show abstract][hide abstract] ABSTRACT: A novel form of acto-myosin regulation has been proposed in which polymerization of new actin filaments regulates motility of parasites of the apicomplexan class of protozoa. In vivo and in vitro parasite F-actin is very short and unstable, but the structural basis and details of filament dynamics remain unknown. Here, we show that long actin filaments can be obtained by polymerizing unlabeled rabbit skeletal actin (RS-actin) onto both ends of the short rhodamine-phalloidin-stabilized Plasmodium falciparum actin I (Pf-actin) filaments. Following annealing, hybrid filaments of micron length and "zebra-striped" appearance are observed by fluorescence microscopy that are stable enough to move over myosin class II motors in a gliding filament assay. Using negative stain electron microscopy we find that pure Pf-actin stabilized by jasplakinolide (JAS) also forms long filaments, indistinguishable in length from RS-actin filaments, and long enough to be characterized structurally. To compare structures in near physiological conditions in aqueous solution we imaged Pf-actin and RS-actin filaments by atomic force microscopy (AFM). We found the monomer stacking to be distinctly different for Pf-actin compared with RS-actin, such that the pitch of the double helix of Pf-actin filaments was 10% larger. Our results can be explained by a rotational angle between subunits that is larger in the parasite compared with RS-actin. Modeling of the AFM data using high-resolution actin filament models supports our interpretation of the data. The structural differences reported here may be a consequence of weaker inter- and intra-strand contacts, and may be critical for differences in filament dynamics and for regulation of parasite motility.
Journal of Biological Chemistry 11/2010; 285(47):36577-85. · 4.65 Impact Factor
[show abstract][hide abstract] ABSTRACT: A novel form of acto-myosin regulation has been proposed in which polymerization of new actin filaments regulates motility
of parasites of the apicomplexan class of protozoa. In vivo and in vitro parasite F-actin is very short and unstable, but the structural basis and details of filament dynamics remain unknown. Here,
we show that long actin filaments can be obtained by polymerizing unlabeled rabbit skeletal actin (RS-actin) onto both ends of the short rhodamine-phalloidin-stabilized Plasmodium falciparum actin I (Pf-actin) filaments. Following annealing, hybrid filaments of micron length and “zebra-striped” appearance are observed by fluorescence
microscopy that are stable enough to move over myosin class II motors in a gliding filament assay. Using negative stain electron
microscopy we find that pure Pf-actin stabilized by jasplakinolide (JAS) also forms long filaments, indistinguishable in length from RS-actin filaments, and long enough to be characterized structurally. To compare structures in near physiological conditions
in aqueous solution we imaged Pf-actin and RS-actin filaments by atomic force microscopy (AFM). We found the monomer stacking to be distinctly different for Pf-actin compared with RS-actin, such that the pitch of the double helix of Pf-actin filaments was 10% larger. Our results can be explained by a rotational angle between subunits that is larger in the
parasite compared with RS-actin. Modeling of the AFM data using high-resolution actin filament models supports our interpretation of the data. The
structural differences reported here may be a consequence of weaker inter- and intra-strand contacts, and may be critical
for differences in filament dynamics and for regulation of parasite motility.
Journal of Biological Chemistry 11/2010; 285(47):36577-36585. · 4.65 Impact Factor
[show abstract][hide abstract] ABSTRACT: The hierarchical and partially redundant nature of protein structures justifies the definition of frequently occurring conformations of short fragments as 'states'. Collections of selected representatives for these states define Structural Alphabets, describing the most typical local conformations within protein structures. These alphabets form a bridge between the string-oriented methods of sequence analysis and the coordinate-oriented methods of protein structure analysis.
A Structural Alphabet has been derived by clustering all four-residue fragments of a high-resolution subset of the protein data bank and extracting the high-density states as representative conformational states. Each fragment is uniquely defined by a set of three independent angles corresponding to its degrees of freedom, capturing in simple and intuitive terms the properties of the conformational space. The fragments of the Structural Alphabet are equivalent to the conformational attractors and therefore yield a most informative encoding of proteins. Proteins can be reconstructed within the experimental uncertainty in structure determination and ensembles of structures can be encoded with accuracy and robustness.
The density-based Structural Alphabet provides a novel tool to describe local conformations and it is specifically suitable for application in studies of protein dynamics.
[show abstract][hide abstract] ABSTRACT: We apply our recently developed information-theoretic measures for the characterisation and comparison of protein-protein interaction networks. These measures are used to quantify topological network features via macroscopic statistical properties. Network differences are assessed based on these macroscopic properties as opposed to microscopic overlap, homology information or motif occurrences. We present the results of a large-scale analysis of protein-protein interaction networks. Precise null models are used in our analyses, allowing for reliable interpretation of the results. By quantifying the methodological biases of the experimental data, we can define an information threshold above which networks may be deemed to comprise consistent macroscopic topological properties, despite their small microscopic overlaps. Based on this rationale, data from yeast-two-hybrid methods are sufficiently consistent to allow for intra-species comparisons (between different experiments) and inter-species comparisons, while data from affinity-purification mass-spectrometry methods show large differences even within intra-species comparisons.
PLoS ONE 01/2010; 5(8):e12083. · 3.73 Impact Factor
[show abstract][hide abstract] ABSTRACT: We study the tailoring of structured random graph ensembles to real networks, with the objective of generating precise and practical mathematical tools for quantifying and comparing network topologies macroscopically, beyond the level of degree statistics. Our family of ensembles can produce graphs with any prescribed degree distribution and any degree-degree correlation function, its control parameters can be calculated fully analytically, and as a result we can calculate (asymptotically) formulae for entropies and complexities, and for information-theoretic distances between networks, expressed directly and explicitly in terms of their measured degree distribution and degree correlations.
Journal of Physics A General Physics 12/2009; 42(48).
[show abstract][hide abstract] ABSTRACT: The size of current protein databases is a challenge for many Bioinformatics applications, both in terms of processing speed and information redundancy. It may be therefore desirable to efficiently reduce the database of interest to a maximally representative subset.
The MinSet method employs a combination of a Suffix Tree and a Genetic Algorithm for the generation, selection and assessment of database subsets. The approach is generally applicable to any type of string-encoded data, allowing for a drastic reduction of the database size whilst retaining most of the information contained in the original set. We demonstrate the performance of the method on a database of protein domain structures encoded as strings. We used the SCOP40 domain database by translating protein structures into character strings by means of a structural alphabet and by extracting optimized subsets according to an entropy score that is based on a constant-length fragment dictionary. Therefore, optimized subsets are maximally representative for the distribution and range of local structures. Subsets containing only 10% of the SCOP structure classes show a coverage of >90% for fragments of length 1-4.
Supplementary data are available at Bioinformatics online.
[show abstract][hide abstract] ABSTRACT: Merozoite surface protein 1 (MSP1) of the malaria parasite Plasmodium falciparum is an important vaccine candidate antigen. Antibodies specific for the C-terminal maturation product, MSP1(19), have been shown to inhibit erythrocyte invasion and parasite growth. Specific monoclonal antibodies react with conformational epitopes contained within the two EGF-like domains that constitute the antigen MSP1(19). To gain greater insight into the inhibitory process, the authors selected two strongly inhibitory antibodies (designated 12.8 and 12.10) and modeled their structures by homology. Computational docking was used to generate antigen-antibody complexes and a selection filter based on NMR data was applied to obtain plausible models. Molecular Dynamics simulations of the selected complexes were performed to evaluate the role of specific side chains in the binding. Favorable complexes were obtained that complement the NMR data in defining specific binding sites. These models can provide valuable guidelines for future experimental work that is devoted to the understanding of the action mechanism of invasion-inhibitory antibodies.
Proteins Structure Function and Bioinformatics 03/2007; 66(3):513-27. · 3.34 Impact Factor
[show abstract][hide abstract] ABSTRACT: Large-scale analysis of biomolecular complexes reveals the functional network within the cell. Computational methods are required to extract the essential information from the available data. The POPSCOMP server is designed to calculate the interaction surface between all components of a given complex structure consisting of proteins, DNA or RNA molecules. The server returns matrices and graphs of surface area burial that can be used to automatically annotate components and residues that are involved in complex formation, to pinpoint conformational changes and to estimate molecular interaction energies. The analysis can be performed on a per-atom level or alternatively on a per-residue level for low-resolution structures. Here, we present an analysis of ribosomal structures in complex with various antibiotics to exemplify the potential and limitations of automated complex analysis. The POPSCOMP server is accessible at http://ibivu.cs.vu.nl/programs/popscompwww/.
Nucleic Acids Research 08/2005; 33(Web Server issue):W342-6. · 8.28 Impact Factor
[show abstract][hide abstract] ABSTRACT: A detailed molecular dynamics study of the haemagglutinin fusion peptide (N-terminal 20 residues of the HA2 subunits) in a model bilayer has yielded useful information about the molecular interactions leading to insertion into the lipids. Simulations were performed on the native sequence, as well as a number of mutant sequences, which are either fusogenic or nonfusogenic. For the native sequence and fusogenic mutants, the N-terminal 11 residues of the fusion peptides are helical and insert with a tilt angle of approximately 30 degrees with respect to the membrane normal, in very good agreement with experimental data. The tilted insertion of the native sequence peptide leads to membrane bilayer thinning and the calculated order parameters show larger disorder of the alkyl chains. These results indicate that the lipid packing is perturbed by the fusion peptide and could be used to explain membrane fusion. For the nonfusogenic sequences investigated, it was found that most of them equilibrate parallel to the interface plane and do not adopt a tilted conformation. The presence of a charged residue at the beginning of the sequence (G1E mutant) resulted in a more difficult case, and the outcomes do not fall straightforwardly into the general picture. Sequence searches have revealed similarities of the fusion peptide of influenza haemagglutinin with peptide sequences such as segments of porin, amyloid alpha eta peptide, and a peptide from the prion sequence. These results confirm that the sequence can adopt different folds in different environments. The plasticity and the conformational dependence on the local environment could be used to better understand the function of fusion peptides.
[show abstract][hide abstract] ABSTRACT: We present a profile-profile multiple alignment strategy that uses database searching to collect homologues for each sequence in a given set, in order to enrich their available evolutionary information for the alignment. For each of the alignment sequences, the putative homologous sequences that score above a pre-defined threshold are incorporated into a position-specific pre-alignment profile. The enriched position-specific profile is used for standard progressive alignment, thereby more accurately describing the characteristic features of the given sequence set. We show that owing to the incorporation of the pre-alignment information into a standard progressive multiple alignment routine, the alignment quality between distant sequences increases significantly and outperforms state-of-the-art methods, such as T-COFFEE and MUSCLE. We also show that although entirely sequence-based, our novel strategy is better at aligning distant sequences when compared with a recent contact-based alignment method. Therefore, our pre-alignment profile strategy should be advantageous for applications that rely on high alignment accuracy such as local structure prediction, comparative modelling and threading.
Nucleic Acids Research 02/2005; 33(3):816-24. · 8.28 Impact Factor
[show abstract][hide abstract] ABSTRACT: This paper introduces the novel method of contact-based protein sequence alignment, where structural information in the form of contact mutation probabilities is incorporated into an alignment routine using contact-mutation matrices (CAO: Contact Accepted mutatiOn). The contact-based alignment routine optimizes the score of matched contacts, which involves four (two per contact) instead of two residues per match in pairwise alignments. The first contact refers to a real side-chain contact in a template sequence with known structure, and the second contact is the equivalent putative contact of a homologous query sequence with unknown structure. An algorithm has been devised to perform a pairwise sequence alignment based on contact information. The contact scores were combined with PAM-type (Point Accepted Mutation) substitution scores after parameterization of gap penalties and score weights by means of a genetic algorithm. We show that owing to the structural information contained in the CAO matrices, significantly improved alignments of distantly related sequences can be obtained. This has allowed us to annotate eight putative Drosophila IGF sequences. Contact-based sequence alignment should therefore prove useful in comparative modelling and fold recognition.
Nucleic Acids Research 02/2004; 32(8):2464-73. · 8.28 Impact Factor