[Show abstract][Hide abstract] ABSTRACT: An important aspect of studying the relationship between protein sequence, structure and function is the molecular characterization of the effect of protein mutations. To understand the functional impact of amino acid changes, the multiple biological properties of protein residues have to be considered together.
BMC proceedings 01/2014; 8(Suppl 2 Proceedings of the 3rd Annual Symposium on Biologica):S2.
[Show abstract][Hide abstract] ABSTRACT: The relationship of HIV tropism with disease progression and the recent development of CCR5-blocking drugs underscore the importance of monitoring virus coreceptor usage. As an alternative to costly phenotypic assays, computational methods aim at predicting virus tropism based on the sequence and structure of the V3 loop of the virus gp120 protein. Here we present a numerical descriptor of the V3 loop encoding its physicochemical and structural properties. The descriptor allows for structure-based prediction of HIV tropism and identification of properties of the V3 loop that are crucial for coreceptor usage. Use of the proposed descriptor for prediction results in a statistically significant improvement over the prediction based solely on V3 sequence with 3 percentage points improvement in AUC and 7 percentage points in sensitivity at the specificity of the 11/25 rule (95%). We additionally assessed the predictive power of the new method on clinically derived 'bulk' sequence data and obtained a statistically significant improvement in AUC of 3 percentage points over sequence-based prediction. Furthermore, we demonstrated the capacity of our method to predict therapy outcome by applying it to 53 samples from patients undergoing Maraviroc therapy. The analysis of structural features of the loop informative of tropism indicates the importance of two loop regions and their physicochemical properties. The regions are located on opposite strands of the loop stem and the respective features are predominantly charge-, hydrophobicity- and structure-related. These regions are in close proximity in the bound conformation of the loop potentially forming a site determinant for the coreceptor binding. The method is available via server under http://structure.bioinf.mpi-inf.mpg.de/.
[Show abstract][Hide abstract] ABSTRACT: Biological plausibility and other prior information could help select genome-wide association (GWA) findings for further follow-up, but there is no consensus on which types of knowledge should be considered or how to weight them. We used experts' opinions and empirical evidence to estimate the relative importance of 15 types of information at the single-nucleotide polymorphism (SNP) and gene levels. Opinions were elicited from 10 experts using a two-round Delphi survey. Empirical evidence was obtained by comparing the frequency of each type of characteristic in SNPs established as being associated with seven disease traits through GWA meta-analysis and independent replication, with the corresponding frequency in a randomly selected set of SNPs. SNP and gene characteristics were retrieved using a specially developed bioinformatics tool. Both the expert and the empirical evidence rated previous association in a meta-analysis or more than one study as conferring the highest relative probability of true association, whereas previous association in a single study ranked much lower. High relative probabilities were also observed for location in a functional protein domain, although location in a region evolutionarily conserved in vertebrates was ranked high by the data but not by the experts. Our empirical evidence did not support the importance attributed by the experts to whether the gene encodes a protein in a pathway or shows interactions relevant to the trait. Our findings provide insight into the selection and weighting of different types of knowledge in SNP or gene prioritization, and point to areas requiring further research.
[Show abstract][Hide abstract] ABSTRACT: Parkinson's disease (PD) is a progressive neurodegenerative disorder affecting approximately 1-2% of the general population over age 60. It is characterized by a rather selective loss of dopaminergic neurons in the substantia nigra and the presence of α-synuclein-enriched Lewy body inclusions. Mutations in the Parkin gene (PARK2) are the major cause of autosomal recessive early-onset parkinsonism. The Parkin protein is an E3 ubiquitin ligase with various cellular functions, including the induction of mitophagy upon mitochondrial depolarizaton, but the full repertoire of Parkin-binding proteins remains poorly defined. Here we employed tandem affinity purification interaction screens with subsequent mass spectrometry to profile binding partners of Parkin. Using this approach for two different cell types (HEK293T and SH-SY5Y neuronal cells), we identified a total of 203 candidate Parkin-binding proteins. For the candidate proteins and the proteins known to cause heritable forms of parkinsonism, protein-protein interaction data were derived from public databases, and the associated biological processes and pathways were analyzed and compared. Functional similarity between the candidates and the proteins involved in monogenic parkinsonism was investigated, and additional confirmatory evidence was obtained using published genetic interaction data from Drosophila melanogaster. Based on the results of the different analyses, a prioritization score was assigned to each candidate Parkin-binding protein. Two of the top ranking candidates were tested by co-immunoprecipitation, and interaction to Parkin was confirmed for one of them. New candidates for involvement in cell death processes, protein folding, the fission/fusion machinery, and the mitophagy pathway were identified, which provide a resource for further elucidating Parkin function.
PLoS ONE 01/2013; 8(11):e78648. · 3.53 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Prioritization is the process whereby a set of possible candidate genes or SNPs is ranked so that the most promising can be taken forward into further studies. In a genome-wide association study, prioritization is usually based on the P-values alone, but researchers sometimes take account of external annotation information about the SNPs such as whether the SNP lies close to a good candidate gene. Using external information in this way is inherently subjective and is often not formalized, making the analysis difficult to reproduce. Building on previous work that has identified 14 important types of external information, we present an approximate Bayesian analysis that produces an estimate of the probability of association. The calculation combines four sources of information: the genome-wide data, SNP information derived from bioinformatics databases, empirical SNP weights, and the researchers' subjective prior opinions. The calculation is fast enough that it can be applied to millions of SNPS and although it does rely on subjective judgments, those judgments are made explicit so that the final SNP selection can be reproduced. We show that the resulting probability of association is intuitively more appealing than the P-value because it is easier to interpret and it makes allowance for the power of the study. We illustrate the use of the probability of association for SNP prioritization by applying it to a meta-analysis of kidney function genome-wide association studies and demonstrate that SNP selection performs better using the probability of association compared with P-values alone.
[Show abstract][Hide abstract] ABSTRACT: Anaemia is a chief determinant of global ill health, contributing to cognitive impairment, growth retardation and impaired physical capacity. To understand further the genetic factors influencing red blood cells, we carried out a genome-wide association study of haemoglobin concentration and related parameters in up to 135,367 individuals. Here we identify 75 independent genetic loci associated with one or more red blood cell phenotypes at P < 10(-8), which together explain 4-9% of the phenotypic variance per trait. Using expression quantitative trait loci and bioinformatic strategies, we identify 121 candidate genes enriched in functions relevant to red blood cell biology. The candidate genes are expressed preferentially in red blood cell precursors, and 43 have haematopoietic phenotypes in Mus musculus or Drosophila melanogaster. Through open-chromatin and coding-variant analyses we identify potential causal genetic variants at 41 loci. Our findings provide extensive new insights into genetic mechanisms and biological pathways controlling red blood cell formation and function.
[Show abstract][Hide abstract] ABSTRACT: Phospho- and sphingolipids are crucial cellular and intracellular compounds. These lipids are required for active transport, a number of enzymatic processes, membrane formation, and cell signalling. Disruption of their metabolism leads to several diseases, with diverse neurological, psychiatric, and metabolic consequences. A large number of phospholipid and sphingolipid species can be detected and measured in human plasma. We conducted a meta-analysis of five European family-based genome-wide association studies (N = 4034) on plasma levels of 24 sphingomyelins (SPM), 9 ceramides (CER), 57 phosphatidylcholines (PC), 20 lysophosphatidylcholines (LPC), 27 phosphatidylethanolamines (PE), and 16 PE-based plasmalogens (PLPE), as well as their proportions in each major class. This effort yielded 25 genome-wide significant loci for phospholipids (smallest P-value = 9.88×10−204) and 10 loci for sphingolipids (smallest P-value = 3.10×10−57). After a correction for multiple comparisons (P-value
[Show abstract][Hide abstract] ABSTRACT: Drug-resistant viral variants are a major issue in the use of direct-acting antiviral agents in chronic hepatitis C. Ketoamides are potent inhibitors of the NS3 protease, with V55A identified as mutation associated with resistance to boceprevir. Underlying molecular mechanisms are only partially understood. We applied a comprehensive sequence analysis to characterize the natural variability at Val55 within dominant worldwide patient strains. A residue-interaction network and molecular dynamics simulation were applied to identify mechanisms for ketoamide resistance and viral fitness in Val55 variants. An infectious H77S.3 cell culture system was used for variant phenotype characterization. We measured antiviral 50% effective concentration (EC₅₀) and fold changes, as well as RNA replication and infectious virus yields from viral RNAs containing variants. Val55 was found highly conserved throughout all hepatitis C virus (HCV) genotypes. The conservative V55A and V55I variants were identified from HCV genotype 1a strains with no variants in genotype 1b. Topology measures from a residue-interaction network of the protease structure suggest a potential Val55 key role for modulation of molecular changes in the protease ligand-binding site. Molecular dynamics showed variants with constricted binding pockets and a loss of H-bonded interactions upon boceprevir binding to the variant proteases. These effects might explain low-level boceprevir resistance in the V55A variant, as well as the Val55 variant, reduced RNA replication capacity. Higher structural flexibility was found in the wild-type protease, whereas variants showed lower flexibility. Reduced structural flexibility could impact the Val55 variant's ability to adapt for NS3 domain-domain interaction and might explain the virus yield drop observed in variant strains.
Antimicrobial Agents and Chemotherapy 01/2012; 56(4):1907-15. · 4.57 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Computational analysis and interactive visualization of biological networks and protein structures are common tasks for gaining insight into biological processes. This protocol describes three workflows based on the NetworkAnalyzer and RINalyzer plug-ins for Cytoscape, a popular software platform for networks. NetworkAnalyzer has become a standard Cytoscape tool for comprehensive network topology analysis. In addition, RINalyzer provides methods for exploring residue interaction networks derived from protein structures. The first workflow uses NetworkAnalyzer to perform a topological analysis of biological networks. The second workflow applies RINalyzer to study protein structure and function and to compute network centrality measures. The third workflow combines NetworkAnalyzer and RINalyzer to compare residue networks. The full protocol can be completed in ∼2 h.
[Show abstract][Hide abstract] ABSTRACT: It is a challenge to develop direct-acting antiviral agents that target the nonstructural protein 3/4A protease of hepatitis C virus because resistant variants develop. Ketoamide compounds, designed to mimic the natural protease substrate, have been developed as inhibitors. However, clinical trials have revealed rapid selection of resistant mutants, most of which are considered to be pre-existing variants.
We identified residues near the ketoamide-binding site in x-ray structures of the genotype 1a protease, co-crystallized with boceprevir or a telaprevir-like ligand, and then identified variants at these positions in 219 genotype-1 sequences from a public database. We used side-chain modeling to assess the potential effects of these variants on the interaction between ketoamide and the protease, and compared these results with the phenotypic effects on ketoamide resistance, RNA replication capacity, and infectious virus yields in a cell culture model of infection.
Thirteen natural binding-site variants with potential for ketoamide resistance were identified at 10 residues in the protease, near the ketoamide binding site. Rotamer analysis of amino acid side-chain conformations indicated that 2 variants (R155K and D168G) could affect binding of telaprevir more than boceprevir. Measurements of antiviral susceptibility in cell-culture studies were consistent with this observation. Four variants (ie, Q41H, I132V, R155K, and D168G) caused low-to-moderate levels of ketoamide resistance; 3 of these were highly fit (Q41H, I132V, and R155K).
Using a comprehensive sequence and structure-based analysis, we showed how natural variation in the hepatitis C virus protease nonstructural protein 3/4A sequences might affect susceptibility to first-generation direct-acting antiviral agents. These findings increase our understanding of the molecular basis of ketoamide resistance among naturally existing viral variants.
[Show abstract][Hide abstract] ABSTRACT: The study of individual amino acid residues and their molecular interactions in protein structures is crucial for understanding structure-function relationships. Recent work has indicated that residue networks derived from 3D protein structures provide additional insights into the structural and functional roles of interacting residues. Here, we present the new software tools RINerator and RINalyzer for the automatized generation, 2D visualization, and interactive analysis of residue interaction networks, and highlight their use in different application scenarios.
Trends in Biochemical Sciences 02/2011; 36(4):179-82. · 13.08 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Conserved interactions between T cell receptors (TCRs) and major histocompatibility complex (MHC) proteins with bound peptide antigens are not well understood. In order to gain a better understanding of the interaction modes of human TCR variable (V) regions, we have performed a structural analysis of the TCRs bound to their MHC-peptide ligands in human, using the available structural models determined by X-ray crystallography. We identified important differences to previous studies in which such interactions were evaluated. Based on the interactions found in the actual experimental structures we developed the first rule-based approach for predicting the ability of TCR residues in the complementarity-determining region (CDR) 1, CDR2, and CDR3 loops to interact with the MHC-peptide antigen complex. Two relatively simple algorithms show good performance under cross validation.
[Show abstract][Hide abstract] ABSTRACT: Structural alignments of proteins are important for identification of structural similarities, homology detection and functional annotation. The structural alignment problem is well studied and computationally difficult. Many different scoring schemes for structural similarity as well as many algorithms for finding high-scoring alignments have been proposed. Algorithms using contact map overlap (CMO) as scoring function are currently the only practical algorithms able to compute provably optimal alignments.
We propose a new mathematical model for the alignment of inter-residue distance matrices, building upon previous work on maximum CMO. Our model includes all elements needed to emulate various scoring schemes for the alignment of protein distance matrices. The algorithm that we use to compute alignments is practical only for sparse distance matrices. Therefore, we propose a more effective scoring function, which uses a distance threshold and only positive structural scores. We show that even under these restrictions our approach is in terms of alignment accuracy competitive with state-of-the-art structural alignment algorithms, whereas it additionally either proves the optimality of an alignment or returns bounds on the optimal score. Our novel method is freely available and constitutes an important promising step towards truly provably optimal structural alignments of proteins.
An executable of our program PAUL is available at http://planet-lisa.net/.
[Show abstract][Hide abstract] ABSTRACT: Boceprevir is a hepatitis C virus (HCV) nonstructural protein (NS) 3/4A protease inhibitor that is currently being evaluated in combination with peginterferon alfa-2b and ribavirin in phase 3 studies. The clinical resistance profile of boceprevir is not characterized in detail so far. The NS3 protease domain of viral RNA was cloned from HCV genotype 1-infected patients (n = 22). A mean number of 47 clones were sequenced before, at the end, and after treatment with 400 mg boceprevir twice or three times daily for 14 days for genotypic, phenotypic, and viral fitness analysis. At the end of treatment, a wild-type an NS3 protease sequence was observed with a mean frequency of 85.9%. In the remaining isolates, five previously observed resistance mutations (V36M/A, T54A/S, R155K/T, A156S, V170A) and one mutation (V55A) with unknown resistance to boceprevir were detected either alone or in combination. Phenotypic analysis in the HCV replicon assay showed low (V36G, T54S, R155L; 3.8- to 5.5-fold 50% inhibitory concentration [IC(50)]), medium (V55A, R155K, V170A, T54A, A156S; 6.8- to 17.7-fold IC(50)) and high level (A156T; >120-fold IC(50)) resistance to boceprevir. The overall frequency of resistant mutations and the level of resistance increased with greater declines in mean maximum HCV RNA levels. Two weeks after the end of treatment, the frequency of resistant variants declined and the number of wild-type isolates increased to 95.5%. With the exception of V36 and V170 variants all resistant mutations declined by more than 50%. Mathematical modeling revealed impaired replicative fitness for all single mutations, whereas for combined mutations a relative increase of replication efficiency was suggested. CONCLUSION: During boceprevir monotherapy, resistance mutations at six positions within the NS3 protease were detected by way of clonal sequence analysis. All mutations are associated with reduced replicative fitness estimated by mathematical modeling and show cross-resistance to telaprevir.
[Show abstract][Hide abstract] ABSTRACT: Variable region 1 (V1) of the SPRY domain of TRIM5alpha is a major determinant for species-specific virus restriction in primates. We previously reported that a chimeric TRIM5alpha containing baboon V1 in the background of cynomolgus monkey TRIM5alpha showed potent anti-human immunodeficiency virus type 2 (HIV-2) activity. Since baboons are reportedly sensitive to HIV-2 infection, there was a discrepancy between the ability of baboon TRIM5alpha V1 to restrict HIV-2 and baboon sensitivity to HIV-2. In the study presented here, we examined the roles of V2 and V3 of the baboon TRIM5alpha SPRY domain in its anti-HIV-2 activity. A chimeric TRIM5alpha containing the entire baboon SPRY domain showed weak anti-HIV-2 activity. This attenuation of activity was caused by a single serine-to-proline substitution in baboon TRIM5alpha V2. These findings indicate that the combination of V1 with other variable regions of SPRY is important in anti-HIV-2 activity of primate TRIM5alpha.