[Show abstract][Hide abstract]ABSTRACT: ATP binding cassette (ABC) transporters play critical roles in maintaining sterol balance in higher eukaryotes. The ABCG5/ABCG8 heterodimer (G5G8) mediates excretion of neutral sterols in liver and intestines. Mutations disrupting G5G8 cause sitosterolaemia, a disorder characterized by sterol accumulation and premature atherosclerosis. Here we use crystallization in lipid bilayers to determine the X-ray structure of human G5G8 in a nucleotide-free state at 3.9 Å resolution, generating the first atomic model of an ABC sterol transporter. The structure reveals a new transmembrane fold that is present in a large and functionally diverse superfamily of ABC transporters. The transmembrane domains are coupled to the nucleotide-binding sites by networks of interactions that differ between the active and inactive ATPases, reflecting the catalytic asymmetry of the transporter. The G5G8 structure provides a mechanistic framework for understanding sterol transport and the disruptive effects of mutations causing sitosterolaemia.
[Show abstract][Hide abstract]ABSTRACT: Cholesterol homeostasis is mediated by Scap, a polytopic ER protein that transports SREBPs from ER to Golgi where SREBPs are
processed to forms that activate cholesterol synthesis. Scap has eight transmembrane helices and two large luminal loops,
designated Loop1 and Loop7. We earlier provided indirect evidence that Loop1 binds to Loop7, allowing Scap to bind COPII proteins
for transport in coated vesicles. When ER cholesterol rises, it binds to Loop1. We hypothesized that this causes dissociation
from Loop7, abrogating COPII binding. Here, we demonstrate direct binding of the two loops when expressed as isolated fragments
or as a fusion protein. Expressed alone, Loop1 remained intracellular and membrane-bound. When Loop7 was co-expressed, it
bound to Loop1 and the soluble complex was secreted. A Loop1-Loop7 fusion protein was also secreted, and the two loops remained
bound when the linker between them was cleaved by a protease. Point mutations that disrupt the Loop1-Loop7 interaction prevented
secretion of the L1-L7 fusion protein. These data provide direct documentation of intramolecular Loop1-Loop7 binding, a central
event in cholesterol homeostasis.
[Show abstract][Hide abstract]ABSTRACT: We present an overview of contact-assisted predictions in the eleventh round of Critical Assessment of Protein Structure Prediction (CASP11), which included four categories: predicted contacts (Tp), correct contacts (Tc), simulated sparse NMR contacts (Ts), and cross-linking contacts (Tx). Comparison of assisted to unassisted model quality highlighted a relatively poor overall performance in CASP11 using predicted Tp and crosslinked Tx contact information. However, average model quality significantly improved in the correct Tc and simulated NMR Ts categories for most targets, where maximum improvement of unassisted models reached an impressive 70 GDT_TS. Comparison of the performance in the correct Tc category to CASP10 suggested the improvement in CASP11 model quality originated from an increased number of provided contacts per target. Group rankings based on a combination of scores used in the CASP11 free modeling (FM) assessment for each category highlight four top-performing groups, with three from the Lee lab and one from the Baker lab. We used the overall performance of these groups in each category to develop hypotheses for their relative outperformance in the correct Tc and simulated NMR Ts categories, which stemmed from the fraction of correct contacts provided (correct Tc category) and a reduced fraction of correct contacts offset by an increased coverage of the correct contacts (simulated NMR Ts category). This article is protected by copyright. All rights reserved.
No preview · Article · Feb 2016 · Proteins Structure Function and Bioinformatics
[Show abstract][Hide abstract]ABSTRACT: Proteins and their domains evolve by a set of events commonly including the duplication and divergence of small motifs. The presence of short repetitive regions in domains has generally constituted a difficult case for structural domain classifications and their hierarchies. We developed the Evolutionary Classification Of protein Domains (ECOD) in part to implement a new schema for the classification of these types of proteins. Here we document the ways in which ECOD classifies proteins with small internal repeats, widespread functional motifs, and assemblies of small domain-like fragments in its evolutionary schema. We illustrate the ways in which the structural genomics project impacted the classification and characterization of new structural domains and sequence families over the decade. This article is protected by copyright. All rights reserved.
[Show abstract][Hide abstract]ABSTRACT: Protein target structures for the Critical Assessment of Structure Prediction round 11 (CASP11) and CASP ROLL were split into domains and classified into categories suitable for assessment of template-based modeling (TBM) and free modeling (FM) based on their evolutionary relatedness to existing structures classified by the Evolutionary Classification of Protein Domains (ECOD) database. First, target structures were divided into domain-based evaluation units. Target splits were based on the domain organization of available templates as well as the performance of servers on whole targets compared to split target domains. Second, evaluation units were classified into TBM and FM categories using a combination of measures that evaluate prediction quality and template detectability. Generally, target domains with sequence-related templates and good server prediction performance were classified as TBM, whereas targets without sequence-identifiable templates and low server performance were classified as FM. As in previous CASP experiments, the boundaries for classification were blurred due to the presence of significant insertions and deteriorations in the targets with respect to homologous templates, as well as the presence of templates with partial coverage of new folds. The FM category included 45 target domains, which represents an unprecedented number of difficult CASP targets provided for modeling. This article is protected by copyright. All rights reserved.
No preview · Article · Jan 2016 · Proteins Structure Function and Bioinformatics
[Show abstract][Hide abstract]ABSTRACT: We present an assessment of 'template-free modeling' (FM) in CASP11and ROLL. Community-wide server performance suggested the use of automated scores similar to previous CASPs would provide a good system of evaluating performance, even in the absence of comprehensive manual assessment. The CASP11 FM category included several outstanding examples, including successful prediction by the Baker group of a 256-residue target (T0806-D1) that lacked sequence similarity to any existing template. The top server model prediction by Zhang's Quark, which was apparently selected and refined by several manual groups, encompassed the entire fold of target T0837-D1. Methods from the same two groups tended to dominate overall CASP11 FM and ROLL rankings. Comparison of top FM predictions with those from the previous CASP experiment revealed progress in the category, particularly reflected in high prediction accuracy for larger protein domains. FM prediction models for two cases were sufficient to provide functional insights that were otherwise not obtainable by traditional sequence analysis methods. Importantly, CASP11 abstracts revealed that alignment-based contact prediction methods brought about much of the CASP11 progress, producing both of the functionally relevant models as well as several of the other outstanding structure predictions. These methodological advances enabled de-novo modeling of much larger domain structures than was previously possible and allowed prediction of functional sites. This article is protected by copyright. All rights reserved.
No preview · Article · Dec 2015 · Proteins Structure Function and Bioinformatics
[Show abstract][Hide abstract]ABSTRACT: The Cκ-RMSD and GDT-TS calculations are over the full-length sequence.The total GREMLIN score for the model is reported. The most accurate models have the best GREMLIN score.DOI:
[Show abstract][Hide abstract]ABSTRACT: Detailed table containing all 131 large protein families. Also the complete list of protein coding genes from E. coli (ECOLI), B. subtilis (BACSU) Halobacterium salinarum (HALSA), and Sulfolobus solfataricus (SULSO) along with number of sequences are provided.
[Show abstract][Hide abstract]ABSTRACT: The prediction of the structures of proteins without detectable sequence similarity to any protein of known structure remains an outstanding scientific challenge. Here we report significant progress in this area. We first describe de novo blind structure predictions of unprecendented accuracy we made for two proteins in large families in the recent CASP11 blind test of protein structure prediction methods by incorporating residue–residue co-evolution information in the Rosetta structure prediction program. We then describe the use of this method to generate structure models for 58 of the 121 large protein families in prokaryotes for which three-dimensional structures are not available. These models, which are posted online for public access, provide structural information for the over 400,000 proteins belonging to the 58 families and suggest hypotheses about mechanism for the subset for which the function is known, and hypotheses about function for the remainder.
[Show abstract][Hide abstract]ABSTRACT: The type VI secretion system (T6SS) is a widespread protein secretion apparatus used by Gram-negative bacteria to deliver toxic effector proteins into adjacent bacterial or host cells. Here, we uncovered a role in interbacterial competition for the two T6SSs encoded by the marine pathogen Vibrio alginolyticus. Using comparative proteomics and genetics, we identified their effector repertoires. In addition to the previously described effector V12G01_02265, we identified three new effectors secreted by T6SS1, indicating that the T6SS1 secretes at least four antibacterial effectors, of which three are members of the MIX-effector class. We also showed that the T6SS2 secretes at least three antibacterial effectors. Our findings revealed that many MIX-effectors belonging to clan V are "orphan" effectors that neighbor mobile elements and are shared between marine bacteria via horizontal gene transfer. We demonstrated that a MIX V-effector from V. alginolyticus is a functional T6SS effector when ectopically expressed in another Vibrio species. We propose that mobile MIX V-effectors serve as an environmental reservoir of T6SS effectors that are shared and used to diversify antibacterial toxin repertoires in marine bacteria, resulting in enhanced competitive fitness.
[Show abstract][Hide abstract]ABSTRACT: Inference of homology from protein sequences provides an essential tool for analyzing protein structure, function, and evolution. Current sequence-based homology search methods are still unable to detect many similarities evident from protein spatial structures. In computer science a search engine can be improved by considering networks of known relationships within the search database. Here, we apply this idea to protein-sequence-based homology search and show that it dramatically enhances the search accuracy. Our new method, COMPADRE (COmparison of Multiple Protein sequence Alignments using Database RElationships) assesses the relationship between the query sequence and a hit in the database by considering the similarity between the query and hit's known homologs. This approach increases detection quality, boosting the precision rate from 18% to 83% at half-coverage of all database homologs. The increased precision rate allows detection of a large fraction of protein structural relationships, thus providing structure and function predictions for previously uncharacterized proteins. Our results suggest that this general approach is applicable to a wide variety of methods for detection of biological similarities. The web server is available at prodata.swmed.edu/compadre.
Full-text · Article · Jun 2015 · Proceedings of the National Academy of Sciences
[Show abstract][Hide abstract]ABSTRACT: Understanding the evolution of a protein, including both close and distant relationships, often reveals insight into its structure and function. Fast and easy access to such up-to-date information facilitates research. We have developed a hierarchical evolutionary classification of all proteins with experimentally determined spatial structures, and presented it as an interactive and updatable online database. ECOD (Evolutionary Classification of protein Domains) is distinct from other structural classifications in that it groups domains primarily by evolutionary relationships (homology), rather than topology (or "fold"). This distinction highlights cases of homology between domains of differing topology to aid in understanding of protein structure evolution. ECOD uniquely emphasizes distantly related homologs that are difficult to detect, and thus catalogs the largest number of evolutionary links among structural domain classifications. Placing distant homologs together underscores the ancestral similarities of these proteins and draws attention to the most important regions of sequence and structure, as well as conserved functional sites. ECOD also recognizes closer sequence-based relationships between protein domains. Currently, approximately 100,000 protein structures are classified in ECOD into 9,000 sequence families clustered into close to 2,000 evolutionary groups. The classification is assisted by an automated pipeline that quickly and consistently classifies weekly releases of PDB structures and allows for continual updates. This synchronization with PDB uniquely distinguishes ECOD among all protein classifications. Finally, we present several case studies of homologous proteins not recorded in other classifications, illustrating the potential of how ECOD can be used to further biological and evolutionary studies.
Full-text · Article · Dec 2014 · PLoS Computational Biology