[show abstract][hide abstract] ABSTRACT: We present a prototype of a new structural classification of proteins, SCOP2 (http://scop2.mrc-lmb.cam.ac.uk/), that we have developed recently. SCOP2 is a successor to the Structural Classification of Proteins (SCOP, http://scop.mrc-lmb.cam.ac.uk/scop/) database. Similarly to SCOP, the main focus of SCOP2 is to organize structurally characterized proteins according to their structural and evolutionary relationships. SCOP2 was designed to provide a more advanced framework for protein structure annotation and classification. It defines a new approach to the classification of proteins that is essentially different from SCOP, but retains its best features. The SCOP2 classification is described in terms of a directed acyclic graph in which nodes form a complex network of many-to-many relationships and are represented by a region of protein structure and sequence. The new classification project is expected to ensure new advances in the field and open new areas of research.
Nucleic Acids Research 11/2013; · 8.28 Impact Factor
[show abstract][hide abstract] ABSTRACT: Genome3D, available at http://www.genome3d.eu, is a new collaborative project that integrates UK-based structural resources to provide a unique perspective on sequence-structure-function relationships. Leading structure prediction resources (DomSerf, FUGUE, Gene3D, pDomTHREADER, Phyre and SUPERFAMILY) provide annotations for UniProt sequences to indicate the locations of structural domains (structural annotations) and their 3D structures (structural models). Structural annotations and 3D model predictions are currently available for three model genomes (Homo sapiens, E. coli and baker's yeast), and the project will extend to other genomes in the near future. As these resources exploit different strategies for predicting structures, the main aim of Genome3D is to enable comparisons between all the resources so that biologists can see where predictions agree and are therefore more trusted. Furthermore, as these methods differ in whether they build their predictions using CATH or SCOP, Genome3D also contains the first official mapping between these two databases. This has identified pairs of similar superfamilies from the two resources at various degrees of consensus (532 bronze pairs, 527 silver pairs and 370 gold pairs).
Nucleic Acids Research 11/2012; · 8.28 Impact Factor
[show abstract][hide abstract] ABSTRACT: During the past decade, the Protein Structure Initiative (PSI) centres have become major contributors of new families, superfamilies and folds to the Structural Classification of Proteins (SCOP) database. The PSI results have increased the diversity of protein structural space and accelerated our understanding of it. This review article surveys a selection of protein structures determined by the Joint Center for Structural Genomics (JCSG). It presents previously undescribed β-sheet architectures such as the double barrel and spiral β-roll and discusses new examples of unusual topologies and peculiar structural features observed in proteins characterized by the JCSG and other Structural Genomics centres.
Acta Crystallographica Section F Structural Biology and Crystallization Communications 10/2010; 66(Pt 10):1190-7. · 0.55 Impact Factor
[show abstract][hide abstract] ABSTRACT: The Structural Classification of Proteins (SCOP) database is a comprehensive ordering of all proteins of known structure, according to their evolutionary and structural relationships. The SCOP hierarchy comprises the following levels: Species, Protein, Family, Superfamily, Fold and Class. While keeping the original classification scheme intact, we have changed the production of SCOP in order to cope with a rapid growth of new structural data and to facilitate the discovery of new protein relationships. We describe ongoing developments and new features implemented in SCOP. A new update protocol supports batch classification of new protein structures by their detected relationships at Family and Superfamily levels in contrast to our previous sequential handling of new structural data by release date. We introduce pre-SCOP, a preview of the SCOP developmental version that enables earlier access to the information on new relationships. We also discuss the impact of worldwide Structural Genomics initiatives, which are producing new protein structures at an increasing rate, on the rates of discovery and growth of protein families and superfamilies. SCOP can be accessed at http://scop.mrc-lmb.cam.ac.uk/scop.
Nucleic Acids Research 02/2008; 36(Database issue):D419-25. · 8.28 Impact Factor
[show abstract][hide abstract] ABSTRACT: With the increasing amount of structural data, the number of homologous protein structures bearing topological irregularities is steadily growing. These include proteins with circular permutations, segment-swapping, context-dependent folding or chameleon sequences that can adopt alternative secondary structures. Their non-trivial structural relationships are readily identified during expert analysis but their automatic identification using the existing computational tools still remains difficult or impossible. Such non-trivial cases of protein relationships are known to pose a problem to multiple alignment algorithms and to impede comparative modeling studies. They support a new emerging concept of evolutionary changeable protein fold, which creates practical difficulties for the hierarchical classifications of protein structures.To facilitate the understanding of, and to provide a comprehensive annotation of proteins with such non-trivial structural relationships we have created SISYPHUS ([Sigmaomeganuphiomicronzeta]--in Greek crafty), a compendium to the SCOP database. The SISYPHUS database contains a collection of manually curated structural alignments and their inter-relationships. The multiple alignments are constructed for protein structural regions that range from oligomeric biological units, or individual domains to fragments of different size. The SISYPHUS multiple alignments are displayed with SPICE, a browser that provides an integrated view of protein sequences, structures and their annotations. The database is available from http://sisyphus.mrc-cpe.cam.ac.uk.
Nucleic Acids Research 02/2007; 35(Database issue):D253-9. · 8.28 Impact Factor
[show abstract][hide abstract] ABSTRACT: Understanding the molecular mechanisms of transition state regulator proteins is critical, since they play a pivotal role in the ability of bacteria to cope with changing environments. Although much effort has focused on their genetic characterization, little is known about their structural and functional conservation. Here we present the high resolution NMR solution structure of the N-terminal domain of the Bacillus subtilis transition state regulator Abh (AbhN), only the second such structure to date. We then compare AbhN to the N-terminal DNA-binding domain of B. subtilis AbrB (AbrBN). This is the first such comparison between two AbrB-like transition state regulators. AbhN and AbrBN are very similar, suggesting a common structural basis for their DNA binding. However, we also note subtle variances between the AbhN and AbrBN structures, which may play important roles in DNA target specificity. The results of accompanying in vitro DNA-binding studies serve to highlight binding differences between the two proteins.
Journal of Biological Chemistry 08/2006; 281(30):21399-409. · 4.65 Impact Factor
[show abstract][hide abstract] ABSTRACT: Understanding the molecular mechanisms of transition state regulator proteins is critical, since they play a pivotal role
in the ability of bacteria to cope with changing environments. Although much effort has focused on their genetic characterization,
little is known about their structural and functional conservation. Here we present the high resolution NMR solution structure
of the N-terminal domain of the Bacillus subtilis transition state regulator Abh (AbhN), only the second such structure to date. We then compare AbhN to the N-terminal DNA-binding
domain of B. subtilis AbrB (AbrBN). This is the first such comparison between two AbrB-like transition state regulators. AbhN and AbrBN are very
similar, suggesting a common structural basis for their DNA binding. However, we also note subtle variances between the AbhN
and AbrBN structures, which may play important roles in DNA target specificity. The results of accompanying in vitro DNA-binding studies serve to highlight binding differences between the two proteins.
Journal of Biological Chemistry 07/2006; 281(30):21399-21409. · 4.65 Impact Factor
[show abstract][hide abstract] ABSTRACT: The functional requirement to form and maintain the active site structure probably exerts a strong selective pressure on a protein to adopt just one stable and evolutionarily conserved fold. Nonetheless, new evidence suggests the likelihood of protein fold being neither physically nor biologically invariant. Alternative folds discovered in several proteins are composed of constant and variable parts. The latter display context-dependent conformations and a tendency to form new oligomeric interfaces. In turn, oligomerisation mediates fold evolution without loss of protein function. Gene duplication breaks down homo-oligomeric symmetry and relieves the pressure to maintain the local architecture of redundant active sites; this can lead to further structural changes.
Current Opinion in Structural Biology 07/2006; 16(3):399-408. · 8.74 Impact Factor
[show abstract][hide abstract] ABSTRACT: Cellular metabolism constantly generates by-products that are wasteful or even harmful. Such compounds are excreted from the cell or are removed through hydrolysis to normal cellular metabolites by various 'house-cleaning' enzymes. Some of the most important contaminants are non-canonical nucleoside triphosphates (NTPs) whose incorporation into the nascent DNA leads to increased mutagenesis and DNA damage. Enzymes intercepting abnormal NTPs from incorporation by DNA polymerases work in parallel with DNA repair enzymes that remove lesions produced by modified nucleotides. House-cleaning NTP pyrophosphatases targeting non-canonical NTPs belong to at least four structural superfamilies: MutT-related (Nudix) hydrolases, dUTPase, ITPase (Maf/HAM1) and all-alpha NTP pyrophosphatases (MazG). These enzymes have high affinity (Km's in the micromolar range) for their natural substrates (8-oxo-dGTP, dUTP, dITP, 2-oxo-dATP), which allows them to select these substrates from a mixture containing a approximately 1000-fold excess of canonical NTPs. To date, many house-cleaning NTPases have been identified only on the basis of their side activity towards canonical NTPs and NDP derivatives. Integration of growing structural and biochemical data on these superfamilies suggests that their new family members cleanse the nucleotide pool of the products of oxidative damage and inappropriate methylation. House-cleaning enzymes, such as 6-phosphogluconolactonase, are also part of normal intermediary metabolism. Genomic data suggest that house-cleaning systems are more abundant than previously thought and include numerous analogous enzymes with overlapping functions. We discuss the structural diversity of these enzymes, their phylogenetic distribution, substrate specificity and the problem of identifying their true substrates.
[show abstract][hide abstract] ABSTRACT: New relationships found in the process of updating the structural classification of proteins (SCOP) database resulted in the revision of the structure of the N-terminal, DNA-binding domain of the transition state regulator AbrB. The dimeric AbrB domain shares a common fold with the addiction antidote MazE and the subunit of uncharacterized protein MraZ implicated in cell division and cell envelope formation. It has a detectable sequence similarity to both MazE and MraZ thus providing an evolutionary link between the two proteins. The putative DNA-binding site of AbrB is found on the same face as the DNA-binding site of MazE and appears similar, both in structure and sequence, to the exposed conserved region of MraZ. This strongly suggests that MraZ also binds DNA and allows for a consensus model of DNA recognition by the members of this novel protein superfamily.
[show abstract][hide abstract] ABSTRACT: We report here the structure of the putative chromo domain from MOF, a member of the MYST family of histone acetyltransferases that acetylates histone H4 at Lys-16 and is part of the dosage compensation complex in Drosophila. We found that the structure of this domain is a beta-barrel that is distinct from the alpha + beta fold of the canonical chromo domain. Despite the differences, there are similarities that support an evolutionary relationship between the two domains, and we propose the name "chromo barrel." The chromo barrel domains may be divided into two groups, MSL3-like and MOF-like, on the basis of whether a group of conserved aromatic residues is present or not. The structure suggests that, although the MOF-like domains may have a role in RNA binding, the MSL3-like domains could instead bind methylated residues. The MOF chromo barrel shares a common fold with other chromatin-associated modules, including the MBT-like repeat, Tudor, and PWWP domains. This structural similarity suggests a probable evolutionary pathway from these other modules to the canonical chromo domains (or vice versa) with the chromo barrel domain representing an intermediate structure.
Journal of Biological Chemistry 10/2005; 280(37):32326-31. · 4.65 Impact Factor
[show abstract][hide abstract] ABSTRACT: Structure-guided analysis of the new dimeric dUTPase family revealed its sequence relationship to the phage T4 dCTPase, phosphoribosyl-ATP pyrophosphatase HisE, NTP pyrophosphatase MazG, and several uncharacterized protein families, including the human protein XTP3TPA (RS21-C6), which is overexpressed in embryonic and cancer cells. Comparison with the recently determined structure of a MazG-like protein from Sulfolobus solfataricus supported the unification of these enzymes in one superfamily of all-alpha NTP pyrophosphatases, suggesting that dimeric dUTPases evolved from a tetrameric MazG-like ancestor by gene duplication. Analysis of the structure of the Sulfolobus MazG points to 2-hydroxyadenosine (isoguanosine) triphosphate, a product of oxidative damage of ATP, as the most likely substrate. We predict that uncharacterized members of this superfamily perform "house-cleaning" functions by hydrolyzing abnormal NTPs and are functionally analogous to the structurally unrelated hydrolases of the Nudix superfamily. We outline probable tertiary and quaternary structures of the all-alpha NTP pyrophosphatase superfamily members.
Journal of Molecular Biology 04/2005; 347(2):243-55. · 3.91 Impact Factor
[show abstract][hide abstract] ABSTRACT: The Structural Classification of Proteins (SCOP) database is a comprehensive ordering of all proteins of known structure, according to their evolutionary and structural relationships. Protein domains in SCOP are hierarchically classified into families, superfamilies, folds and classes. The continual accumulation of sequence and structural data allows more rigorous analysis and provides important information for understanding the protein world and its evolutionary repertoire. SCOP participates in a project that aims to rationalize and integrate the data on proteins held in several sequence and structure databases. As part of this project, starting with release 1.63, we have initiated a refinement of the SCOP classification, which introduces a number of changes mostly at the levels below superfamily. The pending SCOP reclassification will be carried out gradually through a number of future releases. In addition to the expanded set of static links to external resources, available at the level of domain entries, we have started modernization of the interface capabilities of SCOP allowing more dynamic links with other databases. SCOP can be accessed at http://scop.mrc-lmb.cam.ac.uk/scop.
Nucleic Acids Research 02/2004; 32(Database issue):D226-9. · 8.28 Impact Factor
[show abstract][hide abstract] ABSTRACT: Sex Comb on Midleg (SCM) belongs to the Polycomb group of proteins, which are involved in transcriptional regulation in Drosophila. It is one of the components of Polycomb repressive complex 1, a multiprotein complex of Polycomb group proteins involved in the maintenance of repression and the blocking of chromatin remodeling. SCM contains two approximately 100-residue malignant brain tumor (MBT) repeats at the N terminus. These repeats are also found in other proteins involved in transcriptional repression. Here, we report the 1.78-A crystal structure of the two MBT repeats of SCM-like 2 (SCML2), a human homologue of SCM. Each repeat consists of an extended arm and a beta-barrel core. There are significant structural similarities to the Tudor, PWWP, and chromo domains, suggesting probable evolutionary relationships and functional similarities between the MBT repeats and these domains.
Journal of Biological Chemistry 12/2003; 278(47):46968-73. · 4.65 Impact Factor
[show abstract][hide abstract] ABSTRACT: Specific modifications to histones are essential epigenetic markers---heritable changes in gene expression that do not affect the DNA sequence. Methylation of lysine 9 in histone H3 is recognized by heterochromatin protein 1 (HP1), which directs the binding of other proteins to control chromatin structure and gene expression. Here we show that HP1 uses an induced-fit mechanism for recognition of this modification, as revealed by the structure of its chromodomain bound to a histone H3 peptide dimethylated at Nzeta of lysine 9. The binding pocket for the N-methyl groups is provided by three aromatic side chains, Tyr21, Trp42 and Phe45, which reside in two regions that become ordered on binding of the peptide. The side chain of Lys9 is almost fully extended and surrounded by residues that are conserved in many other chromodomains. The QTAR peptide sequence preceding Lys9 makes most of the additional interactions with the chromodomain, with HP1 residues Val23, Leu40, Trp42, Leu58 and Cys60 appearing to be a major determinant of specificity by binding the key buried Ala7. These findings predict which other chromodomains will bind methylated proteins and suggest a motif that they recognize.
[show abstract][hide abstract] ABSTRACT: The SCOP (Structural Classification of Proteins) database is a comprehensive ordering of all proteins of known structure, according to their evolutionary and structural relationships. Protein domains in SCOP are grouped into species and hierarchically classified into families, superfamilies, folds and classes. Recently, we introduced a new set of features with the aim of standardizing access to the database, and providing a solid basis to manage the increasing number of experimental structures expected from structural genomics projects. These features include: a new set of identifiers, which uniquely identify each entry in the hierarchy; a compact representation of protein domain classification; a new set of parseable files, which fully describe all domains in SCOP and the hierarchy itself. These new features are reflected in the ASTRAL compendium. The SCOP search engine has also been updated, and a set of links to external resources added at the level of domain entries. SCOP can be accessed at http://scop.mrc-lmb.cam.ac.uk/scop.
Nucleic Acids Research 02/2002; 30(1):264-7. · 8.28 Impact Factor
[show abstract][hide abstract] ABSTRACT: The genome sequencing projects and knowledge of the entire protein repertoires of many organisms have prompted new procedures and techniques for the large-scale determination of protein structure, function and interactions. Recently, new work has been carried out on the determination of the function and evolutionary relationships of proteins by experimental structural genomics, and the discovery of protein-protein interactions by computational structural genomics.
Current Opinion in Structural Biology 07/2001; 11(3):354-63. · 8.74 Impact Factor