Alexey G Murzin

EMBL-EBI, Cambridge, England, United Kingdom

Are you Alexey G Murzin?

Claim your profile

Publications (74)497.93 Total impact

  • [show abstract] [hide abstract]
    ABSTRACT: CA_C2195 from Clostridium acetobutylicum is a protein of unknown function. Sequence analysis predicted that part of the protein contained a metallopeptidase-related domain. There are over 200 homologs of similar size in large sequence databases such as UniProt, with pairwise sequence identities in the range of ~40-60%. CA_C2195 was chosen for crystal structure determination for structure-based function annotation of novel protein sequence space. The structure confirmed that CA_C2195 contained an N-terminal metallopeptidase-like domain. The structure revealed two extra domains: an alpha+beta domain inserted in the metallopeptidase-like domain and a C-terminal circularly permuted winged-helix-turn-helix domain. Based on our sequence and structural analyses using the crystal structure of CA_C2195 we provide a view into the possible functions of the protein. From contextual information from gene-neighborhood analysis, we propose that rather than being a peptidase, CA_C2195 and its homologs might play a role in biosynthesis of a modified cell-surface carbohydrate in conjunction with several sugar-modification enzymes. These results provide the groundwork for the experimental verification of the function.
    BMC Bioinformatics 03/2014; 15(1):75. · 3.02 Impact Factor
  • Source
    [show abstract] [hide abstract]
    ABSTRACT: We present a prototype of a new structural classification of proteins, SCOP2 (http://scop2.mrc-lmb.cam.ac.uk/), that we have developed recently. SCOP2 is a successor to the Structural Classification of Proteins (SCOP, http://scop.mrc-lmb.cam.ac.uk/scop/) database. Similarly to SCOP, the main focus of SCOP2 is to organize structurally characterized proteins according to their structural and evolutionary relationships. SCOP2 was designed to provide a more advanced framework for protein structure annotation and classification. It defines a new approach to the classification of proteins that is essentially different from SCOP, but retains its best features. The SCOP2 classification is described in terms of a directed acyclic graph in which nodes form a complex network of many-to-many relationships and are represented by a region of protein structure and sequence. The new classification project is expected to ensure new advances in the field and open new areas of research.
    Nucleic Acids Research 11/2013; · 8.28 Impact Factor
  • Source
    [show abstract] [hide abstract]
    ABSTRACT: The NTF2-like superfamily is a versatile group of protein domains sharing a common fold. The sequences of these domains are very diverse and they share no common sequence motif. These domains serve a range of different functions within the proteins in which they are found, including both catalytic and non-catalytic versions. Clues to the function of protein domains belonging to such a diverse superfamily can be gleaned from analysis of the proteins and organisms in which they are found. Here we describe three protein domains of unknown function found mainly in bacteria: DUF3828, DUF3887 and DUF4878. Structures of representatives of each of these domains: BT_3511 from Bacteroides thetaiotaomicron (strain VPI-5482) [PDB:3KZT], Cj0202c from Campylobacter jejuni subsp. jejuni serotype O:2 (strain NCTC 11168) [PDB:3K7C], rumgna_01855) and RUMGNA_01855 from Ruminococcus gnavus (strain ATCC 29149) [PDB:4HYZ] have been solved by X-ray crystallography. All three domains are similar in structure and all belong to the NTF2-like superfamily. Although the function of these domains remains unknown at present, our analysis enables us to present a hypothesis concerning their role. Our analysis of these three protein domains suggests a potential non-catalytic ligand-binding role. This may regulate the activities of domains with which they are combined in the same polypeptide or via operonic linkages, such as signaling domains (e.g. serine/threonine protein kinase), peptidoglycan-processing hydrolases (e.g. NlpC/P60 peptidases) or nucleic acid binding domains (e.g. Zn-ribbons).
    BMC Bioinformatics 11/2013; 14(1):327. · 3.02 Impact Factor
  • Source
    [show abstract] [hide abstract]
    ABSTRACT: Maf (for multicopy associated filamentation) proteins represent a large family of conserved proteins implicated in cell division arrest but whose biochemical activity remains unknown. Here, we show that the prokaryotic and eukaryotic Maf proteins exhibit nucleotide pyrophosphatase activity against 5-methyl-UTP, pseudo-UTP, 5-methyl-CTP, and 7-methyl-GTP, which represent the most abundant modified bases in all organisms, as well as against canonical nucleotides dTTP, UTP, and CTP. Overexpression of the Maf protein YhdE in E. coli cells increased intracellular levels of dTMP and UMP, confirming that dTTP and UTP are the in vivo substrates of this protein. Crystal structures and site-directed mutagenesis of Maf proteins revealed the determinants of their activity and substrate specificity. Thus, pyrophosphatase activity of Maf proteins toward canonical and modified nucleotides might provide the molecular mechanism for a dual role of these proteins in cell division arrest and house cleaning.
    Chemistry & biology 10/2013; · 6.52 Impact Factor
  • Source
    [show abstract] [hide abstract]
    ABSTRACT: Every genome contains a large number of uncharacterized proteins that may encode entirely novel biological systems. Many of these uncharacterized proteins fall into related sequence families. By applying sequence and structural analysis we hope to provide insight into novel biology. We analyze a previously uncharacterized Pfam protein family called DUF4424 [Pfam:PF14415]. The recently solved three-dimensional structure of the protein lpg2210 from Legionella pneumophila provides the first structural information pertaining to this family. This protein additionally includes the first representative structure of another Pfam family called the YARHG domain [Pfam:PF13308]. The Pfam family DUF4424 adopts a 19-stranded beta-sandwich fold that shows similarity to the N-terminal domain of leukotriene A-4 hydrolase. The YARHG domain forms an all-helical domain at the C-terminus. Structure analysis allows us to recognize distant similarities between the DUF4424 domain and individual domains of M1 aminopeptidases and tricorn proteases, which form massive proteasome-like capsids in both archaea and bacteria. Based on our analyses we hypothesize that the DUF4424 domain may have a role in forming large, multi-component enzyme complexes. We suggest that the YARGH domain may play a role in binding a moiety in proximity with peptidoglycan, such as a hydrophobic outer membrane lipid or lipopolysaccharide.
    BMC Bioinformatics 09/2013; 14(1):265. · 3.02 Impact Factor
  • Source
    [show abstract] [hide abstract]
    ABSTRACT: Genome3D, available at http://www.genome3d.eu, is a new collaborative project that integrates UK-based structural resources to provide a unique perspective on sequence-structure-function relationships. Leading structure prediction resources (DomSerf, FUGUE, Gene3D, pDomTHREADER, Phyre and SUPERFAMILY) provide annotations for UniProt sequences to indicate the locations of structural domains (structural annotations) and their 3D structures (structural models). Structural annotations and 3D model predictions are currently available for three model genomes (Homo sapiens, E. coli and baker's yeast), and the project will extend to other genomes in the near future. As these resources exploit different strategies for predicting structures, the main aim of Genome3D is to enable comparisons between all the resources so that biologists can see where predictions agree and are therefore more trusted. Furthermore, as these methods differ in whether they build their predictions using CATH or SCOP, Genome3D also contains the first official mapping between these two databases. This has identified pairs of similar superfamilies from the two resources at various degrees of consensus (532 bronze pairs, 527 silver pairs and 370 gold pairs).
    Nucleic Acids Research 11/2012; · 8.28 Impact Factor
  • Source
    [show abstract] [hide abstract]
    ABSTRACT: C-1 carriers are essential cofactors in all domains of life, and in Archaea, these can be derivatives of tetrahydromethanopterin (H(4)-MPT) or tetrahydrofolate (H(4)-folate). Their synthesis requires 6-hydroxymethyl-7,8-dihydropterin diphosphate (6-HMDP) as the precursor, but the nature of pathways that lead to its formation were unknown until the recent discovery of the GTP cyclohydrolase IB/MptA family that catalyzes the first step, the conversion of GTP to dihydroneopterin 2',3'-cyclic phosphate or 7,8-dihydroneopterin triphosphate [El Yacoubi, B.; et al. (2006) J. Biol. Chem., 281, 37586-37593 and Grochowski, L. L.; et al. (2007) Biochemistry46, 6658-6667]. Using a combination of comparative genomics analyses, heterologous complementation tests, and in vitro assays, we show that the archaeal protein families COG2098 and COG1634 specify two of the missing 6-HMDP synthesis enzymes. Members of the COG2098 family catalyze the formation of 6-hydroxymethyl-7,8-dihydropterin from 7,8-dihydroneopterin, while members of the COG1634 family catalyze the formation of 6-HMDP from 6-hydroxymethyl-7,8-dihydropterin. The discovery of these missing genes solves a long-standing mystery and provides novel examples of convergent evolutions where proteins of dissimilar architectures perform the same biochemical function.
    ACS Chemical Biology 08/2012; · 5.44 Impact Factor
  • Source
    [show abstract] [hide abstract]
    ABSTRACT: The YgjD/Kae1 family (COG0533) has been on the top-10 list of universally conserved proteins of unknown function for over 5 years. It has been linked to DNA maintenance in bacteria and mitochondria and transcription regulation and telomere homeostasis in eukaryotes, but its actual function has never been found. Based on a comparative genomic and structural analysis, we predicted this family was involved in the biosynthesis of N(6)-threonylcarbamoyl adenosine, a universal modification found at position 37 of tRNAs decoding ANN codons. This was confirmed as a yeast mutant lacking Kae1 is devoid of t(6)A. t(6)A(-) strains were also used to reveal that t(6)A has a critical role in initiation codon restriction to AUG and in restricting frameshifting at tandem ANN codons. We also showed that YaeZ, a YgjD paralog, is required for YgjD function in vivo in bacteria. This work lays the foundation for understanding the pleiotropic role of this universal protein family.
    The EMBO Journal 02/2011; 30(5):882-93. · 9.82 Impact Factor
  • Source
    Antonina Andreeva, Alexey G Murzin
    [show abstract] [hide abstract]
    ABSTRACT: During the past decade, the Protein Structure Initiative (PSI) centres have become major contributors of new families, superfamilies and folds to the Structural Classification of Proteins (SCOP) database. The PSI results have increased the diversity of protein structural space and accelerated our understanding of it. This review article surveys a selection of protein structures determined by the Joint Center for Structural Genomics (JCSG). It presents previously undescribed β-sheet architectures such as the double barrel and spiral β-roll and discusses new examples of unusual topologies and peculiar structural features observed in proteins characterized by the JCSG and other Structural Genomics centres.
    Acta Crystallographica Section F Structural Biology and Crystallization Communications 10/2010; 66(Pt 10):1190-7. · 0.55 Impact Factor
  • Source
    [show abstract] [hide abstract]
    ABSTRACT: The crystal structure of a putative NTPase, YP_001813558.1 from Exiguobacterium sibiricum 255-15 (PF09934, DUF2166) was determined to 1.78 Å resolution. YP_001813558.1 and its homologs (dimeric dUTPases, MazG proteins and HisE-encoded phosphoribosyl ATP pyrophosphohydrolases) form a superfamily of all-α-helical NTP pyrophosphatases. In dimeric dUTPase-like proteins, a central four-helix bundle forms the active site. However, in YP_001813558.1, an unexpected intertwined swapping of two of the helices that compose the conserved helix bundle results in a `linked dimer' that has not previously been observed for this family. Interestingly, despite this novel mode of dimerization, the metal-binding site for divalent cations, such as magnesium, that are essential for NTPase activity is still conserved. Furthermore, the active-site residues that are involved in sugar binding of the NTPs are also conserved when compared with other α-helical NTPases, but those that recognize the nucleotide bases are not conserved, suggesting a different substrate specificity.
    Acta Crystallographica Section F Structural Biology and Crystallization Communications 10/2010; 66(Pt 10):1237-44. · 0.55 Impact Factor
  • Source
    Antonina Andreeva, Alexey G Murzin
    Proceedings of the National Academy of Sciences 01/2009; 105(52):E128-9. · 9.74 Impact Factor
  • Source
    [show abstract] [hide abstract]
    ABSTRACT: We have identified a novel family of proteins, in which the N-terminal cystathionine beta-synthase (CBS) domain is fused to the C-terminal Zn ribbon domain. Four proteins were overexpressed in Escherichia coli and purified: TA0289 from Thermoplasma acidophilum, TV1335 from Thermoplasma volcanium, PF1953 from Pyrococcus furiosus, and PH0267 from Pyrococcus horikoshii. The purified proteins had a red/purple color in solution and an absorption spectrum typical of rubredoxins (Rds). Metal analysis of purified proteins revealed the presence of several metals, with iron and zinc being the most abundant metals (2-67% of iron and 12-74% of zinc). Crystal structures of both mercury- and iron-bound TA0289 (1.5-2.0 A resolution) revealed a dimeric protein whose intersubunit contacts are formed exclusively by the alpha-helices of two cystathionine beta-synthase subdomains, whereas the C-terminal domain has a classical Zn ribbon planar architecture. All proteins were reversibly reduced by chemical reductants (ascorbate or dithionite) or by the general Rd reductase NorW from E. coli in the presence of NADH. Reduced TA0289 was found to be capable of transferring electrons to cytochrome C from horse heart. Likewise, the purified Zn ribbon protein KTI11 from Saccharomyces cerevisiae had a purple color in solution and an Rd-like absorption spectrum, contained both iron and zinc, and was reduced by the Rd reductase NorW from E. coli. Thus, recombinant Zn ribbon domains from archaea and yeast demonstrate an Rd-like electron carrier activity in vitro. We suggest that, in vivo, some Zn ribbon domains might also bind iron and therefore possess an electron carrier activity, adding another physiological role to this large family of important proteins.
    Journal of Molecular Biology 02/2008; 375(1):301-15. · 3.91 Impact Factor
  • Source
    [show abstract] [hide abstract]
    ABSTRACT: The Structural Classification of Proteins (SCOP) database is a comprehensive ordering of all proteins of known structure, according to their evolutionary and structural relationships. The SCOP hierarchy comprises the following levels: Species, Protein, Family, Superfamily, Fold and Class. While keeping the original classification scheme intact, we have changed the production of SCOP in order to cope with a rapid growth of new structural data and to facilitate the discovery of new protein relationships. We describe ongoing developments and new features implemented in SCOP. A new update protocol supports batch classification of new protein structures by their detected relationships at Family and Superfamily levels in contrast to our previous sequential handling of new structural data by release date. We introduce pre-SCOP, a preview of the SCOP developmental version that enables earlier access to the information on new relationships. We also discuss the impact of worldwide Structural Genomics initiatives, which are producing new protein structures at an increasing rate, on the rates of discovery and growth of protein families and superfamilies. SCOP can be accessed at http://scop.mrc-lmb.cam.ac.uk/scop.
    Nucleic Acids Research 02/2008; 36(Database issue):D419-25. · 8.28 Impact Factor
  • Source
    [show abstract] [hide abstract]
    ABSTRACT: With the increasing amount of structural data, the number of homologous protein structures bearing topological irregularities is steadily growing. These include proteins with circular permutations, segment-swapping, context-dependent folding or chameleon sequences that can adopt alternative secondary structures. Their non-trivial structural relationships are readily identified during expert analysis but their automatic identification using the existing computational tools still remains difficult or impossible. Such non-trivial cases of protein relationships are known to pose a problem to multiple alignment algorithms and to impede comparative modeling studies. They support a new emerging concept of evolutionary changeable protein fold, which creates practical difficulties for the hierarchical classifications of protein structures.To facilitate the understanding of, and to provide a comprehensive annotation of proteins with such non-trivial structural relationships we have created SISYPHUS ([Sigmaomeganuphiomicronzeta]--in Greek crafty), a compendium to the SCOP database. The SISYPHUS database contains a collection of manually curated structural alignments and their inter-relationships. The multiple alignments are constructed for protein structural regions that range from oligomeric biological units, or individual domains to fragments of different size. The SISYPHUS multiple alignments are displayed with SPICE, a browser that provides an integrated view of protein sequences, structures and their annotations. The database is available from http://sisyphus.mrc-cpe.cam.ac.uk.
    Nucleic Acids Research 02/2007; 35(Database issue):D253-9. · 8.28 Impact Factor
  • Source
    [show abstract] [hide abstract]
    ABSTRACT: Understanding the molecular mechanisms of transition state regulator proteins is critical, since they play a pivotal role in the ability of bacteria to cope with changing environments. Although much effort has focused on their genetic characterization, little is known about their structural and functional conservation. Here we present the high resolution NMR solution structure of the N-terminal domain of the Bacillus subtilis transition state regulator Abh (AbhN), only the second such structure to date. We then compare AbhN to the N-terminal DNA-binding domain of B. subtilis AbrB (AbrBN). This is the first such comparison between two AbrB-like transition state regulators. AbhN and AbrBN are very similar, suggesting a common structural basis for their DNA binding. However, we also note subtle variances between the AbhN and AbrBN structures, which may play important roles in DNA target specificity. The results of accompanying in vitro DNA-binding studies serve to highlight binding differences between the two proteins.
    Journal of Biological Chemistry 08/2006; 281(30):21399-409. · 4.65 Impact Factor
  • [show abstract] [hide abstract]
    ABSTRACT: Understanding the molecular mechanisms of transition state regulator proteins is critical, since they play a pivotal role in the ability of bacteria to cope with changing environments. Although much effort has focused on their genetic characterization, little is known about their structural and functional conservation. Here we present the high resolution NMR solution structure of the N-terminal domain of the Bacillus subtilis transition state regulator Abh (AbhN), only the second such structure to date. We then compare AbhN to the N-terminal DNA-binding domain of B. subtilis AbrB (AbrBN). This is the first such comparison between two AbrB-like transition state regulators. AbhN and AbrBN are very similar, suggesting a common structural basis for their DNA binding. However, we also note subtle variances between the AbhN and AbrBN structures, which may play important roles in DNA target specificity. The results of accompanying in vitro DNA-binding studies serve to highlight binding differences between the two proteins.
    Journal of Biological Chemistry 07/2006; 281(30):21399-21409. · 4.65 Impact Factor
  • Source
    Antonina Andreeva, Alexey G Murzin
    [show abstract] [hide abstract]
    ABSTRACT: The functional requirement to form and maintain the active site structure probably exerts a strong selective pressure on a protein to adopt just one stable and evolutionarily conserved fold. Nonetheless, new evidence suggests the likelihood of protein fold being neither physically nor biologically invariant. Alternative folds discovered in several proteins are composed of constant and variable parts. The latter display context-dependent conformations and a tendency to form new oligomeric interfaces. In turn, oligomerisation mediates fold evolution without loss of protein function. Gene duplication breaks down homo-oligomeric symmetry and relieves the pressure to maintain the local architecture of redundant active sites; this can lead to further structural changes.
    Current Opinion in Structural Biology 07/2006; 16(3):399-408. · 8.74 Impact Factor
  • Source
    [show abstract] [hide abstract]
    ABSTRACT: The culturability of several actinobacteria is controlled by resuscitation-promoting factors (Rpfs). These are proteins containing a c. 70-residue domain that adopts a lysozyme-like fold. The invariant catalytic glutamate residue found in lysozyme and various bacterial lytic transglycosylases is also conserved in the Rpf proteins. Rpf from Micrococcus luteus, the founder member of this protein family, is indeed a muralytic enzyme, as revealed by its activity in zymograms containing M. luteus cell walls and its ability to (i) cause lysis of Escherichia coli when expressed and secreted into the periplasm; (ii) release fluorescent material from fluorescamine-labelled cell walls of M. luteus; and (iii) hydrolyse the artificial lysozyme substrate, 4-methylumbelliferyl-beta-D-N,N',N''-triacetylchitotrioside. Rpf activity was reduced but not completely abolished when the invariant glutamate residue was altered. Moreover, none of the other acidic residues in the Rpf domain was absolutely required for muralytic activity. Replacement of one or both of the cysteine residues that probably form a disulphide bridge within Rpf impaired but did not completely abolish muralytic activity. The muralytic activities of the Rpf mutants were correlated with their abilities to stimulate bacterial culturability and resuscitation, consistent with the view that the biological activity of Rpf results directly or indirectly from its ability to cleave bonds in bacterial peptidoglycan.
    Molecular Microbiology 02/2006; 59(1):84-98. · 4.96 Impact Factor
  • [show abstract] [hide abstract]
    ABSTRACT: Cellular metabolism constantly generates by-products that are wasteful or even harmful. Such compounds are excreted from the cell or are removed through hydrolysis to normal cellular metabolites by various 'house-cleaning' enzymes. Some of the most important contaminants are non-canonical nucleoside triphosphates (NTPs) whose incorporation into the nascent DNA leads to increased mutagenesis and DNA damage. Enzymes intercepting abnormal NTPs from incorporation by DNA polymerases work in parallel with DNA repair enzymes that remove lesions produced by modified nucleotides. House-cleaning NTP pyrophosphatases targeting non-canonical NTPs belong to at least four structural superfamilies: MutT-related (Nudix) hydrolases, dUTPase, ITPase (Maf/HAM1) and all-alpha NTP pyrophosphatases (MazG). These enzymes have high affinity (Km's in the micromolar range) for their natural substrates (8-oxo-dGTP, dUTP, dITP, 2-oxo-dATP), which allows them to select these substrates from a mixture containing a approximately 1000-fold excess of canonical NTPs. To date, many house-cleaning NTPases have been identified only on the basis of their side activity towards canonical NTPs and NDP derivatives. Integration of growing structural and biochemical data on these superfamilies suggests that their new family members cleanse the nucleotide pool of the products of oxidative damage and inappropriate methylation. House-cleaning enzymes, such as 6-phosphogluconolactonase, are also part of normal intermediary metabolism. Genomic data suggest that house-cleaning systems are more abundant than previously thought and include numerous analogous enzymes with overlapping functions. We discuss the structural diversity of these enzymes, their phylogenetic distribution, substrate specificity and the problem of identifying their true substrates.
    Molecular Microbiology 02/2006; 59(1):5-19. · 4.96 Impact Factor
  • Source
    [show abstract] [hide abstract]
    ABSTRACT: New relationships found in the process of updating the structural classification of proteins (SCOP) database resulted in the revision of the structure of the N-terminal, DNA-binding domain of the transition state regulator AbrB. The dimeric AbrB domain shares a common fold with the addiction antidote MazE and the subunit of uncharacterized protein MraZ implicated in cell division and cell envelope formation. It has a detectable sequence similarity to both MazE and MraZ thus providing an evolutionary link between the two proteins. The putative DNA-binding site of AbrB is found on the same face as the DNA-binding site of MazE and appears similar, both in structure and sequence, to the exposed conserved region of MraZ. This strongly suggests that MraZ also binds DNA and allows for a consensus model of DNA recognition by the members of this novel protein superfamily.
    FEBS Letters 11/2005; 579(25):5669-74. · 3.58 Impact Factor

Publication Stats

9k Citations
497.93 Total Impact Points

Institutions

  • 2013
    • EMBL-EBI
      Cambridge, England, United Kingdom
  • 1998–2007
    • University of Cambridge
      • Department of Biochemistry
      Cambridge, England, United Kingdom
    • Trust Sanger Institute Genome Research Ltd.
      Cambridge, England, United Kingdom
  • 2006
    • National Institutes of Health
      • National Center for Biotechnology Information
      Bethesda, MD, United States
  • 2005
    • North Carolina State University
      • Department of Molecular and Structural Biochemistry
      Raleigh, NC, United States
  • 2002–2004
    • University of Texas at Austin
      • • Division of Medicinal Chemistry
      • • Department of Chemistry and Biochemistry
      Texas City, TX, United States
  • 2001
    • University College London
      • Department of Structural and Molecular Biology
      London, ENG, United Kingdom
  • 1996–2001
    • Medical Research Council (UK)
      Londinium, England, United Kingdom
  • 1993
    • Russian Academy of Sciences
      • Institute of Protein Research
      Moscow, Moscow, Russia