Kira S Makarova

Oregon State University, Corvallis, Oregon, United States

Are you Kira S Makarova?

Claim your profile

Publications (127)824.51 Total impact

  • [show abstract] [hide abstract]
    ABSTRACT: The CRISPR-Cas systems of archaeal and bacterial adaptive immunity are classified into three types that differ by the repertoires of CRISPR-associated (cas) genes, the organization of cas operons and the structure of repeats in the CRISPR arrays. The simplest among the CRISPR-Cas systems is type II in which the endonuclease activities required for the interference with foreign deoxyribonucleic acid (DNA) are concentrated in a single multidomain protein, Cas9, and are guided by a co-processed dual-tracrRNA:crRNA molecule. This compact enzymatic machinery and readily programmable site-specific DNA targeting make type II systems top candidates for a new generation of powerful tools for genomic engineering. Here we report an updated census of CRISPR-Cas systems in bacterial and archaeal genomes. Type II systems are the rarest, missing in archaea, and represented in ∼5% of bacterial genomes, with an over-representation among pathogens and commensals. Phylogenomic analysis suggests that at least three cas genes, cas1, cas2 and cas4, and the CRISPR repeats of the type II-B system were acquired via recombination with a type I CRISPR-Cas locus. Distant homologs of Cas9 were identified among proteins encoded by diverse transposons, suggesting that type II CRISPR-Cas evolved via recombination of mobile nuclease genes with type I loci.
    Nucleic Acids Research 04/2014; · 8.28 Impact Factor
  • Kira S Makarova, Yuri I Wolf, Eugene V Koonin
    [show abstract] [hide abstract]
    ABSTRACT: CRISPR (clustered regularly interspaced short palindromic repeats)-Cas (CRISPR-associated) is an adaptive immunity system in bacteria and archaea that functions via a distinct self/non-self recognition mechanism that involves unique spacers homologous with viral or plasmid DNA and integrated into the CRISPR loci. Most of the Cas proteins evolve under relaxed purifying selection and some underwent dramatic structural rearrangements during evolution. In many cases, CRISPR-Cas system components are replaced either by homologous or by analogous proteins or domains in some bacterial and archaeal lineages. However, recent advances in comparative sequence analysis, structural studies and experimental data suggest that, despite this remarkable evolutionary plasticity, all CRISPR-Cas systems employ the same architectural and functional principles, and given the conservation of the principal building blocks, share a common ancestry. We review recent advances in the understanding of the evolution and organization of CRISPR-Cas systems. Among other developments, we describe for the first time a group of archaeal cas1 gene homologues that are not associated with CRISPR-Cas loci and are predicted to be involved in functions other than adaptive immunity.
    Biochemical Society Transactions 12/2013; 41(6):1392-1400. · 2.59 Impact Factor
  • Source
    [show abstract] [hide abstract]
    ABSTRACT: The CRISPR-Cas-derived RNA-guided Cas9 endonuclease is the key element of an emerging promising technology for genome engineering in a broad range of cells and organisms. The DNA-targeting mechanism of the type II CRISPR-Cas system involves maturation of tracrRNA:crRNA duplex (dual-RNA), which directs Cas9 to cleave invading DNA in a sequence-specific manner, dependent on the presence of a Protospacer Adjacent Motif (PAM) on the target. We show that evolution of dual-RNA and Cas9 in bacteria produced remarkable sequence diversity. We selected eight representatives of phylogenetically defined type II CRISPR-Cas groups to analyze possible coevolution of Cas9 and dual-RNA. We demonstrate that these two components are interchangeable only between closely related type II systems when the PAM sequence is adjusted to the investigated Cas9 protein. Comparison of the taxonomy of bacterial species that harbor type II CRISPR-Cas systems with the Cas9 phylogeny corroborates horizontal transfer of the CRISPR-Cas loci. The reported collection of dual-RNA:Cas9 with associated PAMs expands the possibilities for multiplex genome editing and could provide means to improve the specificity of the RNA-programmable Cas9 tool.
    Nucleic Acids Research 11/2013; · 8.28 Impact Factor
  • Source
    [show abstract] [hide abstract]
    ABSTRACT: Clustering of functionally related genes in operons allows for coregulated gene expression in prokaryotes. This is advantageous when equal amounts of gene products are required. Production of protein complexes with an uneven stoichiometry, however, requires tuning mechanisms to generate subunits in appropriate relative quantities. Using comparative genomic analysis, we show that differential translation is a key determinant of modulated expression of genes clustered in operons and that codon bias generally is the best in silico indicator of unequal protein production. Variable ribosome density profiles of polycistronic transcripts correlate strongly with differential translation patterns. In addition, we provide experimental evidence that de novo initiation of translation can occur at intercistronic sites, allowing for differential translation of any gene irrespective of its position on a polycistronic messenger. Thus, modulation of translation efficiency appears to be a universal mode of control in bacteria and archaea that allows for differential production of operon-encoded proteins.
    Cell reports. 09/2013;
  • [show abstract] [hide abstract]
    ABSTRACT: To characterize the mechanism through which myosin XI-K attaches to its principal endomembrane cargo, a yeast two-hybrid library of Arabidopsis thaliana cDNAs was screened using the myosin cargo binding domain as bait. This screen identified two previously uncharacterized transmembrane proteins (hereinafter myosin binding proteins or MyoB1/2) that share a myosin binding, conserved domain of unknown function 593 (DUF593). Additional screens revealed that MyoB1/2 also bind myosin XI-1, whereas myosin XI-I interacts with the distantly related MyoB7. The in vivo interactions of MyoB1/2 with myosin XI-K were confirmed by immunoprecipitation and colocalization analyses. In epidermal cells, the yellow fluorescent protein-tagged MyoB1/2 localize to vesicles that traffic in a myosin XI-dependent manner. Similar to myosin XI-K, MyoB1/2 accumulate in the tip-growing domain of elongating root hairs. Gene knockout analysis demonstrated that functional cooperation between myosin XI-K and MyoB proteins is required for proper plant development. Unexpectedly, the MyoB1-containing vesicles did not correspond to brefeldin A-sensitive Golgi and post-Golgi or prevacuolar compartments and did not colocalize with known exocytic or endosomal compartments. Phylogenomic analysis suggests that DUF593 emerged in primitive land plants and founded a multigene family that is conserved in all flowering plants. Collectively, these findings indicate that MyoB are membrane-anchored myosin receptors that define a distinct, plant-specific transport vesicle compartment.
    The Plant Cell 08/2013; · 9.25 Impact Factor
  • Kira S Makarova, Eugene V Koonin
    [show abstract] [hide abstract]
    ABSTRACT: Recent advances in the characterization of the archaeal DNA replication system together with comparative genomic analysis have led to the identification of several previously uncharacterized archaeal proteins involved in replication and currently reveal a nearly complete correspondence between the components of the archaeal and eukaryotic replication machineries. It can be inferred that the archaeal ancestor of eukaryotes and even the last common ancestor of all extant archaea possessed replication machineries that were comparable in complexity to the eukaryotic replication system. The eukaryotic replication system encompasses multiple paralogs of ancestral components such that heteromeric complexes in eukaryotes replace archaeal homomeric complexes, apparently along with subfunctionalization of the eukaryotic complex subunits. In the archaea, parallel, lineage-specific duplications of many genes encoding replication machinery components are detectable as well; most of these archaeal paralogs remain to be functionally characterized. The archaeal replication system shows remarkable plasticity whereby even some essential components such as DNA polymerase and single-stranded DNA-binding protein are displaced by unrelated proteins with analogous activities in some lineages.
    Cold Spring Harbor perspectives in biology 07/2013; · 9.63 Impact Factor
  • Source
    [show abstract] [hide abstract]
    ABSTRACT: BACKGROUND: The major role of enzymatic toxins that target nucleic acids in biological conflicts at all levels has become increasingly apparent thanks in large part to the advances of comparative genomics. Typically, toxins evolve rapidly hampering the identification of these proteins by sequence analysis. Here we analyze an unexpectedly widespread superfamily of toxin domains most of which possess RNase activity. RESULTS: The HEPN superfamily is comprised of all alpha-helical domains that were first identified as being associated with DNA polymerase beta-type nucleotidyltransferases in prokaryotes and animal Sacsin proteins. Using sensitive sequence and structure comparison methods, we vastly extend the HEPN superfamily by identifying numerous novel families and by detecting diverged HEPN domains in several known protein families. The new HEPN families include the RNase LS and LsoA catalytic domains, KEN domains (e.g. RNaseL and Ire1) and the RNase domains of RloC and PrrC. The majority of HEPN domains contain conserved motifs that constitute a metal-independent endoRNase active site. Some HEPN domains lacking this motif probably function as non-catalytic RNA-binding domains, such as in the case of the mannitol repressor MtlR. Our analysis shows that HEPN domains function as toxins that are shared by numerous systems implicated in intra-genomic, inter-genomic and intra-organismal conflicts across the three domains of cellular life. In prokaryotes HEPN domains are essential components of numerous toxin-antitoxin (TA) and abortive infection (Abi) systems and in addition are tightly associated with many restriction-modification (R-M) and CRISPR-Cas systems, and occasionally with other defense systems such as Pgl and Ter. We present evidence of multiple modes of action of HEPN domains in these systems, which include direct attack on viral RNAs (e.g. LsoA and RNase LS) in conjunction with other RNase domains (e.g. a novel RNase H fold domain, NamA), suicidal or dormancy-inducing attack on self RNAs (RM systems and possibly CRISPR-Cas systems), and suicidal attack coupled with direct interaction with phage components (Abi systems).These findings are compatible with the hypothesis on coupling of pathogen-targeting (immunity) and self-directed (programmed cell death and dormancy induction) responses in the evolution of robust antiviral strategies. We propose that altruistic cell suicide mediated by HEPN domains and other functionally similar RNases was essential for the evolution of kin and group selection and cell cooperation. HEPN domains were repeatedly acquired by eukaryotes and incorporated into several core functions such as endonucleolytic processing of the 5.8S-25S/28S rRNA precursor (Las1), a novel ER membrane-associated RNA degradation system (C6orf70), sensing of unprocessed transcripts at the nuclear periphery (Swt1). Multiple lines of evidence suggest that, similar to prokaryotes, HEPN proteins were recruited to antiviral, antitransposon, apoptotic systems or RNA-level response to unfolded proteins (Sacsin and KEN domains) in several groups of eukaryotes. CONCLUSIONS: Extensive sequence and structure comparisons reveal unexpectedly broad presence of the HEPN domain in an enormous variety of defense and stress response systems across the tree of life. In addition, HEPN domains have been recruited to perform essential functions, in particular in eukaryotic rRNA processing. These findings are expected to stimulate experiments that could shed light on diverse cellular processes across the three domains of life.Reviewers: This article was reviewed by Martijn Huynen, Igor Zhulin and Nick Grishin.
    Biology Direct 06/2013; 8(1):15. · 2.72 Impact Factor
  • Source
    [show abstract] [hide abstract]
    ABSTRACT: BACKGROUND: A single cultured marine organism, Nanoarchaeum equitans, represents the Nanoarchaeota branch of symbiotic Archaea, with a highly reduced genome and unusual features such as multiple split genes. RESULTS: The first terrestrial hyperthermophilic member of the Nanoarchaeota was collected from Obsidian Pool, a thermal feature in Yellowstone National Park, separated by single cell isolation, and sequenced together with its putative host, a Sulfolobales archaeon. Both the new Nanoarchaeota (Nst1) and N. equitans lack most biosynthetic capabilities, and phylogenetic analysis of ribosomal RNA and protein sequences indicates that the two form a deep-branching archaeal lineage. However, the Nst1 genome is more than 20% larger, and encodes a complete gluconeogenesis pathway as well as the full complement of archaeal flagellum proteins. With a larger genome, a smaller repertoire of split protein encoding genes and no split non-contiguous tRNAs, Nst1 appears to have experienced less severe genome reduction than N. equitans. These findings imply that, rather than representing ancestral characters, the extremely compact genomes and multiple split genes of Nanoarchaeota are derived characters associated with their symbiotic or parasitic lifestyle. The inferred host of Nst1 is potentially autotrophic, with a streamlined genome and simplified central and energetic metabolism as compared to other Sulfolobales. CONCLUSIONS: Comparison of the N. equitans and Nst1 genomes suggests that the marine and terrestrial lineages of Nanoarchaeota share a common ancestor that was already a symbiont of another archaeon. The two distinct Nanoarchaeota-host genomic data sets offer novel insights into the evolution of archaeal symbiosis and parasitism, enabling further studies of the cellular and molecular mechanisms of these relationships.Reviewers: This article was reviewed by Patrick Forterre, Bettina Siebers (nominated by Michael Galperin) and Purification Lopez-Garcia.
    Biology Direct 04/2013; 8(1):9. · 2.72 Impact Factor
  • Source
    Kira S Makarova, Yuri I Wolf, Eugene V Koonin
    [show abstract] [hide abstract]
    ABSTRACT: Our knowledge of prokaryotic defense systems has vastly expanded as the result of comparative genomic analysis, followed by experimental validation. This expansion is both quantitative, including the discovery of diverse new examples of known types of defense systems, such as restriction-modification or toxin-antitoxin systems, and qualitative, including the discovery of fundamentally new defense mechanisms, such as the CRISPR-Cas immunity system. Large-scale statistical analysis reveals that the distribution of different defense systems in bacterial and archaeal taxa is non-uniform, with four groups of organisms distinguishable with respect to the overall abundance and the balance between specific types of defense systems. The genes encoding defense system components in bacterial and archaea typically cluster in defense islands. In addition to genes encoding known defense systems, these islands contain numerous uncharacterized genes, which are candidates for new types of defense systems. The tight association of the genes encoding immunity systems and dormancy- or cell death-inducing defense systems in prokaryotic genomes suggests that these two major types of defense are functionally coupled, providing for effective protection at the population level.
    Nucleic Acids Research 03/2013; · 8.28 Impact Factor
  • Eugene V Koonin, Kira S Makarova
    [show abstract] [hide abstract]
    ABSTRACT: The CRISPR-Cas (clustered regularly interspaced short palindromic repeats, CRISPR-associated genes) is an adaptive immunity system in bacteria and archaea that functions via a distinct self-non-self recognition mechanism that is partially analogous to the mechanism of eukaryotic RNA interference (RNAi). The CRISPR-Cas system incorporates fragments of virus or plasmid DNA into the CRISPR repeat cassettes and employs the processed transcripts of these spacers as guide RNAs to cleave the cognate foreign DNA or RNA. The Cas proteins, however, are not homologous to the proteins involved in RNAi and comprise numerous, highly diverged families. The majority of the Cas proteins contain diverse variants of the RNA recognition motif (RRM), a widespread RNA-binding domain. Despite the fast evolution that is typical of the cas genes, the presence of diverse versions of the RRM in most Cas proteins provides for a simple scenario for the evolution of the three distinct types of CRISPR-cas systems. In addition to several proteins that are directly implicated in the immune response, the cas genes encode a variety of proteins that are homologous to prokaryotic toxins that typically possess nuclease activity. The predicted toxins associated with CRISPR-Cas systems include the essential Cas2 protein, proteins of COG1517 that, in addition to a ligand-binding domain and a helix-turn-helix domain, typically contain different nuclease domains and several other predicted nucleases. The tight association of the CRISPR-Cas immunity systems with predicted toxins that, upon activation, would induce dormancy or cell death suggests that adaptive immunity and dormancy/suicide response are functionally coupled. Such coupling could manifest in the persistence state being induced and potentially providing conditions for more effective action of the immune system or in cell death being triggered when immunity fails.
    RNA biology 02/2013; 10(5). · 5.56 Impact Factor
  • Source
    [show abstract] [hide abstract]
    ABSTRACT: BACKGROUND: Collections of Clusters of Orthologous Genes (COGs) provide indispensable tools for comparative genomic analysis, evolutionary reconstruction and functional annotation of new genomes. Initially, COGs were made for all complete genomes of cellular life forms that were available at the time. However, with the accumulation of thousands of complete genomes, construction of a comprehensive COG set has become extremely computationally demanding and prone to error propagation, necessitating the switch to taxon-specific COG collections. Previously, we reported the collection of COGs for 41 genomes of Archaea (arCOGs). Here we present a major update of the arCOGs and describe evolutionary reconstructions to reveal general trends in the evolution of Archaea. RESULTS: The updated version of the arCOG database incorporates 91% of the pangenome of 120 archaea (251,032 protein-coding genes altogether) into 10,335 arCOGs. Using this new set of arCOGs, we performed maximum likelihood reconstruction of the genome content of archaeal ancestral forms and gene gain and loss events in archaeal evolution. This reconstruction shows that the last Common Ancestor of the extant Archaea was an organism of greater complexity than most of the extant archaea, probably with over 2,500 protein-coding genes. The subsequent evolution of almost all archaeal lineages was apparently dominated by gene loss resulting in genome streamlining. Overall, in the evolution of Archaea as well as a representative set of bacteria that was similarly analyzed for comparison, gene losses are estimated to outnumber gene gains at least 4 to 1. Analysis of specific patterns of gene gain in Archaea shows that, although some groups, in particular Halobacteria, acquire substantially more genes than others, on the whole, gene exchange between major groups of Archaea appears to be largely random, with no major 'highways' of horizontal gene transfer. CONCLUSIONS: The updated collection of arCOGs is expected to become a key resource for comparative genomics, evolutionary reconstruction and functional annotation of new archaeal genomes. Given that, in spite of the major increase in the number of genomes, the conserved core of archaeal genes appears to be stabilizing, the major evolutionary trends revealed here have a chance to stand the test of time.Reviewers: This article was reviewed by (for complete reviews see the Reviewers' Reports section): Dr. PLG, Prof. PF, Dr. PL (nominated by Prof. JPG).
    Biology Direct 12/2012; 7(1):46. · 2.72 Impact Factor
  • Source
    [show abstract] [hide abstract]
    ABSTRACT: BACKGROUND: The virus-host arms race is a major theater for evolutionary innovation. Archaea and bacteria have evolved diverse, elaborate antivirus defense systems that function on two general principles: i) immune systems that discriminate self DNA from nonself DNA and specifically destroy the foreign, in particular viral, genomes, whereas the host genome is protected, or ii) programmed cell suicide or dormancy induced by infection. PRESENTATION OF THE HYPOTHESIS: Almost all genomic loci encoding immunity systems such as CRISPR-Cas, restriction-modification and DNA phosphorothioation also encompass suicide genes, in particular those encoding known and predicted toxin nucleases, which do not appear to be directly involved in immunity. In contrast, the immunity systems do not appear to encode antitoxins found in typical toxin-antitoxin systems. This raises the possibility that components of the immunity system themselves act as reversible inhibitors of the associated toxin proteins or domains as has been demonstrated for the Escherichia coli anticodon nuclease PrrC that interacts with the PrrI restriction-modification system. We hypothesize that coupling of diverse immunity and suicide/dormancy systems in prokaryotes evolved under selective pressure to provide robustness to the antivirus response. We further propose that the involvement of suicide/dormancy systems in the coupled antivirus response could take two distinct forms:1) induction of a dormancy-like state in the infected cell to 'buy time' for activation of adaptive immunity; 2) suicide or dormancy as the final recourse to prevent viral spread triggered by the failure of immunity. TESTING THE HYPOTHESIS: This hypothesis entails many experimentally testable predictions. Specifically, we predict that Cas2 protein present in all cas operons is a mRNA-cleaving nuclease (interferase) that might be activated at an early stage of virus infection to enable incorporation of virus-specific spacers into the CRISPR locus or to trigger cell suicide when the immune function of CRISPR-Cas systems fails. Similarly, toxin-like activity is predicted for components of numerous other defense loci. IMPLICATIONS OF THE HYPOTHESIS: The hypothesis implies that antivirus response in prokaryotes involves key decision-making steps at which the cell chooses the path to follow by sensing the course of virus infection.ReviewersThis article was reviewed by Arcady Mushegian, Etienne Joly and Nick Grishin. For complete reviews, go to the Reviewers' reports section.
    Biology Direct 11/2012; 7(1):40. · 2.72 Impact Factor
  • Source
    [show abstract] [hide abstract]
    ABSTRACT: Escherichia coli bacteriophage T7 is a founding member of a large clade of podoviruses encoding a single-subunit RNA polymerase (RNAP). Phages of the family rely on host RNAP for transcription of early viral genes; viral RNAP transcribes non-early viral genes. T7 and its close relatives encode an inhibitor of host RNAP, the gp2 protein. Gp2 is essential for phage development and ensures that host RNAP does not interfere with viral RNAP transcription at late stages of infection. Here, we identify host RNAP inhibitors encoded by a subset of T7 clade phages related to ϕKMV phage of Pseudomonas aeruginosa. We demonstrate that these proteins are functionally identical to T7 gp2 in vivo and in vitro. The ability of some Pseudomonas phage gp2-like proteins to inhibit RNAP is modulated by N-terminal domains, which are absent from the T7 phage homolog. This finding indicates that Pseudomonas phages may use external or internal cues to initiate inhibition of host RNAP transcription and that gp2-like proteins from these phages may be receptors of these cues.
    Virology 11/2012; · 3.35 Impact Factor
  • Source
    [show abstract] [hide abstract]
    ABSTRACT: Microcin C (McC) is heptapeptide adenylate antibiotic produced by Escherichia coli strains carrying the mccABCDEF gene cluster encoding enzymes, in addition to the heptapeptide structural gene mccA, necessary for McC biosynthesis and self-immunity of the producing cell. The heptapeptide facilitates McC transport into susceptible cells, where it is processed releasing a non-hydrolyzable aminoacyl adenylate that inhibits an essential aminoacyl-tRNA synthetase. The self-immunity gene mccF encodes a specialized serine peptidase that cleaves an amide bond connecting the peptidyl or aminoacyl moieties of, respectively, intact and processed McC with the nucleotidyl moiety. Most mccF orthologs from organisms other than E. coli are not linked to the McC biosynthesis gene cluster. Here, we show that a protein product of one such gene, MccF from Bacillus anthracis (BaMccF), is able to cleave intact and processed McC, and we present a series of structures of this protein. Structural analysis of apo-BaMccF and its adenosine monophosphate complex reveals specific features of MccF-like peptidases that allow them to interact with substrates containing nucleotidyl moieties. Sequence analyses and phylogenetic reconstructions suggest that several distinct subfamilies form the MccF clade of the large S66 family of bacterial serine peptidases. We show that various representatives of the MccF clade can specifically detoxify non-hydrolyzable aminoacyl adenylates differing in their aminoacyl moieties. We hypothesize that bacterial mccF genes serve as a source of bacterial antibiotic resistance.
    Journal of Molecular Biology 04/2012; 420(4-5):366-83. · 3.91 Impact Factor
  • Source
    [show abstract] [hide abstract]
    ABSTRACT: M icrocin C (McC) is heptapeptide-adenylate antibiotic produced by Escherichia coli strains carrying the mccABCDEF gene cluster encoding, in addition to the heptapeptide structural gene mccA, enzymes necessary for McC biosynthesis and self-immunity of the producing cell. The heptapeptide facilitates McC transport into susceptible cells, where it is processed releasing a non-hydrolyzable aminoacyl adenylate that inhibits an essential aspartyl-tRNA synthetase. The self-immunity gene mccF encodes a specialized serine-peptidase that cleaves an amide bond connecting the peptidyl or aminoacyl moieties of, respectively, intact and processed McC with the nucleotidyl moiety. Unlike in E. coli, some of mccF orthologs are not expressed as a part of the mcc operon, and exist as single genes. Here, we show that a protein product of one such gene, MccF from Bacillus anthracis (BaMccF), is able to cleave intact and processed McC. Structural analysis conformed this observation. The structures of apo-BaMccF and its AMP-complex revealed a peptidase with specific features that allow MccF to interact with substrates containing nucleotidyl moieties. Sequence analysis and phylogenetic tree reconstruction for the MccF/LD-carboxypeptidase family of proteins show distinct subfamilies in the MccF clade. Several representatives of MccF clade can restore E. coli resistance to McC and other non-hydrolyzable aminoacyl adenylates. Based on our data we propose that members of MccF clade may have similar substrate specificity. Our results suggest that expression of widespread mccF-like genes may be linked to detoxification of aminoacyl adenylates (endogenous or exogenous) and may represent a "stealthy" source of antibiotic resistance.
  • Source
    Kira S Makarova, Eugene V Koonin, Zvi Kelman
    [show abstract] [hide abstract]
    ABSTRACT: In eukaryotes, the CMG (CDC45, MCM, GINS) complex containing the replicative helicase MCM is a key player in DNA replication. Archaeal homologs of the eukaryotic MCM and GINS proteins have been identified but until recently no homolog of the CDC45 protein was known. Two recent developments, namely the discovery of archaeal GINS-associated nuclease (GAN) that belongs to the RecJ family of the DHH hydrolase superfamily and the demonstration of homology between the DHH domains of CDC45 and RecJ, show that at least some Archaea possess a full complement of homologs of the CMG complex subunits. Here we present the results of in-depth phylogenomic analysis of RecJ homologs in archaea. We confirm and extend the recent hypothesis that CDC45 is the eukaryotic ortholog of the bacterial and archaeal RecJ family nucleases. At least one RecJ homolog was identified in all sequenced archaeal genomes, with the single exception of Caldivirga maquilingensis. These proteins include previously unnoticed remote RecJ homologs with inactivated DHH domain in Thermoproteales. Combined with phylogenetic tree reconstruction of diverse eukaryotic, archaeal and bacterial DHH subfamilies, this analysis yields a complex scenario of RecJ family evolution in Archaea which includes independent inactivation of the nuclease domain in Crenarchaeota and Halobacteria, and loss of this domain in Methanococcales. The archaeal complex of a CDC45/RecJ homolog, MCM and GINS is homologous and most likely functionally analogous to the eukaryotic CMG complex, and appears to be a key component of the DNA replication machinery in all Archaea. It is inferred that the last common archaeo-eukaryotic ancestor encoded a CMG complex that contained an active nuclease of the RecJ family. The inactivated RecJ homologs in several archaeal lineages most likely are dedicated structural components of replication complexes.
    Biology Direct 02/2012; 7:7. · 2.72 Impact Factor
  • Source
    [show abstract] [hide abstract]
    ABSTRACT: A novel bacteriophage infecting Staphylococus pasteuri was isolated during a screen for phages in Antarctic soils. The phage named SpaA1 is morphologically similar to phages of the family Siphoviridae. The 42,784 bp genome of SpaA1 is a linear, double-stranded DNA molecule with 3' protruding cohesive ends. The SpaA1 genome encompasses 63 predicted protein-coding genes which cluster within three regions of the genome, each of apparently different origin, in a mosaic pattern. In two of these regions, the gene sets resemble those in prophages of Bacillus thuringiensis kurstaki str. T03a001 (genes involved in DNA replication/transcription, cell entry and exit) and B. cereus AH676 (additional regulatory and recombination genes), respectively. The third region represents an almost complete genome (except for the short terminal segments) of a distinct bacteriophage, MZTP02. Nearly the same gene module was identified in prophages of B. thuringiensis serovar monterrey BGSC 4AJ1 and B. cereus Rock4-2. These findings suggest that MZTP02 can be shuttled between genomes of other bacteriophages and prophages, leading to the formation of chimeric genomes. The presence of a complete phage genome in the genome of other phages apparently has not been described previously and might represent a 'fast track' route of virus evolution and horizontal gene transfer. Another phage (BceA1) nearly identical in sequence to SpaA1, and also including the almost complete MZTP02 genome within its own genome, was isolated from a bacterium of the B. cereus/B. thuringiensis group. Remarkably, both SpaA1 and BceA1 phages can infect B. cereus and B. thuringiensis, but only one of them, SpaA1, can infect S. pasteuri. This finding is best compatible with a scenario in which MZTP02 was originally contained in BceA1 infecting Bacillus spp, the common hosts for these two phages, followed by emergence of SpaA1 infecting S. pasteuri.
    PLoS ONE 01/2012; 7(7):e40683. · 3.73 Impact Factor
  • Source
    [show abstract] [hide abstract]
    ABSTRACT: The recently discovered CRISPR-Cas adaptive immune system is present in almost all archaea and many bacteria. It consists of cassettes of CRISPR repeats that incorporate spacers homologous to fragments of viral or plasmid genomes that are employed as guide RNAs in the immune response, along with numerous CRISPR-associated (cas) genes that encode proteins possessing diverse, only partially characterized activities required for the action of the system. Here, we investigate the evolution of the cas genes and show that they evolve under purifying selection that is typically much weaker than the median strength of purifying selection affecting genes in the respective genomes. The exceptions are the cas1 and cas2 genes that typically evolve at levels of purifying selection close to the genomic median. Thus, although these genes are implicated in the acquisition of spacers from alien genomes, they do not appear to be directly involved in an arms race between bacterial and archaeal hosts and infectious agents. These genes might possess functions distinct from and additional to their role in the CRISPR-Cas-mediated immune response. Taken together with evidence of the frequent horizontal transfer of cas genes reported previously and with the wide-spread microscale recombination within these genes detected in this work, these findings reveal the highly dynamic evolution of cas genes. This conclusion is in line with the involvement of CRISPR-Cas in antiviral immunity that is likely to entail a coevolutionary arms race with rapidly evolving viruses. However, we failed to detect evidence of strong positive selection in any of the cas genes.
    Journal of bacteriology 12/2011; 194(5):1216-25. · 3.94 Impact Factor
  • Source
    [show abstract] [hide abstract]
    ABSTRACT: All sequenced genomes of representatives of the Francisella genus contain two rpoA genes, which encode non-identical RNA polymerase (RNAP) subunits, α1 and α2. In all other bacteria studied to date, a dimer of identical α subunits initiates the assembly of the catalytically proficient RNAP core (subunit composition α2ββ'). Based on an observation that both α1 and α2 are incorporated into Francisella RNAP, Charity et al. (2007) previously suggested that up to four different species of RNAP core enzyme might form in the same Francisella cell. By in vitro assembly from fully denatured state, we determined that both Francisella α subunits are required for efficient dimerization; no homodimer formation was detected. Bacterial two-hybrid system analysis likewise indicated strong interactions between the α1 and α2 N-terminal domains (NTDs, responsible for dimerization). NTDs of α2 did not interact detectably, while weak interaction between α1 NTDs was observed. This weak homotypic interaction may explain low-level transcription activity observed in in vitro RNAP reconstitution reactions containing Francisella large subunits (β', β) and α1. No activity was observed with RNAP reconstitution reactions containing α2, while robust transcription activity was detected in reactions containing α1 and α2. Phylogenetic analysis based on RpoA resulted in a tree compatible with standard bacterial taxonomy with both Francisella RpoA branches positioned within γ-proteobacteria. The observed phylogeny and analysis of constrained trees are compatible with Francisella lineage-specific rpoA duplication followed by acceleration of evolutionary rate and subfunctionalization. The results strongly suggest that most Francisella RNAP contains α heterodimer with a minor subfraction possibly containing α1 homodimer. Comparative sequence analysis suggests that this heterodimer is oriented, in a sense that only one monomer, α1, interacts with the β subunit during the α2β RNAP subassembly formation. Most likely the two rpoA copies in Francisella have emerged through a lineage-specific duplication followed by subfunctionalization of interacting paralogs.
    BMC Molecular Biology 11/2011; 12:50. · 2.80 Impact Factor
  • Source
    [show abstract] [hide abstract]
    ABSTRACT: ssDNA-binding proteins (SSBs) based on the oligonucleotide-binding fold are considered ubiquitous in nature and play a central role in many DNA transactions including replication, recombination, and repair. We demonstrate that the Thermoproteales, a clade of hyperthermophilic Crenarchaea, lack a canonical SSB. Instead, they encode a distinct ssDNA-binding protein that we term "ThermoDBP," exemplified by the protein Ttx1576 from Thermoproteus tenax. ThermoDBP binds specifically to ssDNA with low sequence specificity. The crystal structure of Ttx1576 reveals a unique fold and a mechanism for ssDNA binding, consisting of an extended cleft lined with hydrophobic phenylalanine residues and flanked by basic amino acids. Two ssDNA-binding domains are linked by a coiled-coil leucine zipper. ThermoDBP appears to have displaced the canonical SSB during the diversification of the Thermoproteales, a highly unusual example of the loss of a "ubiquitous" protein during evolution.
    Proceedings of the National Academy of Sciences 11/2011; 109(7):E398-405. · 9.74 Impact Factor

Publication Stats

8k Citations
994 Downloads
824.51 Total Impact Points

Institutions

  • 2008–2013
    • Oregon State University
      • Department of Botany and Plant Pathology
      Corvallis, Oregon, United States
    • University of Hawai'i System
      Honolulu, Hawaii, United States
    • University of Hawaiʻi at Hilo
      • Department of Natural Science
      Hilo, HI, United States
  • 2004–2013
    • Wageningen University
      • Laboratory of Microbiology
      Wageningen, Provincie Gelderland, Netherlands
  • 1999–2013
    • National Institutes of Health
      • National Center for Biotechnology Information
      Bethesda, MD, United States
  • 1999–2012
    • National Center for Biotechnology Information
      Maryland, United States
  • 2009
    • University of Nebraska at Omaha
      • Eppley Institute for Research in Cancer and Allied Diseases
      Omaha, NE, United States
  • 2006–2008
    • Universität Osnabrück
      • Biophysics
      Osnabrück, Lower Saxony, Germany
  • 2003–2008
    • Oak Ridge National Laboratory
      • Biosciences Division
      Oak Ridge, FL, United States
  • 2005
    • University of California, Davis
      • Department of Viticulture and Enology
      Davis, CA, United States
    • University of Hawaiʻi at Mānoa
      • Department of Microbiology
      Honolulu, HI, United States
    • The University of York
      • York Structural Biology Laboratory
      York, England, United Kingdom
  • 1998–2005
    • Uniformed Services University of the Health Sciences
      • Department of Pathology
      Bethesda, MD, United States
    • Texas A&M University
      • Department of Biology
      College Station, TX, United States
  • 2001
    • National Eye Institute
      Maryland, United States
  • 1992–1998
    • Institute of Cytology and Genetics
      Novo-Nikolaevsk, Novosibirsk, Russia