Mobile group II introns. Annu Rev Genet

Article (PDF Available)inAnnual Review of Genetics 38(1):1-35 · February 2004with32 Reads
DOI: 10.1146/annurev.genet.38.072902.091600 · Source: PubMed
Abstract
Mobile group II introns, found in bacterial and organellar genomes, are both catalytic RNAs and retrotransposable elements. They use an extraordinary mobility mechanism in which the excised intron RNA reverse splices directly into a DNA target site and is then reverse transcribed by the intron-encoded protein. After DNA insertion, the introns remove themselves by protein-assisted, autocatalytic RNA splicing, thereby minimizing host damage. Here we discuss the experimental basis for our current understanding of group II intron mobility mechanisms, beginning with genetic observations in yeast mitochondria, and culminating with a detailed understanding of molecular mechanisms shared by organellar and bacterial group II introns. We also discuss recently discovered links between group II intron mobility and DNA replication, new insights into group II intron evolution arising from bacterial genome sequencing, and the evolutionary relationship between group II introns and both eukaryotic spliceosomal introns and non-LTR-retrotransposons. Finally, we describe the development of mobile group II introns into gene-targeting vectors, "targetrons," which have programmable target specificity.

Figures

Figure
Figure
Figure

Full-text (PDF)

Available from: Steven Zimmerly
15 Oct 2004 19:27 AR AR230-GE38-01.tex AR230-GE38-01.sgm LaTeX2e(2002/01/18)
P1: GCE
10.1146/annurev.genet.38.072902.091600
Annu. Rev. Genet. 2004. 38:1–35
doi: 10.1146/annurev.genet.38.072902.091600
Copyright
c
2004 by Annual Reviews. All rights reserved
First published online as a Review in Advance on May 25, 2004
MOBILE GROUP II INTRONS
Alan M. Lambowitz
1
and Steven Zimmerly
2
1
Institute for Cellular and Molecular Biology, Department of Chemistry and Biochemistry,
and Section of Molecular Genetics and Microbiology, School of Biological Sciences,
University of Texas at Austin, Austin, Texas 78712; email: lambowitz@mail.utexas.edu
2
Department of Biological Sciences, University of Calgary, Calgary, Alberta, T2N 1N4
Canada; email: zimmerly@ucalgary.ca
KeyWords intron evolution, retrotransposon, reverse transcriptase, ribozyme,
RNA splicing
Abstract Mobile group II introns, found in bacterial and organellar genomes,
are both catalytic RNAs and retrotransposable elements. They use an extraordinary
mobility mechanism in which the excised intron RNA reverse splices directly into a
DNA target site and is then reverse transcribed by the intron-encoded protein. After
DNA insertion, the introns remove themselves by protein-assisted, autocatalytic RNA
splicing, thereby minimizing host damage. Here we discuss the experimental basis
for our current understanding of group II intron mobility mechanisms, beginning with
genetic observations in yeast mitochondria, and culminating with a detailed under-
standing of molecular mechanisms shared by organellar and bacterial group II introns.
We also discuss recently discovered links between group II intron mobility and DNA
replication, new insights into group II intron evolution arising from bacterial genome
sequencing, and the evolutionary relationship between group II introns and both eu-
karyotic spliceosomal introns and non-LTR-retrotransposons. Finally, we describe the
development of mobile group II introns into gene-targeting vectors, “targetrons,” which
have programmable target specificity.
CONTENTS
INTRODUCTION ..................................................... 2
DISTRIBUTION OF GROUP II INTRONS ................................. 2
GROUP II INTRON SPLICING MECHANISM AND
RNA STRUCTURE ................................................... 3
PROTEIN-ASSISTED SPLICING OF GROUP II INTRONS ................... 5
GROUP II INTRON-ENCODED PROTEINS: DOMAINS
AND BIOCHEMICAL ACTIVITIES ..................................... 7
GROUP II INTRON-ENCODED PROTEINS: LINEAGES
AND VARIATIONS .................................................. 8
DEGENERATE GROUP II INTRONS, TRANS-SPLICING,
AND TWINTRONS .................................................. 9
RETROHOMING OF YEAST mtDNA GROUP II INTRONS .................. 10
RETROHOMING OF THE L. LACTIS Ll.LtrB INTRON ...................... 14
0066-4197/04/1215-0001$14.00
1
Annu. Rev. Genet. 2004.38:1-35. Downloaded from arjournals.annualreviews.org
by UNIVERSITY OF CALGARY on 02/25/05. For personal use only.
15 Oct 2004 19:27 AR AR230-GE38-01.tex AR230-GE38-01.sgm LaTeX2e(2002/01/18)
P1: GCE
2 LAMBOWITZ
ZIMMERLY
ALTERNATE MOBILITY MECHANISMS USED BY En
GROUP II INTRONS ................................................. 15
GROUP II INTRON RETROTRANSPOSITION TO ECTOPIC SITES ........... 16
GROUP II INTRON RETROTRANSPOSITION SUPPORTED
BY AN INTRON-ENCODED PROTEIN IN TRANS? ........................ 17
MECHANISM OF SECOND-STRAND SYNTHESIS
AND CONNECTIONS TO DNA REPLICATION ........................... 18
SYNTHESIS OF THE INTRON-ENCODED PROTEIN ....................... 18
BINDING OF THE INTRON-ENCODED PROTEIN TO THE
INTRON RNA ....................................................... 19
DNA TARGET SITE RECOGNITION BY GROUP II INTRON RNPs ........... 20
REGULATION OF GROUP II INTRON MOBILITY ......................... 23
EVOLUTION OF MOBILE GROUP II INTRONS ........................... 24
EVOLUTION OF THE INTRON-ENCODED PROTEIN
AND PROTEIN-ASSISTED SPLICING .................................. 24
EVOLUTIONARY RELATIONSHIP OF GROUP II INTRONS
TO SPLICEOSOMAL INTRONS AND
NON-LTR-RETROTRANSPOSONS ..................................... 25
GROUP II INTRONS AS GENE-TARGETING VECTORS .................... 26
INTRODUCTION
Mobile group II introns are retroelements consisting of a highly structured cat-
alytic RNA and a multifunctional, intron-encoded protein (IEP), which has reverse
transcriptase (RT) activity. The intron RNA, by virtue of its ribozyme activity, in-
trinsically carries out RNA splicing and reverse splicing (integration) reactions,
while the IEP facilitates these reactions by stabilizing the catalytically active RNA
structure. Mobility occurs by a remarkable target DNA-primed reverse transcrip-
tion (TPRT) mechanism in which the excised intron RNA reverse splices directly
into a DNA target site and is then reverse transcribed by the IEP. In some cases,
the IEP has a DNA endonuclease (En) activity that cleaves the opposite strand to
generate the primer for TPRT, but recent studies show that nascent strands at DNA
replication forks can also serve as primers. By using these mechanisms, group II
introns “retrohome” at frequencies approaching 100% into specific DNA target
sites, typically the unoccupied site in an intronless allele, and “retrotranspose”
at low frequencies into ectopic sites that resemble the normal homing site. Mo-
bile group II introns are also of interest because they are the putative ancestors of
spliceosomal introns and non-LTR retrotransposons in higher organisms, elements
that together comprise more than 45% of the human genome.
DISTRIBUTION OF GROUP II INTRONS
Group II introns were discovered in the mitochondrial (mt) and chloroplast (cp)
genomes of lower eukaryotes and higher plants, where they interrupt conserved
genes. About a third of organellar group II introns encode an open reading frame
(ORF) with homology to RTs (68, 109). Among the first identified and studied were
Annu. Rev. Genet. 2004.38:1-35. Downloaded from arjournals.annualreviews.org
by UNIVERSITY OF CALGARY on 02/25/05. For personal use only.
15 Oct 2004 19:27 AR AR230-GE38-01.tex AR230-GE38-01.sgm LaTeX2e(2002/01/18)
P1: GCE
MOBILE GROUP II INTRONS
3
the yeast mtDNA introns coxI-I1 and -I2 (also known as aI1 and aI2, respectively),
which are found at two sites in the cytochrome oxidase subunit I gene (6). These
and other examples of closely related group II introns found at different loci,
along with their RT-related ORFs, suggested that group II introns might be mobile
elements (50).
The first bacterial group II introns were discovered about ten years ago by PCR
screens (34). Only recently, however, has genome sequencing revealed that group II
introns are surprisingly common in both gram-negative and gram-positive bacteria
(23, 111). About a quarter of sequenced bacterial genomes contain a group II
intron, with up to two dozen present in a given organism (21, 111). By comparison,
group II introns are rare in archaea, being found in only 2 closely related species
out of 17 whose genomes have been sequenced (91). Unlike organellar group
II introns, almost all the bacterial introns encode an RT-related ORF or ORF
remnant. Additionally, most bacterial group II introns are inserted either between
genes or within mobile DNAs, such as IS elements or plasmids, which may aid
their dissemination (21, 43). The well-studied Lactococcus lactis Ll.LtrB intron,
for example, was discovered in a relaxase gene (ltrB)inaconjugative element,
where its splicing is required for conjugation (72, 100).
Group II introns have not been found in the streamlined mtDNAs of meta-
zoan animals, possibly because of intron loss, nor in eukaryotic nuclear genes,
where their presumed descendants, spliceosomal introns, predominate. However,
the nuclear genome of the higher plant Arabidopsis thaliana contains an integrated,
presumably nonfunctional mtDNA sequence that includes group II introns (57).
Nuclear integration of mtDNA fragments occurs frequently and is one process by
which group II introns may have been transferred to the nucleus before evolving
into spliceosomal introns (7).
GROUP II INTRON SPLICING MECHANISM
AND RNA STRUCTURE
Like spliceosomal introns, group II introns splice via two sequential transesterifi-
cation reactions that yield ligated exons and an excised intron lariat with a 2
-5
phosphodiester bond (84, 97, 113) (Figure 1A). In the case of group II introns,
however, the splicing reactions are catalyzed by the intron RNA itself. To ac-
complish this, the RNA folds into conserved secondary and tertiary structures,
which form an active site containing catalytically essential Mg
2+
ions (67, 89).
The conserved secondary structure consists of six double-helical domains (DI-
DVI) radiating from a central wheel, with three major subgroups (IIA, IIB, and
IIC) and further subdivisions defined by specific features (33, 69, 111) (Figure
1B). DV interacts with DI to form the minimal catalytic core; DVI contains the
branch-point nucleotide residue, usually a bulged A; and DII and DIII contribute
to RNA folding and catalytic efficiency (32). DIV, which encodes the intron ORF,
is dispensable for ribozyme activity. NMR and X-ray crystal structures have been
determined for segments DV and DV + DVI (102, 119), and over a dozen tertiary
Annu. Rev. Genet. 2004.38:1-35. Downloaded from arjournals.annualreviews.org
by UNIVERSITY OF CALGARY on 02/25/05. For personal use only.
15 Oct 2004 19:27 AR AR230-GE38-01.tex AR230-GE38-01.sgm LaTeX2e(2002/01/18)
P1: GCE
4 LAMBOWITZ
ZIMMERLY
Figure 1 Group II intron RNA splicing mechanism and secondary structure. A. Splic-
ing occurs via two sequential transesterification reactions. In the first, nucleophilic
attack at the 5
-splice site by the 2
OH of a bulged A-residue in DVI results in cleav-
age of the 5
-splice site coupled to formation of lariat intermediate. In the second,
nucleophilic attack at the 3
-splice site by the 3
OH of the cleaved 5
exon results in
exon ligation and release of the intron lariat. B. The conserved secondary structure
consists of six double-helical domains (DI-DVI) emanating from a central wheel, with
subdomains indicated by lower-case letters (e.g., DIVa). The ORF is encoded within
DIV (dotted loop), and DIVa is the high-affinity binding site for the IEP. Greek letters
indicate sequences involved in tertiary interactions. EBS and IBS refer to exon- and
intron-binding sites, respectively. Some key differences between subgroup IIA, IIB,
and IIC introns are indicated within dashed boxes, but additional smaller differences
are not shown (see References 69, 109 for detailed discussion of differences between
group II intron subclasses).
Annu. Rev. Genet. 2004.38:1-35. Downloaded from arjournals.annualreviews.org
by UNIVERSITY OF CALGARY on 02/25/05. For personal use only.
15 Oct 2004 19:27 AR AR230-GE38-01.tex AR230-GE38-01.sgm LaTeX2e(2002/01/18)
P1: GCE
MOBILE GROUP II INTRONS
5
interactions between different RNA domains have been identified (IBS1-3, α-λ),
enabling structural modeling of the active site (14, 82, 107). An evolutionary re-
lationship between group II and spliceosomal introns is supported by their similar
splicing mechanisms and by the finding that several group II intron domains, in-
cluding DI, DIII, and DV, and segments DI-III and DV-DVI, can function in trans
to promote splicing of mutant introns lacking the domains, as expected for the
progenitors of spliceosomal snRNAs (12, 24, 45, 70, 88).
Keytothe operation of group II introns are three short sequence elements
that base pair with flanking 5
- and 3
-exon sequences to help position the splice
junctions at the intron’s active site for both RNA splicing and reverse splicing
reactions (Figure 1B) (14, 67). The sequence elements EBS1 and EBS2 (exon-
binding sites 1 and 2) in DI each form 5 to 6 base pairs with the 5
-exon sequences
IBS1 and IBS2 (intron-binding sites 1 and 2). In group IIA introns, the sequence
δ adjacent to EBS1 base pairs with δ
, the first 1–3 nucleotides of the 3
exon,
while in group IIB introns, the 3
exon base pairs instead with EBS3, located in a
different part of DI (Figure 1B).
PROTEIN-ASSISTED SPLICING OF GROUP II INTRONS
Although some group II introns self-splice in vitro, this reaction generally requires
nonphysiological conditions, and in vivo, proteins are required to help the intron
RNA fold into a catalytically active structure (reviewed in 51, 52). In the case of
mobile group II introns, a major protein required for splicing is the IEP, which
binds specifically to the intron RNA to stabilize the active structure (“maturase”
activity) (Figure 2) (9, 64, 77, 93). As discussed below, the IEP uses parts of
the RT domain and domain X, which likely corresponds to the RT thumb, to
bind different regions of the intron RNA (17, 74, 77). All characterized maturases
are intron-specific splicing factors, but closely related maturases may have some
cross-reactivity (9, 77, 93).
In plant chloroplasts, only one group II intron, trnK-I1 in the tRNA
Lys
gene,
encodes a maturase-related protein, and this protein, denoted MatK, may contribute
to the splicing of a number of ORF-less group II introns. This more generalized
function of MatK is supported by the nature of the splicing defects resulting from
inhibition of cp protein synthesis, and by the observation that a free-standing ORF
encoding MatK is retained in the pared down cp genomes of nonphotosynthetic
plants (reviewed in 114). MatK proteins contain a well-conserved domain X, but
have degenerate RT motifs and lack the En domain (Figure 2D) (74). In higher
plants, four other maturase-related proteins, denoted nMat-1a, -1b, -2a, and -2b,
are encoded by nuclear genes, but have mt targeting sequences and may function
in splicing one or more mt group II introns, most of which do not encode ORFs
(Figure 2E,F) (73). The nuclear-encoded nMat proteins likely originated from an
organelle group II intron. Like MatK, they have a conserved domain X, but some
have deviations in the RT domain and lack or have deviations in the En domain,
suggesting loss of mobility functions.
Annu. Rev. Genet. 2004.38:1-35. Downloaded from arjournals.annualreviews.org
by UNIVERSITY OF CALGARY on 02/25/05. For personal use only.
15 Oct 2004 19:27 AR AR230-GE38-01.tex AR230-GE38-01.sgm LaTeX2e(2002/01/18)
P1: GCE
6 LAMBOWITZ
ZIMMERLY
Figure 2 Group II intron-encoded and related proteins. Protein coding regions are
shown as rectangles, with different shadings indicating conserved regions. For proteins
encoded within introns, the 5
and 3
exons (E1 and E2, respectively) are shown as
gray boxes, and noncoding regions of the intron RNA are thick black lines. Protein
domains are: RT, with conserved sequence blocks RT-0 to -7; domain X, associated
with maturase activity; D, DNA-binding domain; and En, DNA endonuclease domain.
The locations of intron RNA domains I–VI are indicated above the Ll.LtrB intron,
with hatch marks demarcating domain boundaries. (A) L. lactis Ll.LtrB intron. (B) S.
cerevisiae mtDNA intron coxI-I1. (C) Sinorhizobium meliloti RmInt1, which encodes a
protein lacking the En domain and has only a short C-terminal extension in the position
corresponding to domain D. (D) Arabidopsis thaliana cp tRNA
Lys
intron encoding
MatK, with degenerate but recognizable features of the RT domain. (E) and (F) A.
thaliana nuclear-encoded maturase-like proteins nMat-1a and nMat-2a, respectively.
In addition to maturases, a number of host-encoded group II intron splicing
factors have been identified genetically in yeast, algae, and higher plants (reviewed
in 52, 55). Some of these proteins may function alone, whereas others may function
in conjunction with maturases or other proteins, either by stabilizing the active RNA
structure, or by acting as RNA chaperones to resolve non-native structures that
constitute “kinetic traps” in RNA folding. The host-encoded splicing factors differ
among organisms, but a common feature is that they have or had another cellular
function (e.g., peptidyl-tRNA hydrolase and CRM-domain proteins for higher
Annu. Rev. Genet. 2004.38:1-35. Downloaded from arjournals.annualreviews.org
by UNIVERSITY OF CALGARY on 02/25/05. For personal use only.
15 Oct 2004 19:27 AR AR230-GE38-01.tex AR230-GE38-01.sgm LaTeX2e(2002/01/18)
P1: GCE
MOBILE GROUP II INTRONS
7
plant cp DNA introns; pseudouridine synthase for a trans-spliced Chlamydomonas
reinhardtii cp DNA intron) (46, 83, 85, 108). The idiosyncratic nature of these
proteins in different organisms suggests that they were recruited to function in
splicing relatively recently in evolution, after the dispersal of the introns as mobile
genetic elements, similar to the recruitment of aminoacyl-tRNA synthetases and
other host factors to function in splicing group I introns (52, 53).
GROUP II INTRON-ENCODED PROTEINS: DOMAINS
AND BIOCHEMICAL ACTIVITIES
Most of what is known about the biochemical activities of group II IEPs has come
from studies of the yeast mt introns coxI-I1 and -I2 and the L. lactis Ll.LtrB intron.
The proteins encoded by these introns are active RTs containing four “domains”
denoted RT, X, DNA binding (D), and DNA endonuclease (En) (Figure 2A,B)
(49, 65, 74, 93, 94). The N-terminal RT domain contains conserved sequence
blocks RT-1 to -7 found in the fingers and palm of retroviral and other RTs, as
well as an upstream region, denoted RT-0, which is characteristic of RTs of non-
LTR retrotransposons, such as human LINE elements (59, 126). Domain X was
identified as a site of mutations affecting maturase activity (17, 74). It is located
just downstream of the RT domain in the position corresponding to the thumb
and connection domains of retroviral RTs, and likely corresponds to a specialized
thumb in group II intron RTs. The RT and X domains function together to bind
the intron RNA both as a substrate for RNA splicing and as a template for reverse
transcription (17, 49, 65, 115).
The C-terminal D and En domains are not required for RNA splicing, but
function in intron mobility. Domain D is not highly conserved in sequence, but
contains two functionally important regions, one consisting of a cluster of basic
amino acid residues and the other containing a predicted α-helix (94). The En
domain, which carries out second-strand DNA cleavage during mobility, contains
motifs characteristic of the H-N-H DNA endonuclease family interspersed with two
pairs of conserved cysteine residues, similar to an arrangement found in phage T4
endonuclease VII (38, 94, 101). The H-N-H motif forms part of the En active site,
which in the case of the L. lactis IEP contains a single catalytically essential Mg
2+
ion. The conserved cysteine motifs appear to stabilize the higher-order structure
of the domain, but unlike EndoVII, do not contain a coordinated Zn
2+
ion, at least
in the purified protein (94).
The functions of the D and En domains in the yeast and lactococcal introns
were defined by analyzing C-terminal truncations (41, 103). Deletion of the En
domain abolished only second-strand DNA cleavage, while a longer truncation
deleting both D and En also abolished reverse splicing of the intron RNA into
double-stranded DNA. The truncated protein lacking D and En could still support
residual reverse splicing into single-stranded DNA substrates, implying that D is
required primarily to access the target site in double-stranded DNA. The RT and/or
Annu. Rev. Genet. 2004.38:1-35. Downloaded from arjournals.annualreviews.org
by UNIVERSITY OF CALGARY on 02/25/05. For personal use only.
15 Oct 2004 19:27 AR AR230-GE38-01.tex AR230-GE38-01.sgm LaTeX2e(2002/01/18)
P1: GCE
8 LAMBOWITZ
ZIMMERLY
X domains may also contribute to DNA binding, as evidenced by their ability to
support some reverse splicing into single-stranded DNA. The En domain has no
cognate in other retroelements, suggesting that it was appended to a pre-existing RT
to enhance mobility by TPRT (3, 63). Nuclear non-LTR-retrotransposons contain
two other types of En domains that have been appended to their RTs for analogous
TPRT reactions (29, 76).
GROUP II INTRON-ENCODED PROTEINS: LINEAGES
AND VARIATIONS
More than 200 ORF-containing group II introns have been sequenced, and with a
few exceptions noted below, all encode RT-related proteins. Based on phylogenetic
analysis, the IEPs can be divided into eight major lineages denoted mitochondrial,
chloroplast-like 1 and 2, and bacterial A-E (112, 126) (Figure 3). Importantly, each
lineage of IEP is associated with a distinct RNA structural subclass, implying that
the IEP was associated with the intron RNA prior to the divergence of different
group II intron lineages, with little if any subsequent exchange of IEPs (109). The
“mitochondrial” and “chloroplast” lineages include a number of bacterial group II
Figure 3 Phylogeny of group II intron ORFs and correspondence with RNA structural
classes. Phylogenetic relationships of group II intron ORFs are summarized based on
neighbor-joining analyses (109, 112). Group II intron ORFs are divided into eight
clades, named mitochondrial, chloroplast-like 1 and 2, and bacterial A–E (112, 126).
Each ORF clade is associated with a distinct RNA structural class (IIA1, IIB1, IIB2,
IIC, two other distinct IIB-like, and two distinct IIA/B hybrid classes) (109). The
branching patterns for specific introns discussed in this review are indicated, along
with group II introns of E. coli and the archaea Methanosarcina acetivorans (M.a.),
which illustrate probable horizontal transfers.
Annu. Rev. Genet. 2004.38:1-35. Downloaded from arjournals.annualreviews.org
by UNIVERSITY OF CALGARY on 02/25/05. For personal use only.
15 Oct 2004 19:27 AR AR230-GE38-01.tex AR230-GE38-01.sgm LaTeX2e(2002/01/18)
P1: GCE
MOBILE GROUP II INTRONS
9
introns (e.g., the L. lactis Ll.LtrB intron belongs to the “mitochondrial” lineage), a
situation thought to reflect that mt and cp group II introns were derived from spe-
cific bacterial lineages. Additionally, the heterogeneous phylogenetic distribution
of group II intron subclasses suggests that horizontal transfer of bacterial group
II introns is relatively common (21, 126), and indeed, cross-species transfer by
conjugation has been demonstrated (4).
About a quarter of organellar group II introns and half of bacterial group II
introns (including all members of bacterial classes C, D, and E) encode proteins
lacking the En domain. As discussed below, some of these introns are mobile, and
their IEPs may be related to ancestral RTs that lacked the En domain. However,
phylogenetic analysis suggests that the En domain was also lost multiple times in
both bacterial and organellar lineages (126).
Finally, a small subset of fungal mtDNA group II introns is distinct in encoding
proteins of the LAGLIDADG family of group I intron homing endonucleases
(110). The LAGLIDADG proteins promote homing of group I introns by cleaving
recipient alleles to initiate double-strand break repair (DSBR) recombination, and
some have also adapted to function in RNA splicing by stabilizing the active RNA
structure (2, 51, 52). It will be of great interest to determine if these proteins have
similar functions when associated with group II introns.
DEGENERATE GROUP II INTRONS, TRANS-SPLICING,
AND TWINTRONS
Group II introns exhibit several types of structural variations that provide insight
into their evolutionary potential and constraints. First, many organellar group II
introns have degenerate RNA structures. Higher plant cp and mt group II introns, for
example, frequently have deviations from the conserved RNA secondary structure,
such as mispairings in DV and DVI, and no higher plant group II intron has been
shown to self-splice (67, 83). More extreme structural degeneration is seen among
Euglena cp group II introns, some of which (also called group III introns) are as
short as 91 nts and have only a degenerate DI and DVI (13). Also, many organellar
group II introns lack an ORF or encode proteins with degenerate RT domains (e.g.,
cp MatK proteins), suggesting loss of mobility functions (74).
Remarkably, a number of organelle group II introns are discontinuous, consist-
ing of two or more segments encoded in different parts of the genome. Transcripts
of these segments reassociate via tertiary interactions to form a functional intron,
resulting in “trans-splicing” of the associated exons (5). The nad1 gene in many
higher plants, for example, is split into four independently transcribed segments,
which are spliced in three trans-splicing reactions. In ancestral lower plants, the
nad1 gene is continuous and does not contain introns, indicating that the gene
was split after intron insertion. In Chlamydomonas reinhardtii chloroplasts, two
trans-splicing introns are found in the psaA gene, with intron 1 transcribed in
three segments and intron 2 in two segments. Mutations in about a dozen nuclear
Annu. Rev. Genet. 2004.38:1-35. Downloaded from arjournals.annualreviews.org
by UNIVERSITY OF CALGARY on 02/25/05. For personal use only.
15 Oct 2004 19:27 AR AR230-GE38-01.tex AR230-GE38-01.sgm LaTeX2e(2002/01/18)
P1: GCE
10 LAMBOWITZ
ZIMMERLY
genes affect the C. reinhardtii trans-splicing reactions (37, 85, 92). Notably, trans-
splicing appears to have evolved independently multiple times in organelles (90).
The repeated distintegration of group II introns into segments that can functionally
reassociate supports the view that a similar process was involved in the evolution
of spliceosomal introns (99).
Finally, “twintrons” are a type of structural variation in which one group II
intron has inserted into another, forming a nested set of up to four introns. In
Euglena chloroplasts, twintrons are found within housekeeping genes, and they
must be spliced sequentially starting with the innermost intron, which then yields
a continuous copy of the next intron, and so on (13). By contrast, most bacterial
twintrons are found in intergenic regions, and some bacterial clusters have asym-
metric organizations, with incomplete copies of some introns (23). The insertion
of one intron into another provides a safe haven for the invading intron.
RETROHOMING OF YEAST mtDNA GROUP II INTRONS
The mobility of group II introns was first demonstrated by Meunier et al. (66) for
the S. cerevisiae mtDNA intron coxI-I1 and by Skelly et al. (105) for the related
Kluyveromyces lactis coxI-I1 intron. These investigators analyzed crosses between
haploid yeast strains containing different combinations of mtDNA introns. During
crosses, mitochondria fuse, enabling recombination between mtDNAs, which then
segregate to a homoplasmic state. Both group II introns homed to the unoccupied
site in intronless alleles at high frequency, occupying 90% of the progeny alleles,
and in S. cerevisiae, homing was shown to be blocked both by IEP mutations and by
intron RNA mutations that inhibit splicing. The latter finding implied a significant
difference from the DSBR mechanism used for group I intron mobility, where
splicing competence of the intron is not required.
In the first detailed study of group II intron mobility, Lazowska et al. (54)
showed that both coxI-I1 and coxI-I2 are mobile independently, but in some crosses,
coxI-I1 mobility was blocked by a small number of allele-specific sequence dif-
ferences in the DNA target site. By analyzing pooled progeny, they showed that
insertion of coxI-I1 into its target site is accompanied by asymmetric coconversion
of flanking exon sequences, which extended >50 bp into the 5
exon but only
<25 bp into the 3
exon. These findings and previous characterization of cDNAs
synthesized from unspliced precursor RNA in endogenous reverse transcription
reactions (49) suggested that the mobility intermediate was a reverse transcript of
unspliced precursor RNA. This cDNA was hypothesized to begin in the 3
exon and
extend through the intron into the 5
exon, enabling intron integration into the re-
cipient allele by recombination between homologous exon sequences (Figure 4A).
The inhibition of mobility by small DNA target site differences raised the pos-
sibility of a DNA endonuclease activity that cleaves the recipient allele, and this
possibility was supported by the subsequent identification of H-N-H DNA endonu-
clease motifs in the C-terminal domain of the IEPs (38, 101).
Annu. Rev. Genet. 2004.38:1-35. Downloaded from arjournals.annualreviews.org
by UNIVERSITY OF CALGARY on 02/25/05. For personal use only.
15 Oct 2004 19:27 AR AR230-GE38-01.tex AR230-GE38-01.sgm LaTeX2e(2002/01/18)
P1: GCE
MOBILE GROUP II INTRONS
11
Extending these findings, Moran et al. (78) showed that coxI-I2 mobility is
inhibited by mutations in the IEP’s RT and En domains, as well as by the deletion
of intron DV, which abolished ribozyme activity without affecting the amino acid
sequence of the IEP. Unexpectedly, mutations in the RT active site that abolished
RT activity inhibited mobility only by 50%, whereas mutations in the En region
reduced mobility to undetectable levels. These findings suggested two different
mobility mechanisms, both of which require the En activity—an RT-dependent
mechanism involving a cDNA intermediate, and an RT-independent DSBR mech-
anism initiated by En cleavage of the recipient allele, analogous to group I intron
mobility. Asymmetric coconversion of exon sequences was again observed, and
as above, the simplest interpretation was that the mobility intermediate was a re-
verse transcript of unspliced precursor RNA. Only later did it become clear that
the coconversion pattern was actually a composite of those for different mobility
pathways operating simultaneously (see below).
Enlightenment came from biochemical studies with yeast mt RNP preparations,
in which the mobility reactions were reconstituted in vitro, enabling the dissec-
tion of individual steps (125). First, incubation of the RNPs with double-stranded
homing site DNA showed that mobility occurs by a TPRT mechanism in which the
RNPs cleave both strands of the DNA and then use the 3
end of the cleaved anti-
sense strand as the primer for reverse transcription of an intron RNA template. The
sense strand is cleaved precisely at the exon junction, whereas the antisense strand
is cleaved at position +10 in the 3
exon. Analysis of mutants showed that the En
activity requires not only the IEP but also splicing-competent intron RNA, which
stabilizes the IEP. Because of the previous coconversion data, the initial model
continued to assume that unspliced precursor RNA was the template for reverse
transcription (Figure 4B), analogous to the TPRT mechanism first demonstrated
by Eickbush and coworkers for the insect R2 element (58). In that mechanism,
the RT uses a site-specific endonuclease activity to cleave the DNA target site,
generating a nick that is then used as a primer for reverse transcription beginning
at the 3
end of the element’s RNA.
For group II introns, the final critical revelation came from further dissection of
the DNA cleavage reaction using the newly developed biochemical methods (118,
124). These studies showed that while antisense-strand cleavage is catalyzed by
the En domain of the IEP, sense-strand cleavage is catalyzed by the intron RNA
via a reverse splicing reaction. The latter reaction inserts the intron RNA directly
into the DNA target site where it can serve as a template for reverse transcription
primed by the 3
end of the cleaved antisense strand (Figure 4C,E). While coxI-I2
carried out mainly the first step of reverse splicing in vitro, resulting in intron lariat
RNA linked to the 3
exon, coxI-I1 carried out substantial amounts of complete
reverse splicing in vitro, resulting in insertion of linear intron RNA between the
two DNA exons. These different levels of partial and complete reverse splicing
are now known to reflect that the reaction is reversible, with excision favored for
some introns, leading to a preponderance of partially reverse spliced product in
vitro (1). Reverse transcription across the 3
-splice site blocks excision and pulls
Annu. Rev. Genet. 2004.38:1-35. Downloaded from arjournals.annualreviews.org
by UNIVERSITY OF CALGARY on 02/25/05. For personal use only.
15 Oct 2004 19:27 AR AR230-GE38-01.tex AR230-GE38-01.sgm LaTeX2e(2002/01/18)
P1: GCE
12 LAMBOWITZ
ZIMMERLY
Annu. Rev. Genet. 2004.38:1-35. Downloaded from arjournals.annualreviews.org
by UNIVERSITY OF CALGARY on 02/25/05. For personal use only.
15 Oct 2004 19:27 AR AR230-GE38-01.tex AR230-GE38-01.sgm LaTeX2e(2002/01/18)
P1: GCE
MOBILE GROUP II INTRONS
13
the equilibrium toward fully reverse spliced intron, which is presumably the pre-
ferred mobility intermediate in all cases.
The finding that the intron RNA integrates into double-stranded DNA by reverse
splicing suggested that DNA target site recognition involves the EBS-IBS and δ-
δ
pairings, which are required for reverse splicing into RNA and single-stranded
DNA substrates (see Figure 1B). As discussed below, this inference was confirmed
in subsequent studies, which showed that the IEP also contributes to DNA target
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
Figure 4 Proposed and demonstrated group II intron mobility mechanisms. A. Early
proposed mechanism based on initial genetic studies of yeast mtDNA group II intron
homing. In this mechanism, the proposed mobility intermediate is a cDNA of unspliced
precursor RNA that begins just downstream of the intron in the 3
exon and contin-
ues through the intron into the 5
exon. Homologous recombination between exon
sequences in the cDNA and an intronless allele results in insertion of the intron with
coconversion of flanking exon sequences (54, 78). B. Initially proposed TPRT mech-
anism in which the group II intron RNP makes a double-strand break in the recipient
allele and then uses unspliced precursor RNA as a template for reverse transcription
(125). C. Major retrohoming mechanism used by the yeast mtDNA coxI-I1 and coxI-I2
introns in which the intron RNA reverse splices into the DNA target site and is reverse
transcribed by the IEP, with integration completed by DSBR-like recombination initi-
ated by the intron cDNA invading an intron-containing allele (30, 31, 118, 124). This
mechanism leads to coconversion of 5
-butnot 3
-exon sequences. D.RT-independent
homing mechanism used by yeast mtDNA group II introns, in which group II intron
RNPs cleave the recipient allele to initiate DSBR, analogous to group I intron homing
(30, 31). This mechanism results in coconversion of both 5
- and 3
-exon sequences.
E. Major retrohoming mechanism used by the L. lactis Ll.LtrB intron and for some
homing events by the yeast mtDNA introns. The intron RNA reverse splices into the
DNA target site and is used as a template to synthesize a full-length cDNA, which is
integrated by DNA replication and repair enzymes (16, 30). F. and G. Mechanisms
used by the L. lactis Ll.LtrB intron for retrohoming in the absence of second-strand
cleavage. The intron RNA reverse splices into double-stranded DNA and uses a nascent
leading or lagging strand at a DNA replication fork as the primer for reverse transcrip-
tion (122). Variations of these mechanisms may be used by naturally occurring group
II introns that encode proteins lacking the C-terminal En activity. H. Mechanism in
which the intron RNA reverse splices into single-stranded DNA at a replication fork
and uses a nascent lagging strand DNA as a primer for reverse transcription (44, 122).
This mechanism has been hypothesized for retrotransposition of the Ll.LtrB intron to
ectopic sites in L. lactis and also may be used by naturally occurring group II introns,
whose IEPs lack En activity. I. Alternate proposed mechanism for En
mobility, in
which the intron RNA reverse splices into the DNA target site and then uses a nick
in the opposite strand as a primer for reverse transcription (95). Models A and B are
shown in brackets to indicate that they have been superseded by other mechanisms
shown in the figure.
Annu. Rev. Genet. 2004.38:1-35. Downloaded from arjournals.annualreviews.org
by UNIVERSITY OF CALGARY on 02/25/05. For personal use only.
15 Oct 2004 19:27 AR AR230-GE38-01.tex AR230-GE38-01.sgm LaTeX2e(2002/01/18)
P1: GCE
14 LAMBOWITZ
ZIMMERLY
recognition and helps promote local DNA unwinding, enabling the intron RNA to
base pair to the target DNA. By using the same base-pairing interactions with the
5
and 3
exons for both RNA splicing and DNA target site recognition, the intron
ensures that it will insert only at target sites from which it can subsequently splice,
thereby minimizing host damage.
Following the biochemical studies, genetic analysis of large numbers of indi-
vidual mobility events for wild-type coxI-I1 and -I2 indicated that after reverse
splicing of the intron RNA into the DNA target site, mobility can be completed
by at least three different mechanisms (30, 31). In the major pathway (60% of
the events), intron insertion is accompanied by coconversion of 5
-butnot 3
-
exon sequences. These events are thought to occur by reverse splicing of the intron
RNA into the DNA target site, followed by synthesis of a complete or partial intron
cDNA, which then invades an intron-containing allele to initiate DSBR-like recom-
bination (Figure 4C). Completion of that process leads to variable coconversion of
5
-butnot 3
-exon sequences. In the second pathway (40% of the events), intron
insertion is accompanied by coconversion of both 5
- and 3
-exon sequences, pre-
sumably through DSBR, initiated by RNP cleavage of the target site (Figure 4D).
As expected, this pathway remains active in RT-deficient mutants. Finally, a small
proportion of mobility events for the wild-type introns occurs without any cocon-
version of exon sequences, presumably via synthesis of a full-length intron cDNA,
which is integrated by DNA repair, similar to the major pathway first demonstrated
for the L. lactis Ll.LtrB intron (see below; Figure 4E). Interestingly, the proportion
of mobility events occurring by the third pathway was increased to 43% by certain
DNA target site mutations (30). Although the use of a precursor RNA template
is not needed to explain the coconversion data, it remains a possibility for some
mobility events, particularly in light of biochemical studies showing that the RT is
bound to precursor RNA in a position to initiate cDNA synthesis in the 3
exon (115,
127). The ability to complete mobility by using different cellular recombination
and repair activities makes group II introns adaptable to different host organisms.
RETROHOMING OF THE L. LACTIS Ll.LtrB INTRON
The discovery of group II introns in bacteria suggested that tractable bacterial
systems might be developed for detailed genetic and biochemical analysis (34).
However, most bacterial introns proved immobile and refractory to high-level ex-
pression in Escherichia coli.Anexception was the Lactococcus lactis Ll.LtrB
intron (71, 100). Matsuura et al. (65) developed an efficient E. coli expression
system for Ll.LtrB and showed that the IEP, denoted LtrA, has RT, maturase,
and En activities. Further, RNPs consisting of the IEP and lariat RNA could re-
verse splice into double-stranded DNA target sites and carry out TPRT of the
inserted intron RNA. A minor difference from the yeast mt introns was that the
IEP cleaved the bottom strand at position +9 rather than +10. It was also shown
that a drug-resistance marker could be inserted into the DIV loop without impeding
the mobility reactions, demonstrating how the introns could be used as vectors.
Annu. Rev. Genet. 2004.38:1-35. Downloaded from arjournals.annualreviews.org
by UNIVERSITY OF CALGARY on 02/25/05. For personal use only.
15 Oct 2004 19:27 AR AR230-GE38-01.tex AR230-GE38-01.sgm LaTeX2e(2002/01/18)
P1: GCE
MOBILE GROUP II INTRONS
15
Next, Cousineau et al. (16) used the genetically marked introns to develop
plasmid-based genetic assays for intron homing in E. coli and L. lactis.Inakey
experiment, a group I intron introduced along with a kan
R
marker in DIV was
absent after intron homing, proving that mobility occurs via an RNA intermediate,
from which the group I intron could splice. Retrohoming of Ll.LtrB in both E. coli
and L. lactis was found to be RecA-independent and to occur without coconversion
of flanking exon sequences (16, 71). Both properties differ from the major pathway
used by the yeast mt introns and suggest that retrohoming of Ll.LtrB occurs pre-
dominantly via the synthesis of a full-length cDNA of the reverse spliced intron,
which is then integrated by DNA repair independent of homologous recombination
(Figure 4E).
ALTERNATE MOBILITY MECHANISMS USED
BY En
GROUP II INTRONS
As mentioned previously, many bacterial group II introns encode proteins lacking
the En domain. Thus far, the best studied is Sinorhizobium meliloti RmInt1, a
bacterial class D intron found originally in an IS element (Figure 2C). RmInt1
retrohomes efficiently, inserting into 20% to 48% of plasmid-borne target sites
introduced into intron-containing strains (60). The RmInt1 IEP has RT activity,
and its RNPs can reverse splice into double- or single-stranded DNA substrates, but
cannot carry out site-specific second-strand cleavage, thus requiring an alternate
mechanism to prime reverse transcription (80).
Insight into priming mechanisms for En
introns was obtained by analyzing
mobility of the Ll.LtrB intron in E. coli, using conditions in which second-strand
cleavage was blocked by mutations in either the IEP or DNA target site (122). Such
mutations decreased but did not eliminate mobility, and the residual mobility then
showed a pronounced replication frequency dependence and strand bias, suggest-
ing that a nascent leading strand at a DNA replication fork was used to prime reverse
transcription (Figure 4F). Residual mobility also occurred at lower frequency in the
opposite orientation, with characteristics suggesting that nascent lagging strands
could be used as primers to some extent (Figure 4G). The preference for leading
strand primers may reflect that after reverse splicing into double-stranded DNA,
the intron RNP is positioned to use a leading strand primer directly, while the
use of a lagging strand primer requires the potentially disruptive passage of the
replication fork through the region containing the inserted intron RNP. Notably,
En
retrohoming of the Ll.LtrB intron is dependent upon the RT activity of the
IEP, suggesting that the host DNA polymerase does not simply copy through the
inserted intron RNA and that the nascent DNA strand is used instead to prime
reverse transcription by the IEP. The situation may be analogous to the mode of
action of lesion bypass DNA polymerases, where the host replicative polymerase
disengages and then resumes after the lesion is repaired (87).
Similar mechanisms may be used by naturally occurring group II introns that
lack the En domain. Ichiyanagi et al. (44) noted that most En
introns are found
Annu. Rev. Genet. 2004.38:1-35. Downloaded from arjournals.annualreviews.org
by UNIVERSITY OF CALGARY on 02/25/05. For personal use only.
15 Oct 2004 19:27 AR AR230-GE38-01.tex AR230-GE38-01.sgm LaTeX2e(2002/01/18)
P1: GCE
16 LAMBOWITZ
ZIMMERLY
on the lagging template strand, the orientation opposite that for retrohoming of
En
mutants of the Ll.LtrB intron. This opposite strand bias may reflect that the En
introns retrohome by reverse splicing into single-stranded DNA after the passage
of the replication fork, enabling facile use of lagging strand primers (Figure 4H).
En
introns that retrohome efficiently may have specific adaptations for targeting
single-stranded regions or interactions with the host replication machinery that
facilitate use of nascent DNA strands as primers. Other priming mechanisms are
also possible. Schizosaccharomyces pombe cob-I1, for example, which encodes an
IEP with an inactive En domain, has been suggested to prime reverse transcription
of the reverse spliced intron RNA by using a nonspecific opposite-strand nick
(Figure 4I) (95).
GROUP II INTRON RETROTRANSPOSITION
TO ECTOPIC SITES
In addition to retrohoming, group II introns retrotranspose at low frequency (typ-
ically 10
4
to 10
5
; 25, 44) to ectopic sites that resemble the normal homing site,
providing a means of intron dispersal. Retrotransposition was first demonstrated for
group II introns in yeast and Podospora anserina mtDNAs (79, 98). In these small
genomes, duplication of the intron could be detected by PCR, but homologous
recombination rapidly deleted one copy of the intron along with the intervening
mtDNA sequences. The RmInt1 and Ll.LtrB introns expressed from plasmids were
also shown to retrotranspose to ectopic sites in their natural hosts (15, 44, 62). In
all cases, the retrotransposition sites generally had good matches for IBS1, but
poorer matches for IBS2 or for 5
- and 3
-exon sequences recognized by the IEP.
As discussed below, IEP interactions with the 5
exon are required for efficient
reverse splicing into double-stranded DNA, and interactions with the 3
exon are
required for second-strand cleavage. As expected from the lack of appropriate 3
-
exon sequences, retrotransposition of the Ll.LtrB intron in L. lactis does not require
the En activity of the IEP (15, 44). The initial studies of group II intron retrotrans-
position were interpreted in terms of an RNA-based mechanism first suggested
in general terms by Cech (10), who noted that self-splicing introns could reverse
splice into an ectopic site in RNA, leading to a recombined RNA, which is then
reverse transcribed and integrated into the genome by homologous recombination.
The finding that group II intron RNPs reverse splice into DNA homing sites
suggested an alternate “DNA target” mechanism for retrotransposition, in which
the intron RNA reverse splices directly into an ectopic DNA site and is then reverse
transcribed by the IEP (118, 124). A DNA target mechanism was supported initially
by biochemical studies with yeast coxI-I1, which showed that RNPs could reverse
splice albeit inefficiently into known DNA transposition sites in vitro (117). It was
then proven by showing that “flipping” the target sequence to the opposite strand
to prevent its transcription did not inhibit retrotransposition (25). The generality
of the mechanism was suggested by the observation that many group II intron
integration sites in bacterial genomes are located in nontranscribed intergenic
Annu. Rev. Genet. 2004.38:1-35. Downloaded from arjournals.annualreviews.org
by UNIVERSITY OF CALGARY on 02/25/05. For personal use only.
15 Oct 2004 19:27 AR AR230-GE38-01.tex AR230-GE38-01.sgm LaTeX2e(2002/01/18)
P1: GCE
MOBILE GROUP II INTRONS
17
regions, inconsistent with an RNA target mechanism (21, 44, 61). Early experi-
ments with the S. meliloti RmInt1 intron also suggested a DNA-target mechanism
by showing RecA-independent insertion into an ectopic site in the oxi1 gene, but
its relationship to other retrotransposition events is unclear because the oxi1 site
closely resembles the normal homing site and supports mobility at a relatively high
frequency (5% of the wild-type level) (62).
Ichiyanagi et al. (44) obtained further insight into group II intron retrotransposi-
tion mechanisms by a systematic study of the L. lactis Ll.LtrB intron using a pow-
erful new genetic assay based on incorporation of a R
etrotransposition Indicator
G
ene (RIG) marker. This marker, analogous to those used to study retrotransposi-
tion in a variety of other systems (19), consists of a kan
R
gene inserted into group II
intron DIV in the reverse orientation, but interrupted by an efficiently self-splicing
group I intron in the forward orientation. During retrotransposition via an RNA
intermediate, the group I intron is spliced, reconstituting the kan
R
marker gene,
which is then selected after the intron has integrated into a retrotransposition site.
Analysis of a large number of events showed that retrotransposition of the Ll.LtrB
intron in L. lactis occurs into both transcribed and nontranscribed strands and
established that retrotransposition is not dependent on RecA function, a feature
obscured by selection biases in earlier experiments. In addition, most of the retro-
transposition sites were found in the lagging template strand. This finding together
with the lack of 5
- and 3
-exon sequences recognized by the IEP strongly suggest
a mechanism in which the intron reverse splices into transiently single-stranded
DNA at a replication fork, followed by priming from a nascent lagging DNA
strand (43, 44, 122). In principle, retrotransposition could also occur by inaccurate
reverse splicing into double-stranded DNA, using either nascent DNA strands or
opposite-strand nicks to prime reverse transcription, with the proportion of events
occurring by different pathways varying for other introns or even for the same
intron under different conditions.
Although the DNA-target mechanism is predominant, one can ask if the RNA-
target mechanism is ever used and if not, why not? All the required steps appear to
be feasible. Group II introns reverse splice readily into RNA substrates in vitro, and
precise intron deletion from yeast mtDNA genes appears to occur by a mechanism
involving recombination between genomic DNA and a cDNA of spliced mRNA
synthesized by a group II IEP (56). A possible explanation is that for the introns
analyzed thus far, the IEP has a high affinity for DNA, thereby targeting the intron
to reverse splice preferentially into DNA sites. It remains important to look for
cases in which group II introns retrotranspose via the RNA target mechanism.
GROUP II INTRON RETROTRANSPOSITION SUPPORTED
BY AN INTRON-ENCODED PROTEIN IN TRANS?
Although most bacterial group II introns encode ORFs, four ORF-less introns were
identified recently in cyanobacteria and archaea (23, 81; L. Dai & S.Z., unpublished
data). These ORF-less introns appear to be mobile, as judged by their presence at
Annu. Rev. Genet. 2004.38:1-35. Downloaded from arjournals.annualreviews.org
by UNIVERSITY OF CALGARY on 02/25/05. For personal use only.
15 Oct 2004 19:27 AR AR230-GE38-01.tex AR230-GE38-01.sgm LaTeX2e(2002/01/18)
P1: GCE
18 LAMBOWITZ
ZIMMERLY
multiple sites, and in each case, a closely related intron encoding an RT ORF is
also found within the genome. The latter finding raises the possibility that mobility
is promoted by the IEP in trans. The intron-insertion sites for the ORF-less introns
lack shared features except for matches for the IBS1 and IBS2 sequences recog-
nized by base pairing of the intron RNA, suggesting reverse splicing into single-
stranded DNA or RNA target sites (see above). An analogous situation may exist in
Euglena cp DNA, which contains only two protein-encoding group II or III introns,
but has a large number of ORF-less group II introns that appear to have retrotrans-
posed by a mechanism dependent mainly on the IBS1/EBS1 pairing (13, 26, 120).
MECHANISM OF SECOND-STRAND SYNTHESIS
AND CONNECTIONS TO DNA REPLICATION
Following synthesis of the intron cDNA, both retrohoming and retrotransposition
require second-strand synthesis to complete intron integration. In the case of the
Ll.LtrB intron, the IEP has very low processive DNA-dependent DNA polymerase
activity, suggesting that second-strand synthesis may be carried out by host en-
zymes (J. Zhong & A.M.L., unpublished data). Consistent with this possibility, a
real-time PCR experiment using a recipient plasmid with a temperature-sensitive
replication origin showed that homing of the wild-type Ll.LtrB intron is blocked
when plasmid replication is inhibited, indicating a requirement for intron insertion
into actively replicating DNA (122). A connection between mobility and DNA
replication was also suggested by the finding that an Ll.LtrB intron with random-
ized EBS1/2 and δ sequences inserts preferentially at sites near the chromosome
replication origin (57% of the insertion sites found within 5% of the genome on
either side of oriC;121). Although this clustering may reflect in part the higher
copy number of origin proximal genes, which in rapidly dividing E. coli may be
greater than 4:1, it is not observed for other transposons. It could also reflect a di-
rect interaction between group II intron RNPs and the DNA replication machinery
that leads to intron insertion soon after initiation of DNA synthesis at OriC.
SYNTHESIS OF THE INTRON-ENCODED PROTEIN
Studies with the yeast mtDNA and lactococcal Ll.LtrB introns have provided
insight into individual steps in intron mobility. The IEP, which is required for RNA
splicing, must be translated initially from unspliced precursor RNA. In Ll.LtrB and
other bacterial group II introns, the ribosome-binding site and initiation codon of
the intron ORF are located in or near the DIVa stem-loop structure, which is a high-
affinity binding site for the IEP (115). Experiments with Ll.LtrB showed that the
IEP functions most efficiently when expressed in cis from the same plasmid as the
intron RNA (17, 123), and that its binding to DIVa down-regulates translation by
sequestering the ribosome-binding site (104). The latter prevents the accumulation
Annu. Rev. Genet. 2004.38:1-35. Downloaded from arjournals.annualreviews.org
by UNIVERSITY OF CALGARY on 02/25/05. For personal use only.
15 Oct 2004 19:27 AR AR230-GE38-01.tex AR230-GE38-01.sgm LaTeX2e(2002/01/18)
P1: GCE
MOBILE GROUP II INTRONS
19
of excess IEP and also halts ribosome entry into the intron, which might otherwise
impede RNA splicing. In L. lactis, the Ll.LtrB intron may also use an internal
promoter in DI to independently express LtrA (123), but there is no evidence
that such an internal promoter is used in E. coli where Ll.LtrB stills functions
efficiently. Many organellar group II introns, including yeast coxI-I1 and -I2, differ
from bacterial introns in that the ORF extends upstream from DIV and is translated
in frame with the upstream exon, yielding a precursor protein that is cleaved to
generate the active IEP (67). In these cases, the splicing of the intron prevents
further translation, again feedback regulating the IEP (9). The upstream extension
of the ORF in organellar group II introns is thought to be an evolutionary adaptation
that permits more efficient translation initiation or regulation (51).
BINDING OF THE INTRON-ENCODED PROTEIN
TO THE INTRON RNA
The interaction between the IEP and intron RNA is critical for all steps of intron
splicing and mobility. The RNA splicing activity of the IEP was first demonstrated
genetically for the yeast mtDNA coxI-I1 and -I2 introns (9, 77) and subsequently
both genetically and biochemically for L. lactis Ll.LtrB intron (65). The bio-
chemical studies with the Ll.LtrB intron showed that the purified LtrA protein
binds tightly and specifically to the Ll.LtrB intron but not to noncognate group
II introns, and is by itself sufficient to promote RNA splicing at physiological
Mg
2+
concentrations in vitro (93). The LtrA protein binds to the intron RNA as a
dimer, the same quaternary structure found for other RTs (93).
Further studies with Ll.LtrB suggested a model in which the IEP binds first
to the high-affinity binding site in DIVa, where it also autoregulates translation
(see above), and then makes secondary contacts with conserved catalytic core
regions, potentially including DI, DII, and DVI, to fold the intron RNA into the
active structure (64, 115). Analysis of mutants suggests that the N terminus of the
RT domain interacts with DIVa, whereas other regions of the RT and X domains
interact with the catalytic core (17). There is some indication that after the initial
binding to DIVa, the interaction of the IEP with the catalytic core occurs by tertiary-
structure capture rather than by tertiary-structure induction (82). The yeast coxI-I2
intron uses a similar mechanism in which DIVa is required for stable binding of
the IEP and additional contacts with the catalytic core promote RNA splicing (42).
The high-affinity binding to DIVa accounts at least in part for the intron specificity
of maturases and may also serve to anchor or guide the interaction of the IEP with
catalytic core regions.
Although DIVa makes a strong contribution to binding, residual splicing of
Ll.LtrB and yeast coxI-I2 can occur in the absence of DIVa by direct binding of
the IEP to the catalytic core both in vitro (64, 115) and in vivo (17, 42). The level of
residual splicing in vivo in the absence of DIVa is considerably different for Ll.LtrB
and coxI-I2 (10% and 70%, respectively), possibly reflecting different relative
Annu. Rev. Genet. 2004.38:1-35. Downloaded from arjournals.annualreviews.org
by UNIVERSITY OF CALGARY on 02/25/05. For personal use only.
15 Oct 2004 19:27 AR AR230-GE38-01.tex AR230-GE38-01.sgm LaTeX2e(2002/01/18)
P1: GCE
20 LAMBOWITZ
ZIMMERLY
affinities of the IEP for different binding sites. However, for both introns, deletion
of DIVa has a much stronger effect on intron mobility (>10
5
-fold inhibition; 20,
42). The latter could reflect the fact that binding to DIVa is particularly critical
for positioning the RT to initiate reverse transcription (115). The finding that the
IEP can interact directly with the conserved regions of the intron, independent of
DIVa, suggests how maturases might evolve from intron-specific to general group
II intron splicing factors, as may have occurred for the plant MatK and nMat
proteins (64).
The use of a high-affinity binding site outside the catalytic core differs from
the mode of action of well-studied group I intron splicing factors, which bind
directly to the catalytic core to stabilize the active RNA structure (reviewed in 52).
A possible rationale is that high-affinity binding to DIVa anchors one region of the
protein to the intron RNA, while leaving other regions relatively flexible to engage
in different interactions during the multiple steps in RNA splicing and intron
mobility. In the latter process, the IEP sequentially recognizes the DNA target site,
promotes reverse splicing, cleaves the second strand, and initiates TPRT, all while
remaining associated in some way with the intron RNA (64).
DNA TARGET SITE RECOGNITION BY GROUP II
INTRON RNPs
Group II intron RNPs recognize DNA target sequences by using both the IEP and
base pairing of the intron RNA. Figure 5A–D compare the DNA target sites for four
group II introns analyzed thus far. In all four cases, the intron RNA base pairs to the
5
- and 3
-exon sequences via the previously discussed EBS2/IBS2, EBS1/IBS1
and δ-δ
(group IIA), or EBS3/IBS3 (group IIB) interactions, while the IEP rec-
ognizes additional upstream and downstream sequences (41, 47, 75, 103, 117).
In each case, the importance of the base-pairing interactions was demonstrated
by showing that mutations in the DNA target site could be rescued by compensa-
tory mutations in the intron RNA. Notably, analysis of reverse splicing into RNA
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
Figure 5 DNA target site recognition by mobile group II introns. Target sites are
shown for the following group II introns: A. L. lactis Ll.LtrB (40, 75, 86); B. S. cerevisiae
coxI-I1 (117); C. S. cerevisiae coxI-I2 (41); D. S. meliloti RmInt1 (47); and E. bacterial
class C introns (21, 39). Group II intron RNPs use both the IEP and base pairing
of the intron RNA to recognize specific sequences in the DNA target site. The most
critical positions recognized by the IEP in the 5
and 3
exons are indicated by shaded
squares and ovals, respectively, based on original references, where they are defined
by somewhat different criteria. The IBS and δ
sequence regions recognized by base
pairing of the intron RNA’s EBS and δ sequences are also shaded. Bacterial class
C introns appear to recognize the inverted repeat of a rho-independent transcription
terminator followed by an IBS1 sequence that base pairs with the intron RNA.
Annu. Rev. Genet. 2004.38:1-35. Downloaded from arjournals.annualreviews.org
by UNIVERSITY OF CALGARY on 02/25/05. For personal use only.
15 Oct 2004 19:27 AR AR230-GE38-01.tex AR230-GE38-01.sgm LaTeX2e(2002/01/18)
P1: GCE
MOBILE GROUP II INTRONS
21
Annu. Rev. Genet. 2004.38:1-35. Downloaded from arjournals.annualreviews.org
by UNIVERSITY OF CALGARY on 02/25/05. For personal use only.
15 Oct 2004 19:27 AR AR230-GE38-01.tex AR230-GE38-01.sgm LaTeX2e(2002/01/18)
P1: GCE
22 LAMBOWITZ
ZIMMERLY
substrates showed that single-base mismatches affect the k
cat
for reverse splicing
as well as K
d
, and thus have a much stronger inhibitory effect than is expected from
decreased binding affinity alone (116). This feature likely contributes to the very
high target specificity for intron insertion. The DNA target sequences recognized
by the IEP differ even for closely related introns, such as coxI-I1 and -I2, suggesting
that the IEP can evolve readily to recognize different target sites, a feature that
may help group II introns to establish themselves at new locations. Further, for all
of the introns, relatively few positions are recognized by the IEP, implying that
most of the specificity for DNA target site recognition comes from base pairing of
the intron RNA.
The yeast mtDNA and lactococcal introns, whose IEPs contain C-terminal D
and En domains, recognize relatively long DNA target sites (30–35 bp). For all
three introns, mutations in key nucleotide residues recognized by the IEP in the
distal 5
-exon region inhibit both reverse splicing and second-strand cleavage,
whereas 3
-exon mutations inhibit only second-strand cleavage. In the case of
RmInt1, whose IEP lacks the En domain as well as all or part of domain D, DNA
target site recognition by the IEP appears more limited, with mutations at only two
positions (15 and +4) strongly inhibiting mobility (47). Since RmInt1 does not
carry out second-strand cleavage, IEP recognition of the 3
-exon must be required
either for the initial DNA target site recognition or for reverse splicing, a difference
from the yeast and lactococcal introns.
Thus far, detailed biochemical analysis of DNA target site recognition has been
done only for the Ll.LtrB intron, where an efficient E. coli expression system
makes it possible to obtain large amounts of reconstituted RNPs for biochemical
studies (93). Kinetic analysis showed that the RNPs first bind DNA nonspecifically
and then search for their target site sequence, presumably by facilitated-diffusion
mechanisms analogous to those used by site-specific DNA-binding proteins (1).
The initial recognition event appears to involve major groove interactions be-
tween the IEP and key bases in the distal 5
-exon region, including T-23, G-21,
and A-20 on the same strand into which the intron reverse splices (Figure 5A)
(103). These base interactions bolstered by phosphate-backbone contacts trigger
local DNA unwinding, enabling the intron RNA to base pair to the IBS and δ
se-
quences for reverse splicing. Mutagenesis experiments indicate some preferences
for specific bases from positions 17 to 13, where the IEP crosses the minor
groove, but it is not clear if this is due to direct interactions with bases or indirect
readout of DNA structure (86). Importantly, mutations at all critical bases in the
distal 5
-exon region strongly inhibit reverse splicing into double-stranded DNA
buthave little if any effect on reverse splicing into otherwise identical single-
stranded DNA targets, implying that their recognition is required mainly for DNA
unwinding (122). Second-strand cleavage occurs after a lag and is dependent on
the same protein and base-pairing interactions that position the RNP for reverse
splicing (although reverse splicing per se is not required; 1), as well as a small
number of additional IEP interactions with the 3
exon, the most critical being
recognition of T + 5 (Figure 5A) (75, 103). The latter lies in the region of the
Annu. Rev. Genet. 2004.38:1-35. Downloaded from arjournals.annualreviews.org
by UNIVERSITY OF CALGARY on 02/25/05. For personal use only.
15 Oct 2004 19:27 AR AR230-GE38-01.tex AR230-GE38-01.sgm LaTeX2e(2002/01/18)
P1: GCE
MOBILE GROUP II INTRONS
23
DNA target site that becomes single-stranded after DNA unwinding, suggest-
ing a mechanism for temporally separating reverse splicing and second-strand
cleavage.
The small number of bases recognized by the IEP in the distal 5
-exon re-
gion raises the question of how RNPs recognize sites that can base pair with
the intron RNA without inefficiently unwinding a large number of noncognate
sites. The answer seems to be that DNA unwinding requires concerted interac-
tion of the IEP and base-pairing of the intron RNA, which may drive the pro-
cess to completion. Thus, RNPs reconstituted with wild-type IEP and an intron
RNA, whose EBS and δ sequences were modified to prevent base pairing with the
DNA target site, did not induce DNA unwinding assayed by KMnO
4
modifica-
tion (103). The concerted RNA interaction could involve progressive sampling of
base pairs starting from one end, or possibly triplex formation followed by base
pairing.
Bacterial class C introns, whose IEPs lack an En domain, have a highly distinct
target specificity. These introns insert downstream of palindromic rho-independent
transcription terminators at sites having potential IBS1 but not IBS2 sequences
(Figure 5E). Moreover, bacterial class C introns are often found inserted at multiple
target sites that have little sequence similarity, but share the palindromic terminator
motif, suggesting IEP recognition of higher-order DNA structure (21, 39). Indeed,
the formation of a DNA hairpin structure would be favored in single-stranded
regions and may be a key factor in targeting these En
introns to the lagging
template strand at DNA replication forks, where they can use a nascent lagging
strand to prime reverse transcription (see above).
REGULATION OF GROUP II INTRON MOBILITY
Mobile elements and their hosts use diverse mechanisms to regulate transposition
and thus minimize host damage (8). For group II introns, the primary mecha-
nisms are site-specific insertion and removal by RNA splicing. In addition, many
bacterial group II introns insert into benign sites, such as downstream of transcrip-
tion terminators or within mobile elements, pathogenicity islands, or other group II
introns (21, 23). In a number of cases, these target sites correspond to conserved se-
quences, e.g., conserved regions of transposases or the RT domain of other group II
introns (23, 106), maximizing the chances that unoccupied sites will be available.
More direct regulation is also possible. Some E. coli strains, for example, contain
structurally intact group II introns but also have unfilled homing sites, suggesting
that mobility is rare, and attempts to demonstrate mobility of E. coli introns have
failed (22). The mobility of group II introns in E. coli is impeded by endoge-
nous ribonucleases, which nick the intron RNA (40). However, this factor alone
seems insufficient to account for the immobility of E. coli introns, implying ad-
ditional mechanisms that suppress group II intron mobility or limit it to specific
environmental conditions.
Annu. Rev. Genet. 2004.38:1-35. Downloaded from arjournals.annualreviews.org
by UNIVERSITY OF CALGARY on 02/25/05. For personal use only.
15 Oct 2004 19:27 AR AR230-GE38-01.tex AR230-GE38-01.sgm LaTeX2e(2002/01/18)
P1: GCE
24 LAMBOWITZ
ZIMMERLY
EVOLUTION OF MOBILE GROUP II INTRONS
The phylogenetic distribution of mobile group II introns suggests that they evolved
in bacteria and were then transferred to eukaryotes, possibly via bacterial endosym-
bionts that gave rise to organelles (34, 111, 115, 126). An origin in archaea seems
less likely since the few group II introns found in archaea can be accounted for
by horizontal transfer from bacteria (91). The inference from phylogenetic anal-
ysis that the IEP was associated with the intron RNA prior to the divergence of
different group II intron lineages (see above) led to the “retroelement ancestor
hypothesis.” According to this hypothesis, all extant group II introns descended
from RT-encoding group II introns in bacteria, with two lineages becoming as-
sociated with organelles, and ORF loss occurring in many organellar and a few
bacterial group II introns (109). This hypothesis is supported by the finding of
ORF remnants in many, but not all, ORF-less introns (23, 109).
The “retroelement ancestor hypothesis” does not directly address how mobile
group II introns originated. One possibility is that they arose from a retroelement
that developed self-splicing activity to minimize deleterious effects of its trans-
position on the host (18). However, this scenario does not provide a compelling
rationale for the evolution of a complex ribozyme structure, since the same RNA
splicing reactions are readily carried out by protein enzymes. An alternate hypoth-
esis is that mobile group II introns were created by the insertion of a retroelement
or RT into a pre-existing group II ribozyme, analogous to the evolution of mobile
group I introns by the invasion of ORFs encoding DNA homing endonucleases
(51). In this scenario, the group II ribozyme originated before its association with
the RT and may have functioned as a mobile element by reverse splicing into
RNA or DNA sites. The insertion of an RT into the intron would then have re-
sulted in a more efficient genomic parasite [see Discussion in (115)]. An interest-
ing possibility is that group II intron ribozymes evolved initially in thermophiles
or halophiles, providing conditions in which the intron RNA might be catalyt-
ically efficient. Acquisition of the ORF might then have enabled the introns to
break out into other bacterial species. This hypothesis predicts that self-sufficient
ORF-less group II introns may still exist in organisms that live under extreme
conditions.
EVOLUTION OF THE INTRON-ENCODED PROTEIN
AND PROTEIN-ASSISTED SPLICING
In either scenario for the origin of mobile introns, the RT functioned initially in
intron mobility and then adapted to function in RNA splicing by virtue of its
specific interaction with the intron RNA. The ancestral RT likely lacked the D and
En domains and may have promoted mobility by one of the mechanisms discussed
previously for En
group II introns, e.g., reverse splicing into transiently single-
stranded DNA followed by use of a nascent DNA strand at a replication fork to
Annu. Rev. Genet. 2004.38:1-35. Downloaded from arjournals.annualreviews.org
by UNIVERSITY OF CALGARY on 02/25/05. For personal use only.
15 Oct 2004 19:27 AR AR230-GE38-01.tex AR230-GE38-01.sgm LaTeX2e(2002/01/18)
P1: GCE
MOBILE GROUP II INTRONS
25
prime reverse transcription (Figure 4H). Acquisition of the D and En domains
would have allowed efficient mobility by reverse splicing into double-stranded
DNA without the need for DNA replication to generate a primer (Figure 4C–E).
Group II introns that insert within genes are under selective pressure to retain
splicing activity and to limit mobility in order to minimize host damage. Both
requirements are satisfied by the replacement of the IEP by cellular splicing factors,
as observed for organellar group II introns. Further, loss of mobility functions is
expected for elements that have saturated available target sites, because there is
less opportunity to select functional variants (36). The saturation of available target
sites and fewer opportunities for horizontal transfer may explain why degenerate
group II introns are much more prevalent in small organelle genomes than in
bacteria (21).
The different proteins recruited to promote splicing in different hosts may
in turn have dictated distinct patterns of RNA structural degeneration. Thus,
fungal mtDNA group II introns, which retain the canonical structure but self-
splice inefficiently, may rely on splicing factors that compensate only for limited
RNA structural defects, whereas higher plant cp and mt group II introns, which
have greater structural deviations, may rely on a larger number of more glob-
ally acting proteins, perhaps acting together in a complex (83, 92, 108). Simi-
larly, the wholesale structural degeneration of Euglena cp group II introns may
reflect the recruitment of proteins that supply the missing functions or a stochas-
tic event that enabled some group II intron domains to function efficiently in
trans.
EVOLUTIONARY RELATIONSHIP OF GROUP II
INTRONS TO SPLICEOSOMAL INTRONS AND
NON-LTR-RETROTRANSPOSONS
Group II introns are the proposed ancestors of both nuclear spliceosomal introns
and nuclear non-LTR-retrotransposons (11, 27, 28, 99, 125). The key steps pos-
tulated to have occurred in the conversion of group II introns into spliceosomal
introns—degeneration of internal RNA structure, dependence on a common splic-
ing apparatus, and the use of trans-acting RNAs—are all exemplified by the prop-
erties of group II introns in different organisms. The involvement of snRNAs in
the splicing of nuclear introns seems a particularly compelling argument for such
a scenario, since RNA splicing reactions per se can be catalyzed simply by protein
enzymes, as for eukaryotic tRNA introns.
In eukaryotes, the nuclear envelope, which separates transcription from at least
the bulk of translation, would constitute a barrier for both splicing and mobility
of group II introns, since the IEP can no longer bind the intron RNA immediately
after transcription. This separation may favor the substitution of host-encoded
proteins that could function more efficiently in trans, leading to the evolution of a
common splicing apparatus. It is possible that early “spliceosomal” introns retained
Annu. Rev. Genet. 2004.38:1-35. Downloaded from arjournals.annualreviews.org
by UNIVERSITY OF CALGARY on 02/25/05. For personal use only.
15 Oct 2004 19:27 AR AR230-GE38-01.tex AR230-GE38-01.sgm LaTeX2e(2002/01/18)
P1: GCE
26 LAMBOWITZ
ZIMMERLY
mobility by interacting with the RTs in trans,asmay be the case for some present-
day ORF-less introns (see above). However, as the number of introns grew, mobility
would be increasingly detrimental to the host, favoring the replacement of the
RT with other cellular splicing factors. Group II introns that had not inserted
within genes would be under no selective pressure to retain splicing, enabling them
to evolve into non-LTR-retrotransposons. Although it is uncertain whether any
evolutionary scenario can be proven, additional clues may be obtained by searching
the genomes of primitive eukaryotes for remnants of group II introns or primitive
snRNAs.
GROUP II INTRONS AS GENE-TARGETING VECTORS
Because group II introns recognize their DNA target sites mainly by base pairing
of the intron RNA, they can be targeted to insert into different DNA sites simply
by modifying the intron RNA (31, 40, 41). This feature, combined with their very
high insertion frequency and specificity, has made it possible to use mobile group II
introns as programmable gene-targeting vectors, dubbed “targetrons.” A targetron
derived from the L. lactis Ll.LtrB intron has been used for efficient targeted gene
disruption in both gram-negative and gram-positive bacteria (35, 48, 86, 121). Ad-
ditionally, group II introns can be used for the site-specific chromosomal insertion
of cargo genes cloned in DIV (e.g., 35) and to introduce targeted double-strand
breaks, which stimulate homologous recombination with a cotransformed DNA
fragment, enabling the introduction of point mutations (48).
In bacteria, retargeted group II introns are generally expressed from a donor
plasmid. The donor plasmid pACD3, shown in Figure 6, expresses an Ll.LtrB-
ORF intron and short flanking exons, with the IEP synthesized from a position
just downstream of the 3
exon (121). The IEP expressed from this position still
promotes efficient splicing and mobility, but after insertion at a new location, the
ORF intron is unable to splice significantly in the absence of the IEP, yielding
a gene disruption. Since the ORF intron contains multiple stop codons in all
reading frames, suitably placed disruptions are likely to ablate gene function. An
intron targeted to the antisense strand inserts in the orientation opposite target
gene transcription and cannot be spliced, yielding an unconditional disruption.
By contrast, an intron targeted to the sense strand inserts in the same orientation
as target gene transcription, and a ORF intron inserted in this orientation may
give a conditional disruption, if its splicing is linked to the expression of the
IEP from a separate construct (Figure 6) (35, 48). Group II intron retargeting is
now done routinely by using a computer program that scans the desired target
sequence for the best matches to the positions recognized by the IEP and then
designs primers for modifying the intron’s EBS and δ sequences to insert into
those sites (86). The positions recognized by the IEP are sufficiently few and
flexible that the program readily identifies multiple rank-ordered target sites in any
gene.
Annu. Rev. Genet. 2004.38:1-35. Downloaded from arjournals.annualreviews.org
by UNIVERSITY OF CALGARY on 02/25/05. For personal use only.
15 Oct 2004 19:27 AR AR230-GE38-01.tex AR230-GE38-01.sgm LaTeX2e(2002/01/18)
P1: GCE
MOBILE GROUP II INTRONS
27
Figure 6 Use of targeted group II introns (targetrons) for gene disruption. The donor
plasmid pACD3 expresses a 0.9-kb L. lactis Ll.LtrB-ORF intron and short flanking
exons, with the IEP (LtrA protein) expressed from a position just downstream of the 3
exon (121). T1 and T2 are transcription terminators. The intron is retargeted by modi-
fying the EBS2, EBS1, and δ sequences to base pair to the IBS2, IBS1, and δ
sequences
in the DNA target site. The IBS1 and 2 sequences in the donor plasmid’s 5
exon are
also modified to base pair to the intron’s retargeted EBS1 and 2 sequences for efficient
RNA splicing. Sequences modified for retargeting are boxed. An intron targeted to the
antisense strand inserts in the orientation opposite target gene transcription and thus
cannot be spliced, yielding an unconditional disruption (right). By contrast, an intron
targeted to the sense strand inserts in correct orientation to be spliced; a ORF intron
inserted in this orientation can potentially yield a conditional disruption by linking its
splicing to expression the LtrA protein in trans (35) (left).
Annu. Rev. Genet. 2004.38:1-35. Downloaded from arjournals.annualreviews.org
by UNIVERSITY OF CALGARY on 02/25/05. For personal use only.
15 Oct 2004 19:27 AR AR230-GE38-01.tex AR230-GE38-01.sgm LaTeX2e(2002/01/18)
P1: GCE
28 LAMBOWITZ
ZIMMERLY
In E. coli, retargeted group II introns commonly insert specifically into the
desired chromosomal target site at frequencies >1% without selection (86). This
frequency is high enough to detect insertions by colony PCR. Alternatively, inser-
tions can be detected by using a conventional or R
etrotransposition-Activated
selectable M
arker (RAM), patterned after previously developed RIG markers
(see above), inserted in DIV (121). One RAM marker used for gene targeting
is a small trimethoprim-resistance (Tp
R
) gene inserted in DIV in the reverse
orientation, but interrupted by an efficiently self-splicing group I intron in the
forward orientation. The latter is excised during retrotranspositon, enabling se-
lection of the marker after integration into a DNA target site. Even for ineffi-
cient introns, nearly 100% of the Tp
R
colonies had the desired single disruption
(121).
In addition to targeted gene disruption, an Ll.LtrB intron containing a RAM
marker and randomized target site recognition (EBS and δ) sequences was used
to obtain disruptions at sites distributed throughout the E. coli genome, analogous
to global transposon mutagenesis. Despite clustering of insertion sites near the
chromosome replication origin, the resulting library was sufficiently complex to
contain disruptants of most if not all nonessential E. coli genes (121). Advantages
of targetrons are that each gene can be targeted individually, with multiple introns
if necessary, and that the high insertion frequencies in the absence of selection
facilitate the construction of strains having multiple disruptions or other desirable
combinations of traits.
In initial work toward developing group II intron-based gene targeting methods
in higher organisms, group II introns were designed and selected to insert into
the HIV1 provirus and the human gene encoding CCR5,animportant target site
in anti-HIV therapy. The retargeted intron RNPs retained activity in human cells,
inserting into plasmid-borne HIV1 and CCR5 target sites after liposome-mediated
transfection (40). If group II introns can be adapted to function as efficiently in
gene targeting in eukaryotes as they now do in bacteria, they would have potentially
widespread applications, including the production of genetically stable disruptants
for functional genomics, and the site-specific insertion or repair of genes for gene
therapy.
ACKNOWLEDGMENTS
We thank Marlene Belfort, Georg Mohr, Philip Perlman, and Roland Saldanha for
comments on the manuscript. Work in the authors’ laboratories was supported by
NIH grants GM37949 and GM37951 to A.M.L. and CIHR, NSERC, and AHFMR
grants to S.Z. Both authors are inventors on patents and patent applications for
group II intron-based gene targeting methods assigned to the Ohio State University
and the University of Texas at Austin. They may receive a percentage of royalties
paid to the universities from licensing of those patents. In addition, A.M.L. holds
a minority equity interest in a company (InGex LLC) licensed by the universities
to commercialize group II intron gene targeting technology.
Annu. Rev. Genet. 2004.38:1-35. Downloaded from arjournals.annualreviews.org
by UNIVERSITY OF CALGARY on 02/25/05. For personal use only.
15 Oct 2004 19:27 AR AR230-GE38-01.tex AR230-GE38-01.sgm LaTeX2e(2002/01/18)
P1: GCE
MOBILE GROUP II INTRONS
29
The Annual Review of Genetics is online at http://genet.annualreviews.org
LITERATURE CITED
1. Aizawa Y, Xiang Q, Lambowitz AM, Pyle
AM. 2003. The pathway for DNA recog-
nition and RNA integration by a group II
intron retrotransposon. Mol. Cell 11:795–
805
2. Belfort M. 2003. Two for the price of
one: a bifunctional intron-encoded DNA
endonuclease-RNA maturase. Genes Dev.
17:2860–63
3. Belfort M, Derbyshire V, Parker MM,
Cousineau B, Lambowitz AM. 2002. Mo-
bile introns: pathways and proteins. See
Ref. 16a, pp. 761–83
4. Belhocine K, Plante I, Cousineau B.
2004. Conjugation mediates transfer of
the Ll.LtrB group II intron between differ-
ent bacterial species. Mol. Microbiol. 51:
1459–69
5. Bonen L. 1993. Trans-splicing of pre-
mRNA in plants, animals, and protists.
FASEB J. 7:40–46
6. Bonitz SG, Coruzzi G, Thalenfeld BE,
Tzagoloff A, Macino G. 1980. Assem-
bly of the mitochondrial membrane sys-
tem. Structure and nucleotide sequence
of the gene coding for subunit 1 of
yeast cytochrome oxidase. J. Biol. Chem.
255:11927–41
7. Burger G, Lang BF. 2003. Parallels in
genome evolution in mitochondria and
bacterial symbionts. IUBMB Life 55:205–
12
8. Bushman FD. 2003. Targeting survival:
integration site selection by retroviruses
and LTR-retrotransposons. Cell 115:135–
38
9. Carignani G, Groudinsky O, Frezza D,
Schiavon E, Bergantino E, Slonimski PP.
1983. An mRNA maturase is encoded
by the first intron of the mitochondrial
gene for the subunit I of cytochrome
oxidase in S. cerevisiae. Cell 35:733–
42
10. Cech TR. 1985. Self-splicing RNA: impli-
cations for evolution. Int. Rev. Cytol. 93:
3–22
11. Cech TR. 1986. The generality of self-
splicing RNA: relationship to nuclear
mRNA splicing. Cell 44:207–10
12. Chin K, Pyle AM. 1995. Branch-point at-
tack in group II introns is a highly re-
versible transesterification, providing a
potential proofreading mechanism for 5
-
splice site selection. RNA 1:391–406
13. Copertino DW, Hallick RB. 1993. Group
II and group III introns of twintrons:
potential relationships with nuclear pre-
mRNA introns. Trends Biochem. Sci. 18:
467–71
14. Costa M, Michel F, Westhof E. 2000.
A three-dimensional perspective on exon
binding by a group II self-splicing intron.
EMBO J. 19:5007–18
15. Cousineau B, Lawrence S, Smith D,
Belfort M. 2000. Retrotransposition of
a bacterial group II intron. Nature 404:
1018–21. Erratum. 2001. Nature 414:
84
16. Cousineau B, Smith D, Lawrence-
Cavanagh S, Mueller JE, Yang J, et al.
1998. Retrohoming of a bacterial group
II intron: mobility via complete re-
verse splicing, independent of homolo-
gous DNA recombination. Cell 94:451–
62
16a. Craig NL, Craigie R, Gellert M, Lam-
bowitz AM, eds. 2002. Mobile DNA II.
Washington, DC: ASM Press
17. Cui X, Matsuura M, Wang Q, Ma H,
Lambowitz AM. 2004. A group II intron-
encoded maturase functions preferentially
in cis and requires both the reverse tran-
scriptase and X domains to promote RNA
splicing. J. Mol. Biol. In press
18. Curcio MJ, Belfort M. 1996. Retrohom-
ing: cDNA-mediated mobility of group
II introns requires a catalytic RNA. Cell
84:9–12
Annu. Rev. Genet. 2004.38:1-35. Downloaded from arjournals.annualreviews.org
by UNIVERSITY OF CALGARY on 02/25/05. For personal use only.
15 Oct 2004 19:27 AR AR230-GE38-01.tex AR230-GE38-01.sgm LaTeX2e(2002/01/18)
P1: GCE
30 LAMBOWITZ
ZIMMERLY
19. Curcio MJ, Garfinkel DJ. 1991. Single-
step selection for Ty1 element retrotrans-
position. Proc. Natl. Acad. Sci. USA 88:
936–40
20. D’Souza LM, Zhong J. 2002. Mutations
in the Lactococcus lactis Ll.LtrB group II
intron that retain mobility in vivo. BMC
Mol. Biol. 3:17
21. Dai L, Zimmerly S. 2002. Compilation
and analysis of group II intron inser-
tions in bacterial genomes: evidence for
retroelement behavior. Nucleic Acids Res.
30:1091–102
22. Dai L, Zimmerly S. 2002. The dispersal of
five group II introns among natural popu-
lations of Escherichia coli. RNA 8:1294–
307
23. Dai L, Zimmerly S. 2003. ORF-less and
reverse-transcriptase-encoding group II
introns in archaebacteria, with a pattern of
homing into related group II intron ORFs.
RNA 9:14–19
24. Dib-Hajj SD, Boulanger SC, Hebbar SK,
Peebles CL, Franzen JS, Perlman PS.
1993. Domain 5 interacts with domain
6 and influences the second transesteri-
fication reaction of group II intron self-
splicing. Nucleic Acids Res. 21:1797–
804
25. Dickson L, Huang HR, Liu L, Matsuura
M, Lambowitz AM, Perlman PS. 2001.
Retrotransposition of a yeast group II in-
tron occurs by reverse splicing directly
into ectopic DNA sites. Proc. Natl. Acad.
Sci. USA 98:13207–12
26. Doetsch NA, Thompson MD, Hallick
RB. 1998. A maturase-encoding group III
twintron is conserved in deeply rooted
euglenoid species: Are group III introns
the chicken or the egg? Mol. Biol. Evol.
15:76–86
27. Eickbush TH. 1994. Origin and evolution-
ary relationships of retroelements. In The
Evolutionary Biology of Viruses, ed. SS
Morse, pp. 121–57. New York: Raven
28. Eickbush TH. 1999. Mobile introns: retro-
homing by complete reverse splicing.
Curr. Biol. 9:R11–14
29. Eickbush TH. 2002. R2 and related site-
specific non-long terminal repeat retro-
transposons. See Ref. 16a, pp. 813–
35
30. Eskes R, Liu L, Ma H, Chao MY, Dickson
L, et al. 2000. Multiple homing pathways
used by yeast mitochondrial group II in-
trons. Mol. Cell. Biol. 20:8432–46
31. Eskes R, Yang J, Lambowitz AM, Perl-
man PS. 1997. Mobility of yeast mito-
chondrial group II introns: engineering a
new site specificity and retrohoming via
full reverse splicing. Cell 88:865–74
32. Fedorova O, Mitros T, Pyle AM. 2003.
Domains 2 and 3 interact to form critical
elements of the group II intron active site.
J. Mol. Biol. 330:197–209
33. Ferat JL, Le Gouar M, Michel F. 2003. A
group II intron has invaded the genus Azo-
tobacter and is inserted within the termi-
nation codon of the essential groEL gene.
Mol. Microbiol. 49:1407–23
34. Ferat JL, Michel F. 1993. Group II
self-splicing introns in bacteria. Nature
364:358–61
35. Frazier CL, San Filippo J, Lambowitz
AM, Mills DA. 2003. Genetic manip-
ulation of Lactococcus lactis by using
targeted group II introns: generation of
stable insertions without selection. Appl.
Environ. Microbiol. 69:1121–28
36. Goddard MR, Burt A. 1999. Recurrent in-
vasion and extinction of a selfish gene.
Proc. Natl. Acad. Sci. USA 96:13880–85
37. Goldschmidt-Clermont M, Choquet Y,
Girard-Bascou J, Michel F, Schirmer-
Rahire M, Rochaix JD. 1991. A small
chloroplast RNA may be required for
trans-splicing in Chlamydomonas rein-
hardtii. Cell 65:135–43
38. Gorbalenya AE. 1994. Self-splicing
group I and group II introns encode ho-
mologous (putative) DNA endonucleases
of a new family. Protein Sci. 3:1117–
20
39. Granlund M, Michel F, Norgren M. 2001.
Mutually exclusive distribution of IS 1548
and GBSi1, an active group II intron
Annu. Rev. Genet. 2004.38:1-35. Downloaded from arjournals.annualreviews.org
by UNIVERSITY OF CALGARY on 02/25/05. For personal use only.
15 Oct 2004 19:27 AR AR230-GE38-01.tex AR230-GE38-01.sgm LaTeX2e(2002/01/18)
P1: GCE
MOBILE GROUP II INTRONS
31
identified in human isolates of group B
streptococci. J. Bacteriol. 183:2560–69
40. Guo H, Karberg M, Long M, Jones JP,
3rd, Sullenger B, Lambowitz AM. 2000.
Group II introns designed to insert into
therapeutically relevant DNA target sites
in human cells. Science 289:452–57
41. Guo H, Zimmerly S, Perlman PS, Lam-
bowitz AM. 1997. Group II intron en-
donucleases use both RNA and protein
subunits for recognition of specific se-
quences in double-stranded DNA. EMBO
J. 16:6835–48
42. Huang HR, Chao MY, Armstrong B,
Wang Y, Lambowitz AM, Perlman PS.
2003. The DIVa maturase binding site in
the yeast group II intron aI2 is essential for
intron homing but not for in vivo splicing.
Mol. Cell. Biol. 23:8809–19
43. Ichiyanagi K, Beauregard A, Belfort M.
2003. A bacterial group II intron favors
retrotransposition into plasmid targets.
Proc. Natl. Acad. Sci. USA 100:15742–47
44. Ichiyanagi K, Beauregard A, Lawrence S,
Smith D, Cousineau B, Belfort M. 2002.
Retrotransposition of the Ll.LtrB group
II intron proceeds predominantly via re-
verse splicing into DNA targets. Mol. Mi-
crobiol. 46:1259–72
45. Jarrell KA, Dietrich RC, Perlman PS.
1988. Group II intron domain 5 facilitates
a trans-splicing reaction. Mol. Cell. Biol.
8:2361–66
46. Jenkins BD, Barkan A. 2001. Recruitment
of a peptidyl-tRNA hydrolase as a facili-
tator of group II intron splicing in chloro-
plasts. EMBO J. 20:872–79
47. Jim´enez-Zurdo JI, Garc´ıa-Rodr´ıguez FM,
Barrientos-Dur´an A, Toro N. 2003. DNA
target site requirements for homing in vivo
of a bacterial group II intron encoding
a protein lacking the DNA endonuclease
domain. J. Mol. Biol. 326:413–23
48. Karberg M, Guo H, Zhong J, Coon R, Pe-
rutka J, Lambowitz AM. 2001. Group II
introns as controllable gene targeting vec-
tors for genetic manipulation of bacteria.
Nat. Biotechnol. 19:1162–67
49. Kennell JC, Moran JV, Perlman PS, Bu-
tow RA, Lambowitz AM. 1993. Re-
verse transcriptase activity associated
with maturase-encoding group II introns
in yeast mitochondria. Cell 73:133–46
50. Lambowitz AM. 1989. Infectious introns.
Cell 56:323–26
51. Lambowitz AM, Belfort M. 1993. Introns
as mobile genetic elements. Annu. Rev.
Biochem. 62:587–622
52. Lambowitz AM, Caprara MG, Zimmerly
S, Perlman PS. 1999. Group I and group II
ribozymes as RNPs: clues to the past and
guides to the future. In The RNA World, ed.
RF Gesteland, TR Cech, JF Atkins, pp.
451–85. Cold Spring Harbor, NY: Cold
Spring Harbor Lab. Press. 2nd ed.
53. Lambowitz AM, Perlman PS. 1990.
Involvement of aminoacyl-tRNA syn-
thetases and other proteins in group I and
group II intron splicing. Trends Biochem.
Sci. 15:440–44
54. LazowskaJ, Meunier B, Macadre C. 1994.
Homing of a group II intron in yeast
mitochondrial DNA is accompanied by
unidirectional co-conversion of upstream-
located markers. EMBO J. 13:4963–72
55. Lehmann K, Schmidt U. 2003. Group II
introns: structural and catalytic versatil-
ity of large natural ribozymes. Crit. Rev.
Biochem. Mol. Biol. 38:249–303
56. Levra-Juillet E, Boulet A, S´eraphin B, Si-
mon M, Faye G. 1989. Mitochondrial in-
trons aI1 and/or aI2 are needed for the
in vivo deletion of intervening sequences.
Mol. Gen. Genet. 217:168–71
57. Lin X, Kaul S, Rounsley S, Shea TP, Ben-
ito MI, et al. 1999. Sequence and analysis
of chromosome 2 of the plant Arabidopsis
thaliana. Nature 402:761–68
58. Luan DD, Korman MH, Jakubczak JL,
Eickbush TH. 1993. Reverse transcription
of R2Bm RNA is primed by a nick at the
chromosomal target site: a mechanism for
non-LTR retrotransposition. Cell 72:595–
605
59. Malik HS, Burke WD, Eickbush TH.
1999. The age and evolution of non-LTR
Annu. Rev. Genet. 2004.38:1-35. Downloaded from arjournals.annualreviews.org
by UNIVERSITY OF CALGARY on 02/25/05. For personal use only.
15 Oct 2004 19:27 AR AR230-GE38-01.tex AR230-GE38-01.sgm LaTeX2e(2002/01/18)
P1: GCE
32 LAMBOWITZ
ZIMMERLY
retrotransposable elements. Mol. Biol.
Evol. 16:793–805
60. Mart´ınez-Abarca F, Garc´ıa-Rodr´ıguez
FM, Toro N. 2000. Homing of a bacte-
rial group II intron with an intron-encoded
protein lacking a recognizable endonu-
clease domain. Mol. Microbiol. 35:1405–
12
61. Mart´ınez-Abarca F, Toro N. 2000. Group
II introns in the bacterial world. Mol. Mi-
crobiol. 38:917–26
62. Mart´ınez-Abarca F, Toro N. 2000. RecA-
independent ectopic transposition in vivo
of a bacterial group II intron. Nucleic
Acids Res. 28:4397–402
63. Mart´ınez-Abarca F, Zekri S, Toro N. 1998.
Characterization and splicing in vivo of
a Sinorhizobium meliloti group II intron
associated with particular insertion se-
quences of the IS 630-Tc1/IS 3 retroposon
superfamily. Mol. Microbiol. 28:1295–
306
64. Matsuura M, Noah JW, Lambowitz AM.
2001. Mechanism of maturase-promoted
group II intron splicing. EMBO J. 20:
7259–70
65. Matsuura M, Saldanha R, Ma H, Wank
H, Yang J, et al. 1997. A bacterial group
II intron encoding reverse transcriptase,
maturase, and DNA endonuclease activ-
ities: biochemical demonstration of mat-
urase activity and insertion of new genetic
information within the intron. Genes Dev.
11:2910–24
66. Meunier B, Tian G-L, Macadre C,
Slonimski PP, Lazowska J. 1990. Group
II introns transpose in yeast mitochon-
dria. In Structure Function and Biogen-
esis of Energy Transfer Systems, ed. E
Quagliariello, S Papa, F Palmieri, C Sac-
cone, pp. 169–74. Amsterdam: Elsevier
67. Michel F, Ferat JL. 1995. Structure and
activities of group II introns. Annu. Rev.
Biochem. 64:435–61
68. Michel F, Lang BF. 1985. Mitochondrial
class II introns encode proteins related to
the reverse transcriptases of retroviruses.
Nature 316:641–43
69. Michel F, Umesono K, Ozeki H. 1989.
Comparative and functional anatomy of
group II catalytic introns–a review. Gene
82:5–30
70. Michels WJ Jr, Pyle AM. 1995. Con-
version of a group II intron into a new
multiple-turnover ribozyme that selec-
tively cleaves oligonucleotides: elucida-
tion of reaction mechanism and structure/
function relationships. Biochemistry 34:
2965–77
71. Mills DA, Manias DA, McKay LL, Dunny
GM. 1997. Homing of a group II in-
tron from Lactococcus lactis subsp. lactis
ML3. J. Bacteriol. 179:6107–11
72. Mills DA, McKay LL, Dunny GM. 1996.
Splicing of a group II intron involved in
the conjugative transfer of pRS01 in lac-
tococci. J. Bacteriol. 178:3531–38
73. Mohr G, Lambowitz AM. 2003. Puta-
tive proteins related to group II intron re-
verse transcriptase/maturases are encoded
by nuclear genes in higher plants. Nucleic
Acids Res. 31:647–52
74. Mohr G, Perlman PS, Lambowitz AM.
1993. Evolutionary relationships among
group II intron-encoded proteins and iden-
tification of a conserved domain that may
be related to maturase function. Nucleic
Acids Res. 21:4991–97
75. Mohr G, Smith D, Belfort M, Lambowitz
AM. 2000. Rules for DNA target-site
recognition by a lactococcal group II in-
tron enable retargeting of the intron to spe-
cific DNA sequences. Genes Dev. 14:559–
73
76. Moran JV, Gilbert N. 2002. Mammalian
LINE-1 retrotransposons and related ele-
ments. See Ref. 16a, pp. 836–69
77. Moran JV, Mecklenburg KL, Sass P,
Belcher SM, Mahnke D, et al. 1994. Splic-
ing defective mutants of the COXI gene of
yeast mitochondrial DNA: initial defini-
tion of the maturase domain of the group
II intron AI2. Nucleic Acids Res. 22:2057–
64
78. Moran JV, Zimmerly S, Eskes R,
Kennell JC, Lambowitz AM, et al. 1995.
Annu. Rev. Genet. 2004.38:1-35. Downloaded from arjournals.annualreviews.org
by UNIVERSITY OF CALGARY on 02/25/05. For personal use only.
15 Oct 2004 19:27 AR AR230-GE38-01.tex AR230-GE38-01.sgm LaTeX2e(2002/01/18)
P1: GCE
MOBILE GROUP II INTRONS
33
Mobile group II introns of yeast mi-
tochondrial DNA are novel site-specific
retroelements. Mol. Cell. Biol. 15:2828–
38
79. Mueller MW, Allmaier M, Eskes R,
Schweyen RJ. 1993. Transposition of
group II intron aI1 in yeast and invasion
of mitochondrial genes at new locations.
Nature 366:174–76
80. Mu˜noz-Adelantado E, San Filippo J,
Mart´ınez-Abarca F, Garc´ıa-Rodr´ıguez
FM, Lambowitz AM, Toro N. 2003. Mo-
bility of the Sinorhizobium meliloti group
II intron RmInt1 occurs by reverse splic-
ing into DNA, but requires an unknown
reverse transcriptase priming mechanism.
J. Mol. Biol. 327:931–43
81. Nakamura Y, Kaneko T, Sato S, Ikeuchi
M, Katoh H, et al. 2002. Complete
genome structure of the thermophilic
cyanobacterium Thermosynechococcus
elongatus BP-1. DNA Res. 9:123–
30
82. Noah JW, Lambowitz AM. 2003. Effects
of maturase binding and Mg
2