A Simple Cipher Governs DNA
Recognition by TAL Effectors
Matthew J. Moscou and Adam J. Bogdanove*
AL (transcription activator–like) effectors
of plant pathogenic bacteria in the genus
Xanthomonas contribute to disease or trigger
specific host genes (1–5). Specificity depends on a
variable number of imperfect, typically 34, amino
diresidue (RVD). We show that the RVDs of TAL
effectors correspond directly to the nucleotides in
their target sites, one RVD to one nucleotide, with
some degeneracy and no apparent context depen-
use of these proteins in research and biotechnology.
Several considerations suggested that RVDs
place residues 12 and 13 on a solvent-exposed sur-
face (6). Binding of TAL effector AvrBs3 to the
the AvrBs3 target Bs3 is activated also by TAL ef-
target gene promoter pairs, we scanned for RVD-
nucleotide alignments with minimal entropy.
Low entropy sites were present in each pro-
the 54–base pair (bp) UPA20 promoter fragment
that is sufficient and necessary for activation, and it
coincided with the UPA box common to genes di-
morphism between the activated and nonactivated
alleles of their respective targets, Os8N3 and Xa27.
Across the alignments at these three sites, RVD-
alignments were selected on the basis of those as-
sociations, resulting in exactly one site per TAL
effector-target pair (Fig. 1). AT precedes each site.
To assess the specificity conferred by the RVD-
ogen X. oryzae. For four, the experimentally identi-
fied target gene was the best or nearly best match.
Better matches were not preceded by a T, were not
represented on the microarray used to identify the
target, or lacked introns and expressed sequence tag
the forward sites for the known targets. The known
target of the fifth effector, AvrXa27, is the disease
resistance gene Xa27 (1). The poorer rank for this
match (5368) may reflect a suboptimal or calibrated
host adaptation. Better scoring sites likely com-
prise genes targeted by AvrXa27 for pathogenesis.
all rice promoters with 40 additional X. oryzae
TAL effectors. We retained the best alignments
for which the downstream gene was activated
during infection based on public microarray data
(http://PLEXdb.org, accession OS3). Here too a
T precedes each site, and no reverse-strand sites
stitute a strikingly simple cipher (Fig. 1C).
There is some degeneracy in the cipher. Strong
associations may represent binding anchors. Weak
sites tend to be C-rich after the site
and G-poor throughout (Fig. 1D).
upstream of the annotated transcrip-
tional start. None are closer than 87
bp to the translational start.
Annotation of TAL effector tar-
gets, now feasible, will aid identifi-
cation of host genes important in
disease. Adding TAL sites may
enhance efficacy and durability of
resistance genes like Xa27. TAL
effectors may also be useful for
targeted gene activation as well.
Whether TAL effectors function in
nonplant cells or are amenable to
protein fusion are unknown. Eluci-
dating their interaction with host
transcriptional machinery and their
tant next steps in defining the func-
References and Notes
1. K. Gu et al., Nature 435, 1122 (2005).
2. B. Yang, A. Sugio, F. F. White, Proc. Natl. Acad. Sci. U.S.A.
103, 10503 (2006).
3. S. Kay, S. Hahn, E. Marois, G. Hause, U. Bonas, Science
318, 648 (2007).
4. A. Sugio, B. Yang, T. Zhu, F. F. White, Proc. Natl. Acad.
Sci. USA 104, 10720 (2007).
5. P. Römer et al., Science 318, 645 (2007).
6. S. Schornack, A. Meyer, P. Römer, T. Jordan, T. Lahaye,
J. Plant Physiol. 163, 256 (2006).
7. S. Schornack, G. V. Minsavage, R. E. Stall, J. B. Jones,
T. Lahaye, New Phytol. 179, 546 (2008).
and K. Dorman, N. Lauter, A. Miller, and S. Whitham
for suggestions. This work was funded by the NSF.
Supporting Online Material
Figs. S1 and S2
Tables S1 and S2
8 July 2009; accepted 23 September 2009
Published online 29 October 2009;
Include this information when citing this paper.
Department of Plant Pathology and Bioinformatics and
Computational Biology Program, Iowa State University,
Ames, IA 50011, USA.
*To whom correspondence should be addressed. E-mail:
Fig. 1. The TAL effector–DNA recognition cipher. (A) A
and a representative repeat sequence with the RVD under-
several TAL effector RVD and target gene promoter se-
quences. An asterisk indicates a deletion at residue 13. (C)
more alignments obtained by scanning all rice promoters
effector the best alignment for which the downstream gene
was activated during infection. (D) Flanking nucleotide
relative to the 5′ end of the target site; N, length of target
site. Logos were generated using WebLogo (http://weblogo.
VOL 32611 DECEMBER 2009
on December 10, 2009