Design of active small interfering RNAs
Andrew S Peek & Mark A Behlke*
Integrated DNA Technologies Inc
1710 Commercial Park
*To whom correspondence should be addressed
Current Opinion in Molecular Therapeutics 2007 9(2):110-118
The Thomson Corporation ISSN 1464-8431
Small interfering RNAs (siRNAs) have become the experimental
tool of choice to suppress gene expression in a wide variety of
organisms. Site selection and optimization does not appear to be as
difficult for siRNAs as would be expected from historical
experience with antisense oligonucleotides. Nevertheless, not all
sites within a target gene perform equally. Significant progress has
been made in defining sequence features that contribute to siRNA
potency, and a variety of computational tools are available from
academic and commercial sources to assist with siRNA design.
Potential siRNA sequences should be screened for homology to
other genes within the target organism's transcriptome to
minimize cross-hybridization and inadvertent knockdown of
unrelated genes via off-target effects. In addition to rational design
criteria, chemical modification of the RNA can improve function
by improving stability, reducing the potential for off-target effects
and avoiding stimulation of the innate immune system.
Keywords Algorithm, innate immunity, oligonucleotides,
RNA interference, small interfering RNA
RNA interference (RNAi) is an evolutionarily conserved
process in which double-stranded RNAs (dsRNAs) suppress
expression of genes with complementary sequences [1-3].
RNAi was first described in plants and nematodes, where
exogenous long dsRNAs (usually several hundred base
pairs or more) can be used to trigger suppression.
Unfortunately, long dsRNAs trigger type 1 IFN responses in
higher organisms and therefore are not useful for RNAi
studies in mammals. Nevertheless, RNAi can still be
exploited in mammalian cells. dsRNAs are processed by the
endoribonuclease Dicer into 21- to 23-base pair (bp)
products with two-base 3'-overhangs and 5'-phosphates.
These small interfering RNAs (siRNAs) are the actual
molecular triggers of RNAi. Except for certain sequence-
specific effects, dsRNAs of this length do not generally
activate innate immunity in most cell types, and siRNAs can
be directly used to manipulate gene expression in
Human Dicer forms a heterodimer with a second RNA
binding protein called TAR RNA binding protein (TRBP).
Whether endogenously processed by Dicer/TRBP or
supplied as a synthetic oligonucleotide introduced into the
cell by transfection, 21mer siRNAs are incorporated into a
multiprotein complex known as the RNA-induced silencing
complex (RISC). During RISC formation, the siRNA strands
are separated and only one strand of the duplex is retained
(the 'guide strand') while the other strand (the 'passenger
strand') is degraded. In mammalian cells, one of the protein
components of RISC, Ago-2, functions as the 'slicer' activity
and cleaves the target mRNA. The precise site of slicer
cleavage is determined by sequence complementarity to the
'guide strand' of the siRNA. The steps involved in
degradative RNAi are outlined in Figure 1. These pathways
have been well summarized in several reviews [4,5].
Figure 1. Biochemical pathways involved in degradative RNAi.
Long dsRNA (Dicer substrate)
Dicer and TRBP associate
with siRNA and participate
in RISC formation
Dicer cleaves long RNAs into siRNAs
RISC assembly and
loading involves Ago-2
and other proteins
Long double-stranded RNAs (dsRNAs) are cleaved by Dicer into short 21- to 23-base pair duplexes with two-base 3'-overhangs and 5'-phosphate
(small interfering RNAs [siRNAs]). siRNAs enter the RNA-induced silencing complex (RISC), where one strand directs cleavage of a target RNA.
Synthetic siRNAs can be transfected into cells and directly enter the pathway at the stage of RISC formation. TRBP TAR RNA binding protein.
Design of active small interfering RNAs Peek & Behlke 111
Most researchers currently use chemically synthesized siRNAs
that are made as single-stranded 21mer RNA oligonucleotides
(ssRNAs) and annealed into duplex form. These mimic the
natural siRNAs that result from Dicer processing of long
dsRNAs. Not all siRNAs perform equally well and site
selection is an important aspect of siRNA design.
Site selection, siRNA design and structure
Most synthetic siRNAs have a 19-bp core duplex domain
with two-base 3'-overhangs. Heuristic rules for identifying
sites to place RNAi duplexes are based on the sequence of
this 19-base 'core' region, but the overhangs influence
activity and should not be ignored. Several investigators
have published studies that examined small to very large
sets of siRNAs and correlated sequence with functional
activity. From these efforts, a variety of rule sets and
complex algorithms have been devised that improve the
likelihood of selecting a 'good' siRNA site within a gene
sequence. Table 1 lists 25 reports of this kind from the past
five years. Some of the features of these rule sets relate to the
underlying biochemistry of RNAi and will be considered
below in greater detail.
Table 1. Papers that report methods for predicting effective
Method(s) Number of siRNAs
in data set
76, 44, 653)-19mers
ANN artificial neural network, DRM disjunctive rule merging, DT
decision tree, GPBoost genetic programming and boosting, GSK
general string kernel, SVM support vector machine.
Natural siRNAs have 5'-phosphate groups at their ends and
this phosphate group is essential for activity. It is important
not to block the 5'-end of the antisense (AS, guide) strand
with a chemical modifying group. However, it is not
necessary to phosphorylate synthetic siRNAs, since
unmodified 5'-OH ends are rapidly phosphorylated by the
action of cellular kinase(s) after transfection. DNA 'TT'
dinucleotide overhangs are frequently used at the 3'-ends
regardless of the natural target sequence; however, there is
some evidence to suggest that use of RNA bases may confer
slightly higher potency [6-8].
From studies carried out using large datasets, no significant
difference in overall efficacy has been reported between sites
within the coding region versus the 3'-untranslated region
(UTR). It therefore does not appear that the general location of
an siRNA within the target gene correlates with potency.
However, the localization of an siRNA within the gene exon
structure can be significant. It is important to identify the
existence of splice variants for the gene of interest when
selecting siRNA sites. Most users will want reagents that target
common exons, which will reduce expression of all isoforms. If
desired, siRNAs can be directed to exons that are splice-form
specific and used to selectively reduce expression of those
isoforms to study their biological function.
Thermodynamic features of functional siRNAs
Thermodynamic properties of siRNAs influence their
potency, predominantly by factors relating to RISC loading.
Effective siRNAs have a relatively lower duplex stability
(Tm; less stable, more A/U rich) toward the 5'-end of the
strand that remains in RISC (the 'guide strand') and a
relatively higher Tm (more stable, more G/C rich) toward
the 5'-end of the degraded or ejected strand (the 'passenger
strand') [9•]. Thermodynamic asymmetry is a fundamental
property present in functional siRNAs and microRNAs
(miRNAs) and directly influences strand loading during
RISC formation [10-12]. Most design rules exploit
thermodynamic asymmetry by selecting naturally occurring
sites within a gene sequence that conform to the desired
pattern. However, it may be necessary occasionally to target
a specific location even though the natural base sequence is
thermodynamically unfavorable. It is possible to artificially
manipulate thermodynamic asymmetry by introducing
mismatches (to lower Tm) or by placing chemically modified
bases (to increase Tm) in the siRNA duplex. Non-
complementary bases should be introduced at the 3'-end of
the sense strand (passenger strand) to lower stability of the
5'-end of the AS strand (guide strand), without impairing
ability of the AS strand to anneal to the native target [13-15].
In general, functional siRNAs tend have a moderate to low
GC content (30 to 52%). Regions of high local GC content
may be prone to problems with secondary structure
formation (see below) and may also have reduced turnover
rates in RISC . Therefore, areas with high GC content
should be avoided when possible.
Site-specific base composition and motifs
Nucleotide biases have been identified at several positions
within the core region of the guide strand, some of which
may simply reflect requirements for thermodynamic
118 Current Opinion in Molecular Therapeutics 2007 Vol 9 No 2
80. Robbins MA, Li M, Leung I, Li H, Boyer DV, Song Y, Behlke MA, Rossi
JJ: Stable expression of shRNAs in human CD34+ progenitor cells
can avoid induction of interferon responses to siRNAs in vitro. Nat
Biotechnol (2006) 24(5):566-571.
81. Heidel JD, Hu S, Liu XF, Triche TJ, Davis ME: Lack of interferon
response in animals to naked siRNAs. Nat Biotechnol (2004)
82. Kariko K, Buckstein M, Ni H, Weissman D: Suppression of RNA
recognition by toll-like receptors: The impact of nucleoside
modification and the evolutionary origin of RNA. Immunity (2005)
• This study first identified the spectrum of base modifications that enable
RNA sequences to evade detection by receptors of the innate immune
83. Judge AD, Bola G, Lee AC, MacLachlan I: Design of noninflammatory
synthetic siRNA mediating potent gene silencing in vivo. Mol Ther
• This study first identified the minimal modification patterns necessary for
2'-O-methyl modified RNAs to evade detection by receptors of the innate
84. Elbashir SM, Harborth J, Weber K, Tuschl T: Analysis of gene
function in somatic mammalian cells using small interfering RNAs.
Methods (2002) 26(2):199-213.
85. Holen T, Amarzguioui M, Wiiger MT, Babaie E, Prydz H: Positional
effects of short interfering RNAs targeting the human coagulation
trigger tissue factor. Nucleic Acids Res (2002) 30(8):1757-1766.
86. Reynolds A, Leake D, Boese Q, Scaringe S, Marshall WS, Khvorova A:
Rational siRNA design for RNA interference. Nat Biotechnol (2004)
87. Ui-Tei K, Naito Y, Takahashi F, Haraguchi T, Ohki-Hamazaki H, Juni A,
Ueda R, Saigo K: Guidelines for the selection of highly effective
siRNA sequences for mammalian and chick RNA interference.
Nucleic Acids Res (2004) 32(3):936-948.
88. Amarzguioui M, Prydz H: An algorithm for selection of functional
siRNA sequences. Biochem
89. Hsieh AC, Bo R, Manola J, Vazquez F, Bare O, Khvorova A, Scaringe
S, Sellers WR: A library of siRNA duplexes targeting the
phosphoinositide 3-kinase pathway: Determinants of gene
silencing for use in cell-based screens. Nucleic Acids Res (2004)
90. Takasaki S, Kotani S, Konagaya A: An effective method for selecting
siRNA target sequences in mammalian cells. Cell Cycle (2004)
91. Poliseno L, Evangelista M, Mercatanti A, Mariani L, Citti L, Rainaldi G:
The energy profiling of short interfering RNAs is highly predictive
of their activity. Oligonucleotides (2004) 14(3):227-232.
Biophys Res Commun (2004)
92. Saetrom P, Snove O Jr: A comparison of siRNA efficacy predictors.
Biochem Biophys Res Commun (2004) 321(1):247-253.
93. Chalk AM, Wahlestedt C, Sonnhammer EL: Improved and automated
prediction of effective siRNA. Biochem Biophys Res Commun (2004)
94. Huesken D, Lange J, Mickanin C, Weiler J, Asselbergs F, Warner J,
Meloon B, Engel S, Rosenberg A, Cohen D, Labow M et al: Design of a
genome-wide siRNA library using an artificial neural network.
Nat Biotechnol (2005) 23(8):995-1001.
• This paper presents the largest data set (to date) of functional siRNAs for
structure-function analysis and utilizes an artificial neural network method of
machine learning to predict active siRNA sites in new target sequences.
95. Huesken D, Lange J, Mickanin C, Weiler J, Asselbergs F, Warner J,
Meloon B, Engel S, Rosenberg A, Cohen D, Labow M et al:
Corrigendum: Design of a genome-wide siRNA library using an
artificial neural network. Nat Biotechnol (2005) 23(10):1315.
96. Ge G, Wong GW, Luo B: Prediction of siRNA knockdown efficiency
using artificial neural network models. Biochem Biophys Res
Commun (2005) 336(2):723-728.
97. Jagla B, Aulner N, Kelly PD, Song D, Volchuk A, Zatorski A, Shum D,
Mayer T, De Angelis DA, Ouerfelli O, Rutishauser U, Rothman JE:
Sequence characteristics of functional siRNAs. RNA (2005)
98. Teramoto R, Aoki M, Kimura T, Kanaoka M: Prediction of siRNA
functionality using generalized string kernel and support vector
machine. FEBS Lett (2005) 579(13):2878-2882.
99. Jia P, Shi T, Cai Y, Li Y: Demonstration of two novel methods for
predicting functional siRNA efficiency. BMC Bioinformatics (2006)
100. Shabalina SA, Spiridonov AN, Ogurtsov AY: Computational models
with thermodynamic and composition features improve siRNA
design. BMC Bioinformatics (2006) 7:65.
101. Holen T: Efficient prediction of siRNAs with siRNA rules 1.0: An
open-source JAVA approach to siRNA algorithms. RNA (2006)
102. Vert JP, Foveau N, Lajaunie C, Vandenbrouck Y: An accurate and
interpretable model for siRNA
Bioinformatics (2006) 7:520.
103. Gong W, Ren Y, Xu Q, Wang Y, Lin D, Zhou H, Li T: Integrated siRNA
design based on surveying of features associated with high RNAi
effectiveness. BMC Bioinformatics (2006) 7:516.
104. Ladunga I: More complete gene silencing by fewer siRNAs:
Transparent optimized design and biophysical signature. Nucleic
Acids Res (2007) 35(2):433-440.
efficacy prediction. BMC