ArticlePDF Available

Computational Evidence for a Viral Encoded miRNA in the Intergenic, Untranslated Region of the Zaire Ebolavirus Nucleoprotein Gene May Explain Its Differential Virulence: a Potential Pandora Element

Authors:
  • Hale O'mana'o Biomedical Research

Abstract and Figures

Ebolavirus is one of the most virulent of human viral pathogens, killing 25-90% of patients within 6-16 days (EBOV and SUDV) resulting from complete disruption of the host immune response. Despite the relatively minor genetic differences between the species, the case fatality rate (CFR) is significantly different. The Zaire Ebolavirus (EBOV) historically is the most lethal, with its sporadic outbreaks resulting in a CFR ranging from 60% (Minkébé, Gabon 1994, Gueckedou 2014) to 90% (Kéllé, Congo 2003). This is in sharp contrast to Reston ebolavirus (RESTV) which, despite its translational similarity to EBOV and serologic evidence of human transmission, has yet to result in even minor symptoms of the disease. The viral protein, VP35, has been shown to be a multifunctional virulence factor by antagonizing anti-viral signalling pathways via its interferon inhibitory domain (IID). However, it is unlikely the VP35 pathways could fully account for the observed differential virulence between the species. Viral encoded miRNAs have already been identified in many species and shown to modulate the expression and antiviral function of interferons (IFNs). This is the first paper to summarize the computational identification of a likely miRNA, termed the Pandora Element, within the intergenic, untranslated region of the Zaire ebolavirus nucleoprotein gene that appears to target multiple human transcripts. A structural characteristic of the RESTV Pandora domain will also be introduced that may also account for its differential virulence.
Content may be subject to copyright.
Computational Evidence for a Viral Encoded miRNA in the Intergenic,
Untranslated Rehion of the Zaire Ebolavirus Nucleoprotein Gene May
Explain Its Differential Virulence: a Potential Pandora Element
Robert Ricketson, Hale O’mana’o Biomedical Research, Wichita, Kansas, USA; SM
Christensen, University of Texas-Arlington, Arlington, Texas USA
USA
Abstract
Ebolavirus is one of the most virulent of human viral pathogens, killing 25-90% of
patients within 6-16 days (EBOV and SUDV) resulting from complete disruption of the
host immune response. Despite the relatively minor genetic differences between the
species, the case fatality rate (CFR) is significantly different. The Zaire Ebolavirus (EBOV)
historically is the most lethal, with its sporadic outbreaks resulting in a CFR ranging from
60% (Minkébé, Gabon 1994, Gueckedou 2014) to 90% (Kéllé, Congo 2003). This is in
sharp contrast to Reston ebolavirus (RESTV) which, despite its translational similarity to
EBOV and serologic evidence of human transmission, has yet to result in even minor
symptoms of the disease.
The viral protein, VP35, has been shown to be a multifunctional virulence factor
by antagonizing anti-viral signalling pathways via its interferon inhibitory domain (IID).
However, it is unlikely the VP35 pathways could fully account for the observed
differential virulence between the species. Viral encoded miRNAs have already been
identified in many species and shown to modulate the expression and antiviral function
of interferons (IFNs).
This is the first paper to summarize the computational identification of a likely
miRNA, termed the Pandora Element, within the intergenic, untranslated region of the
Zaire ebolavirus nucleoprotein gene that appears to target multiple human transcripts.
A structural characteristic of the RESTV Pandora domain will also be introduced that
may also account for its differential virulence.
Citation. Robert A Ricketson, ; SM Christensen; Computational Evidence for a Viral
Encoded miRNA in the Intergenic, Untranslated Region of the Zaire Ebolavirus
Nucleoprotein Gene May Explain Its Differential Virulence: a Potential Pandora Element
1
Introduction
Inarguably, the genera Ebolavirus is the most virulent of human viral pathogens,
killing from 25-90% of patients within 6-16 days (EBOV and SUDV) following an
incubation period of 2-21 days.
According to the World Health Organization's International Statistical Classification
of Diseases and Related Health Problems, Ebola Virus Disease (EVD) refers to the
human disease known to be caused by 4 of the 5 species of the viral genera, Ebolavirus.
They include Zaire ebolavirus (EBOV), Sudan ebolavirus (SUDV), Bundibugyo ebolavirus
(BDBV), and the Taï Forest ebolavirus (TAFV). Reston ebolavirus (RESTV), has not yet
been found to cause human disease. Lloviu cuevavirus (LLOV), a recent addition to the
Filoviridae family, will also be included in the discussion later for structural reasons. It is
not yet considered to be in the Ebola taxon.
Ebola Virus Disease (EVD) is rare, marked by fulminant and sporadic outbreaks in
equatorial Africa, being transmitted from person to person via infected bodily fluids,
such as feces, saliva, vomit, blood, and sperm. EVD is characterized by acute onset of
fever, malaise, myalgia, headache, and pharyngitis followed by vomiting, diarrhea,
maculopapular rash, renal and hepatic involvement and hemorrhagic diathesis. The
incubation period ranges between 2 21 days.
The initial reported outbreak involved the Zaire ebolavirus (EBOV), strain Mayinga,
and occurred during September of 1976, in the northern region of Zaire of equatorial
Africa. Most of the patients were treated for malaria and standard universal precautions
at the Yambuku Mission Hospital were virtually non-existent. By the end of October,
there were 318 cases and and 280 deaths (38 serologically confirmed survivors) for a
final case fatality rate (CFR), defined as the ratio of deaths within a designated
population of people with a particular condition over a certain period of time, recorded
as 88%. During that same year, another strain emerged in Nzara, Sudan, classified as
Sudan ebolavirus. Also extremely lethal, the case fatality rate there was 68% (see tables
1 and 2). Over the past 35 years, two additional African strains of Ebolavirus have
emerged, Taï Forest (TAFV) and Bundibugyo (BDBV), and one from the Western Pacific
(Phillipines), the Reston ebolavirus (RESTV) but so far these later strains have not
approached the human lethality of EBOV or SUDV.
The Ebolavirus outbreak of 2014 in West Africa is unprecedented. As of March 18, 2015
there are reported 24,666 cases with 10,179 deaths in this outbreak. With the calculated
CFR of 61% (best reported confirmed etc., from Guinea), the true number of deaths may
be closer to 15,046, meaning almost 5,000 unaccounted for bodies. With that in mind,
the post-mortem transmissions recently reported in Guinea (see below) reflect that.
Currently, 49 total reported EVD deaths in the week of March 15, 2015, almost half (23)
were identified post-mortem in the community. 28% of confirmed cases arose from
registered contacts, and there were a reported 18 unsafe burials. Taken together, these
indicators suggest that the outbreak in Guinea is still being driven by unknown chains of
transmission.
2
However, it must be noted that the virulence of the Ebolavirus species has been shown
to change through only a few passages. It has been demonstrated that Ebola has an
extensive list of hosts in which it has been capable of infecting [Van der Groen et al.,
1978]. The opportunity for the Reston species to become as virulent as the Zaire or
Sudan species by rapid adaptation, therefore, is quite plausible. The minor genetic
differences in sequence, as alluded to earlier, have resulted in significant differences in
the CFR during outbreaks suggesting there may be some other factor not yet identified
to account for these discrepancies.
Review of Current Model of Pathogenesis of Ebolavirus
Over the past several years, there have been major advances in our knowledge of
how the virus is able to disrupt the human immune system. In order for the virus to
maintain its life cycle, it must be able to modify the hosts pathogen Pattern Recognition
Receptors (PRR) such as the membrane-bound Toll-like receptors (TLR) and
cytoplasmic RIG-I that result in the activation of a signal transduction cascade. This
cascade results in the production of type I IFNs through host IFN regulatory factors
such as IRF-3, IRF-7 and NFκB in addition to the induction of antiviral gene promotors.
In general, early in infection with a pathogen, components of the innate immune
system sense pathogen-associated molecular patterns (PAMPs). PAMPs are small
molecular motifs conserverved within a species such as bacterial flagellin, LPS, and
nucleic acid variants viral dsRNA. When PAMPs are detected, the adaptive arms of the
immune system are signaled to respond. For example, viral dsRNA is a unique PAMP
and identifiable by the host PRRs during the course of infection. This viral dsRNA is
detected in the cytoplasm of an infected cell by viral dsRNA receptors such as retinoic
acidinducible gene I (RIG-I) [ref here] and melanoma differentiationassociated gene
5 (MDA-5) [ref here]. Host RIG-I and MDA-5 then initiate signaling cascades that
activate interferon regulatory factor 3 (IRF-3), which then leads to the production of
interferon α/β (IFN α/β).
The RIG-I motif identifies blunt-end dsRNA which are 5’ ppp and longer than 23
bp. Greater affinity for 5-overhangs was noted as compared to 3-overhangs [The
Chase for the RIG_I Ligand-Recent Advances; Molecular Therapy (2010) 18 7, 1274-1262,
M Schlee] . RIG-I is activated by blunt-ended double-stranded (ds)RNA with or without a
5'-triphosphate (ppp), by single-stranded RNA marked by a 5'-ppp and by polyuridine
sequences. Upon binding to such PAMP motifs, RIG-I initiates a signaling cascade that
induces innate immune defenses and inflammatory cytokines to establish an antiviral
state [Jiang, Structural basis of RNA recognition and activation by innate immune
receptor RIG-I.].
VP35 has been identified as being an innate immune antagonist of IFN [13,14],
an essential cofactor of the viral RNA polymerase complex [15], is required for viral
assembly [16] and an RNAi silencing suppressor.
IFN antagonism by VP35 is accomplished by binding dsRNA in a manner similar
to the influenza virus NS1 protein dsRNA binding domain. The influenza virus is capable
of inhibiting IFN though a dsRNA binding domain within the viral NS1 protein. In
ebolavirus, the VP35 COOH terminus was reported to contain a conserved amino acid
sequence similar to a short stretch of amino acids conserved at the NH2 terminal
3
dsRNA binding domain of the influenza A virus NS1 protein. This led to the
identification of an IFN inhibitory domain (IID) in VP35.
The VP35 dsRNA binding domain (RBD) forms an asymmetric dimer in the
dsRNA-complex structure, with one dimer bound to each end of an 18-bp dsRNA.
Within each dimer, one VP35 RBD binds the phosphate backbone of the dsRNA (the
“backbone-binding” VP35), whereas the other VP35 RBD of the pair binds terminal
nucleotides of the dsRNA, and “end-capping” the dsRNA (Figure 1.). The binding of
dsRNA within the central basic patch of VP35 is independent of sequence.
Figure 1. Conserved basic residues in VP35 IID recognize the dsRNA backbone and the dsRNA blunt ends are "end-capped" by
a pocket of hydrophobic residues that mimic RIG-1 like receptors of blunt-end dsRNA. [Leung, Structural Basis for dsRNA
recognition and interferon antagonism by Ebola V35, Nature Structural & Molecular Biology; 2009]
However, a comparative amino acid alignmen of VP35 of all species of ebolvirus
does not appear to answer the question as to differential virulence between the species
(Figure 2). The interecting residues involved as reported in previous publihed studies
are actually highly conserved.
1
Figure 2. Comparative amino acid alignment of VP35. The COOH-terminus is highly conserved across all species of
Ebolavirus.
Viral Encoded miRNA
MicroRNAs (miRNAs) are small, RNA molecules encoded in the genomes of
plants, animals, and some viruses. These 19-25 mer mature RNAs are conserved in in
many species but not necessarily in viruses. Mature miRNAs regulate the expression of
genes by binding to the 3'-untranslated regions (3'-UTR) of specific target host mRNAs
(Ambros et al 2003) and are capable of basepairing with these target mRNAs and lead
to translational repression by incomplete basepairing or cleavage by complete
basepairing with the target mRNA. There is evidence that mature miRNAs may act as
regulators of processes such as organ development (Reinhart 2000), cell proliferation
and cell death (Brennecke 2003), apoptosis, fat metabolism (Xu 2003), cell
differentiation (Dostie 2003, Chen 2003) , and viral infection (Pfeffer 2004).
Mature miRNAs are initially expressed as primary miRNAs, termed
pri-miRNAs
(Lee 2002) (Figures 3-4). During the formation of the initial pri-miRNA transcript, the
miRNA forms a hair-pin like structure with 5’ triphosphate caps and a 3’ end that is
polyadenylated (Smalheiser 2003). Pri-miRNAs are generated by RNA polymerase II (pol
II) in all eukaryotes or by RNA polymerase III (pol III) in some viruses [7]
In order for the miRNA to be released from the pri-miRNA transcript, the
canonical pathway suggests the hairpin must be recognized by the ribonuclease
Drosha/DGCR8 complex
in the nucleus (Lee 2003). Recent research shows that siRNAs
transfected to cells are primarily intercepted by Dicer, which also recognizes 2nt 3’-
2
overhangs, but the efficiency of capture of different siRNAs is highly variable and
dependent on the nature of the 3 overhang sequence [1,2]. This results in the formation
of an approximately 70-nt
pre-miRNA
with 1-4 nt 3’ overhangs, 25-30 bp stems, and a
10 nt terminal loop (Lee 2003, Yi 2003).
Pre-miRNA is then exported from the nucleus to the cytoplasm by binding to
Exportin-5 (Exp5) (Lund 2003, Yi 2003). Precursor miRNAs (pre-miRNAs) may also be
generated
independently of Drosha/DGCR8
by following debranching of lariat-
structures known as
mirtrons
, and alternatively through folding of transfer RNAs
(tRNAs) or small nucleolar RNAs (snoRNAs), or by tRNAse-Z cleavage of pri-miRNAs
containing tRNA-like structures linked to pre-miRNA stem-loops. These latter
mechanisms, particularly regarding Dicer, leading to the
cytoplasmic
formation of pre-
miRNA may be of more interest. Ebolavirus infection is cytoplasmic and therefore pre-
miRNA development must occur independent of nuclear Drosha. The RdRp L gene of
Zaire ebolavirus contains conserved residues of S-adenosylmethionine-dependent
methyltransferases (SAM or AdoMet-MTase). Methyl-transferases appear to be involved
in viral RNA capping (Chan 2009,6)) and RNAi.
Early examples of RNAi were triggered by exogenous dsRNA. In these cases, long,
exogenous dsRNA is cleaved into double-stranded siRNAs by Dicer (Dcr), a dsRNA-
specific RNase III family ribonuclease [7]. The ribonuclease Dicer cleaves double-
stranded RNA (dsRNA) into small interfering RNA (siRNA) and microRNA precursors
(pre-miRNA) into microRNA (miRNA). Human Dicer belongs to the RNase III class and
contains an N-terminal DExH-box RNA helicase-like domain, a domain originally termed
the domain of unknown function (DUF283), a PAZ domain, two RNase III domains, and a
double-stranded RNA-binding domain (dsRBD) [3]. The dsRNA processing center of
Dicer is formed through intramolecular dimerization of two RNase III domains
functioning together to cleave phosphodiester bonds on opposite strands of a dsRNA
substrate [4]. The RNase IIIa domain cleaves the 3-arm of pre-miRNA and the RNase IIIb
domain cleaves its 5-arm, and both domains must exhibit their activity to generate a
miRNA-miRNA duplex .
Once processed by Dicer in the cytoplasm, the mature miRNA with a 2 nt 3’
overhang along with Dicer are incorporated into the RNA Induced Silencing Complex
(RISC), which includes Dicer, Argonate, TRBP, and TNRC6B. Along with the miRNA, the
commplex can then target the host or viral mRNA to cause translational repression.
3
Figure 3. Theoretical model of Ebola miRNA
A mechanism that triggers RNAi is related to the presence of viral genomes or
transponsable elements and generation of siRNAs molecules (~19-21nt) The processing
of secondary RNA structures in the viral RNA genome or dsRNA replication
intermediates by RNase III enzyme Dicer generates these siRNAs. After Dicer processing,
siRNAs molecules follow the same pathway as miRNAs in the RISC complex. Once in the
RISC complex, siRNAs can result in translational repression of the target mRNA if
partially complementary to the target mRNA or promote cleavage and degredation if
completely complementary.
Figure 4. Precursor miRNAs (pre-miRNAs)
microRNA (miRNA) biogenesis pathways.
Precursor miRNAs (pre-miRNAs) can be
generated (i) by Drosha/DGCR8 cleavage of long
primary miRNAs (pri-miRNAs) or independent of
Drosha/DGCR8 by, (ii) following debranching of
lariat-structures known as mirtrons, (iii) through
alternative folding of transfer RNAs (tRNAs) or
small nucleolar RNAs (snoRNAs), (iv) by tRNAseZ
cleavage of pri-miRNAs containing tRNA-like
structures linked to pre-miRNA stem- loops, o5
(v) cleavage in the cytoplasm by Dicer, Once
exported to the cytoplasm, pre-miRNAs from the
nuclear Drosha or alternate pathways are
identified and cleaved by Dicer to generate a
miRNA duplex, one strand of which is
incorporated into the RNA-induced silencing
complex (RISC) to target cellular messenger RNAs
4
(mRNAs) (from Viruses, microRNAs, and Host Interactions Rebecca L. Skalsky and Bryan R. Cullen)
Once the processed viral miRNA is in the cytoplasm, Dicer cleaves the pre-miRNA
approximately 19 bp from the Drosha cut site (Lee 2003, Yi 2003). The resulting double-
stranded RNA has 14 nt 3' overhangs at either end (Lund 2003). Only one of the two
strands is the mature miRNA; some mature miRNAs derive from the leading strand of
the pri-miRNA transcript, and with other miRNAs the lagging strand is the mature
miRNA. The double-stranded mature miRNA produced by Dicer must separate to
associate with the RISC (Hutvagner 2002). The active strand selection is based primarily
upon the stability of the termini of the two ends of the dsRNA (Schwarz 2003, Khvorova
2003). The strand with lower stability base pairing at the 5' end (2-4 nt) associates with
RISC and thus becomes the active miRNA (Schwarz 2003).
Results
Prediction of viral miRNA has been reported and several virus families have been
shown to encode >225 miRNAs. These include herpesviruses, polyomaviruses, and
adenoviruses. To date, no viral miRNAs have been identified in human papilloma virus
(HPV), hepatitis C virus (HCV), yellow fever virus, or human T-cell leukemia virus I
infected cells using standard sequencing (18, 56, 72) or in HPV-, HCV-, cowpox virus,
polio virus, vesicular stomatitis virus (VSV), West Nile virus, dengue virus, or
influenza virusinfected cells using deep sequencing techniques (60, 70a;R.L. Skalsky,
J.L.) Additionally, since Ebolavirus replicates in the cytoplasm, formation of pre-miRNA
would out of necessity occur through a mechanism away from nuclear Drosha.
The Vir-Mir database (http://140.109.42.4/cgi-bin/miRNA/miRNA.cgi) , constructed
and maintained by the Transcriptome Discovery Lab(TDL), Institute of BioMedical
Science, Academia Sinica, Taipei, Taiwan, is a database containing predicted viral miRNA
candidate hairpins within the Filoviridae family, there are listed 167 candidates.
Table 1 Vir-MiR Database of Viral miRNA Candidates
Viral miRNA Candidates (Vir-Mir Database)
Deltavirus
0
dsDNA viruses, no RNA stage
26339
ds RNA viruses
379
Environmental samples
0
Retro-transcribing viruses
386
Satellites
59
ssDNA viruses
410
ssRNA negative strand viruses
167
ss +-strand viruses, no DNA stage
2325
Unclassified viruses
374
(Table 1-current list of candidate viral miRNA maintained at the Vir-Mir database)
All known full length nucleotide sequences of EBOV (accession AY142960, AF086833,
AF272001, EY224440, AY354458, and EF490229), SUDV (EU338380, FJ968794, and
AY729654), BDBV (FJ217161), TAFV (FJ217162), and RESTV (AF522874, AB050936,
FJ21583, FJ21584, and FJ21585), and LLOV (JF828358) were downloaded from the NCBI
5
database and aligned with representative CDS sequences of nucleoprotein, VP35, VP30,
GLY, VP40, VP24 and RdRp in Jalview. The 5’ UTR from the genomic sequence was
identified following translation of the region 1-4000 is six frames using ExPasy. . The
length of the noncoding region between the stop codon (UGA/UAA to the predicted
CDS start for VP35 was approximately 340 bases. The nucleotide sequence from
position 1 to the AUG start codon of the VP35 CDS from EBOV, SUDV, RESTV, TAFV,
BDBV, AND LLOV were submitted for BLASTn search using discontinuous BLAST
parameters. All sequences were aligned in CLUSTALW format and a phylogenetic tree
was then obtained for each of the above sequences in Newick format.
Alignment
Representative sequences from EBOV, SUDV, BDBV, TAFV, and RESTV were aligned and
visualized with Clustal in Jalview (FIGURE 5). The portion of the sequence
corresponding to the hairpin full length sequence was isolated for comparative analysis
using MEGA6.
Figure 5. Representative alignment of the intergenic sequence within all species of Ebolavirus. The specific hairpin sequence
is islolated within the bold blue box. Alignment performed using CLUSTAL and visualized with Jalview.
6
The unaligned nucleotide composition demonstrated no significant differences within
EBOV outbreaks (Table 1), with CG% of 30%, and AU 37%. SUDV CG% ranged from 30-
32%, adenine 36-37%, and uracil 32-33%. BDBV and TAFV had the highest CG% of 50%
with the majority being guanine substitutions. RESTV CG% was similar to EBOV with the
CG% 32-34%, adenine 35%, and uracil 14-15%.
Upon inspection of the alignment, however, There were noted to be clusters of guanine
and cytosine substitutions at the 5’ and 3’ ends of the sequences in RESTV, BDBV, and
TAFV that required further study as to the potential for pseudoknot formation. (FIGURE
6)
TABLE 2. Nucleotide composition
NUCLEOTIDE COMPOSITION (unaligned)
Identity (species, strain, outbreak
year, sequence length )
A
U
G
C
CG%
%A
%U
%G
%C
Zaire Mayinga 1976/1-342
125
113
50
52
30%
37%
33%
15%
15%
Zaire Gabon 1994/1-342
127
112
48
53
30%
37%
33%
14%
15%
Zaire Kikwit 1995/1-342
126
111
49
54
30%
37%
32%
14%
16%
Zaire Gabon 1996/1-342
127
112
48
53
30%
37%
33%
14%
15%
Zaire DRC 2014/1-343
127
111
48
54
30%
37%
32%
14%
16%
Zaire Gueckedou 2014/1-342
125
114
50
51
30%
37%
33%
15%
15%
Zaire Kissidougou 2014/1-342
125
114
50
51
30%
37%
33%
15%
15%
Zaire Mali DPR-4 2014/1-342
125
114
50
51
30%
37%
33%
15%
15%
Zaire Mali DPR-3 2014/1-342
125
114
50
51
30%
37%
33%
15%
15%
Zaire Liberia 2014/1-342
125
114
50
51
30%
37%
33%
15%
15%
Zaire Sierra Leone NM042.3 2014/1-
342
125
114
50
51
30%
37%
33%
15%
15%
Zaire Sierra Leone 2014/1-343
125
114
50
51
30%
37%
33%
15%
15%
Sudan Boniface 1977/1-339
120
109
56
52
32%
35%
32%
17%
15%
Sudan Maleo 1979/1-339
120
110
56
51
32%
35%
32%
17%
15%
Sudan Gulu 2000/1-339
124
112
56
45
30%
37%
33%
17%
13%
Sudan Nakisamata 2011/1-339
123
113
56
45
30%
36%
33%
17%
13%
Sudan EboSud-609 2012/1-339
123
113
56
45
30%
36%
33%
17%
13%
Bundibugyo Uganda 2008/1-330
108
53
41
125
50%
33%
16%
12%
38%
Bundibugyo EboBund-14 2012/1-330
106
57
42
122
50%
32%
17%
13%
37%
Tai Forest 1994/1-343
124
48
42
128
50%
36%
14%
12%
37%
Reston Pennsylvania 1989/1-336
116
106
50
62
33%
35%
32%
15%
18%
Reston 08-A 2008/1-336
118
109
49
58
32%
35%
32%
15%
17%
Reston 08-C 2008/1-336
117
104
48
65
34%
35%
31%
14%
19%
Reston 08-E 2008/1-336
119
103
47
65
33%
35%
31%
14%
19%
Reston 09A Farm A 2009/1-336
117
109
50
58
32%
35%
32%
15%
17%
7
Figure 6. The clusters of guanine and cytosine substitutions which were later identified to be involved in pseudoknot
formation are enclosed in yellow joined with its associated bases.
Secondary Structure
With the identification of a regulatory element within the nucleoprotein gene as
inferred from the tree network analysis from the nucleoprotein 5’ UTR of the genomic
strand, each sequence was submitted into Mfold for secondary structure prediction
(Figure 7). The default parameters were modified only as to window size of 3, the ends
set as flat, and single strand the drawing mode. Using the structure with the lowest MFE,
there were clear differences between the species. The most virulent strain, EBOV,
demonstrated a well-defined stem loop structure at the 5’ end of the –strand and the 5’
end of the plus strand following the stop codon of the nucleoprotein CDS.
The readily apparent observations were, (1) there appeared to be a conserved structure
in the region of the 5’ UTR in genomic mRNA nucleoprotein gene, and (2) the overhang
contained a polyuridine section that was characteristic of U bodies and RIG-I /VP35
binding regions. Additionally, with increasing virulence, (1 )the stem loop was more
8
prevalent and longer, (2) the location was closer to the 5’ end, (3) the availability to
function as a miRNA was increased.
Figure 7. Secondary structures of representative sequences in eacch species of Ebolavirus were introduced into mFold for
secondary structure analysis. The hairpin structure of interest was readily visualized in EBOV but not in the less virulent
species.
Pseudoknot within 5’ UTR
Mfold is not capable of visualizing pseudoknots. To evaluate this further for the
presence of a pseudoknot, each sequence from both the + and strands of the UTR was
submitted to psknotsRG-mfe and RNAStructure. PsknotsRG is a tool for folding RNA
secondary structures, including the class of simple recursive pseudoknots. The program
runs in O(n^4) time and O(n^2) space, therefore its application on the BiBiserv is limited
to sequences of length up to 800 bases. The energy parameters for structures
containing no pseudoknots are the same as in the actual Mfold 3.1. The energy for
9
pseudoknots is computed with a model similar to that used by Rivas & Eddy in pknots.
The folding temperature is fixed to 37C [Reeder, 2004 Jens Reeder and Robert
Giegerich; Design, implementation and evaluation of a practical pseudoknot folding
algorithm based on thermodynamics, BMC Bioinformatics, 5:104, 2004 and Jens Reeder,
Peter Steffen and Robert Giegerich, pknotsRG: RNA pseudoknot folding including near-
optimal structures and sliding windows Nucl. Acids. Res., 35(suppl_2): W320-324, 200.7].
RNAStructure utilizes free energy minimization for secondary structure prediction
[Mathews, D.H., Disney, M.D., Childs, J.L., Schroeder, S.J., Zuker, M. and Turner, D.H.
(2004), incorporating chemical modification constraints into a dynamic programming
algorithm for prediction of RNA secondary structure.]. Proc. Natl. Acad. Sci. USA,
101:7287-7292]. After creating a partition function file [Mathews, D.H. (2004) Using an
RNA secondary structure partition function to determine confidence in base pairs
predicted by free energy minimization], the parts file can be analyzed with ProbKnot, a
program within RNAStructure, for the presence of a pseudoknot [Bellaousov, S., and
Mathews, D. H. (2010. ProbKnot: fast prediction of RNA secondary structure including
pseudoknots. RNA. 16:1870-1880].
The images obtained from PseudoViewer failed to demonstrate the presence of a
definitive pseudoknot in EBOV (FIGURE 8) or SUDV. In contrast, the presence of an RNA
pseudoknot within the UTR of BDBV, TAFV, and RESTV was computationally visualized.
The more virulent strains EBOV and SUDV did not demonstrate a pseudoknot regardless
of the software utilized. Pseudoknots can be located in the 5' UTR, the 3' UTR, or the
coding region, and their localization influences their effect on translation; for example,
initiation, frameshifting and termination. The pseudoknots located at the 3' UTR have a
role in the switch between translation and replication of viral RNAs [Roberts (2009) RNA
structure: new messages in translation, replication and disease].
10
Figure 8. Pseuodoknot structure analysis in representative species of Ebolavirus. Box layout diagram of the pseudoknots seen
in BDBV, TAFV and RESTV. EBOV and SUDV, the more virulent strains, could not be demonstrated to form a pseudoknot in
RNAStructure or psknots.
11
Target Analysis
The full length sequence of the hairpin structure from EBOV, strain Mayinga 1976
(accession number AF086833.2) was submitted online as a query to miRBase and
searched for both hairpin and mature sequence homologues and filtered for human
hsa-miR representative sequences and their targets (Table 2). The miRBase database is
a searchable database of published miRNA sequences and annotation. Each entry in the
miRBase Sequence database represents a predicted hairpin portion of a miRNA
transcript (termed mir in the database), with information on the location and sequence
of the mature miRNA sequence (termed miR) (8-13). Homologue alignments for the
hairpin sequence are shown in Table 3. The mature miRNA database results are shown
in Table 4.
Table 2. MiRBase Results-Hairpin sequence
Table 3. Alignment of EBOV with hsa-miR human homologues
Query: 82-110 (8 mer)
hsa-mir-548ay: 28-56
score: 100
evalue: 0.12
UserSeq 82 aaaagugauucuuauuuuugaauuuaaag 110
|||||| ||| | |||||| ||||||||
hsa-mir-548ay 28 aaaaguaauugugguuuuugcauuuaaag 56
Query: 82-115 (7 mer)
hsa-mir-548a-1: 25-58
score: 89
evalue: 1.0
UserSeq 82 aaaagugauucuuauuuuugaauuuaaagcuagc 115
|||||| ||| | ||||||| ||||| || |
hsa-mir-548a-1 25 aaaaguaauugugauuuuugccauuaaaaguaac 58
Query: 63-117 (6 mer)
hsa-mir-651: 1-55
Accession
ID
Query
start
Query
end
Subject
start
Subject
end
Strand
Score
Evalue
MI0022210
hsa-mir-548ay
82
110
28
56
+
100
0.12
MI0003593
hsa-mir-548a-1
82
115
25
58
+
89
1
MI0003666
hsa-mir-651
63
117
1
55
-
86
1.8
MI0016770
hsa-mir-548ad
82
108
16
42
+
81
4.6
12
score: 86
evalue: 1.8
UserSeq 117 ugugaauuauuaucacaauaaaagugauucuuauuuuugaauuuaaagcuagcuu 63
| || ||| |||| || |||||| | ||||| | || | | ||| ||
hsa-mir-651 1 uuugcauuuuuauuugaacaaaagucaagcuuauccuaaaaagcagugauagauu 55
Query: 82-108 (8 mer)
hsa-mir-548ad: 16-42
score: 81
evalue: 4.6
UserSeq 82 aaaagugauucuuauuuuugaauuuaa 108
|||||| ||| | |||||||| |||
hsa-mir-548ad 16 aaaaguaauugugguuuuugaaaguaa 42
Table 4. Mature miRNA Database results
Accession
ID
Query
start
Query
end
Subject
start
Subject
end
Strand
Score
Evalue
MIMAT0025456
hsa-miR-548az-5p
82
101
2
21
+
73
1.1
MIMAT0003285
hsa-miR-548c-3p
82
101
1
20
-
64
6.3
MIMAT0004812
hsa-miR-548d-5p
82
101
1
20
+
64
6.3
MIMAT0015009
hsa-miR-548t-5p
82
101
2
21
+
64
6.3
MIMAT0025452
hsa-miR-548ay-5p
82
101
1
20
+
64
6.3
MIMAT0032114
hsa-miR-548ad-5p
82
101
1
20
+
64
6.3
MIMAT0032115
hsa-miR-548ae-5p
82
101
1
20
+
64
6.3
MIMAT0025471
hsa-miR-6507-3p
83
100
2
19
+
63
7.6
MIMAT0027461
hsa-miR-6780a-3p
25
40
6
21
-
62
9.3
Table 5. Alignment of Query to mature miRNAs
Query: 82-101
hsa-miR-548az-5p: 2-21
10 mer
score: 73
evalue: 1.1
UserSeq 82 aaaagugauucuuauuuuug 101
|||||||||| | ||||||
hsa-miR-548az-5p 2 aaaagugauugugguuuuug 21
13
Query: 82-101
hsa-miR-548c-3p: 1-20
7 mer
score: 64
evalue: 6.3
UserSeq 101 aaaagugauucuuauuuuug 82
|||||| ||| |||||||
hsa-miR-548c-3p 1 aaaaguaauugagauuuuug 20
Query: 82-101
6 mer
hsa-miR-548d-5p: 1-20
score: 64
evalue: 6.3
UserSeq 82 aaaagugauucuuauuuuug 101
|||||| ||| | ||||||
hsa-miR-548d-5p 1 aaaaguaauugugguuuuug 20
Query: 82-101
hsa-miR-548t-5p: 2-21
9 mer
score: 64
evalue: 6.3
UserSeq 82 aaaagugauucuuauuuuug 101
||||||||| | ||||||
hsa-miR-548t-5p 2 aaaagugaucgugguuuuug 21
Query: 82-101
hsa-miR-548ay-5p: 1-20
score: 64
evalue: 6.3
UserSeq 82 aaaagugauucuuauuuuug 101
|||||| ||| | ||||||
hsa-miR-548ay-5p 1 aaaaguaauugugguuuuug 20
Query: 82-101
14
hsa-miR-548ad-5p: 1-20
score: 64
evalue: 6.3
UserSeq 82 aaaagugauucuuauuuuug 101
|||||| ||| | ||||||
hsa-miR-548ad-5p 1 aaaaguaauugugguuuuug 20
Query: 82-101
hsa-miR-548ae-5p: 1-20
score: 64
evalue: 6.3
UserSeq 82 aaaagugauucuuauuuuug 101
|||||| ||| | ||||||
hsa-miR-548ae-5p 1 aaaaguaauugugguuuuug 20
Query: 83-100
hsa-miR-6507-3p: 2-19
score: 63
evalue: 7.6
UserSeq 83 aaagugauucuuauuuuu 100
||||| ||| |||||||
hsa-miR-6507-3p 2 aaaguccuuccuauuuuu 19
Query: 25-40
hsa-miR-6780a-3p: 6-21
score: 62
evalue: 9.3
UserSeq 40 caaggaacgaaaacag 25
| ||||| ||||||||
hsa-miR-6780a-3p 6 cuaggaaagaaaacag 21
The human targets of both the hairpin sequence and mature 3p and 5p sequences were
then obtained using the hsa-miR homologues from MICRORNA.ORG, mirDB, RNA22
TARGETMINER, AND TARGETSCAN-VERT (See supplementary files).
15
There were identified a total of 10, 742 hairpin targets and 26,753 mature sequence
targets identified within the hsa-548 group. Only experimental hit were identified within
hsa-miR 548ay (hairpin), hsa-miR-651 (hairpin), hsa-miR 548az (mature), hsa-miR 548ay
(mature) and hsa-miR 6780.
Table 6. Overall Human Target Results
Hairpin Sequence
Total
10,742
hsa-miR-548ay
experimental
hsa-miR-548a
10670
hsa-miR-651
experimental
hsa-miR-548ad
72
Mature Sequence
Total
26,753
hsa-miR-548az
experimental
hsa-miR-548c
9969
hsa-miR-548d
9931
hsa-miR-548t
4297
hsa-miR-548ay
experimental
hsa-miR-548ad
147
hsa-miR-548ae
2409
hsa-miR-6780
experimental
The targets identified would potentially involve human transcripts involved in
ubiquination, multiple transcription factors, gene sequences involved in cell
differentiation, hematologic pathways, and immune transcripts (interleukin-1,
interleukin-8, ,TNF, others-see Supplemental Files). Figure 9 demonstrates only a few
of the targets from has-miR-548ae mature sequence.
16
Figure 9. The corresponding human targets from hsa-miR-548ae. The targets predicted to interact with the hsa-miR-548ae
mature sequence are shown. Only representative targets are shown in the image above. (See supplemental files for the
complete list).
17
Conclusion
This is the first report of a possible miRNA in the Zaire ebolavirus and may explain in
particular, the differential virulence between the species that has to date not been fully
accounted for. In the present outbreak of 2014-2015 in West Africa where nearly 25,000
have become infected with this dreaded disease and over 10,000 have died, it is
imperative we evaluate all potential causes is we are to arrive at an effective treatment.
Trails have begun with RNAi but are not specifically directed towards this segment in
the Ebola Zaire genome.
The limitations of a computational analysis are clear in that to date there is no
laboratory confirmation as to the validity in the findings included in this report.
However, with the significant 8-10 mer homology with human hsa-miR sequences that
were identified in this report, the consideration for further studies into this being a
significant virulence factor is warranted.
18
References
1. Sakurai K, Amarzguioui M, Kim DH, Alluin J, Heale B, et al. (2011) A role for human Dicer in pre-
RISC loading of siRNAs. Nucleic Acids Res 39: 15101525.
2. Koscianska E, Starega-Roslan J, Krzyzosiak WJ (2011) The Role of Dicer Protein Partners in the
Processing of MicroRNA Precursors. PLoS ONE 6(12
3. MacRae IJ, Doudna JA (2007) Ribonuclease revisited: structural insights into ribonuclease III
family enzymes. Curr Opin Struct Biol 17: 138145.
4. Zhang H, Kolb FA, Jaskiewicz L, Westhof E, Filipowicz W (2004) Single processing center models
for human Dicer and bacterial RNase III. Cell 118:5768.
5. Megha Ghildiyal and Phillip D. Zamore (2009) Small silencing RNAs: an expanding universe. Nat
Rev Genet. 2009 February; 10(2): 94108.
6. Zhan S, Lukens L (2010) Identification of Novel miRNAs and miRNA Dependent Developmental
Shifts of Gene Expression in Arabidopsis thaliana. PLoS ONE 5(4): e10157.
7. Aparicio, O., et al., Adenovirus virus-associated RNA is processed to functional interfering RNAs
involved in virus production. J Virol, 2006. 80(3): p. 1376-84
8. A uniform system for microRNA annotation. Ambros V, Bartel B, Bartel DP, Burge CB, Carrington
JC, Chen X, Dreyfuss G, Eddy SR, Griffiths-Jones S, Marshall M, Matzke M, Ruvkun G, Tuschl T.
RNA 2003 9(3):277-279
9. miRBase: annotating high confidence microRNAs using deep sequencing data. Kozomara A,
Griffiths-Jones S. NAR 2014 42:D68-D73
10. miRBase: integrating microRNA annotation and deep-sequencing data. Kozomara A, Griffiths-
Jones S. NAR 2011 39:D152-D157
11. miRBase: tools for microRNA genomics. Griffiths-Jones S, Saini HK, van Dongen S, Enright AJ.
NAR 2008 36:D154-D158
12. miRBase: microRNA sequences, targets and gene nomenclature. Griffiths-Jones S, Grocock RJ,
van Dongen S, Bateman A, Enright AJ. NAR 2006 34:D140-D144
13. The microRNA Registry. Griffiths-Jones S. NAR 2004 32:D109-D111
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
We describe an update of the miRBase database (http://www.mirbase.org/), the primary microRNA sequence repository. The latest miRBase release (v20, June 2013) contains 24 521 microRNA loci from 206 species, processed to produce 30 424 mature microRNA products. The rate of deposition of novel microRNAs and the number of researchers involved in their discovery continue to increase, driven largely by small RNA deep sequencing experiments. In the face of these increases, and a range of microRNA annotation methods and criteria, maintaining the quality of the microRNA sequence data set is a significant challenge. Here, we describe recent developments of the miRBase database to address this issue. In particular, we describe the collation and use of deep sequencing data sets to assign levels of confidence to miRBase entries. We now provide a high confidence subset of miRBase entries, based on the pattern of mapped reads. The high confidence microRNA data set is available alongside the complete microRNA collection at http://www.mirbase.org/. We also describe embedding microRNA-specific Wikipedia pages on the miRBase website to encourage the microRNA community to contribute and share textual and functional information.
Article
Full-text available
One of the cellular functions of the ribonuclease Dicer is to process microRNA precursors (pre-miRNAs) into mature microRNAs (miRNAs). Human Dicer performs this function in cooperation with its protein partners, AGO2, PACT and TRBP. The exact role of these accessory proteins in Dicer activity is still poorly understood. In this study, we used the northern blotting technique to investigate pre-miRNA cleavage efficiency and specificity after depletion of AGO2, PACT and TRBP by RNAi. The results showed that the inhibition of either Dicer protein partner substantially affected not only miRNA levels but also pre-miRNA levels, and it had a rather minor effect on the specificity of Dicer cleavage. The analysis of the Dicer cleavage products generated in vitro revealed the presence of a cleavage intermediate when pre-miRNA was processed by recombinant Dicer alone. This intermediate was not observed during pre-miRNA cleavage by endogenous Dicer. We demonstrate that AGO2, PACT and TRBP were required for the efficient functioning of Dicer in cells, and we suggest that one of the roles of these proteins is to assure better synchronization of cleavages triggered by two RNase III domains of Dicer.
Article
Full-text available
RNA interference is a powerful mechanism for sequence-specific inhibition of gene expression. It is widely known that small interfering RNAs (siRNAs) targeting the same region of a target-messenger RNA can have widely different efficacies. In efforts to better understand the siRNA features that influence knockdown efficiency, we analyzed siRNA interactions with a high-molecular weight complex in whole cell extracts prepared from two different cell lines. Using biochemical tools to study the nature of the complex, our results demonstrate that the primary siRNA-binding protein in the whole cell extracts is Dicer. We find that Dicer is capable of discriminating highly functional versus poorly functional siRNAs by recognizing the presence of 2-nt 3′ overhangs and the thermodynamic properties of 2–4 bp on both ends of effective siRNAs. Our results suggest a role for Dicer in pre-selection of effective siRNAs for handoff to Ago2. This initial selection is reflective of the overall silencing potential of an siRNA.
Article
Full-text available
microRNAs (miRNAs) are small, endogenous RNAs of 20 approximately 25 nucleotides, processed from stem-loop regions of longer RNA precursors. Plant miRNAs act as negative regulators of target mRNAs predominately by slicing target transcripts, and a number of miRNAs play important roles in development. We analyzed a number of published datasets from Arabidopsis thaliana to characterize novel miRNAs, novel miRNA targets, and miRNA-regulated developmental changes in gene expression. These data include microarray profiling data and small RNA (sRNA) deep sequencing data derived from miRNA biogenesis/transport mutants, microarray profiling data of mRNAs in a developmental series, and computational predictions of conserved genomic stem-loop structures. Our conservative analyses identified five novel mature miRNAs and seven miRNA targets, including one novel target gene. Two complementary miRNAs that target distinct mRNAs were encoded by one gene. We found that genes targeted by known miRNAs, and genes up-regulated or down-regulated in miRNA mutant inflorescences, are highly expressed in the wild type inflorescence. In addition, transcripts upregulated within the mutant inflorescences were abundant in wild type leaves and shoot meristems and low in pollen and seed. Downregulated transcripts were abundant in wild type pollen and seed and low in shoot meristems, roots and leaves. Thus, disrupting miRNA function causes the inflorescence transcriptome to resemble the leaf and meristem and to differ from pollen and seed. Applications of our computational approach to other species and the use of more liberal criteria than reported here will further expand the number of identified miRNAs and miRNA targets. Our findings suggest that miRNAs have a global role in promoting vegetative to reproductive transitions in A. thaliana.
Article
Full-text available
Since the discovery in 1993 of the first small silencing RNA, a dizzying number of small RNA classes have been identified, including microRNAs (miRNAs), small interfering RNAs (siRNAs) and Piwi-interacting RNAs (piRNAs). These classes differ in their biogenesis, their modes of target regulation and in the biological pathways they regulate. There is a growing realization that, despite their differences, these distinct small RNA pathways are interconnected, and that small RNA pathways compete and collaborate as they regulate genes and protect the genome from external and internal threats.
Article
Full-text available
MicroRNAs (miRNAs) are small noncoding RNA gene products about 22 nt long that are processed by Dicer from precursors with a characteristic hairpin secondary structure. Guidelines are presented for the identification and annotation of new miRNAs from diverse organisms, particularly so that miRNAs can be reliably distinguished from other RNAs such as small interfering RNAs. We describe specific criteria for the experimental verification of miRNAs, and conventions for naming miRNAs and miRNA genes. Finally, an online clearinghouse for miRNA gene name assignments is provided by the Rfam database of RNA families.
Article
Full-text available
Posttranscriptional gene silencing allows sequence-specific control of gene expression. Specificity is guaranteed by small antisense RNAs such as microRNAs (miRNAs) or small interfering RNAs (siRNAs). Functional miRNAs derive from longer double-stranded RNA (dsRNA) molecules that are cleaved to pre-miRNAs in the nucleus and are transported by exportin 5 (Exp 5) to the cytoplasm. Adenovirus-infected cells express virus-associated (VA) RNAs, which are dsRNA molecules similar in structure to pre-miRNAs. VA RNAs are also transported by Exp 5 to the cytoplasm, where they accumulate. Here we show that small RNAs derived from VA RNAs (svaRNAs), similar to miRNAs, can be found in adenovirus-infected cells. VA RNA processing to svaRNAs requires neither viral replication nor viral protein expression, as evidenced by the fact that svaRNA accumulation can be detected in cells transfected with VA sequences. svaRNAs are efficiently bound by Argonaute 2, the endonuclease of the RNA-induced silencing complex, and behave as functional siRNAs, in that they inhibit the expression of reporter genes with complementary sequences. Blocking svaRNA-mediated inhibition affects efficient adenovirus production, indicating that svaRNAs are required for virus viability. Thus, svaRNA-mediated silencing could represent a novel mechanism used by adenoviruses to control cellular or viral gene expression.
Article
Dicer is a multidomain ribonuclease that processes double-stranded RNAs (dsRNAs) to 21 nt small interfering RNAs (siRNAs) during RNA interference, and excises microRNAs from precursor hairpins. Dicer contains two domains related to the bacterial dsRNA-specific endonuclease, RNase III, which is known to function as a homodimer. Based on an X-ray structure of the Aquifex aeolicus RNase III, models of the enzyme interaction with dsRNA, and its cleavage at two composite catalytic centers, have been proposed. We have generated mutations in human Dicer and Escherichia coli RNase III residues implicated in the catalysis, and studied their effect on RNA processing. Our results indicate that both enzymes have only one processing center, containing two RNA cleavage sites and generating products with 2 nt 3' overhangs. Based on these and other data, we propose that Dicer functions through intramolecular dimerization of its two RNase III domains, assisted by the flanking RNA binding domains, PAZ and dsRBD.