Content uploaded by Michael T. Wolfinger
Author content
All content in this area was uploaded by Michael T. Wolfinger on Aug 17, 2022
Content may be subject to copyright.
RNA structuredness of viral genomes
2
•Many examples of structured, functional RNAs in untranslated regions
•Some known examples of (conserved) RNA structures in coding regions
•Different evolutionary pressures on RNA structure in coding/non-coding regions
RNA structuredness of viral genomes
3
ACCCAGAC
U
G
U
G
A
C
A
G
A
G
C
A
A
A
A
C
C
C
G
G
A
A
G
G
C
U
C
G
U
A
AAAG
A
U
U
G
U
C
C
G
G
A
ACCAAA
A
G
A
A
A
A
G
C
A
A
G
C
A
A
C
U
C
A
C
A
GAGAUAGA
G
C
U
C
G
G
A
C
U
G
G
A
G
A
G
C
U
C
U
U
U
A
A
A
C
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
AAAAAAAAAA
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
G
C
C
A
G
A
A
U
U
G
A
G
C
UGAACC
U
G
G
A
G
A
G
C
U
C
A
UUA
A
A
U
A
C
A
G
U
C
C
A
GACGAAACAAAACAUGACAAAGCAAAGAG
G
C
U
G
A
G
C
U
A
A
A
A
G
U
U
C
C
C
A
C
U
A
C
G
G
G
A
C
U
G
C
U
U
CA
U
A
G
C
G
G
U
U
U
G
U
G
G
G
G
G
G
AG
G
C
UA
G
G
A
G
G
C
G
A
A
G
C
CACAGAU
C
A
U
G
G
A
A
U
G
A
U
G
C
G
G
C
A
G
C
G
C
G
C
GA
G
A
G
C
G
A
CGGG
G
AAG
U
G
G
U
C
G
C
A
C
C
C
G
A
C
G
C
A
C
C
A
U
C
C
A
U
G
AAGCAAUACUUCGUGAGACCCC
C
C
C
U
G
A
C
C
A
G
C
A
A
A
G
GGG
G
C
A
G
A
C
C
G
G
U
C
A
G
G
G
GUGAGGAAUGCCCCC
A
G
A
G
U
G
C
A
U
U
A
C
G
G
C
A
G
C
AC
G
C
C
A
G
U
GA
G
A
G
U
G
G
C
G
A
CGGGA
AAA
U
G
G
U
C
G
A
U
C
C
C
G
A
C
G
U
A
G
G
G
C
A
C
U
C
U
GAAAAAUUUUGUGAGACC
C
C
C
U
G
C
A
U
C
A
U
G
A
U
A
A
G
G
C
CGA
A
CAUGGUGC
AU
G
A
A
A
G
G
G
GAGGCCC
C
C
G
G
A
A
G
C
AC
G
C
U
U
C
C
G
G
G A
G
G
A
G
G
G
A
A
G
A
G
A
G
A
A
A
U
U
G
GC
A
G
C
U
C
U
C
U
U
CA
G
G
A
UU
U
U
U
C
C
U
C
C
U
C
C
UAUACAAAAUUC
C
C
C
C
U
C
G
GU
A
G
A
G
G
G
G
GG
G
C
G
G
U
U
C
UU
G
U
U
C
U
CC
CUGAG
CCACCAUCACC
C
A
G
A
C
A
C
A
GGU
A
G
U
C
U
GA
C
A
A
G
G
A
G
G
U
G
A
U
G
U
G
U
G
A
C
U
C
G
G
A
A
A
A
A
C
A
C
C
C
G
C
U
AGC
C
A
G
A
A
U
G
U
G
A
C
A
GAGCA
A
A
A
CC
U
G
G
A
G
U
G
C
U
C
G
UUAA
A
U
A
U
U
G
U
C
C
A
GA
A
C
C
A
A
A
A
A
C
U
G
GGG
G
C
C
U
G
G
AGG
C
G
A
G
G
C
CACAGAGC
A
U
G
G
A
A
U
G
A
U
G
C
G
G
C
A
G
C
G
C
G
C
GA
G
A
G
C
G
A
CGGG
G
AGA
U
G
G
U
C
G
U
A
C
C
C
G
A
C
G
C
A
U
C
A
U
C
C
A
U
GAAGCAACAUUUCGUGAGACCC
U
C
C
G
G
C
C
G
G
U
A
G
A
G
GG
G
G
A
A
G
C
C
G
G
C
C
G
G
G
GAAAAAACCCCCCCCC
A
G
A
G
U
G
C
A
C
C
A
C
G
G
C
A
G
C
AC
G
U
C
A
G
U
GA
G
A
G
U
G
G
C
G
A
CGGGA
AAA
U
G
G
U
C
G
A
U
C
C
C
G
A
C
G
U
A
G
G
G
C
A
C
U
C
U
GAAAAACUUUGUGAGAC
C
C
C
C
G
G
C
A
C
C
A
U
G
A
C
A
A
G
G
C
CGA
G
CAUGGUGC
AAG
A
A
A
G
G
G
A
G
GCCC
C
C
G
G
A
A
G
C
AU
G
C
U
U
C
C
G
G
G A
G
G
A
G
G
G
A
A
G
A
G
A
G
A
C
A
U
U
G
GC
A
A
C
U
C
U
C
U
U
CA
G
G
A
UU
U
U
U
C
C
U
C
C
U
C
C
UAUACCAAAUUC
C
C
C
C
U
C
A
AC
A
G
A
G
G
G
G
GG
G
C
G
G
U
U
C
UU
G
U
U
C
U
CC
CUGAG
CCACCAUCACC
C
A
G
A
C
A
C
A
GAU
A
G
U
C
U
GA
C
A
A
G
G
A
G
G
U
G
A
U
G
U
G
U
G
A
C
U
C
G
G
A
A
A
A
A
C
A
C
C
C
G
C
U
Neudoerfl (A)
886-84 (C)
AGACAAAU
U
G
U
G
A
C
A
G
A
G
C
A
G
A
ACC
U
G
G
A
G
U
G
C
U
C
G
UAA
A
A
C
A
U
U
G
U
C
C
A
G
A
A
C
C
A
A
A
A
A
C
C
A
C
A
ACAAGCAACCCACAGAAAACAGA
G
C
U
C
G
G
A
C
U
G
G
A
G
A
G
C
U
C
U
U
U
AAAC
A
A
A
A
A
A
G
C
C
A
G
A
A
U
U
G
A
G
C
UGAACC
U
G
G
A
G
G
G
C
U
C
A
UUA
A
A
C
A
U
U
G
U
C
C
A
GAUAAAAACAAACAUGACUAAGAGAAAAGAAAAGAG
G
C
U
G
A
G
C
A
A
C
G
G
C
U
C
C
A
A
A
U
G
A
C
C
A
G
AC
C
G
U
C
U
ACA
C
C
A
C
G
G
C
U
G
G
G
A
U
U
G
G
G
G
C
CA
G
G
A
G
G
C
G
A
A
G
C
CACGGGCC
A
U
G
A
A
A
U
G
A
U
G
C
G
G
C
A
G
C
G
C
G
C
G
AC
A
G
C
GA
C
G
G
G
G
AA
C
UG
G
U
C
G
U
A
C
C
C
G
A
C
G
C
A
C
C
A
U
U
C
A
U
GAAGCAACAUUUCGUGAGACCCCUC
C
G
G
C
C
A
G
U
G
A
A
G
GG
G
G
A
G
G
C
U
G
G
U
C
G
GG
G
G
U
G
A
AA
A
C
A
C
C
CCC
A
G
G
G
C
G
C
U
C
U
A
U
G
G
C
A
G
C
AC
G
C
C
A
G
U
GA
G
A
G
U
G
G
C
G
A
CGGGAA
AA
U
G
G
U
C
G
U
U
C
C
C
G
A
C
G
U
A
G
G
G
C
G
C
U
C
U
GUAAAAUUU UGUGAGACCC
C
C
U
G
C
A
U
C
A
U
G
A
C
A
A
G
G
C
C
UAAC
C
UGAUGC
GU
A
A
A
A
G
G
GAGGCCC
C
C
G
G
A
A
G
C
AU
G
C
U
U
C
C
G
G
G A
G
G
A
G
G
G
A
A
G
G
G
A
G
A
A
A
U
U
G
GC
A
G
C
U
C
U
C
U
U
CA
G
G
A
GU
U
U
U
C
C
U
C
C
U
C
C
UAUACCAAAUUC
C
C
C
C
U
C
A
AC
A
G
A
G
G
G
G
GG
G
C
G
G
U
U
C
UU
G
U
U
C
U
CC
CUGAG
CCACCAUCACC
C
A
G
A
C
A
C
A
GAU
A
G
U
C
U
GA
C
A
A
G
G
A
G
G
U
G
A
U
G
U
G
U
G
A
C
U
C
G
G
A
A
A
A
A
C
A
C
C
C
G
C
U
TBEV-2871 (B)
AACCAGAC
U
G
U
G
A
C
U
G
A
G
C
A
C
A
ACC
U
G
G
A
G
U
G
C
U
C
G
UUA
A
A
C
A
U
U
G
U
C
C
A
G
A
A
C
C
A
A
A
A
A
C
C
A
C
A
GCAAACAA UUCACAGAACACCCCC
A
G
A
G
U
G
C
C
C
C
A
C
G
G
C
A
A
C
AC
G
U
C
A
G
U
GA
G
A
G
U
G
G
C
G
A
CGGGA
AAA
U
G
G
U
C
G
A
U
C
C
C
G
A
C
G
U
A
G
G
G
C
A
C
U
C
U
GUAAAACUU UGUGAGAC
C
C
C
C
G
G
C
A
C
C
A
U
G
A
U
A
A
G
G
C
CGA
A
CAUGGUGC
AA
G
A
A
C
G
G
G
A
G
GCCC
C
C
G
G
A
A
G
C
AU
G
C
U
U
C
C
G
G
G A
G
G
A
G
G
G
A
A
G
A
G
A
G
A
A
A
U
U
G
GC
A
A
C
U
C
U
C
U
U
CA
G
G
A
UU
C
U
U
C
C
U
C
C
U
C
C
UAUACCAAAUUC
C
C
C
C
U
C
A
AC
A
G
A
G
G
G
G
GG
G
C
G
G
U
U
C
UU
G
U
U
C
U
CC
CUGAG
CCAC
C
AU
C
A
C
C
C
A
G
A
C
A
C
A
G
AU
A
G
U
C
U
GAC
A
A
G
G
A
G
G
U
G
A
C
G
U
G
U
G
A
C
U
C
G
G
A
A
A
A
A
C
A
C
C
C
G
C
U
Senzhang (D)
3’SL
Y1
CSL1
CSL2
CSL2
xrRNA2
xrRNA1
CSL3
CSL4
CSL4
CSL4
poly-A
3’SL
Y1
CSL1
CSL2
xrRNA2
CSL2
xrRNA1
CSL3
CSL4
CSL4
CSL4
3’SL
Y1
CSL1
CSL2
xrRNA2
CSL2
xrRNA1
CSL4
3’SL
Y1
CSL1
CSL2
xrRNA2
CSL4
Baikalean|886-84
variable region (~440nt) conserved region (~320nt)
Himalaya-1
C-I|Oshima_5-10
C-I|Primorye-82
C-I|Tomsk-PT14
C-II|Sofjin-HO
C-II|Primorye-949
C-III|Senzhang
C-III|DXAL-T83
178-79
Obs|TBEV-2871
Bal|EK-328
Bal|Kuutsalo_2
Zau|Zausaev
Zau|TBEV-2836
Vas|1827-18
Vas|Tomsk-PT122
Absettarov
KEM-127
118-71
Salem
Neudoerfl
Hypr
Sipoo-8
W-Eur|NL
N5-17
Bos|Buzuuchuk
Vas|Vasilchenko
TBEV-Eur TBEV-Sib TBEV-FE
CSL2 xrRNA2 CSL2 CSL1 Y1 3'SL
CSL4 CSL4 CSL4 CSL3 xrRNA1 CSL2 xrRNA2 CSL2 CSL1 Y1 3'SL
CSL4 CSL4 CSL4 xrRNA2 CSL2 CSL1 Y1 3'SL
CSL4 xrRNA2 CSL2 CSL1 Y1 3'SL
CSL4 xrRNA1 CSL2 xrRNA2 CSL2 CSL1 Y1 3'SL
CSL4 CSL2 xrRNA2 CSL2 CSL1 Y1 3'SL
CSL4 xrRNA2 CSL2 CSL1 Y1 3'SL
CSL4 CSL4 CSL4 CSL3 xrRNA1 CSL2 xrRNA2 CSL2 CSL1 Y1 3'SL
CSL4 CSL4 CSL4 xrRNA2 CSL2 CSL1 Y1 3'SL
CSL4 xrRNA1 CSL2 xrRNA2 CSL2 CSL1 Y1 3'SL
CSL4 CSL4 CSL4 CSL3 xrRNA1 CSL2 xrRNA2 CSL2 CSL1 Y1 3'SL
CSL4 CSL4 CSL4 xrRNA2 CSL2 CSL1 Y1 3'SL
CSL4 CSL4 CSL4 CSL3 xrRNA1 CSL2 xrRNA2 CSL2 CSL1 Y1 3'SL
CSL4 CSL2 xrRNA2 CSL2 CSL1 Y1 3'SL
CSL4 CSL4 CSL4 CSL3 xrRNA1 CSL2 xrRNA2 CSL2 CSL1 Y1 3'SL
CSL4 CSL4 CSL4 CSL3 xrRNA1 CSL2 xrRNA2 CSL2 CSL1 Y1 3'SL
CSL4 CSL4 CSL4 CSL3 xrRNA1 CSL2 3'SL
CSL4 xrRNA1 xrRNA2 CSL2 CSL1 Y1 3'SL
CSL4 xrRNA1 CSL2 xrRNA2 CSL2 CSL1 Y1 3'SL
CSL4 CSL4 CSL4 CSL3 xrRNA1 CSL2 xrRNA2 CSL2 CSL1 Y1 3'SL
CSL4 CSL3 xrRNA1 CSL2 xrRNA2 CSL2 CSL1 Y1 3'SL
CSL4 xrRNA1 CSL2 xrRNA2 CSL2 CSL1 Y1 3'SL
CSL4 CSL4 CSL4 CSL3 xrRNA1 CSL2 xrRNA2 CSL2 CSL1 Y1 3'SL
CSL4 CSL4 polyA CSL4 CSL3 xrRNA1 CSL2 xrRNA2 CSL2 CSL1 Y1 3'SL
CSL4 CSL2 xrRNA2 CSL2 CSL1 Y1 3'SL
CSL4 CSL4 CSL4 CSL3 CSL2 xrRNA2 CSL2 CSL1 Y1 3'SL
CSL4 xrRNA1 CSL2 xrRNA2 CSL2 CSL1 Y1
xrRNA1 CSL2 xrRNA2 CSL2 CSL1 Y1 *
*
*
*
*
*
Kutschera and Wolfinger; Virus Evol (2022)
•Many examples of structured, functional RNAs in untranslated regions
•Some known examples of (conserved) RNA structures in coding regions
•Different evolutionary pressures on RNA structure in coding/non-coding regions
RNA structuredness of viral genomes
4
•Many examples of structured, functional RNAs in untranslated regions
•Some known examples of (conserved) RNA structures in coding regions
•Different evolutionary pressures on RNA structure in coding/non-coding regions
Newborn and White; Virology (2015)
Fernandes et al.; RNA Biol (2012)
Omar et al.; PLoS Comput Biol (2021)
RNA structuredness of viral genomes
5
•Many examples of structured, functional RNAs in untranslated regions
•Some known examples of (conserved) RNA structures in coding regions
•Different evolutionary pressures on RNA structure in coding/non-coding regions
Newborn and White; Virology (2015)
Fernandes et al.; RNA Biol (2012)
Omar et al.; PLoS Comput Biol (2021)
5
How to assess global RNA structuredness?
6
•MFE Z scores as a proxy for RNA
structuredness
How to assess global RNA structuredness?
7
•Opening energy
•GC content
•MFE Z scores as a proxy for RNA
structuredness
8
Data Set
Baltimore
classification
Unsegmented
Segmented
Total
ssRNA(+)
1333
373
1706
ssRNA(-)
355
1118
1473
dsRNA
73
890
963
dsDNA
714
0
714
2475
2381
4856
9
Structuredness coding/non-coding regions
CDS non-CDS CDS non-CDS CDS non-CDS CDS non-CDS CDS non-CDS
MFE Z scores
10
Structuredness coding/non-coding regions
CDS non-CDS CDS non-CDS CDS non-CDS CDS non-CDS CDS non-CDS
MFE Z scores
11
Structuredness coding/non-coding regions
CDS non-CDS CDS non-CDS CDS non-CDS CDS non-CDS CDS non-CDS
Blechmonas luni narnavirus 1
Anhembi virus, segment L (Bunyavirus)
Uukuvirus (Bunyavirus)
12
Mean Z score of ssRNA(+) families
14
https://viralzone.expasy.org/207
Ebolavirus example [ssRNA(-)]
1
0.602 1
0.633 0.605 1
0.597 0.582 0.602 1
0.597 0.578 0.6 0.594 1
0.678 0.603 0.634 0.598 0.596 1
BDBV
BOMV
EBOV
RESTV
SUDV
TAFV
BDBV
BOMV
EBOV
RESTV
SUDV
TAFV
0.6 0.7 0.8 0.9 1.0
nucleotide identity
0.2
Zaire ebolavirus
Sudan ebolavirus
Tai Forest ebolavirus
Bundibugyo ebolavirus
Bombali ebolavirus
Reston ebolavirus
https://viralzone.expasy.org/207
Ebolavirus example [ssRNA(-)]
1
0.602 1
0.633 0.605 1
0.597 0.582 0.602 1
0.597 0.578 0.6 0.594 1
0.678 0.603 0.634 0.598 0.596 1
BDBV
BOMV
EBOV
RESTV
SUDV
TAFV
BDBV
BOMV
EBOV
RESTV
SUDV
TAFV
0.6 0.7 0.8 0.9 1.0
nucleotide identity
0.2
Zaire ebolavirus
Sudan ebolavirus
Tai Forest ebolavirus
Bundibugyo ebolavirus
Bombali ebolavirus
Reston ebolavirus
16
Ebolavirus example [ssRNA(-)]
mean Z score: -0.21
Structuredness of gene start regions
17
Are they accessible?
AUG
5’ UTR
3’
30nt
CDS
5’
correlation: -0.366
Z score vs. opening energy around gene start
ssRNA(+)
Structuredness of gene start regions
correlation: -0.366
Z score vs. opening energy around gene start
ssRNA(+)
AUG
5’ UTR 3’
30nt
CDS
30nt
19
Structuredness of gene start regions
5’
20
correlation: -0.487
ssRNA(+)
Z score vs. opening energy CDS +30nt
Structuredness of gene start regions
21
correlation: 0.00
ssRNA(-) 5’-3’
Structuredness of gene start regions
22
Where we are The next steps
•Viruses differ in their RNA structuredness;
many viruses are more structured than
expected
•GC content is not always a proxy for RNA
structuredness
•Some viruses achieve high structuredness
despite low GC content
•Analyse structuredness of human mRNAs
•Assess the impact of codon usage bias on
RNA structuredness
•Synbio: Study the impact of alternative
genetic codes on RNA structuredness