PresentationPDF Available

RNA structuredness of viral genomes

Authors:

Abstract

Presentation given at the 7th 'Computational Approaches to RNA Structure and Function' meeting, Benasque, Spain on 12 August 2022
RNA structuredness of viral genomes
Michael T. Wolfinger
Research Group Bioinformatics and Computational Biology
University of Vienna
Austria
Computational Approaches to RNA Structure and Function
Benasque
12 August 2022
RNA structuredness of viral genomes
2
Many examples of structured, functional RNAs in untranslated regions
Some known examples of (conserved) RNA structures in coding regions
Different evolutionary pressures on RNA structure in coding/non-coding regions
RNA structuredness of viral genomes
3
ACCCAGAC
U
G
U
G
A
C
A
G
A
G
C
A
A
A
A
C
C
C
G
G
A
A
G
G
C
U
C
G
U
A
AAAG
A
U
U
G
U
C
C
G
G
A
ACCAAA
A
G
A
A
A
A
G
C
A
A
G
C
A
A
C
U
C
A
C
A
GAGAUAGA
G
C
U
C
G
G
A
C
U
G
G
A
G
A
G
C
U
C
U
U
U
A
A
A
C
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
AAAAAAAAAA
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
G
C
C
A
G
A
A
U
U
G
A
G
C
UGAACC
U
G
G
A
G
A
G
C
U
C
A
UUA
A
A
U
A
C
A
G
U
C
C
A
GACGAAACAAAACAUGACAAAGCAAAGAG
G
C
U
G
A
G
C
U
A
A
A
A
G
U
U
C
C
C
A
C
U
A
C
G
G
G
A
C
U
G
C
U
U
CA
U
A
G
C
G
G
U
U
U
G
U
G
G
G
G
G
G
AG
G
C
UA
G
G
A
G
G
C
G
A
A
G
C
CACAGAU
C
A
U
G
G
A
A
U
G
A
U
G
C
G
G
C
A
G
C
G
C
G
C
GA
G
A
G
C
G
A
CGGG
G
AAG
U
G
G
U
C
G
C
A
C
C
C
G
A
C
G
C
A
C
C
A
U
C
C
A
U
G
AAGCAAUACUUCGUGAGACCCC
C
C
C
U
G
A
C
C
A
G
C
A
A
A
G
GGG
G
C
A
G
A
C
C
G
G
U
C
A
G
G
G
GUGAGGAAUGCCCCC
A
G
A
G
U
G
C
A
U
U
A
C
G
G
C
A
G
C
AC
G
C
C
A
G
U
GA
G
A
G
U
G
G
C
G
A
CGGGA
AAA
U
G
G
U
C
G
A
U
C
C
C
G
A
C
G
U
A
G
G
G
C
A
C
U
C
U
GAAAAAUUUUGUGAGACC
C
C
C
U
G
C
A
U
C
A
U
G
A
U
A
A
G
G
C
CGA
A
CAUGGUGC
AU
G
A
A
A
G
G
G
GAGGCCC
C
C
G
G
A
A
G
C
AC
G
C
U
U
C
C
G
G
G A
G
G
A
G
G
G
A
A
G
A
G
A
G
A
A
A
U
U
G
GC
A
G
C
U
C
U
C
U
U
CA
G
G
A
UU
U
U
U
C
C
U
C
C
U
C
C
UAUACAAAAUUC
C
C
C
C
U
C
G
GU
A
G
A
G
G
G
G
GG
G
C
G
G
U
U
C
UU
G
U
U
C
U
CC
CUGAG
CCACCAUCACC
C
A
G
A
C
A
C
A
GGU
A
G
U
C
U
GA
C
A
A
G
G
A
G
G
U
G
A
U
G
U
G
U
G
A
C
U
C
G
G
A
A
A
A
A
C
A
C
C
C
G
C
U
AGC
C
A
G
A
A
U
G
U
G
A
C
A
GAGCA
A
A
A
CC
U
G
G
A
G
U
G
C
U
C
G
UUAA
A
U
A
U
U
G
U
C
C
A
GA
A
C
C
A
A
A
A
A
C
U
G
GGG
G
C
C
U
G
G
AGG
C
G
A
G
G
C
CACAGAGC
A
U
G
G
A
A
U
G
A
U
G
C
G
G
C
A
G
C
G
C
G
C
GA
G
A
G
C
G
A
CGGG
G
AGA
U
G
G
U
C
G
U
A
C
C
C
G
A
C
G
C
A
U
C
A
U
C
C
A
U
GAAGCAACAUUUCGUGAGACCC
U
C
C
G
G
C
C
G
G
U
A
G
A
G
GG
G
G
A
A
G
C
C
G
G
C
C
G
G
G
GAAAAAACCCCCCCCC
A
G
A
G
U
G
C
A
C
C
A
C
G
G
C
A
G
C
AC
G
U
C
A
G
U
GA
G
A
G
U
G
G
C
G
A
CGGGA
AAA
U
G
G
U
C
G
A
U
C
C
C
G
A
C
G
U
A
G
G
G
C
A
C
U
C
U
GAAAAACUUUGUGAGAC
C
C
C
C
G
G
C
A
C
C
A
U
G
A
C
A
A
G
G
C
CGA
G
CAUGGUGC
AAG
A
A
A
G
G
G
A
G
GCCC
C
C
G
G
A
A
G
C
AU
G
C
U
U
C
C
G
G
G A
G
G
A
G
G
G
A
A
G
A
G
A
G
A
C
A
U
U
G
GC
A
A
C
U
C
U
C
U
U
CA
G
G
A
UU
U
U
U
C
C
U
C
C
U
C
C
UAUACCAAAUUC
C
C
C
C
U
C
A
AC
A
G
A
G
G
G
G
GG
G
C
G
G
U
U
C
UU
G
U
U
C
U
CC
CUGAG
CCACCAUCACC
C
A
G
A
C
A
C
A
GAU
A
G
U
C
U
GA
C
A
A
G
G
A
G
G
U
G
A
U
G
U
G
U
G
A
C
U
C
G
G
A
A
A
A
A
C
A
C
C
C
G
C
U
Neudoerfl (A)
886-84 (C)
AGACAAAU
U
G
U
G
A
C
A
G
A
G
C
A
G
A
ACC
U
G
G
A
G
U
G
C
U
C
G
UAA
A
A
C
A
U
U
G
U
C
C
A
G
A
A
C
C
A
A
A
A
A
C
C
A
C
A
ACAAGCAACCCACAGAAAACAGA
G
C
U
C
G
G
A
C
U
G
G
A
G
A
G
C
U
C
U
U
U
AAAC
A
A
A
A
A
A
G
C
C
A
G
A
A
U
U
G
A
G
C
UGAACC
U
G
G
A
G
G
G
C
U
C
A
UUA
A
A
C
A
U
U
G
U
C
C
A
GAUAAAAACAAACAUGACUAAGAGAAAAGAAAAGAG
G
C
U
G
A
G
C
A
A
C
G
G
C
U
C
C
A
A
A
U
G
A
C
C
A
G
AC
C
G
U
C
U
ACA
C
C
A
C
G
G
C
U
G
G
G
A
U
U
G
G
G
G
C
CA
G
G
A
G
G
C
G
A
A
G
C
CACGGGCC
A
U
G
A
A
A
U
G
A
U
G
C
G
G
C
A
G
C
G
C
G
C
G
AC
A
G
C
GA
C
G
G
G
G
AA
C
UG
G
U
C
G
U
A
C
C
C
G
A
C
G
C
A
C
C
A
U
U
C
A
U
GAAGCAACAUUUCGUGAGACCCCUC
C
G
G
C
C
A
G
U
G
A
A
G
GG
G
G
A
G
G
C
U
G
G
U
C
G
GG
G
G
U
G
A
AA
A
C
A
C
C
CCC
A
G
G
G
C
G
C
U
C
U
A
U
G
G
C
A
G
C
AC
G
C
C
A
G
U
GA
G
A
G
U
G
G
C
G
A
CGGGAA
AA
U
G
G
U
C
G
U
U
C
C
C
G
A
C
G
U
A
G
G
G
C
G
C
U
C
U
GUAAAAUUU UGUGAGACCC
C
C
U
G
C
A
U
C
A
U
G
A
C
A
A
G
G
C
C
UAAC
C
UGAUGC
GU
A
A
A
A
G
G
GAGGCCC
C
C
G
G
A
A
G
C
AU
G
C
U
U
C
C
G
G
G A
G
G
A
G
G
G
A
A
G
G
G
A
G
A
A
A
U
U
G
GC
A
G
C
U
C
U
C
U
U
CA
G
G
A
GU
U
U
U
C
C
U
C
C
U
C
C
UAUACCAAAUUC
C
C
C
C
U
C
A
AC
A
G
A
G
G
G
G
GG
G
C
G
G
U
U
C
UU
G
U
U
C
U
CC
CUGAG
CCACCAUCACC
C
A
G
A
C
A
C
A
GAU
A
G
U
C
U
GA
C
A
A
G
G
A
G
G
U
G
A
U
G
U
G
U
G
A
C
U
C
G
G
A
A
A
A
A
C
A
C
C
C
G
C
U
TBEV-2871 (B)
AACCAGAC
U
G
U
G
A
C
U
G
A
G
C
A
C
A
ACC
U
G
G
A
G
U
G
C
U
C
G
UUA
A
A
C
A
U
U
G
U
C
C
A
G
A
A
C
C
A
A
A
A
A
C
C
A
C
A
GCAAACAA UUCACAGAACACCCCC
A
G
A
G
U
G
C
C
C
C
A
C
G
G
C
A
A
C
AC
G
U
C
A
G
U
GA
G
A
G
U
G
G
C
G
A
CGGGA
AAA
U
G
G
U
C
G
A
U
C
C
C
G
A
C
G
U
A
G
G
G
C
A
C
U
C
U
GUAAAACUU UGUGAGAC
C
C
C
C
G
G
C
A
C
C
A
U
G
A
U
A
A
G
G
C
CGA
A
CAUGGUGC
AA
G
A
A
C
G
G
G
A
G
GCCC
C
C
G
G
A
A
G
C
AU
G
C
U
U
C
C
G
G
G A
G
G
A
G
G
G
A
A
G
A
G
A
G
A
A
A
U
U
G
GC
A
A
C
U
C
U
C
U
U
CA
G
G
A
UU
C
U
U
C
C
U
C
C
U
C
C
UAUACCAAAUUC
C
C
C
C
U
C
A
AC
A
G
A
G
G
G
G
GG
G
C
G
G
U
U
C
UU
G
U
U
C
U
CC
CUGAG
CCAC
C
AU
C
A
C
C
C
A
G
A
C
A
C
A
G
AU
A
G
U
C
U
GAC
A
A
G
G
A
G
G
U
G
A
C
G
U
G
U
G
A
C
U
C
G
G
A
A
A
A
A
C
A
C
C
C
G
C
U
Senzhang (D)
3’SL
Y1
CSL1
CSL2
CSL2
xrRNA2
xrRNA1
CSL3
CSL4
CSL4
CSL4
poly-A
3’SL
Y1
CSL1
CSL2
xrRNA2
CSL2
xrRNA1
CSL3
CSL4
CSL4
CSL4
3’SL
Y1
CSL1
CSL2
xrRNA2
CSL2
xrRNA1
CSL4
3’SL
Y1
CSL1
CSL2
xrRNA2
CSL4
Baikalean|886-84
variable region (~440nt) conserved region (~320nt)
Himalaya-1
C-I|Oshima_5-10
C-I|Primorye-82
C-I|Tomsk-PT14
C-II|Sofjin-HO
C-II|Primorye-949
C-III|Senzhang
C-III|DXAL-T83
178-79
Obs|TBEV-2871
Bal|EK-328
Bal|Kuutsalo_2
Zau|Zausaev
Zau|TBEV-2836
Vas|1827-18
Vas|Tomsk-PT122
Absettarov
KEM-127
118-71
Salem
Neudoerfl
Hypr
Sipoo-8
W-Eur|NL
N5-17
Bos|Buzuuchuk
Vas|Vasilchenko
TBEV-Eur TBEV-Sib TBEV-FE
CSL2 xrRNA2 CSL2 CSL1 Y1 3'SL
CSL4 CSL4 CSL4 CSL3 xrRNA1 CSL2 xrRNA2 CSL2 CSL1 Y1 3'SL
CSL4 CSL4 CSL4 xrRNA2 CSL2 CSL1 Y1 3'SL
CSL4 xrRNA2 CSL2 CSL1 Y1 3'SL
CSL4 xrRNA1 CSL2 xrRNA2 CSL2 CSL1 Y1 3'SL
CSL4 CSL2 xrRNA2 CSL2 CSL1 Y1 3'SL
CSL4 xrRNA2 CSL2 CSL1 Y1 3'SL
CSL4 CSL4 CSL4 CSL3 xrRNA1 CSL2 xrRNA2 CSL2 CSL1 Y1 3'SL
CSL4 CSL4 CSL4 xrRNA2 CSL2 CSL1 Y1 3'SL
CSL4 xrRNA1 CSL2 xrRNA2 CSL2 CSL1 Y1 3'SL
CSL4 CSL4 CSL4 CSL3 xrRNA1 CSL2 xrRNA2 CSL2 CSL1 Y1 3'SL
CSL4 CSL4 CSL4 xrRNA2 CSL2 CSL1 Y1 3'SL
CSL4 CSL4 CSL4 CSL3 xrRNA1 CSL2 xrRNA2 CSL2 CSL1 Y1 3'SL
CSL4 CSL2 xrRNA2 CSL2 CSL1 Y1 3'SL
CSL4 CSL4 CSL4 CSL3 xrRNA1 CSL2 xrRNA2 CSL2 CSL1 Y1 3'SL
CSL4 CSL4 CSL4 CSL3 xrRNA1 CSL2 xrRNA2 CSL2 CSL1 Y1 3'SL
CSL4 CSL4 CSL4 CSL3 xrRNA1 CSL2 3'SL
CSL4 xrRNA1 xrRNA2 CSL2 CSL1 Y1 3'SL
CSL4 xrRNA1 CSL2 xrRNA2 CSL2 CSL1 Y1 3'SL
CSL4 CSL4 CSL4 CSL3 xrRNA1 CSL2 xrRNA2 CSL2 CSL1 Y1 3'SL
CSL4 CSL3 xrRNA1 CSL2 xrRNA2 CSL2 CSL1 Y1 3'SL
CSL4 xrRNA1 CSL2 xrRNA2 CSL2 CSL1 Y1 3'SL
CSL4 CSL4 CSL4 CSL3 xrRNA1 CSL2 xrRNA2 CSL2 CSL1 Y1 3'SL
CSL4 CSL4 polyA CSL4 CSL3 xrRNA1 CSL2 xrRNA2 CSL2 CSL1 Y1 3'SL
CSL4 CSL2 xrRNA2 CSL2 CSL1 Y1 3'SL
CSL4 CSL4 CSL4 CSL3 CSL2 xrRNA2 CSL2 CSL1 Y1 3'SL
CSL4 xrRNA1 CSL2 xrRNA2 CSL2 CSL1 Y1
xrRNA1 CSL2 xrRNA2 CSL2 CSL1 Y1 *
*
*
*
*
*
Kutschera and Wolfinger; Virus Evol (2022)
Many examples of structured, functional RNAs in untranslated regions
Some known examples of (conserved) RNA structures in coding regions
Different evolutionary pressures on RNA structure in coding/non-coding regions
RNA structuredness of viral genomes
4
Many examples of structured, functional RNAs in untranslated regions
Some known examples of (conserved) RNA structures in coding regions
Different evolutionary pressures on RNA structure in coding/non-coding regions
Newborn and White; Virology (2015)
Fernandes et al.; RNA Biol (2012)
Omar et al.; PLoS Comput Biol (2021)
RNA structuredness of viral genomes
5
Many examples of structured, functional RNAs in untranslated regions
Some known examples of (conserved) RNA structures in coding regions
Different evolutionary pressures on RNA structure in coding/non-coding regions
Newborn and White; Virology (2015)
Fernandes et al.; RNA Biol (2012)
Omar et al.; PLoS Comput Biol (2021)
5
How to assess global RNA structuredness?
6
MFE Z scores as a proxy for RNA
structuredness
How to assess global RNA structuredness?
7
Opening energy
GC content
MFE Z scores as a proxy for RNA
structuredness
8
Data Set
Baltimore
classification
Unsegmented
Segmented
Total
ssRNA(+)
1333
373
1706
ssRNA(-)
355
1118
1473
dsRNA
73
890
963
dsDNA
714
0
714
2475
2381
4856
9
Structuredness coding/non-coding regions
CDS non-CDS CDS non-CDS CDS non-CDS CDS non-CDS CDS non-CDS
MFE Z scores
10
Structuredness coding/non-coding regions
CDS non-CDS CDS non-CDS CDS non-CDS CDS non-CDS CDS non-CDS
MFE Z scores
11
Structuredness coding/non-coding regions
CDS non-CDS CDS non-CDS CDS non-CDS CDS non-CDS CDS non-CDS
Blechmonas luni narnavirus 1
Anhembi virus, segment L (Bunyavirus)
Uukuvirus (Bunyavirus)
12
Mean Z score of ssRNA(+) families
13
Phage example [ssRNA(+)]
Viruses|Orthornavirae|Lenarviricota|Leviviricetes|Norzivirales|Fiersviridae|Enterobacteria phage M
https://viralzone.expasy.org/163
mean Z score: -1.79
14
https://viralzone.expasy.org/207
Ebolavirus example [ssRNA(-)]
BDBV
BOMV
EBOV
RESTV
SUDV
TAFV
BDBV
BOMV
EBOV
RESTV
SUDV
TAFV
0.6 0.7 0.8 0.9 1.0
nucleotide identity
0.2
Zaire ebolavirus
Sudan ebolavirus
Tai Forest ebolavirus
Bundibugyo ebolavirus
Bombali ebolavirus
Reston ebolavirus
https://viralzone.expasy.org/207
Ebolavirus example [ssRNA(-)]
BDBV
BOMV
EBOV
RESTV
SUDV
TAFV
BDBV
BOMV
EBOV
RESTV
SUDV
TAFV
0.6 0.7 0.8 0.9 1.0
nucleotide identity
0.2
Zaire ebolavirus
Sudan ebolavirus
Tai Forest ebolavirus
Bundibugyo ebolavirus
Bombali ebolavirus
Reston ebolavirus
16
Ebolavirus example [ssRNA(-)]
mean Z score: -0.21
Structuredness of gene start regions
17
Are they accessible?
AUG
5’ UTR
3’
30nt
CDS
5’
correlation: -0.366
Z score vs. opening energy around gene start
ssRNA(+)
Structuredness of gene start regions
correlation: -0.366
Z score vs. opening energy around gene start
ssRNA(+)
AUG
5’ UTR 3’
30nt
CDS
30nt
19
Structuredness of gene start regions
5’
20
correlation: -0.487
ssRNA(+)
Z score vs. opening energy CDS +30nt
Structuredness of gene start regions
21
correlation: 0.00
ssRNA(-) 5’-3’
Structuredness of gene start regions
22
Where we are The next steps
Viruses differ in their RNA structuredness;
many viruses are more structured than
expected
GC content is not always a proxy for RNA
structuredness
Some viruses achieve high structuredness
despite low GC content
Analyse structuredness of human mRNAs
Assess the impact of codon usage bias on
RNA structuredness
Synbio: Study the impact of alternative
genetic codes on RNA structuredness
Acknowledgements
Teodora Bucaciuc Mracica
Ivo Hofacker
Ronny Lorenz
TBI Vienna
@mtwolfingermichaelwolfinger.com
Boston University
Elke Mühlberger
University of Lethbridge
Auburn University
Joanna Sztuba-Solinska
Trushar Patel
University of Leipzig
Mario Mörl
ResearchGate has not been able to resolve any citations for this publication.
ResearchGate has not been able to resolve any references for this publication.