PosterPDF Available

Abstract

A tool for a thorough investigation of RNA energy landscapes is presented. The topological details of these landscapes are represented by so called ’barrier trees’, which give an efficient impression of the landscape and its overall shape and characteristics.
Landscapes and Energy Barriers
Michael Wolfinger, Peter F. Stadler, Ivo L. Hofacker, Christoph Flamm
Institute for Theoretical Chemistry and Molecular Structural Biology, University Vienna
Tel: +43 1 4277 52731 Fax: +43 1 4277 52793 Email: {mtw,studla,ivo,xtof}@tbi.univie.ac.at Web: http://www.tbi.univie.ac.at/
A tool for a thorough investigation of RNA energy
landscapes is presented. The topological details of
these landscapes are represented by so called ’barrier
trees’, which give an efficient impression of the land-
scape and its overall shape and characteristics.
RNA Secondary Structure
An interesting aspect concerning biomolecules is
structure prediction. As RNA folding is thought to
be of hierarchical nature [2], secondary structures can
be seen as a coarse grained approach to the three di-
mensional structures.
GCGGAUU
UAG
C
U
C
A
G
UU G G
G
A
G
A
G
CGCCAGA
CUGA
A
G
A
U
C
U
G
G
A
G
G
UC
C
U
G
U
G
U
U
C
G
A
UCCACAG
AAUUCGC
A
C
C
A
1
5
10
15
20
25
30 35 40 45
50
55
60
65
70
75
RNA secondary structure: (left) conventional representation. (right) cir-
cle representation. (below) bracket-dot representation.
RNA secondary structure is defined as a pattern of
base pairs, which is determined by hydrogen bonds
between the four bases Adenine (A), Guanine (G),
Cytosine (C) and Uracil (U). Efficient algorithms for
calculation and evaluation of RNA secondary struc-
tures have been suggested [6, 4]. An important contri-
bution to the understanding of the behavior of RNA
molecules was given in [5], where a tool for calcula-
tion of all suboptimally folded RNA structures within
a certain energy range above the ground state was in-
troduced.
The energy landscape of a RNA molecule is a complex
surface of the free energy versus the conformational
degrees of freedom. Here, the allowed conformations
are the (suboptimal) secondary structures which are
compatible with a particular sequence. The figure be-
low illustrates the ’conformation space Cof a short
RNA molecule.
o
o
o
oo o
o
o
o
o
o
o o
o
o
o
o o
o
o
o
o o
o o
o
o
o
o o
o o
o
o
o
o o
o o
o
o
o
o o
o o
o o
o
o
o
o
o
o
o o
o
o
o
o o
o o
o
o
o
o o
o o
One move neighborhood of the conformation space (l.h.s.) and its em-
bedding in the graph representing the conformation space (r.h.s) for a
small RNA molecule which can exhibit 3 base pairs.
As Cis a multidimensional space, it is not clear a pri-
ori how to move in such a complex space. It is there-
fore necessary to define certain rules, the so called
’move set’ (a collection of operations, which, applied
to an element of C, transforms this element into an-
other element of C.
Energy Barriers
With both features at hand (all suboptimal structures
and a metric, the move set, a more detailed investi-
gation of the energy landscape of RNA is possible. It
is therefore necessary make some definitions:
A structure is a local minimum if its energy is lower
than the energy of all neighboring structures
A structure is called local maximum if its energy
is higher than the energies of all legal neighboring
structures
A structure is a saddle point if there are at least
two local minima that can be reached by a downhill
walk starting with this structure
Evidently, the saddle point with lowest energy sepa-
rating two local minima is of particular importance.
They can be found by applying the flooding algorithm
presented below. The outcome of the procedure is a
barrier tree as shown in this figure.
Barrier tree of a random RNA sequence with length 42: Leaves 1-10 de-
note the 10 lowest local minima of the energy landscape, the mfe struc-
ture 1 is marked with an asterisk. Saddle points are labeled A to G.The
Energy barrier of 3 is B(3) = E(B)E(3).
Leaves correspond to the valleys of the landscape,
while saddle points are displayed by internal nodes.
Saddle points can be read off easily from these barrier
trees.
The flooding algorithm
The flooding algorithm can be explained schematic
with the following figure.
AB
CD
The flooding algorithm: Gedanken-experiment where water rises in a
landscape from bottom to top..
Imagine a landscape with only two valleys A and B
(where A is energetically lower than B) and a saddle
point X separating those local minima. Water rises
from bottom to top. In the first step (A), only the
deeper valley will be slightly filled with water. For the
algorithm, this means that all structures below or ex-
actly at the surface of the water belong to valley A.
(All other structures are not accessible by now as we
go through an energetically sorted list of secondary
structures in ascending order.) In the second step
(B), not only the deeper valley A is filled with water,
but also valley B (at least to a small amount). From
now on there are two possibilities for a structure to
belong to: Depending on which valley contains struc-
tures that are neighbors of the actual one, a structure
can belong to valley A or B. Imagine the water rises
further. Step (C) displays a different situation: Sad-
dle point X has been found, which means there exists
a structure which has legal neighbors in valley A and
in valley B. At this point, the two valleys coincide and
B is merged with its ’father A, which means that all
structures from B can be accessed as if they would
belong to A from now on. However, the algorithm
does not stop here. As illustrated in (D), the water
rises further and only valley A is still accessible. The
end of the algorithm has been reaches as soon as (i)
all secondary structures have been processed or (ii) a
predefined amount of local minima has been found.
Applications
The figure below shows the frequency of degenerate
saddles from a tRNAphe.
Saddle point energy versus saddle point multiplicity from a tRNAphe
Another useful application is shown below: The bar-
rier tree of a bi-stable RNA secondary structure,
which can fold into two or more thermodynamically
stable structures separaded by a large energy barrier.
Recently, artificial RNA switches have been designed.
2.30 1
10 2.70 2
1.90 3
1.90 4
15 2.00 5
1.90 6
2.00 7
1.90 9
16
17
1.90 11
1.90 12
1.90 13
1.90 14
18
19
22
23
26
28
29
31
36
1.40 24
34
37
39
45 1.40 25
42
51
53
55 47
58 46
70 1.70 30
1.70 32
62
74 50
61 54
64
1.60 40
1.60 41
69 1.40 35
63
1.60 43
1.60 44
78
88 1.40 38
68
80
77
94 79
91 2.60 33
90
96
89
93 1.90 52
100
1.70 60
82
1.70 65
1.60 67
98
1.60 73
95
97
2.30 56
1.80 71
1.80 75
1.90 83
1.60 92
1.60 99 2.20 5.30 8
1.80 21
3.00 20
1.60 27
49
48
59
76
1.80 57
85
1.80 81
1.40 1.60 84
1.50 86
Barrier tree of a RNA switch showing the 100 deepest local minima of
the energy landscape: The deep local minimum 8 (-6.7 kcal/mol) on the
left hand side is separated from the right subtree via an energy barrier of
10.10 kcal/mol. The energy of the mfe structure is -8.20 kcal/mol. The
open chain conformation is represented by local minimum 81 in the very
left part of the tree.
Besides the shown applications, barrier trees have
been considered recently for various models of dis-
ordered systems, including spin glasses and combina-
torial optimization problems [1, 3].
References
[1] O. Bastert, D. Rockmore, P. Stadler, and G. Tinhofer. Landscapes on
spaces of trees. Appl. Math. Comput. 2001, in press.
[2] P. Brion and E. Westhof. Hierarchy and dynamics of RNA folding. Annu.
Rev. Biophys. Biomol. Struct., 26:113–137, 1997.
[3] F. Ferreira, J. Fontanari, and P. Stadler. Landscape statistics of the low
autocorrelated binary string proble m. J. Phys. A: Math. Gen, 33:8635–
8647, 2000.
[4] I. L. Hofacker, W. Fontana, P. F. Stadler, L. S. Bonhoeffer, M. Tacker, and
P. Schuster. Fast folding and comparison of RNA secondary structures.
Monatsh. Chem., 125:167–188, 1994.
[5] S. Wuchty, W. Fontana, I. L. Hofacker, and P. Schuster. Complete subopti-
mal folding of RNA and the stability of secondary structure. Biopolymers,
49:145–165, 1998.
[6] M. Zuker and D. Sankoff. RNA secondary structures and their prediction.
Bull. Math. Biol., 46:591–621, 1984.
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
The statistical properties of the energy landscape of the low-autocorrelation binary string problem (LABSP) are studied numerically and compared with those of several classic disordered models. Using two global measures of landscape structure which have been introduced in the simulated annealing literature, namely, depth and difficulty, we find that the landscape of the LABSP, except perhaps for a very large degeneracy of the local minimum energies, is qualitatively similar to some well known landscapes such as that of the mean-field two-spin glass model. Furthermore, we show both analytically and numerically that a well known mean-field approximation to the pure model describes the statistical properties of the LABSP extremely well.
Article
Full-text available
Computer codes for computation and comparison of RNA secondary structures, the Vienna RNA package, are presented, that are based on dynamic programming algorithms and aim at predictions of structures with minimum free energies as well as at computations of the equilibrium partition functions and base pairing probabilities.An efficient heuristic for the inverse folding problem of RNA is introduced. In addition we present compact and efficient programs for the comparison of RNA secondary structures based on tree editing and alignment.All computer codes are written in ANSI C. They include implementations of modified algorithms on parallel computers with distributed memory. Performance analysis carried out on an Intel Hypercube shows that parallel computing becomes gradually more and more efficient the longer the sequences are.Die im Vienna RNA package enthaltenen Computer Programme fr die Berechnung und den Vergleich von RNA Sekundrstrukturen werden prsentiert. Ihren Kern bilden Algorithmen zur Vorhersage von Strukturen minimaler Energie sowie zur Berechnung von Zustandssumme und Basenpaarungswahrscheinlichkeiten mittels dynamischer Programmierung.Ein effizienter heuristischer Algorithmus fr das inverse Faltungsproblem wird vorgestellt. Darberhinaus prsentieren wir kompakte und effiziente Programme zum Vergleich von RNA Sekundrstrukturen durch Baum-Editierung und Alignierung.Alle Programme sind in ANSI C geschrieben, darunter auch eine Implementation des Faltungs-algorithmus fr Parallelrechner mit verteiltem Speicher. Wie Tests auf einem Intel Hypercube zeigen, wird das Parallelrechnen umso effizienter je lnger die Sequenzen sind.
Article
Full-text available
This is a review of past and present attempts to predict the secondary structure of ribonucleic acids (RNAs) through mathematical and computer methods. Related areas covering classification, enumeration and graphical representations of structures are also covered. Various general prediction techniques are discussed, especially the use of thermodynamic criteria to construct an optimal structure. The emphasis in this approach is on the use of dynamic programming algorithms to minimize free energy. One such algorithm is introduced which comprises existing ones as special cases.
Article
The evidence showing that the self-assembly of complex RNAs occurs in discrete transitions, each relating to the folding of sub-systems of increasing size and complexity starting from a state with most of the secondary structure, is reviewed. The reciprocal influence of the concentration of magnesium ions and nucleotide mutations on tertiary structure is analyzed. Several observations demonstrate that detrimental mutations can be rescued by high magnesium concentrations, while stabilizing mutations lead to a lesser dependence on magnesium ion concentration. Recent data point to the central controlling and monitoring roles of RNA-binding proteins that can bind to the different folding stages, either before full establishment of the secondary structure or at the molten globule state before the cooperative transition to the final three-dimensional structure.
Article
An algorithm is presented for generating rigorously all suboptimal secondary structures between the minimum free energy and an arbitrary upper limit. The algorithm is particularly fast in the vicinity of the minimum free energy. This enables the efficient approximation of statistical quantities, such as the partition function or measures for structural diversity. The density of states at low energies and its associated structures are crucial in assessing from a thermodynamic point of view how well-defined the ground state is. We demonstrate this by exploring the role of base modification in tRNA secondary structures, both at the level of individual sequences from Escherichia coli and by comparing artificially generated ensembles of modified and unmodified sequences with the same tRNA structure. The two major conclusions are that (1) base modification considerably sharpens the definition of the ground state structure by constraining energetically adjacent structures to be similar to the ground state, and (2) sequences whose ground state structure is thermodynamically well defined show a significant tendency to buffer single point mutations. This can have evolutionary implications, since selection pressure to improve the definition of ground states with biological function may result in increased neutrality.
Article
this paper we describe a fairly simple algorithm which generates all suboptimal folds of a sequence within a desired energy range from the mfe. The idea underlying the algorithm is straightforward, and we took it literally from Waterman and Byers
Article
Combinatorial optimization problems defined on sets of phylogenetic trees are an important issue in computational biology, for instance the problem of reconstruction a phylogeny using maximum likelihood or parsimony approaches. The collection of possible phylogenetic trees is arranged as a so-called Robinson graph by means of the nearest neighborhood interchange move. The coherent algebra and spectra of Robinson graphs are discussed in some detail as their knowledge is important for an understanding of the landscape structure. We consider simple model landscapes as well as landscapes arising from the maximum parsimony problem, focusing on two complementary measures of ruggedness: the amplitude spectrum arising from projecting the cost functions onto the eigenspaces of the underlying graph and the topology of local minima and their connecting saddle points. Key words: Fitness landscapes, phylogenetic trees, spectral graph theory, parsimony problem. Subject Classification: 05C38, 05C85....