PreprintPDF Available

Eterna is Solved

Authors:
Preprints and early-stage research may not have been peer reviewed yet.

Abstract and Figures

RNA design consists of discovering a nucleotide sequence that folds into a target secondary structure. It is useful for synthetic biology, medicine, and nanotechnology. We propose Montparnasse, a Multi Objective Generalized Nested Rollout Policy Adaptation with Limited Repetition (MOGNRPALR) RNA design algorithm. It solves the Eterna benchmark.
Content may be subject to copyright.
Eterna is Solved
Tristan Cazenave
LAMSADE, Universit´
e Paris Dauphine - PSL, CNRS, Paris, France
Abstract. RNA design consists of discovering a nucleotide sequence that folds
into a target secondary structure. It is useful for synthetic biology, medicine,
and nanotechnology. We propose Montparnasse, a Multi Objective Generalized
Nested Rollout Policy Adaptation with Limited Repetition (MOGNRPALR) RNA
design algorithm. It solves the Eterna benchmark.
1 Introduction
The design of molecules with specific properties is an important topic for research re-
lated to health. The RNA design problem, also named the Inverse RNA Folding prob-
lem, is a difficult combinatorial problem. This problem is important for scientific fields
such as bioengineering, pharmaceutical research, biochemistry, synthetic biology, and
RNA nanostructures [20].
RNA is involved in many biological functions. Synthetic RNA can be easily pro-
duced [21] and has many applications in synthetic biology, as well as in drug design
with the building of riboswitches and ribozymes.
RNA design consists of finding a nucleotide sequence that folds into a desired target
structure. Eterna is a standard benchmark for RNA design algorithms. Many algorithms
have been applied to this problem over the years. However, none have successfully
solved all the Eterna problems. This paper presents a simple algorithm that solves the
Eterna benchmark.
RNA molecules are long molecules composed of four possible nucleotides. Molecules
can be represented as strings composed of the four characters A (Adenine), C (Cyto-
sine), G (Guanine), and U (Uracil). For RNA molecules of length N, the size of the state
space of possible strings is exponential in N. It can be very large for long molecules. The
sequence of nucleotides folds back on itself to form what is called its secondary struc-
ture. It is possible to find in polynomial time the folded structure of a given sequence.
However, the opposite, which is the Inverse RNA Folding problem, is hard [2].
RNA functions are determined by its tertiary structure. The secondary structure is
used to determine the tertiary structure according to the base pairing interactions. The
bonds between two nucleotides are given by the six possible base pairs (CG, GC, AU,
UA, UG, GU). The dot-bracket notation is used to represent the secondary structure,
the opening and closing brackets represent the base pairs, and the dots represent the
unbounded sites.
The paper is organized as follows: the second section is about previous attempts at
designing RNA. The third section presents the algorithms used in Montparnasse. The
fourth section details the experimental results.
arXiv:2505.02110v1 [cs.AI] 4 May 2025
2 Tristan Cazenave
2 Previous RNA Design Work
We start with an overview of previous attempts at RNA design. We then describe in
more detail the GREED-RNA program and previous Monte Carlo Search approaches.
2.1 Previous Attempts at Solving Inverse RNA Folding
The algorithms used in these methods include adaptive random walk in RNAinverse
[14], stochastic local search for RNA-SSD [1] and INFO-RNA [3], genetic and evolu-
tionary algorithms such as MODENA [23] and aRNAque [16].
The recent RNA design book [12] contains many papers on RNA design for both
the secondary structure and the tertiary structure.
2.2 GREED-RNA
We focus here on GREED-RNA [15] as it is a very recent and state-of-the-art program
for Eterna. We will compare to GREED-RNA in the experimental results section.
GREED-RNA uses greedy initialization: all base pairs are initialized as GC or CG,
and unpaired positions are all initialized as A. In the early stage of search, it also uses
greedy mutation that randomly chooses between GC and CG at random positions. Later,
it uses random mutation that randomly replaces base pairs with any other base pair and
unpaired nucleotides with any nucleotide.
Sequences are sorted using Multi Objective evaluations. The first objective is their
value in base pair distance (BPD), then the Hamming Distance, the probability over
ensemble, the partition function, the ensemble defect, and the GC-content distance.
Restarts are performed when the stagnation counter reaches a threshold. It also takes
sequences from a pool of sorted sequences.
2.3 Monte Carlo Search
Early attempts used UCT combined with local search to address RNA design [24]. The
program was not tested on the Eterna benchmark.
The NEMO program used Nested Monte Carlo Search [4] with a handmade playout
policy to generate RNA that were further optimized with local search. It could solve 95
of the 100 problems of Eterna.
MCTS and learning from self-play were also used for RNA design [17,18,19].
Generalized Nested Rollout Policy Adaptation (GNRPA) [6] was applied to Inverse
RNA Folding [9] also to solve 95 problems. It was further refined with the learning of
a prior for the policy using either transformers [11] or statistics on solved problems [8].
GNRPA with Limited Repetitions (GNRPALR) [7] was also applied to Inverse RNA
Folding with success. The principle is to avoid too deterministic policies by stopping
iterations when the same sequence is found a second time at a given level of GNRPA.
Eterna is Solved 3
3 Montparnasse
In this section, we describe the algorithms we have developed for RNA design in the
Montparnasse framework. We start with Multi Objective Greedy Randomized Local
Search (MOGRLS) which is a simplification of GREED-RNA. We then explain a mod-
ification of this algorithm we call Progressive Narrowing (PN) that makes use of restarts
and selects the most promising sequences among a set of partially optimized sequences.
The Progressive Narrowing algorithm is tuned using a search for the best parameters.
We end this section with Multi Objective Generalized Nested Rollout Policy Adaptation
with Limited Repetitions (MOGNRPALR), the MCTS algorithm we propose for RNA
design.
3.1 Multi Objective Greedy Randomized Local Search
MOGRLS is a simplification of GREED-RNA that gives better results on difficult prob-
lems. It is a simple algorithm described in Algorithm 1.
Algorithm 1 MOGRLS
1: MOGRLS (targetStructure)
2: bestSequence GenerateInitialSequence(targetStructure)
3: while True do
4: if nevals < 500 then
5: sgreedyMutation(bestSequence, tar getStructure)
6: else
7: srandomM utation(bestSeq uence, targetStr ucture)
8: end if
9: Update bestSequence with susing Multi Objective comparison
10: end while
3.2 Progressive Narrowing
PN is an improvement on MOGRLS that starts searching multiple sequences before
focusing the search on the best one. It is described in Algorithm 2.
3.3 Search for Parameter Tuning
The search for the parameters of PN is done using Algorithm 3 for generating the pos-
sible combinations of parameters that sum to a predefined number of evaluations, and
Algorithm 4 for testing on a dataset of previous recorded runs of MOGRLS the score
of each combination of parameters.
4 Tristan Cazenave
Algorithm 2 Progressive Narrowing
1: PN (targetStructure)
2: for number of restarts do
3: sgreedyMutation(bestSequences[n], tar getStructure)
4: nevals [0, ..., 0]
5: while True do
6: for nrange(len(bestSequences)) do
7: if nevals[n]<500 then
8: sgreedyMutation(bestSequences[n], tar getStructure)
9: else
10: srandomM utation(bestSequences[n], targetS tructure)
11: end if
12: Update bestSequences[n]with susing Multi Objective comparison
13: nevals[n]nevals[n] + 1
14: end for
15: for number of best sequences do
16: if prof ile[n] == nevals[n]then
17: remove worst sequence from best sequences
18: remove profile[n]from profile
19: end if
20: end for
21: break if one sequence left and prof ile[0] == nevals[0]
22: end while
23: end for
Algorithm 3 Parameter generation.
1: search (n, k, s, current, l, possible)
2: if k== 0 then
3: if s== nthen
4: l.append(current)
5: return
6: end if
7: end if
8: if snthen
9: return
10: end if
11: start 0
12: if len(current)>0then
13: start current[1]
14: end if
15: for ipossible do
16: if istart then
17: si s+k×(istart)
18: cur copy(current)
19: cur.append(i)
20: search (n, k 1, si, cur, l, possible)
21: end if
22: end for
Eterna is Solved 5
Algorithm 4 Parameter tuning.
1: ParameterTuning ()
2: for restart [1350,2700] do
3: for n[1..maxprofile + 1] do
4: targetSum restart
5: result []
6: search(targ etSum, n, 0,[], result, possible)
7: for strategy result do
8: nbSolved 0
9: for j[0..nsamples]do
10: solved F alse
11: for r[0..2700//restart]do
12: quad sample(range(nprocess), len(strategy))
13: for i[0..len(strategy)1] do
14: index max(0, strategy [i]1)
15: remove element with worst score after index evaluations in quad
16: end for
17: end for
18: if solved then
19: nbSolved nbSolved + 1
20: end if
21: end for
22: if nbSolved > best then
23: best nbSolved
24: memorize strategy
25: end if
26: end for
27: end for
28: end for
6 Tristan Cazenave
3.4 Multi Objective Generalized Nested Rollout Policy Adaptation with Limited
Repetitions
This section presents the MOGNRPALR algorithms which is a combinations of GN-
RPA [6], GNRPALR [7] and the Multi Objective evaluations of GREED-RNA [15].
GNRPA is a generalization of the NRPA algorithm to the use of a prior. GNRPALR
is an improvement of GNRPA that avoids too deterministic policies. It stops the iter-
ations at a level when the best sequence of this level is found a second time. All of
these algorithms are improvements of the Nested Rollout Policy Adaptation (NRPA)
[22] algorithm whihc is an effective combination of NMCS and the online learning of
a playout policy.
In MOGNRPALR each move is associated to a weight stored in an array called the
policy. For each level there is a best sequence and a policy. The principle is to reinforce
the weights of the best sequence of moves found during the iterations at each level.
At the lowest level, the weights are used in the softmax function to produce a playout
policy that generates good sequences of moves.
MOGNRPALR use nested search [4]. At each level it takes a policy as input and
returns a sequence and its associated scores. At any level >0, the algorithm makes
numerous recursive calls to the lower level, adapting the policy each time with the best
sequence of moves to date. The changes made to the policy do not affect the policy in
higher levels. At level 0, MOGNRPALR return the sequence obtained by the playout
function as well as its associated scores.
The playout function sequentially constructs a random solution biased by the weights
of the moves until it reaches a terminal state. At each step, the function performs Boltz-
mann sampling, choosing the actions with a probability given by the softmax function.
Let wmbe the weight associated to a move min the policy. In NRPA, the probability
of choosing move mis defined by:
pm=ewm
Pkewk
where kgoes through the set of possible moves, including m.
GNRPA [6] generalizes the way probability is calculated using bias βm. The prob-
ability of choosing the move mbecomes:
pm=ewm+βm
Pkewk+βk
Taking βm=βk= 0, we find the formula for NRPA again which corresponds to
sampling without prior.
The algorithm for performing playouts in MOGNRPALR is given in algorithm 5.
The main MOGNRPALR algorithm is given in the algorithm 4. MOGNRPALR calls
the adapt algorithm to modify the policy weights so as to reinforce the best sequence
of the current level. The policy is passed by reference to the adapt algorithm which is
given in the algorithm 6.
The principle of the adapt function is to increase the weights of the moves of the best
sequence of the level and to decrease the weights of all possible moves by an amount
Eterna is Solved 7
proportional to their probabilities of being played. δbm = 0 when b=mand δbm = 1
when b=m.
At line 16 of Algorithm 4 there is a condition that corresponds to Stabilized NRPA
[10] and to starting to adapt only after a few iterations [5,9,13].
Algorithm 5 The Multi Objective playout algorithm
1: playout (policy)
2: state root
3: sequence []
4: while true do
5: if terminal(state)then
6: return scores (state), sequence
7: end if
8: z0
9: for mpossible moves for state do
10: o[m]epolicy[code(m)]+βm
11: zz+o[m]
12: end for
13: choose a move with probability o[move]
z
14: play (state,move)
15: sequence.append (move)
16: end while
4 Experimental Results
The same default parameters are always used for both GREED-RNA and MOGNR-
PALR. The biases used in MOGNRPALR are 5.0 for GC, CG and A and 0.0 for AU,
UA, UG, GU, C, G and U. The Turner 1999 parameters are used for RNA folding. The
same Multi Objective evaluations are used for all algorithms. The problems we solve
are the original Eterna100 v1 problems. Running 200 processes in parallel MOGNR-
PALR solves all the problems of Eterna100 v1 in less than one day. In the following,
we will focus on three of the most difficult problems: Problems 90, 99 and 100. Figure
1 gives the target secondary structures for problems 90, 99 and 100.
4.1 MOGRLS
Figure 2 gives the evolution of the BPD for problem 99 both for GREED-RNA and
MOGRLS. We can see that MOGRLS gets better results.
4.2 PN
The results for MOGRLS of the previous section have been stored in order to be used
as a dataset. For all of the 200 processes, the best BPD of each process has been stored
8 Tristan Cazenave
Algorithm 6 The adapt algorithm
1: adapt (policy,sequence)
2: polp policy
3: state root
4: for bsequence do
5: z0
6: for mpossible moves for state do
7: o[m]epolicy[code(m)]+βm
8: zz+o[m]
9: end for
10: for mpossible moves for state do
11: pmo[m]
z
12: polp[code(m)] polp[code(m)] α(pmδbm )
13: end for
14: play (state,b)
15: end for
16: return polp
Algorithm 7 The MOGNRPALR algorithm.
1: MOGNRPALR (level,policy)
2: if level == 0 then
3: return playout (policy)
4: else
5: bestSequences []
6: i0
7: while True do
8: scores,new MOGNRPALR(level 1,policy)
9: if bestSequences =[] then
10: if new == bestSequences[0][1] then
11: return scores,new
12: end if
13: end if
14: bestSequences.append ([scores,new])
15: bestSequences.sort ()
16: if level > 2or level < 3and i > 3or level == 1 and i > 3and i%4 == 0
then
17: policy adapt (policy,bestS equences[0][1])
18: end if
19: ii+ 1
20: end while
21: end if
Eterna is Solved 9
(a) Gladius: problem 90 (b) Shooting Star: problem 99 (c) Teslagon: problem 100
Fig. 1: Problems 90, 99 and 100 from Eterna100 v1.
every 100 evaluations. The search for narrowing profiles was done for restarts of 135
000 and 270 000 evaluations. The restart of 270 000 evaluations corresponds to no
restart. For each profile tested, 100 000 combinations were tested and averaged. The
best profile found is [10 000, 10 000, 10 000, 10 000, 230 000] with no restart. It solves
16.77 % of the combinations. Figure 3 gives the results for PN on problem 99 and
compares them with MOGRLS. PN gives slightly better results than MOGRLS.
4.3 MOGNRPALR
Figure 4 gives the evolution of the BPD for problem 99 both for MOGNRPALR. It
also compares MOGNRPALR with GREED-RNA, PN, and MOGRLS. We can see that
MOGNRPALR gets much better results.
Table 1 gives the distributions of BPD after 270 000 evaluations for problem 99. The
algorithms are sorted by increasing performances. GREED-RNA solves the problem 6
times out of 200 while MOGNRPALR solves it 120 times out of 200.
Table 1: Distributions of the BPD of the various algorithms after 270 000 evaluations
for problem 99.
BPD 0 1 2 3 4 5 6 7 8
GREED-RNA 6 22 49 66 38 17 2 0 0
MOGRLS 19 46 63 39 22 7 2 2 0
PN 28 72 64 28 8 0 0 0 0
MOGNRPALR 120 78 2 0 0 0 0 0 0
4.4 Problem 90
Problem 90 is a difficult problem where the behavior of the search algorithms is a slow
descent toward the lower BPD.
10 Tristan Cazenave
The number of evaluations performed by GREED-RNA in one hour for problem 90
is 9 200. So we compare GREED-RNA to MOGNRPALR after 220 000 evaluations,
which corresponds to one day of computation. MOGNRPALR reaches 10 400 in one
hour for problem 90. The comparison using the same number of evaluations for the two
algorithms is slightly favorable for GREED-RNA.
Figure 5 gives the distributions of BPD for GREED-RNA and MOGNRPALR.
None of the 200 GREED-RNA processes finds a solution, and the best BPD found
is 2. MOGNRPALR finds multiple solutions.
4.5 Problem 100
Problem 100 is also a difficult problem. The number of evaluations performed by
GREED-RNA in one hour for problem 100 is 37 000. So we compare GREED-RNA to
MOGNRPALR after 530 000 evaluations, which corresponds to one day of computa-
tion.
Figure 6 gives the distributions of BPD for GREED-RNA and MOGNRPALR.
MOGNRPALR finds many more solutions than GREED-RNA.
5 Conclusion
Montparnasse is a framework for RNA design. It contains algorithms such as MOGRLS
that simplify and improve the GREED-RNA local search approach to RNA design. It
improves on MOGRLS with PN and automatic parameter tuning. The main result of this
paper is the design and application of the MOGNRPALR algorithm to RNA design.
For difficult problems of Eterna100 v1 it gives much better results than greedy local
search. It solves the Eterna100 v1 benchmark since all problems are solved within one
day using 200 runs in parallel, each run using the same number of evaluations as the
number of evaluations of a single run in one day.
Eterna is Solved 11
References
1. Andronescu, M., Fejes, A.P., Hutter, F., Hoos, H.H., Condon, A.: A new algorithm for RNA
secondary structure design. Journal of molecular biology 336(3), 607–624 (2004)
2. Bonnet, ´
E., Rza˙
zewski, P., Sikora, F.: Designing RNA secondary structures is hard. Journal
of Computational Biology 27(3), 302–316 (2020)
3. Busch, A., Backofen, R.: INFO-RNA—a fast approach to inverse RNA folding. Bioinfor-
matics 22(15), 1823–1831 (2006)
4. Cazenave, T.: Nested Monte-Carlo Search. In: Boutilier, C. (ed.) IJCAI. pp. 456–461 (2009)
5. Cazenave, T.: Nested rollout policy adaptation with selective policies. In: CGW at IJCAI
2016 (2016)
6. Cazenave, T.: Generalized nested rollout policy adaptation. In: Monte Search at IJCAI (2020)
7. Cazenave, T.: Generalized nested rollout policy adaptation with limited repetitions. European
Workshop on Reinforcement Learning (2024)
8. Cazenave, T.: Learning a prior for monte carlo search by replaying solutions to combinatorial
problems. In: International Conference on Parallel Problem Solving from Nature. pp. 85–99.
Springer (2024)
9. Cazenave, T., Fournier, T.: Monte Carlo inverse folding. In: Monte Search at IJCAI (2020)
10. Cazenave, T., Sevestre, J.B., Toulemont, M.: Stabilized nested rollout policy adaptation. In:
Monte Search at IJCAI (2020)
11. Cazenave, T., Touzani, H.: Monte Carlo inverse RNA folding. In: RNA Design: Methods and
Protocols, pp. 205–215. Springer (2024)
12. Churkin, A., Barash, D.: RNA Design. Springer (2024)
13. Dang, C., Bazgan, C., Cazenave, T., Chopin, M., Wuillemin, P.H.: Warm-starting nested
rollout policy adaptation with optimal stopping. In: Proceedings of the AAAI Conference on
Artificial Intelligence. pp. 12381–12389 (2023)
14. Hofacker, I.L., Fontana, W., Stadler, P.F., Bonhoeffer, L.S., Tacker, M., Schuster, P., et al.:
Fast folding and comparison of RNA secondary structures. Monatshefte fur chemie 125,
167–167 (1994)
15. Lozano-Garc´
ıa, N., Rubio-Largo, ´
A., Granado-Criado, J.M.: A simple yet effective greedy
evolutionary strategy for rna design. IEEE Transactions on Evolutionary Computation (2024)
16. Merleau, N.S., Smerlak, M.: aRNAque: an evolutionary algorithm for inverse pseudoknotted
rna folding inspired by l´
evy flights. BMC bioinformatics 23(1), 335 (2022)
17. Obonyo, S., Jouandeau, N., Owuor, D.: Designing RNA sequences by self-play. In: IJCCI.
pp. 305–312 (2022)
18. Obonyo, S., Jouandeau, N., Owuor, D.: RNA generative modeling with tree search. In: 2024
IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biol-
ogy (CIBCB). pp. 1–9. IEEE (2024)
19. Obonyo, S., Jouandeau, N., Owuor, D.: Self-playing RNA inverse folding. SN Computer
Science 5(4), 403 (2024)
20. Portela, F.: An unexpectedly effective Monte Carlo technique for the RNA inverse folding
problem. BioRxiv p. 345587 (2018)
21. Reese, C.B.: Oligo- and poly-nucleotides: 50 years of chemical synthesis. Org. Biomol.
Chem. 3, 3851–3868 (2005). https://doi.org/10.1039/B510458K
22. Rosin, C.D.: Nested rollout policy adaptation for Monte Carlo Tree Search. In: IJCAI. pp.
649–654 (2011)
23. Taneda, A.: Multi-objective optimization for rna design with multiple target secondary struc-
tures. BMC bioinformatics 16, 1–20 (2015)
24. Yang, X., Yoshizoe, K., Taneda, A., Tsuda, K.: Rna inverse folding using monte carlo tree
search. BMC bioinformatics 18, 1–12 (2017)
12 Tristan Cazenave
(a) Distribution of the BPD for problem 99
after 135 000 evaluations by GREED-RNA.
(b) Distribution of the BPD for problem 99
after 270 000 evaluations by GREED-RNA.
(c) Distribution of the BPD for problem 99
after 135 000 evaluations by MOGRLS.
(d) Distribution of the BPD for problem 99
after 270 000 evaluations by MOGRLS.
(e) Comparison after 270 000 evaluations
between GREED-RNA and MOGRLS.
(f) Evolution of the average BPD of
GREED-RNA and MOGRLS.
Fig. 2: Comparison of the distributions of BPD on problem 99 for GREED-RNA and
MOGRLS for increasing numbers of evaluations. After 270 000 evaluations which cor-
responds to one day of computation, MOGRLS has solved the problem 19 times out
of 200 runs while GREED-RNA has solved the problem 2 times out of 200 runs (see
subfigure (c)). Subfigure (d) gives the evolution of the average BPD with the number of
evaluations for problem 99. The averages are calculated using 200 runs. The rightmost
value corresponds to one day of computation for one run.
Eterna is Solved 13
(a) Distribution of the BPD for problem 99
after 135 000 evaluations by PN.
(b) Distribution of the BPD for problem 99
after 270 000 evaluations by PN.
(c) Comparison after 270 000 evaluations
between PN and MOGRLS.
(d) Evolution of the average BPD of PN and
MOGRLS for problem 99.
Fig. 3: Comparison of the distributions of BPD on problem 99 for PN and MOGRLS for
increasing numbers of evaluations. After 270 000 evaluations which corresponds to one
day of computation, PN has solved the problem 28 times out of 200 runs while MO-
GRLS has solved the problem 19 times out of 200 runs (see subfigure (c)). Subfigure
(d) gives the evolution of the average BPD with the number of evaluations for problem
99. The averages are calculated using 200 runs. The rightmost value corresponds to one
day of computation for one run.
14 Tristan Cazenave
(a) Distribution of the BPD for problem
99 after 135 000 evaluations by MOGNR-
PALR.
(b) Distribution of the BPD for problem
99 after 270 000 evaluations by MOGNR-
PALR.
(c) Comparison after 270 000 evaluations
between MOGNRPALR and PN.
(d) Evolution of the average BPD of all al-
gorithms for problem 99.
Fig. 4: Comparison of the distributions of BPD on problem 99 for PN and MOGNR-
PALR for increasing numbers of evaluations. After 270 000 evaluations which corre-
sponds to one day of computation, MOGNRPALR has solved the problem 120 times
out of 200 runs while PN has solved the problem 28 times out of 200 runs (see sub-
figure (c)). Subfigure (d) gives the evolution of the average BPD with the number of
evaluations for problem 99 for all algorithms. The averages are calculated using 200
runs. The rightmost value corresponds to one day of computation for one run.
Eterna is Solved 15
(a) Distribution of the BPD for problem 90
after 110 000 evaluations by GREED-RNA.
(b) Distribution of the BPD for problem 90
after 220 000 evaluations by GREED-RNA.
(c) Distribution of the BPD for problem
90 after 110 000 evaluations by MOGNR-
PALR.
(d) Distribution of the BPD for problem
90 after 220 000 evaluations by MOGNR-
PALR.
(e) Comparison of the distributions after
220 000 evaluations between GREED-RNA
and MOGNRPALR.
(f) Evolution of the average BPD of
GREED-RNA and MOGNRPALR for
problem 90.
Fig. 5: Comparison of the BPD on problem 90 for GREED-RNA and MOGNRPALR
for increasing numbers of evaluations. 220 000 evaluations by one process takes one
day. GREED-RNA is stuck and does not solve the problem while MOGNRPALR pro-
gresses and solves the problem 6 times out of 200 runs.
16 Tristan Cazenave
(a) Distribution of the BPD for problem 100
after 265 000 evaluations by GREED-RNA.
(b) Distribution of the BPD for problem 100
after 530 000 evaluations by GREED-RNA.
(c) Distribution of the BPD for problem
100 after 265 000 evaluations by MOGN-
RPALR.
(d) Distribution of the BPD for problem
100 after 530 000 evaluations by MOGN-
RPALR.
(e) Comparison of the distributions after
530 000 evaluations between GREED-RNA
and MOGNRPALR.
(f) Evolution of the average BPD of
GREED-RNA and MOGNRPALR for
problem 100.
Fig. 6: Comparison of the BPD on problem 100 for GREED-RNA and MOGNRPALR
for increasing numbers of evaluations. 530 000 evaluations by one process takes one
day. GREED-RNA solves problem 100 less frequently than MOGNRPALR.
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
RNA design, also known as the RNA inverse folding problem, involves discovering a nucleotide sequence that folds into a target structure. This problem has been addressed from a wide number of approaches, improving the ability to solve it in a reasonable time over time. Despite all these efforts, today no method has completely solved the problem. We present GREED-RNA, a new RNA design algorithm, based on a simple greedy evolutionary strategy. The main feature is the use of several objective functions (Base-pair distance, Hamming distance, probability over ensemble, partition function, ensemble defect and GC-content) to select the best solution in each iteration, changing their weight according to the problem-solving conditions. The performance of GREED-RNA was tested using the Eterna100 benchmark, widely used in this area and never fully solved by any method. In addition, a comparative analysis against several published RNA design methods considering three metrics (solved structures, success rate and execution time), allowed us to verify that GREED-RNA performs better than previously developed methods, thus successfully improving the current ability to solve this problem. This tool also allows users to select a range within which the GC-content of the solution sequences must fall. Source code and results are available at https://github.com/iARN-unex/GREED-RNA.
Article
Full-text available
Ribonucleic acid (RNA) sequence controls many biological functions in the body of living organisms including protein synthesis and gene regulations. The sequence folds according to the low Minimum Free Energy (MFE) which enables it to achieve various roles. Given a target RNA structure, an RNA sequence can be designed to fold accordingly in a process called RNA Inverse Folding. In this paper, we present an RNA Inverse Folding model that performs a single-step look-ahead to select optimal RNA base and base pairs to design RNA sequences. Our proposed model (SPRNA) learns to accurately design RNA sequences based on self-improving policy function. The model recorded state-of-the-art results on two test datasets and very competitive results on others.
Article
Full-text available
Background We study in this work the inverse folding problem for RNA, which is the discovery of sequences that fold into given target secondary structures. Results We implement a Lévy mutation scheme in an updated version of an evolutionary inverse folding algorithm and apply it to the design of RNAs with and without pseudoknots. We find that the Lévy mutation scheme increases the diversity of designed RNA sequences and reduces the average number of evaluations of the evolutionary algorithm. Compared to , CPU time is higher but more successful in finding designed sequences that fold correctly into the target structures. Conclusion We propose that a Lévy flight offers a better standard mutation scheme for optimizing RNA design. Our new version of is available on GitHub as a python script and the benchmark results show improved performance on both and the datasets, compared to existing inverse folding tools.
Article
Full-text available
Background Artificially synthesized RNA molecules provide important ways for creating a variety of novel functional molecules. State-of-the-art RNA inverse folding algorithms can design simple and short RNA sequences of specific GC content, that fold into the target RNA structure. However, their performance is not satisfactory in complicated cases. ResultWe present a new inverse folding algorithm called MCTS-RNA, which uses Monte Carlo tree search (MCTS), a technique that has shown exceptional performance in Computer Go recently, to represent and discover the essential part of the sequence space. To obtain high accuracy, initial sequences generated by MCTS are further improved by a series of local updates. Our algorithm has an ability to control the GC content precisely and can deal with pseudoknot structures. Using common benchmark datasets for evaluation, MCTS-RNA showed a lot of promise as a standard method of RNA inverse folding. ConclusionMCTS-RNA is available at https://github.com/tsudalab/MCTS-RNA.
Chapter
The inverse RNA folding problem deals with designing a sequence of nucleotides that will fold into a desired target structure. Generalized Nested Rollout Policy Adaptation (GNRPA) is a Monte Carlo search algorithm for optimizing a sequence of choices. It learns a playout policy to intensify the search of the state space near the current best sequence. The algorithm uses a prior on the possible actions so as to perform non uniform playouts when learning the instance of problem at hand. We trained a transformer neural network on the inverse RNA folding problem using the Rfam database. This network is used to generate a prior for every Eterna100 puzzle. GNRPA is used with this prior to solve some of the instances of the Eterna100 dataset. The transformer prior gives better result than handcrafted heuristics.
Chapter
The RNA Inverse Folding problem comes from computational biology. The goal is to find a molecule that has a given folding. It is important for scientific fields such as bioengineering, pharmaceutical research, biochemistry, synthetic biology and RNA nanostructures. Nested Monte Carlo Search has given excellent results for this problem. We propose to adapt and evaluate different Monte Carlo Search algorithms for the RNA Inverse Folding problem.
Chapter
Nested Rollout Policy Adaptation (NRPA) is a Monte Carlo search algorithm for single player games. In this paper we propose to generalize NRPA with a temperature and a bias and to analyze theoretically the algorithms. The generalized algorithm is named GNRPA. Experiments show it improves on NRPA for different application domains: SameGame and the Traveling Salesman Problem with Time Windows.
Chapter
Nested Rollout Policy Adaptation (NRPA) is a Monte Carlo search algorithm for single player games. In this paper we propose to modify NRPA in order to improve the stability of the algorithm. Experiments show it improves the algorithm for different application domains: SameGame, Traveling Salesman with Time Windows and Expression Discovery.