ThesisPDF Available

Master Thesis September 2011

  • Qubit Pharmaceuticals

Abstract and Figures

Spatial Averaging is a Monte Carlo method introducing a new family of probability densities improving the sampling efficiency of the rare-event problems, while conserving the statistical properties of the original distribution. After a theoretical overview concerning Monte Carlo Methods, the principles of Spatial Averaging are introduced. After this, an application to the research of the best minima of Lennard-Jones clusters is detailed. Then an implementation in CHARMM is exposed, and illustrated with the conformational study of the Alanine Dipeptide in vacuum.
Content may be subject to copyright.
Spatial Averaging : a new Monte Carlo approach for
sampling rare-event problems.
Florent Hedin
Master Chemoinformatic M2 Research, Université de Strasbourg
March 2011 – August 2011
Laboratory of Physical Chemistry, Team of Prof. M. Meuwly
University of Basel, Switzerland
Spatial Averaging is a new Monte Carlo method introducing a new family of probability
densities improving the sampling efficiency of the rare-event problems, while conserving
the statistical properties of the original distribution. After a theoretical overview concern-
ing Monte Carlo Methods, the principles of Spatial Averaging are introduced. After this,
an application to the research of the best minima of Lennard-Jones clusters is detailed.
Then an implementation in CHARMM is exposed, and illustrated with the conformational
study of the Alanine Dipeptide in vacuum.
A Theoretical Overview 4
1 Generalities on Monte Carlo Simulations 5
1.1 Concepts of MC Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1.1 History.................................. 5
1.1.2 Denition ................................ 5
1.1.3 Markov Chain MC (MCMC) . . . . . . . . . . . . . . . . . . . . . . 6
1.1.4 The Metropolis-Hastings algorithm . . . . . . . . . . . . . . . . . . 8
1.1.5 Limitations ............................... 8
2 Spatial Averaging Method 10
2.1 Goal ....................................... 10
2.2 Generalities ................................... 10
2.3 Theory...................................... 10
2.4 Adaptation to Molecular systems . . . . . . . . . . . . . . . . . . . . . . . 11
2.5 About the specific parameters of Spatial Averaging . . . . . . . . . . . . . 12
B Implementations and results 13
3 Lennard-Jones clusters 14
3.1 Goal ....................................... 14
3.2 Interest of LJ clusters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.3 Details on the implementation . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.4 Results...................................... 17
3.4.1 LJ7.................................... 17
3.4.2 LJ13, LJ19 , LJ55 ............................. 17
3.4.3 LJ31, LJ38 ................................ 19
4 Implementation in CHARMM 25
4.1 Generalities on CHARMM and the MC module . . . . . . . . . . . . . . . 25
4.2 Specificities of the implementation in CHARMM . . . . . . . . . . . . . . . 26
4.3 Conformational study of the Alanine Dipeptide . . . . . . . . . . . . . . . 26
One of the main aspects of computational studies is to explore the behaviour of a system:
but even for small systems it is not always possible to enumerate explicitly all the possibles
configurations. It is then desirable to use a method that will allow to sample efficiently
rare events, which are the ones occurring with a very low probability.
Using classical Molecular Dynamics (MD) is a possibility: classical equations of
motion of the system are numerically integrated over short discrete time intervals (of the
order of femtosecond). Unfortunately, most of the time the statistical interval of time
between two occurrences of rare events is considerably larger than the integration time: it
means that if this event occurs during a MD simulation, the time one have to wait before
to observe it might be considerable, and then it would be necessary to set an extremely
long simulation time.
It may then be necessary to consider using a Monte Carlo (MC) method, based
on stochastic numericals experiments. In its simplest form, successive configurations are
randomly generated for exploring the space of the possibilities, without discrimination:
if the number of trials is infinite, then the probability of observing our rare event is one.
Nevertheless, only very few of those states are significant, because most of them does not
respect Boltzmann weighting: a better sampling is possible by considering Markov Chains
methods, such as the Metropolis-Hastings algorithm.
One of the advantages of MC is that it does not need two follow a realistic energetic
path when sampling configurations, and then it can make extreme changes to the config-
uration rapidly, a useful aspect when considering events with a long occurrence time.
As a drawback, sampling efficiency of MC is strongly dependent of the choice of the
moving atoms and of the types of moves allowed: for example, with biomolecules of a
consequent size, if only random translations of atoms ar used, MC may be 10 times
slower than a classical MD; but when combined with dihedral moves of the backbone,
MC may be 3 to 4 time faster than MD.
The goal of this internship was to implement the Spatial Averaging Monte Carlo
method in CHARMM, a method modifying the propability density for making easier the
sampling of rare events.
After a chapter dedicated to the classical MC methods, all the theory and the principles
of Spatial Averaging will be discussed. Then, a test on Lennard-Jones clusters of noble
gases will be presented, with for goal the sampling of best energy states. In the end,
the implementation in CHARMM is discussed, and applied to a conformationnal study
of Alanine Dipeptide.
Part A
Theoretical Overview
Chapter 1
Generalities on Monte Carlo
1.1 Concepts of MC Simulations
1.1.1 History
In 1733, Georges-Louis Leclerc de Buffon posed the “Buffon’s needle problem”, where
πis estimated by dropping nneedles of length l, on a floor made of parallel strips of wood
of length t. If his the number of needles crossing the lines between two strips, then Buffon
demonstrates that πis approximated by the Equation 1.1:
In 1946, Stanislas Ulaw, a scientist working on the Manhattan Project at the Lab-
oratory of Los Alamos, suggested to use stochastic methods for evaluating complicated
mathematical integrals: he studied this idea with John von Neumann and Nicholas
Metropolis, and their work was codenamed “Monte Carlo”, as a reference to the random
games of the casino of Monaco. In 1949 an article entitled “The Monte Carlo Method”
[1] has been published, defining the concept of MC experiments.
1.1.2 Definition
The Monte Carlo method can be used to describe any technique approximating solu-
tions to quantitative problems by using statistical sampling: it relies on repeated random
sampling to compute some results: it is so a stochastic method. The following pattern
describes the steps of a basic simulation:
1. Define a domain of application (i.e. select “items” sampled by the MC method).
2. Generate random values following a probability distribution over this domain.
3. Then perform a classical (deterministic) computation on the sampled items.
4. Repeat those previous steps as long as needed.
As example, it is possible to imagine an MC extension of the Buffons’ experiment for
estimating π: the steps of this algorithm can be described as following:
1. Considering a circle inscribed in a square of length 1, the area of the part of the
circle contained in the square is π
4: so by generating a lot of points randomly with
the MC algorithm it is possible to estimate a value of the area of the circle.
2. Generate randomly a point Pof coordinates (x, y) with 0 x1 and 0 y1.
3. Check if this point is in the circle, i.e. if (x2+y2)1, and increment a variable i
if it is the case.
4. Repeat ntimes this experiment.
At the end of this loop, πis estimated via Equation 1.2:
For an accurate approximation of π, two common properties of MC methods have to be
satisfied; Firstly, the generated coordinates should be truly random, i.e. the random num-
bers have to be uniformly distributed all over the allowed space (here the whole square).
Secondly, there should be a large number of inputs, as the quality of the approximation
increases with the number of trials.
The Table I shows the difference in percent between πand its estimated value in
function of the number of trials, obtained via an application of the previous algorithm
in C++; the Figure A is a graphical representation of the results for 1000 trials, the
condition (x2+y2)1 is satisfied for 784 points so we have by application of Equation
1000 3.136
Trials 10 100 1000 10000 106109
Diff in % 1.86 1.96 0.178 0.128 0.102 4.93 103
Table I: Difference between πan its estimated value depending on the number of trials,
MC method.
1.1.3 Markov Chain MC (MCMC)
In the previous example, all the couples (x, y) are generated independently, i.e. 1) there
are no relations between a couple at the step nand a couple at the state n+ 1 and 2) the
algorithm does not keep trace of the state of the system at the previous step. It is not a
problem for as simple case of area calculation with a well defined criterion (here the size
of the circle), but if we want to use a MC algorithm for comparing two states of a system
we have to be able to quantify the evolution.
This is the principle of Markov Chain Monte Carlo methods, where the next state
depends only of the current state, and not of the entire set of previous states:
0.0 0.2 0.4 0.6 0.8 1.0
0.0 0.2 0.4 0.6 0.8 1.0
π = 3.136 after 1000 trials
Figure A: Graphical Representation of the results shown in Table I for 1000 trials: 784
points inside of the red arc of circle are satisfying the condition (x2+y2)1 .
P(Xn+1 =x|X1=x1, X2=x2,...,Xn=xn) = P(Xn+1 =x|Xn=xn) (1.3)
The Equation 1.3 means that the probability of having a state Xn+1 =xknowing
all the previous states Xi, i = 1..n is the same as knowing only the state Xn=xn: so the
process is stochastic, and if the random values are following a probability distribution,
the ensemble of all the generated states will follow this distribution.
Markov chain is a type of Random Walk: the system is moving around the equi-
librium distribution with no tendency for the steps to proceed in a particular direction.
Some examples of MCMC algorithms are:
Metropolis-Hastings algorithm.
Gibbs Sampling.
Slice Sampling
1.1.4 The Metropolis-Hastings algorithm
This algorithm was proposed in 1953 by Nicholas Metropolis,Arianna & Marshall
Rosenbluth and Augusta & Edward Teller, for the case of the Boltzmann distribution
[2]; W. Keith Hastings extended it to the general case in 1970 [3]: the algorithm will use
a Markov Chain for generating states obeying to a given distribution (after a sufficiently
long time).
Considering the generalized form of Hastings:
1. Xnis the current state of probability P(Xn); the candidate for being the next state
is Xp, P (Xp).
2. Q(Xp;Xn) is a proposal density depending of Xnfor generating Xp; and reciprocally
Q(Xn;Xp) is defined.
3. αis a random number uniformly distributed in [0; 1].
4. Xpwill be accepted as the state Xn+1 if and only if :
α < P(xp)Q(xn;xp)
With this general form it is possible to define two different proposal densities (for
example two Gaussian distributions with different parameters ...); in the case of the
classical Metropolis algorithm this was not considered: as the application was Boltzmann
distribution, probabilities were centred around the state Xn, so Q(Xp;Xn) = Q(Xn;Xp).
Furthermore, P(X)e
kBTwhere EXis the energy of state X,Tthe temperature and
kBthe Boltzmann constant, all with units to adapt. So the Equation 1.4 became :
α < e(EXpEXn)
And the general algorithm becomes:
1. Given a configuration A of energy EA, generate a new configuration B via some MC
moves and estimate EB.
2. If EB< EAthe state B is accepted.
3. Else, apply Equation 1.5, if αis inferior to the right part, the state B is accepted.
4. Else, state B is rejected.
1.1.5 Limitations
Even if the Metropolis-Hastings criterion allows the sampling of states with a higher
energy, it can not sample states separated by barriers of height > kBT; some special
algorithm were developed for solving at least partially this problem.
One of them is the Parallel tempering [4, 5], also known as replica exchange MCMC
1. N copies of the system, randomly initialised, are executed at different temperatures.
2. Then, thanks to the Metropolis criterion, some configurations are exchanged between
the N configurations, and so some high-temperature configurations are available in
low-temperature ones, and vice-versa.
So in the case, of Parallel tempering the Equation 1.5 becomes :
α < e(EpEn)1
The Spatial averaging algorithm, whose goal is to sample rare events too, will work
with an ensemble of copies of the system too, but as we will see, the sampling is not
improved by playing with some controls parameters (temperature in the case of Parallel
tempering), but directly by improving the probability distribution.
Chapter 2
Spatial Averaging Method
2.1 Goal
As said previously, the Spatial Averaging concept is applied to MC simulations in order
to increase the sampling of rare-events problems: its originality relies in the fact that this
is done by modifying the probability density function (pdf) by itself, where other
methods will try to use differently one given pdf.
The key feature of this approach is so the construction of a modified pdf related to
the original one : this will be detailed in the next section.
2.2 Generalities
Most of the work coming in the next two sections is from the publication “A spatial
averaging approach to rare-event sampling”[6], by N. Plattner, J. D. Doll and M.
Meuwly, which puts all the basis of the method.
The modified pdf has to have two specific properties:
1. The integral of the modified pdf over the whole space is identical to that of the
parent distribution: this is needed if we want thermodynamic properties close to
the original ones.
2. The modified pdf is easier to sample than the original one: if not, there are no
benefits for using this modified pdf.
So this technique does not require any a priori knowledge of the specificities of the
rare events: for example we do not need to know a reaction path if we want to sample its
different states.
2.3 Theory
For simplicity, a single dimension system is used, but results are correctly generalisable
to multi-dimensions systems.
We consider an uni-dimensional particle of potential V: the probability for this par-
ticle of being at a point xwith a potential V(x) is:
ρ(x, 0) = exp(βV (x)) (2.1)
With β=1
kBT; the modified pdf used for this algorithm is then defined as following:
ρ(x, ǫ) = ZPǫ(y) exp(βV (x+y))dy (2.2)
Where Pǫ(y) is a normalized probability distribution of length ǫ: we will take Pǫ(y) as
a Gaussian distribution of standard deviation ǫ: adjusting this parameter allows a good
flexibility of the distribution, and with a sufficient value, it is easier to sample states far
from the centre. Furthermore, this Gaussian configuration is centred around ρ(x, 0) so we
Zρ(x, 0) = Zρ(x, ǫ) (2.3)
The Equation 2.3 satisfies the condition 1 previously considered : “The integral of
the modified pdf over the whole space is identical to that of the parent distribution. As
the choice of a Gaussian distribution is purely arbitrary, it might possible to use other
types of probability distribution, as long as the two previous properties are respected.
The integration of the Equation 2.2 over all space shows that it is possible to invert
the orders of integration, so this method is not affected by the random walk of Markov
2.4 Adaptation to Molecular systems
Now the previous algorithm has to be adapted to a 3 dimensional system of multiple
atoms: in a second publication [7], N. Plattner, J. D. Doll and M. Meuwly proposed the
following procedure:
A variable number of configurations Nǫis generated for each coordinate of the atoms
selected for moving; this distribution is of Gaussian type, proportional to e
on x0the original coordinates and with a width of Wǫ. Then the MC move (for example
translation of some atoms) is applied to all the Nǫconfigurations and the energy evaluated:
the principle is then the same as for classical Metropolis-Hastings, excepted that a specific
criterion is used.
Practically, it means:
1. Consider a trial configuration of the system of coordinates ~x0, as in any conventional
MC, and select a type of move.
2. Around this ~x0, generate a Gaussian distribution for Mǫsets of Nǫconfigurations,
of standard deviation Wǫand centred on ~x0.
3. Apply the chosen MC move to all of the MǫNǫconfigurations.
4. Compute the MǫNǫenergies corresponding to those configurations: the old en-
ergies (before MC moves) are E(m,n)
old , the new ones are E(m,n)
new : then we define the
Boltzmann weights as:
old,Boltz =eβE(m,n)
old and E(m,n)
new,Boltz =eβE(m,n)
5. For each Mǫset, evaluate:
old =
old,Boltz and Sm
new =
And then:
δm=ln Sm
old !
6. Then we defined:
7. Then δ+σ2
2will replace the ∆Eof the Metropolis Criterion, and the Equation
1.5 will become:
α < exp(β(δ+σ2
2)) (2.4)
This criterion is homogeneous to an energy; with this approach the criterion is an
average value of the different sets, each set containing a certain number of different con-
figuration: so if after step 3 (MC moves) none of the E(m,n)
new,Boltz energies is lower than the
one of the reference configuration of step 1, the averaged value is used for deciding if it is
possible to accept a state higher in energy.
It is so possible to accept a state with a ∆Esignificantly higher than with a classical
Metropolis because its energy is averaged with the ones of the other MǫNǫconfigura-
tions, with the implicit guarantee that this average is not too big because of the Gaussian
distribution, and so the probability of “jumping” over barriers > kBTis now increased.
2.5 About the specific parameters of Spatial Averag-
A Spatial Averaging MC simulation is so characterized by a triplet [Wǫ, Mǫ, Nǫ]: with this
nomenclature, a classical Metropolis-Hastings simulation would have as values [0.0,1,1],
i.e. the Gaussian Distribution became a Dirac δfunction, coherent with the fact that
there is only one distribution, ~x0, and steps 3 to 6 are simplified for giving the classical
acceptance criterion of Equation 1.5: Spatial Averaging can so be considered as an
extension of the Metropolis-Hastings algorithm.
In one of their paper [7], N. Plattner, J. D. Doll and M. Meuwly showed that some
triplets such as [1.0,30,30] or [2.0,40,40] may allow to sample rare events on some systems
with only 1000 MC moves.
As with other MC simulations, a maximal range xmax
tto the coordinates moves is
defined, in order to avoid incoherence in the geometry of the molecules .
Part B
Implementations and results of
Spatial Averaging MC simulations
Chapter 3
Lennard-Jones clusters
3.1 Goal
The objective is to try Spatial Averaging MC with a well-studied system, of many possible
configurations, and to see if our method allows us to localise global minima of energy.
Lennard-Jones clusters are defined as an ensemble of non-reactive atoms in vacuum
(for example noble gases), interacting only through the Lennard-Jones [8] (LJ) potential,
and the energy of this type of system is, for n particles:
VLJ = 4ε
rij !12
rij !6
Where rij is the distance between atoms iand j,εis the depth of the potential well,
and r0the distance where VLJ = 0. The r12
ij term describes the repulsion at short range
due to the overlapping of orbitals, and the r6
ij describes the attraction at long range. The
Figure B illustrates this variation.
For simplifying the study reduced units are employed, i.e. ε=r0= 1, and the energy
will be noted as a multiple of ε.
3.2 Interest of LJ clusters
A lot of publications are available on those cluster since the 1970’s [9, 10, 11, 12, 13]: a
website1centralised the known structures, lowest minima and symmetry group for clus-
ters from 2 to 1610 atoms: classical MD, Quantum calculation, classical MC methods,
and Genetic algorithms were used for those results, and LJN(with N the number of
atoms) clusters became references systems for methods dedicated to localisation of rare
Table II shows the known (or estimated) number of minima for different LJ clusters
: we can see that it growth dramatically with the value of N, as for 33 atoms, the value
is 41014 !
For proving the efficiency of Spatial Averaging MC, we decided to apply it to several
LJN, with N∈ {7,13,19,31,38,55}. The LJ7was the structure studied during the
V (ε)
Figure B: LJ potential VLJ for 2 particles in function of the distance rij . Modified ver-
sion of, orig-
inal from Olaf Lenz, licence CC BY-SA 3.0.
Number of atoms 2 4 7 10 13 15 19 33
Number of minima 1 1 4 57 366 10700 210641014
Table II: Number of minima for several LJ clusters. Source: [10]
development of the algorithm, as it has only 4 minima, so we expect to sample them very
quickly and easily. Then when the results on this case were good, we considered structures
LJ13, LJ19, LJ55: the number of minima increases exponentially, but the best one of each
is of icosahedral geometry, and is so much more stable than the others. In the end, cases
LJ31, LJ38 were considered, which presents a non-icosahedral best minima, really close
in energy to the others, and so difficult to sample. Furthermore, several publications
[9, 12, 11] studied very well the clusters 13,19,31,38,55, so it will allows us to confront
our results.
3.3 Details on the implementation
AFortran program was created for this purpose, computing the Lennard-Jones poten-
tial according to Equation 3.1, with reduced units as explained before.
The book “Stochastic simulations of clusters”[14] from E. Curotto proposes some
simple implementations of stochastic algorithms, and one of them is the research of the
4 minima of LJ7via the algorithm of Basin Hopping (a variant of Metropolis) : this
program was the starting point, and by modifying and improving it, it was transformed
in an implementation of the Spatial Averaging MC method. The part computing the LJ
energy has not been modified or rewritten, has no modification of way of evaluating the
potential was required.
Here is an overview of the implementation:
1. The program is launched with the following parameters, interactively or via an input
file: the number Nof atoms, the triplet [Wǫ, Mǫ, Nǫ], the number of steps Nsteps
of the simulation, a value kMC for weighting the MC moves, the desired number of
simulations Nruns, and an integer seed for initialising the random numbers simulator.
2. If Nruns >1, several runs are launched in parallel2if the number of CPU is 2:
they act as different runs of the program, as no data is shared.
3. Then 3 Nrandom numbers are generated and they became the coordinates of the
N atoms: those numbers are uniformly distributed in [0; 1[, and multiplied by a
value defined internally: this will not avoid the possibility for two atoms of being
really close but the probability is reduced. Furthermore when it is the case the
system will quickly evolve to a state where this problem disappears, at the moves
decreasing the r12 part of the potential will be automatically accepted. After this
step we have an initial configuration, which will be the ~x0for the first iteration of
the loop.
4. Then we entered the main loop, repeated Nsteps times:
One atom is selected for moving at each step, so firstly we have to create
different configurations of our LJNcluster: for this the algorithm of Box-
Muller [15] is used for generating MǫNǫstates Gaussian-distributed around
the configuration of reference ~x0and with a width of Wǫ, and so we know
that statistically 95% of those states are in the ensemble [1.96 ~x0; +1.96 ~x0]
All those states are stored in an array.
The energies of those configurations are computed, and stored in a second
Secondly, for the given atom defined by 3 coordinates (x, y, z) 3 random num-
bers in [0; 1[ are generated : (δx, δy, δz) and added after having been weighted,
so the new coordinates are:
(x+kMC δx, y +kM C δy, z +kM C δz)
The energies of those after-MC configurations are evaluated too and stored.
Then came the phase of acceptance/rejection, implemented exactly as ex-
plained on the previous pages.
During the last 10% Nsteps the value of Wǫis divided by 100 for stabilising the
energy: the system will remain for those last steps more centred on the current
minima and the precision on the energy will increase.
The coordinates of the best configuration found after the iteration of the main loop
is stored in an XYZ file, allowing to visualise the system with an appropriated software,
such as VMD: it is also possible to write the coordinates of the Nsteps configurations which
are accepted via the acceptance criterion and so we will have a trajectory file showing the
evolution of the system.
Due to the importance of the quality of the random numbers, a dedicated generator
freely available on the Internet were used, dSFMT3, written by Mutsuo Saito and
2Thanks to the library OpenMP
3See m-mat/MT/SFMT/#dSFMT
Makoto Matsumoto [16, 17]: using the SSE2 instructions of modern CPUs, this gen-
erator is extremely fast. Furthermore, its period is 219937 1, which means than more
than 106000 numbers can be theoretically generated before the appearance of two identical
numbers !
3.4 Results
For each cluster, a Figure of the best minima is present: the atoms are coloured differently
with nuances of red, blue and white for increasing the perspective and for making easier
the visualisation.
3.4.1 LJ7
As said earlier, the LJ7acts as a reference model, and its 4 minima were easily sampled:
for this, the parameters were the triplet [0.5;10;10] for Spatial Averaging MC (SP-AV MC)
values, applied on 5000 steps (but with [0.005;10;10] for the last 500 steps as explained
previously). 100 runs were considered. The Table III summarises the results: We can
see that the energies computed are exactly the same that the ones available in literature
[14], and that all the 100 runs finished in one of the minima (sum of the line Frequency
= 100 %) and so our SP-AV MC method seems to be efficient in that case.
LJ7First minima Second minima Third minima Fourth minima
E Theoretic/ε-16.505 -15.935 -15.593 -15.533
E SP-AV MC/ε-16.505 -15.935 -15.593 -15.533
Frequency (%) 32 7 15 46
Table III: LJ7: Energy and frequency of observation of the 4 minima. The theoretic
energy is the one from the book Stochastic simulation of clusters[14]. The parameters are
the triplet [0.5;10;10], 5000 steps, 100 runs.
For comparison, 100 runs were effectuated with as parameters [0.0;1;1] (classical
MC) and 5000 steps: the frequencies of appearance of the minima were respectively
4%;4%;2%;2%, really bad compared to our method.
The Figure C shows the theoretical structure of LJ7, and Figure D shows the best
minima we found: we can see that the 2 structures are strictly the same, with a symmetry
D5h: this agrees too with literature [9].
3.4.2 LJ13, LJ19, LJ55
Those 3 clusters are regulars icosahedra and so even if the number of minima growth very
quickly, the best one is much more stable than the others, as it is associated to a very
favourable geometry: figure E shows the best icosahedral minima for those clusters.
Calculations were launched with [1.0;10;10] 10000 steps for LJ13 and LJ19, [2.0;30;30]
10000 steps for LJ55 ; 100 runs of each are considered. The Table IV shows our results:
LJ13 is easy to sample (27%), LJ19 was found 4 times, LJ55 2 times. We can see that the
Figure C: Theoretical representation of LJ7 best minimum. Source: [14]
Figure D: Representation of LJ7 best minimum: symmetry D5h, energy -16.505ε
energies are quasi-identical for LJ13 and LJ19, and that we found a difference of 0.16ε
for LJ55: is this the best minimum ?
Figure F and Figure G are the disconnectivity graphs for those 3 LJ clusters, found
in literature [12]. The goal of those graph is to represent all or a certain number of
the minima in one figure, for making easy a comparison of the energies. The vertical
axis represents the energy of the minima represented by lines, and the horizontal distance
between the lines is proportional to the size of energy barriers. We can see in each case one
long line going to the bottom of the graphs: it is the best minima, and as said previously,
they are really distinct from the others thanks to the regular icosahedral geometry. For
Figure G the best minima of LJ55 is at more than 3εof the others, so even if our
estimated energy is not exactly the same, we are sure that our method sampled the best
Figures H,I and J are snapshots of our best minima: they are the same as Figure
E, as the symmetry groups (Ih, D5h, Ih) agrees with literature.
Classical MC simulations were tried with [0.0;1;1]; the best minimum of LJ13 was
sampled once with 100 runs of 10000 steps; for LJ19 and LJ55 it was impossible to sample
the minima with 100 runs of 50000 or 100000 steps: once more, our method proves its
efficiency for sampling rare structures.
LJNN=13 N=19 N=55
E Th/ε-44.327 -72.660 -279.248
E SP-AV MC/ε-44.326 -72.659 -279.132
Frequency(%) 27 4 2
Table IV: Energy and Frequency of the best minimum for LJ13, LJ19 , LJ55. Source of E
Th is [9]
Figure E: Theoretical representation of best minima for LJ13, LJ19, LJ55. Source: [12]
3.4.3 LJ31, LJ38
Those 2 structures represents a challenge : their non-icosahedral structure (see Figure
K) will have for consequence several minima close in energy: this is confirmed by the
disconnectivity graphs (Figure L), where we can see that for LJ31 the best minima is
at less than 1.2εof 5 others ; for LJ38 4 minima are at less than 2εof the best one, and
this one is on the right, far from the others, which means that it is surrounded by high
barriers of potential.
The calculation are launched with 250 runs of [2.0;40;40] with 10000 steps for both
systems, and results are available in Table V: the energies are in agreement with the
literature [9]. But the frequencies of apparition of those structures is 1 on 250 runs,
confirming the difficulties to sample them.
The Figures M and N are the geometries obtained with our algorithm: the allures
and groups of symmetry (Cs, Oh) are conform to Figure K.
With those cases too, classical MC simulations were not able to find those minima,
were our methods proves once again it efficiency.
LJNN=31 N=38
E Th/ε-133.586 -173.928
E SP-AV MC/ε-133.581 -173.915
Frequency(%) 0.4 0.4
Table V: Energy and Frequency of the best minimum for LJ31 and LJ38. Source of E Th
is [9]
Figure F: Disconnectivity graphs for LJ13 (left) and LJ19 (right): all the minima are
represented for LJ13 and the 250 best for LJ19. The left bar is the energy in ε. Source:
Figure G: Disconnectivity graphs of the 900 best minima for LJ55: energy in ε. Source:
Figure H: Representation of LJ13 best minimum: symmetry Ih, energy -44.326ε.
Figure I: Representation of LJ19 best minimum: symmetry D5h, energy -72.659ε.
Figure J: Representation of LJ55 best minimum: symmetry Ih, energy -279.132ε.
Figure K: Theoretical representation of the best minima for LJ31 (left) and LJ38 (right).
Source : [12]
Figure L: Disconnectivity graphs of the 200 and 150 best minima for LJ31 and LJ38:
energy in ε. Source : [12]
Figure M: Representation of LJ31 best minimum: symmetry Cs, energy -133.581ε.
Figure N: Representation of LJ38 best minimum: symmetry Oh, energy -173.915ε.
Chapter 4
Implementation in CHARMM
4.1 Generalities on CHARMM and the MC module
CHARMM (Chemistry at HARvard Macromolecular Mechanics) [18, 19, 20] is a molecular
simulation program, developed with a primary focus on the study of molecules of biological
interest (peptides, proteins, etc ...): it provides a lot of tools dedicated to dynamics, path
sampling methods, free energy estimates, molecular minimization, and more ...
For those purposes, CHARMM can use classical force fields with explicit or implicit
solvation models, mixed quantum mechanical-molecular mechanical force fields, Monte-
Carlo simulations, etc... We will focus on the latter.
The MC module [21], mainly written by A. Dinner, J. Hu and A.Ma, allows the user
to define an arbitrary set of moves on a given molecular system, and then to launch the
MC simulation via the corresponding commands on the input file. The main types of
moves are :
Rigid Translations of one or more atoms (RTRN)
Rigid Rotations of one ore more atoms around a centre of rotation: this centre may
be another set of one or more atoms, or the centre of mass of the rotating atoms
Dihedral angles torsions, particularly important when considering biomolecules [22,
23] (TORS)
Concerted rotations [24] of dihedral angles: 7 or 6 dihedral will move together, useful
for deforming a backbone (CROT)
And others such as Hybrid Monte Carlo (HMC) ...
The objective was the implementation of the Spatial Averaging method in this MC
module; N. Plattner made one limited to RTRN moves, and published interesting and
promising results [7]. The main goal of this work was to implement for RROT and TORS
as well. For this, everything was restarted from zero, i.e. take the original code of the
MC module and try to implement in a proper way Spatial Averaging.
4.2 About the specificities of implementing Spatial
Averaging in CHARMM
The main difference with the previous case of LJ clusters is that 3 different types of move
are allowed: so the creation of the MǫNǫconfigurations around the original one are
done differently: previously Nǫcoordinates were distributed around the initial ones, but
for example if the considered move is a TORS, then we will have to generate Nǫdihedral
around the original one.
Fortunately writing a dedicated code for this task is not needed, as the MC module has
some routines whose role is to modify the dihedral, and it is the same for the different types
of moves. So we had to create an “interface”, calling the needed routines according to the
types of the current move MǫNǫtimes: this interface will take care of the generation of
Gaussian distributed random numbers around the current state.
Some limitations have to be highlighted: due to the complexity of the MC module,
representing more than 18000 lines of code, several options or possibilities of the module
are disabled when working with the Spatial Averaging: it is not possible to use some rare
and specific fore fields, the implementation is limited to the default one of CHARMM;
some features such as automatic optimization of the move sets or Hybrid MC are disabled
too, etc ... When the implementation will be publicly distributed in a future version of
CHARMM, those problems will have to be solved or explicitly exposed in some documen-
4.3 Conformational study of the Alanine Dipeptide
The Alanine dipeptide (Figure O) has been used as a test system for theoretical studies
[25, 26, 27] of backbone conformational equilibria: indeed, this dipeptide contains many
of the structural features of proteins, such as the two (φ, ψ) dihedrals angles, NH and
CO groups capable of being involved in hydrogens bonds, and a methyl group attached
on the Cα. Furthermore, thanks to it small size, it was successfully studied via Quantum
chemistry, MD and MC, both in vacuum and with water.
Figure O: Representation of the alanine dipeptide, with the dihedral angles φand ψ:
source [26]
We proposed to apply our Spatial Averaging implementation in CHARMM on the
alanine dipeptide, to see if we can easily sample the different configurations and localise
the transition paths between them.
First of all we have to detail the possible conformations:
β, also called C5, for (φ, ψ)(140,150)
C7eq for (φ, ψ)(90,80)
αR(Right-handed αhelix) for (φ, ψ)(80,60)
αL(Left-handed αhelix) for (φ, ψ)(60,60)
C7ax for (φ, ψ)(60,60)
The Figure P shows a Ramachandran plot with energy of the alanine dipeptide in
water: each point is characterised by a triplet (φ, ψ, E), whose allows to locate the best
conformations; dark-blue zones are the most favourable ones corresponding to stabilizing
electrostatic interactions, and in contrary red zones are forbidden, mainly due to sterics
clashes. The numbers 6 to 13 correspond to saddle points and are the possible paths for
transitions between the different forms.
Figure P: Ramachandran plot for Alanine dipeptide in water: in colour the energy, blue
zones are the most favourable ones. Source: [26]
In the case of a vacuum study, some states are not allowed; only the states β,C7eq and
C7ax are observed, as the αones seem to be favoured by water: the resulting Ramachan-
dran plot is described by the Figure Q.
Now that we have some references plot for comparing, we can discuss our results in
vacuum. The methodology was to run simulations of 10000 steps, with three possible
Figure Q: Ramachandran plot for Alanine dipeptide in vacuum: dashed zones are the
most favourable ones. Source: [25]
1. RTRN of a maximal distance xmax
t= 0.15 Å restricted to heavy atoms, i.e. not
Hydrogens: if an atom linked to an H is chosen, this latter is moved too.
2. RROT of xmax
t= 25restricted to heavy atoms, with the same remark.
3. TORS of xmax
t= 35for the two dihedral angles (φ, ψ).
All those moves had the same weight of 1, so at each step they all have a probability
3of being chosen. The starting point is always (180,180).
Firstly, we applied a classical MC simulation, and we get the Figure R: the system
does not quit the zones of βand C7eq , the most stable ones.
Then we applied Spatial Averaging: we started with [1.0;10;10], and we get Figure S:
much more configurations are sampled in the zones of βand C7eq , and the zone of αRis
sampled by a few states: nevertheless, as said previously this is unstable in vacuum, and
so quickly the system is back in the previous zone.
Figure T shows the results for [1.0;25;25]: the C7ax is sampled by a great number of
points, and we can clearly see a path C7eq C7ax, corresponding to the saddle point 8 of
Figure P, present too in Figure Q. Equivalent results were obtained for [1.0;50;50], so
increasing the number of configurations in our algorithm will not increase the quality of
the results.
After those calculations, we decided to double the parameter Wǫto see what happens:
as the width of the Gaussian distribution is then doubled, we can expect a faster and
Figure R: Ramachandran plot for Alanine dipeptide in vacuum: classical MC of 10000
Figure S: Ramachandran plot for Alanine dipeptide in vacuum: SP-AV MC [1.0;10;10] ;
10000 steps.
better sampling. Figure U shows the results for [2.0;10;10], and they are similar to those
of [1.0;25;25], confirming a faster and better sampling; furthermore a second “pseudo-
path” on saddle point 10 started to be sampled, but after a moment the system turned
back and it never joined the zone C7ax : it is logical, as literature shown that those saddle
point 10 and 12 are used by the path C7eq αLC7ax only in water. Simulations with
[2.0;25;25] and [2.0;50;50] were tested but did not bring more informations.
Spatial Averaging seems so to be available once more to sample some configurations
not available via a classical MC simulation.
Figure T: Ramachandran plot for Alanine dipeptide in vacuum: SP-AV MC [1.0;25;25] ;
10000 steps.
Figure U: Ramachandran plot for Alanine dipeptide in vacuum: SP-AV MC [2.0;10;10] ;
10000 steps.
Conclusion and outlook
In the publication [7], N. Plattner showed how Spatial Averaging applied to RTRN moves
was efficient for sampling configurations of small molecules such as H2or CO in bigger
systems: the main advantage is that, as said previously in this report, we do not need
to have an a priori knowledge of the system. Indeed some techniques such as umbrella
sampling used in classical MD may be able to sample rare paths between configurations,
but for this the underlying potential function is modified.
Our implementation, where we added the possibility of generating modified probabil-
ity densities for RROT and TORS as well, seems to confirm the efficiency of the method.
Of course the Alanine Dipeptide is a small system but our first implementation showed
good results. Some applications are in progress for the cyclic-di-GMP [28] complex and
the insulin dimer [29, 30] in water, but the results were not discussed hereby, because
there are still some problems with the periodic boundaries applied to water.
When this problem will be solved, we might expect good results of our method, and
it might become a useful extension of the MC module, especially for sampling rare events
implying big biomolecules.
I want to thanks Professor Markus Meuwly, who welcomed me in his team for those six
months of work, for guiding me in my understanding of Monte Carlo methods, for giving
me some advices and some examples of application which really helped me. Most of all,
I thanks him for the freedom he gave me in my work, which allows me to really under-
stand the underlying theories, and to discover how scientific programming in molecular
simulations is achieved.
I thank all the members of the team, who considered me as a full member, for their
knowledge of CHARMM and of MD simulations, which solved some of my problems:
Lixian Zhang, Dr. Pierre-André Cazade, Franziska Hofmann, Maksym Soloviov, Juve-
nal Yosa reyes, Dr. Stephan Lutz, Prashant Gupta, Dr. Jing Huang, Dr. Myung Won
Lee, Dr. Yonggang Yang, Dr. Jaroslaw Szymmcak, Manuella Utzinger, and Andi Meier.
Furthermore, I want to congratulate Stephan Lutz and Jing Huang, who successfully de-
fended their PhD during my presence.
In the end, I acknowledge all the Drs., Professors and PhD students of the Université
de Strasbourg which are involved in the teachings of the Master Chemoinformatics.
Florent HEDIN,
August 2011, at Universität Basel.
[1] Nicholas Metropolis and S. Ulam. The monte carlo method. Journal of the American
Statistical Association, 44(247):335–341, 1949. ArticleType: research-article / Full
publication date: Sep., 1949 / Copyright c
1949 American Statistical Association.
[2] Nicholas Metropolis, Arianna W. Rosenbluth, Marshall N. Rosenbluth, Augusta H.
Teller, and Edward Teller. Equation of state calculations by fast computing machines.
The Journal of Chemical Physics, 21(6):1087, 1953.
[3] W. K. HASTINGS. Monte carlo sampling methods using markov chains and their
applications. Biometrika, 57(1):97 –109, April 1970.
[4] Robert H. Swendsen and Jian-Sheng Wang. Replica monte carlo simulation of Spin-
Glasses. Physical Review Letters, 57(21):2607, November 1986.
[5] David J. Earl and Michael W. Deem. Parallel tempering: Theory, applications, and
new perspectives. Physical Chemistry Chemical Physics, 7(23):3910, 2005.
[6] JD Doll, JE Gubernatis, N Plattner, M Meuwly, P Dupuis, and H Wang. A spatial
averaging approach to rare-event sampling. JOURNAL OF CHEMICAL PHYSICS,
131(10), September 2009.
[7] N Plattner, JD Doll, and M Meuwly. Spatial averaging for small molecule diffusion
in condensed phase environments. JOURNAL OF CHEMICAL PHYSICS, 133(4),
July 2010.
[8] J. E. Jones. On the determination of molecular fields. II. from the equation of state
of a gas. Proceedings of the Royal Society of London. Series A, 106(738):463 –477,
October 1924.
[9] David J. Wales and Jonathan P. K. Doye. Global optimization by Basin-Hopping and
the lowest energy structures of Lennard-Jones clusters containing up to 110 atoms.
The Journal of Physical Chemistry A, 101(28):5111–5116, July 1997.
[10] Sigurd Schelstraete and Henri Verschelde. Finding Minimum-Energy configurations
of Lennard-Jones clusters using an effective potential. The Journal of Physical Chem-
istry A, 101(3):310–315, January 1997.
[11] JPK Doye, MA Miller, and DJ Wales. The double-funnel energy landscape of the 38-
atom Lennard-Jones cluster. JOURNAL OF CHEMICAL PHYSICS, 110(14):6896–
6906, April 1999.
[12] JPK Doye, MA Miller, and DJ Wales. Evolution of the potential energy sur-
face with size for Lennard-Jones clusters. JOURNAL OF CHEMICAL PHYSICS,
111(18):8417–8428, November 1999.
[13] Xiang, Cheng, Cai, and Shao. Structural distribution of Lennard-Jones clusters
containing 562 to 1000 atoms. The Journal of Physical Chemistry A, 108(44):9516–
9520, November 2004.
[14] Emanuele Curotto. Stochastic Simulations of Clusters. CRC Press, September 2009.
[15] G. E. P. Box and Mervin E. Muller. A note on the generation of random normal
deviates. The Annals of Mathematical Statistics, 29(2):610–611, June 1958.
[16] Makoto Matsumoto and Takuji Nishimura. Mersenne twister: a 623-dimensionally
equidistributed uniform pseudo-random number generator. ACM Trans. Model. Com-
put. Simul., 8:3–30, January 1998.
[17] Mutsuo Saito and Makoto Matsumoto. Simd-oriented fast mersenne twister: a 128-bit
pseudorandom number generator. In Alexander Keller, Stefan Heinrich, and Harald
Niederreiter, editors, Monte Carlo and Quasi-Monte Carlo Methods 2006, pages 607–
622. Springer Berlin Heidelberg, 2008. 10.1007/978 3540 74496 236.
[18] A.D. MacKerel Jr., C.L. Brooks III, L. Nilsson, B. Roux, Y. Won, and M. Karplus.
CHARMM: The Energy Function and Its Parameterization with an Overview of the
Program, volume 1 of The Encyclopedia of Computational Chemistry, pages 271–277.
John Wiley & Sons: Chichester, 1998.
[19] B.R. Brooks, R.E. Bruccoleri, D.J. Olafson, D.J. States, S. Swaminathan, and
M. Karplus. Charmm: A program for macromolecular energy, minimization, and
dynamics calculations. Journal of Computational Chemistry, 4:187–217, 1983.
[20] B. R. Brooks, C. L. Brooks, III, A. D. Mackerell, Jr., L. Nilsson, R. J. Petrella,
B. Roux, Y. Won, G. Archontis, C. Bartels, S. Boresch, A. Caflisch, L. Caves,
Q. Cui, A. R. Dinner, M. Feig, S. Fischer, J. Gao, M. Hodoscek, W. Im, K. Kuczera,
T. Lazaridis, J. Ma, V. Ovchinnikov, E. Paci, R. W. Pastor, C. B. Post, J. Z. Pu,
M. Schaefer, B. Tidor, R. M. Venable, H. L. Woodcock, X. Wu, W. Yang, D. M.
York, and M. Karplus. CHARMM: The Biomolecular Simulation Program. JOUR-
NAL OF COMPUTATIONAL CHEMISTRY, 30(10, Sp. Iss. SI):1545–1614, JUL 30
[21] Jie Hu, Ao Ma, and Aaron R Dinner. Monte carlo simulations of biomolecules: The
MC module in CHARMM. Journal of Computational Chemistry, 27(2):203–216,
January 2006.
[22] Jakob P. Ulmschneider and William L. Jorgensen. Monte carlo backbone sampling for
polypeptides with variable bond angles and dihedral angles using concerted rotations
and a gaussian bias. The Journal of Chemical Physics, 118(9):4261, 2003.
[23] Jakob P. Ulmschneider and William L. Jorgensen. Monte carlo backbone sampling for
nucleic acids using concerted rotations including variable bond angles. The Journal
of Physical Chemistry B, 108(43):16883–16892, October 2004.
[24] Aaron R Dinner. Local deformations of polymers with nonplanar rigid main chain in-
ternal coordinates. Journal of Computational Chemistry, 21(13):1132–1144, October
[25] Douglas J. Tobias and Charles L. Brooks. Conformational equilibrium in the alanine
dipeptide in the gas phase and aqueous solution: a comparison of theoretical results.
The Journal of Physical Chemistry, 96(9):3864–3870, April 1992.
[26] Dmitriy S. Chekmarev, Tateki Ishida, and Ronald M. Levy. Long-Time conforma-
tional transitions of alanine dipeptide in aqueous solution: continuous and Discrete-
State kinetic models. The Journal of Physical Chemistry B, 108(50):19487–19495,
December 2004.
[27] Ao Ma and Aaron R. Dinner. Automatic method for identifying reaction coordinates
in complex systems.The Journal of Physical Chemistry B, 109(14):6769–6779, April
[28] Lixian Zhang and Markus Meuwly. Stability and dynamics of cyclic diguanylic acid
in solution. ChemPhysChem, 12(2):295–302, February 2011.
[29] Vincent Zoete, Markus Meuwly, and Martin Karplus. A comparison of the dynamic
behavior of monomeric and dimeric insulin shows structural rearrangements in the
active monomer. Journal of Molecular Biology, 342(3):913–929, September 2004.
[30] Vincent Zoete, Markus Meuwly, and Martin Karplus. Study of the insulin dimer-
ization: Binding free energy calculations and per-residue free energy decomposition.
Proteins: Structure, Function, and Bioinformatics, 61(1):79–93, August 2005.
ResearchGate has not been able to resolve any citations for this publication.
Full-text available
CHARMM (Chemistry at HARvard Molecular Mechanics) is a highly versatile and widely used molecular simulation program. It has been developed over the last three decades with a primary focus on molecules of biological interest, including proteins, peptides, lipids, nucleic acids, carbohydrates, and small molecule ligands, as they occur in solution, crystals, and membrane environments. For the study of such systems, the program provides a large suite of computational tools that include numerous conformational and path sampling methods, free energy estimators, molecular minimization, dynamics, and analysis techniques, and model-building capabilities. The CHARMM program is applicable to problems involving a much broader class of many-particle systems. Calculations with CHARMM can be performed using a number of different energy functions and models, from mixed quantum mechanical-molecular mechanical force fields, to all-atom classical potential energy functions with explicit solvent and various boundary conditions, to implicit solvent and membrane models. The program has been ported to numerous platforms in both serial and parallel architectures. This article provides an overview of the program as it exists today with an emphasis on developments since the publication of the original CHARMM article in 1983.
We study the multiple minima problem for Lennard-Jones clusters using an effective potential which is a function of the mean position x̄ and fluctuation σ at finite temperature. It is shown that this method smooths the potential energy hypersurface and facilitates the search for the global minimum. The method is applied to small Lennard-Jones clusters from 2 to 19 atoms and is shown to give accurate results.
On the basis of the icosahedral and decahedral lattices, the lowest energies of the Lennard-Jones (LJ) clusters containing 562-1000 atoms with the two motifs are obtained by using a greedy search method (GSM). Energy comparison between the decahedra and icosahedra shows that icosahedral structures are predominant. However, most of the icosahedral structures with the central vacancy are more stable than that without the central vacancy. On the other hand, in the range of 562-1000 atoms, there are 41 LJ clusters with the decahedral motif. The number of decahedra increases remarkably compared with the smaller LJ clusters. Consequently, the magic numbers and growth characters of decahedral clusters are also studied, and the results show that the magic numbers of intermediate decahedral clusters occur at 654, 755, 807, 843, 879, 915, and 935.
Unravels Complex Problems through Quantum Monte Carlo Methods Clusters hold the key to our understanding of intermolecular forces and how these affect the physical properties of bulk condensed matter. They can be found in a multitude of important applications, including novel fuel materials, atmospheric chemistry, semiconductors, nanotechnology, and computational biology. Focusing on the class of weakly bound substances known as van der Waals clusters or complexes, Stochastic Simulations of Clusters: Quantum Methods in Flat and Curved Spaces presents advanced quantum simulation techniques for condensed matter. The book develops finite temperature statistical simulation tools and real-time algorithms for the exact solution of the Schrödinger equation. It draws on potential energy models to gain insight into the behavior of minima and transition states. Using Monte Carlo methods as well as ground state variational and diffusion Monte Carlo (DMC) simulations, the author explains how to obtain temperature and quantum effects. He also shows how the path integral approach enables the study of quantum effects at finite temperatures. To overcome timescale problems, this book supplies efficient and accurate methods, such as diagonalization techniques, differential geometry, the path integral method in statistical mechanics, and the DMC approach. Gleaning valuable information from recent research in this area, it presents special techniques for accelerating the convergence of quantum Monte Carlo methods.
We describe a method for treating the sparse or rare-event sampling problem. Our approach is based on the introduction of a family of modified importance functions, functions that are related to but easier to sample than the original statistical distribution. We quantify the performance of the approach for a series of example problems using an asymptotic convergence analysis based on transition matrix methods.
An efficient concerted rotation algorithm for use in Monte Carlo statistical mechanics simulations of polypeptides is reported that includes flexible bond and dihedral angles. A Gaussian bias is applied with driver bond and dihedral angles to optimize the sampling efficiency. Jacobian weighting is required in the Metropolis test to correct for imbalances in resultant transition probabilities. Testing of the methodology includes Monte Carlo simulations for polyalanines with 8–14 residues and a 36-residue protein as well as a search to find the lowest-energy conformer of the pentapeptide Met-enkephalin. The results demonstrate the formal correctness and efficiency of the method. © 2003 American Institute of Physics.
On the basis of the icosahedral and decahedral lattices, the lowest energies of the Lennard-Jones (LJ) clusters containing 562−1000 atoms with the two motifs are obtained by using a greedy search method (GSM). Energy comparison between the decahedra and icosahedra shows that icosahedral structures are predominant. However, most of the icosahedral structures with the central vacancy are more stable than that without the central vacancy. On the other hand, in the range of 562−1000 atoms, there are 41 LJ clusters with the decahedral motif. The number of decahedra increases remarkably compared with the smaller LJ clusters. Consequently, the magic numbers and growth characters of decahedral clusters are also studied, and the results show that the magic numbers of intermediate decahedral clusters occur at 654, 755, 807, 843, 879, 915, and 935.
An efficient concerted rotation algorithm for use in Monte Carlo statistical mechanics simulations of nucleic acids is reported. The corresponding algorithm “concerted rotations with flexible bond angles” (CRA) for sampling polypeptides was found to be superior to local moves that included only flexible dihedral angles by allowing exploration of a larger conformational space as well as facilitating backbone transitions. The performance of the present CRA algorithm for polynucleotides is compared to two alternatives, a simple update of main-chain torsion angles and a previously reported, concerted rotation algorithm with fixed bond angles and a mix of flexible and rigid main-chain dihedral angles. The test system is a 12 base-pair duplex B-form DNA helix, and the performance comparisons are made for the system both in a vacuum and with continuum GB/SA solvation. The results demonstrate the superior efficiency of the CRA method over the alternatives.