Figure 1 - available via license: Creative Commons Attribution 4.0 International

Content may be subject to copyright.

# Optimization process of RDF-and EDF-based derivations of forcefield parameters. See the main text for definitions of individual, fitness (eq. (1) or eq. (2)), threshold, precision, and scale. The best and average fitness values are the lowest and average of fitnesses in a generation, respectively. When the RDF-based derivation is adopted, the target distribution function and the threshold regarding the RDF are employed. This holds true for the EDF-based method. The CMA-ES module is represented by the double-lined boxes, that is steps 1 and 4. This module generates individuals and receives the corresponding fitness values. The number of parameters to be fitted was five for fitting of σ, ε, q, d, and a and three for the fitting of σ, ε, and q in step 1. Figure S1 of the supporting information illustrates this optimization process.

Source publication

We propose a novel force‐field‐parametrization procedure that fits the parameters of potential functions in a manner that the pair distribution function (DF) of molecules derived from candidate parameters can reproduce the given target DF. Conventionally, approaches to minimize the difference between the candidate and target DFs employ radial DFs (...

## Contexts in source publication

**Context 1**

... optimization was conducted as explained in Figure 1 by using an evolution strategy, wherein the candidate parameters were generated and evaluated by comparing the corresponding DF to the target DF. The generation and evaluation iterations were continued until a convergence criterion was met. ...

**Context 2**

... vdW parameters (σ [nm], ε [kJ/mol]), fixed charge (q [e]), and geometric parameters (d [nm], a/500 [degree/500]) were optimized as shown in Figure 1. Here, a was scaled by a factor of 500 because the initial and target parameters of a are rather larger than those of other parameters. ...

**Context 3**

... (rankone-update). Here, the ranking is relevant and quantitative information or units of fitness are disregarded. Because the search space of each population is updated according to fitness, less information is required on the domain of parameters and the responses of fitness for a slight perturbation added to different parameters (See Fig. S1 of the Supporting Information, in which the search space is adapted). When conducting the CMA-ES procedure, we used the default settings of all parameters in DEAP 1.2.2, except for the number of individuals, which was set to 12 × the number of parameters to be fitted, and the initial covariance matrix for generating the first ...

**Context 4**

... module generates individuals and receives the corresponding fitness values. The number of parameters to be fitted was five for fitting of σ, ε, q, d, and a and three for the fitting of σ, ε, and q in step 1. Figure S1 of the supporting information illustrates this optimization process. ...

**Context 5**

... to Opt-Run3 employed target DFs derived from Target-Run1 to Target-Run3, respectively. After iterating generations and evaluations of optimization processes several times, as described in Figure 1, the fitness value for each fitness definition reached the corresponding threshold. The comparison of the RDF-and EDFbased fitnesses showed that the EDF provided more accurate parameters for σ, ε, q, and d, but not for a, for which the more accurate value was obtained using the RDF-based definition, as shown in Figure 2. We found that the convergence of the RDFbased fitness did not lead to the convergence of the EDF-based fitness, as shown in Figure 3a and its inset. ...

**Context 6**

... optimization requiring the longest cumulative simulation time of 963 ns per individual over the generations (Fig. 3c, Opt-Run3) resulted from the repeated 100-ns samplings for the last several generations. This suggests that there is a room of improvement in the optimization procedure, especially at the step where simulation time was doubled, as shown in Figure 1. Likewise, pursuing a more appropriate optimizer and a judicious minimizing algorithm other than black-box optimization would be desirable. ...

## Similar publications

Asphalt binder is the main material for road pavement and building construction. It is a complex mixture composed of a large number of hydrocarbons with different molecular weights. The study of asphalt binders and asphalt concretes from a molecular perspective is an important means to understand the intricate properties of asphalt. Molecular dynam...

The tomography of a single quantum particle (i.e., a quantum wave packet) in an accelerated frame is studied. We write the Schrödinger equation in a moving reference frame in which acceleration is uniform in space and an arbitrary function of time. Then, we reduce such a problem to the study of spatiotemporal evolution of the wave packet in an iner...

## Citations

... Earlier definition of CG particles are rather ad hoc [20]. More formulations with improved statistical mechanical rigor appeared later on [22], with radial distribution function based inversion [78,[84][85][86], entropy divergence [19] and force matching algorithm [87][88][89] being outstanding examples of systematic development. Present CG is essentially to realize the following mapping as disclosed by Equation (4) in ref. [22] : ...

Molecular modeling is widely utilized in subjects including but not limited to physics, chemistry, biology, materials science and engineering. Impressive progress has been made in development of theories, algorithms and software packages. To divide and conquer, and to cache intermediate results have been long standing principles in development of algorithms. Not surprisingly, most important methodological advancements in more than half century of molecular modeling are various implementations of these two fundamental principles. In the mainstream classical computational molecular science, tremendous efforts have been invested on two lines of algorithm development. The first is coarse graining, which is to represent multiple basic particles in higher resolution modeling as a single larger and softer particle in lower resolution counterpart, with resulting force fields of partial transferability at the expense of some information loss. The second is enhanced sampling, which realizes “dividing and conquering” and/or “caching” in configurational space with focus either on reaction coordinates and collective variables as in metadynamics and related algorithms, or on the transition matrix and state discretization as in Markov state models. For this line of algorithms, spatial resolution is maintained but results are not transferable. Deep learning has been utilized to realize more efficient and accurate ways of “dividing and conquering” and “caching” along these two lines of algorithmic research. We proposed and demonstrated the local free energy landscape approach, a new framework for classical computational molecular science. This framework is based on a third class of algorithm that facilitates molecular modeling through partially transferable in resolution “caching” of distributions for local clusters of molecular degrees of freedom. Differences, connections and potential interactions among these three algorithmic directions are discussed, with the hope to stimulate development of more elegant, efficient and reliable formulations and algorithms for “dividing and conquering” and “caching” in complex molecular systems.