ArticlePDF Available

Abstract and Figures

In this work, we propose a novel data-driven approach for detailed kinetic mechanisms optimization. The approach is founded on a curve matching-based objective function and includes a methodology for the optimisation of pressure-dependent reactions via logarithmic interpolation (PLOG format). In order to highlight the advantages of the new formulation of the objective function, a comparison with L1 and L2 norm is performed. The selection of impactful reactions is carried out by introducing a Cumulative Impact Function (CIF), while an Evolutionary Algorithm (EA) is adopted for the optimization. The capabilities of the proposed methodology were demonstrated using a database of ~635 experimental datapoints on ammonia combustion, covering standard targets like ignition delay times, speciation and laminar flame speed. The optimization was carried out starting from a recently published mechanism, describing ammonia pyrolysis and oxidation, largely developed using first-principles calculation of rate constants. After the selection of the 24 most impactful reactions, the related 101 normalized Arrhenius parameters were simultaneously varied, within their uncertainty bounds. Their uncertainty bounds were taken from the literature, when available, or estimated according to the level of theory adopted for the determination of the rate constant. Hence, we also provide guidelines to estimate uncertainty for reaction rate constants derived from first principles calculations using well consolidated computational protocols as a reference. The optimized mechanism was found to improve the nominal one, showing a satisfactory agreement over the entire range of operating conditions. Moreover, the use of ‘curve matching’ indices was found to outperform the adoption of L1 and L2 norms. The comparison between the nominal mechanism and the one optimized via curve matching allowed a clear identification of different critical reaction pathways for different experimental targets. From this perspective, the methodology proposed herein can find further application as a useful design-of-experiments tool for an accurate evaluation of crucial kinetic constants, thus driving further mechanism improvement.
Content may be subject to copyright.
Combustion and Flame 229 (2021) 111366
Contents lists available at ScienceDirect
Combustion and Flame
journal homepage: www.elsevier.com/locate/combustame
An evolutionary, data-driven approach for mechanism optimization:
theory and application to ammonia combustion
A. Bertolino
a , b , c
, M. Fürst
a , b , c
, A. Stagni
c
, A. Frassoldati
c
, M. Pelucchi
c
, C. Cavallotti
c
,
T. Faravelli
c
, A. Parente
a , b ,
a
Université Libre de Bruxelles, Ecole polytechnique de Bruxelles, Aero-Thermo-Mechanics Laboratory, Bruxelles, Belgium
b
Université Libre de Bruxelles and Vrije Universiteit Brussel, Combustion and Robust Optimization Group (BURN), Bruxelles, Belgium
c
Department of Chemistry, Materials, and Chemical Engineering “G. Natta”, Politecnico di Milano, Milano 20133, Italy
a r t i c l e i n f o
Article history:
Received 18 October 2020
Revised 8 February 2021
Accepted 9 February 2021
Keywo rds:
Optimization
Detailed kinetics
Ammonia
Uncertainty quantification
a b s t r a c t
In this work, we propose a novel data-driven approach for detailed kinetic mechanisms optimization.
The approach is founded on a curve matching-based objective function and includes a methodology for
the optimisation of pressure-dependent reactions via logarithmic interpolation (PLOG format). In order to
highlight the advantages of the new formulation of the objective function, a comparison with L
1 and L
2
norm is performed. The selection of impactful reactions is carried out by introducing a Cumulative Im-
pact Function (CIF), while an Evolutionary Algorithm (EA) is adopted for the optimization. The capabilities
of the proposed methodology were demonstrated using a database of ~635 experimental datapoints on
ammonia combustion, covering standard targets like ignition delay times, speciation and laminar flame
speed. The optimization was carried out starting from a recently published mechanism, describing am-
monia pyrolysis and oxidation, largely developed using first-principles calculation of rate constants. After
the selection of the 24 most impactful reactions, the related 101 normalized Arrhenius parameters were
simultaneously varied, within their uncertainty bounds. Their uncertainty bounds were taken from the
literature, when available, or estimated according to the level of theory adopted for the determination of
the rate constant. Hence, we also provide guidelines to estimate uncertainty for reaction rate constants
derived from first principles calculations using well consolidated computational protocols as a reference.
The optimized mechanism was found to improve the nominal one, showing a satisfactory agreement over
the entire range of operating conditions. Moreover, the use of ‘curve matching’ indices was found to out-
perform the adoption of L
1 and L
2 norms. The comparison between the nominal mechanism and the
one optimized via curve matching allowed a clear identification of different critical reaction pathways for
different experimental targets. From this perspective, the methodology proposed herein can find further
application as a useful design-of-experiments tool for an accurate evaluation of crucial kinetic constants,
thus driving further mechanism improvement.
© 2021 The Authors. Published by Elsevier Inc. on behalf of The Combustion Institute.
This is an open access article under the CC BY-NC-ND license
( http://creativecommons.org/licenses/by-nc-nd/4.0/ )
1. Introduction
The development of detailed kinetic mechanisms for fuels com-
bustion supports and facilitates the implementation of cleaner fu-
els and more efficient combustion technologies, in the perspec-
tive of a reduced environmental impact, a differentiation of en-
ergy sources and their wiser utilization [1] . From a chemical ki-
Corresponding author at: Université Libre de Bruxelles, Ecole polytechnique de
Bruxelles, Aero-Thermo-Mechanics Laboratory, Bruxelles, Belgium.
E-mail address: alessandro.parente@ulb.be (A. Parente).
netics perspective, a combustion process involves a considerable
amount of species connected by a complex network of reactions.
The increase in computing capabilities and in the accuracy and
availability of experimental data [2] , [3] pushes the development
of kinetic models of increasing complexity in terms of number
of species (~10
3
) and reactions (~10
4
) [1] . The rate constants of
these reactions constitute the parameters of such models, together
with thermodynamic and transport properties. These can be de-
termined experimentally, theoretically or based on analogy with
similar compounds for which kinetic subsets already exist [4] . The
last decade was characterized by a more frequent adoption of theo-
retical methods (e.g. ab initio transition state theory-based master
https://doi.org/10.1016/j.combustflame.2021.02.012
0010-2180/© 2021 The Authors. Published by Elsevier Inc. on behalf of The Combustion Institute. This is an open access article under the CC BY-NC-ND license
( http://creativecommons.org/licenses/by-nc-nd/4.0/ )
A. Bertolino, M. Fürst, A. Stagni et al. Combustion and Flame 229 (2021) 11136 6
Nomenclature
Roman symbols
A pre-exponential factor [s –cm
3 –mol]
E
a activation energy [cal/mol]
f
r uncertainty factor for reaction r
R universal gas constant [cal/mol/K]
d
0
j
zero-order derivative dissimilarity index for the j th
dataset
d
1
j
first-order derivative dissimilarity index for the j th
dataset
g experimental data spline
m model evaluations spline
p
c cross-over rate
p
m mutation rate
X uniformly distributed random variable
Y optimisation target
I
r,s impact coefficient of r
th reaction in s
th test case
s
r,s sensitivity coefficient of r
th reaction in s
th test case
Greek symbols
κkinetic rate constant [s –cm
3 –mol]
αln(A) [-]
βtemperature exponent [-]
εactivation temperature (E
a
/R) [K]
Acronyms
GRI Gas Research Institute
B2B-DC Bound to (2) Bound Data Collaboration
EA Evolutionary Algorithm
GA Genetic Algorithm
MUM-PCE Method of Uncertainty Minimization using Poly-
nomial Chaos Expansion
PLOG Pressure LOGarithmic interpolation
CM Curve Matching
RCM Rapid Compression Machine
PFR Plug Flow Reactor
JSR Jet Stirred Reactor
FPF Freely Propagating Flames
TC Test Case
CSF Cumulative Sensitivity Function
CIF Cumulative Impact Function
PES Potential Energy Surface
Subscripts
0 nominal
r r
th reaction
max maximum value of the related variable
min minimum value of the related variable
L
2 dissimilarity index based on L
2 norm evaluation of
given polynomials
p Pearson based dissimilarity measure of given poly-
nomials
m mutation
c cross-over
equation, AI-TST-ME) [5] , [6] , for the determination of kinetic pa-
rameters and thermodynamic properties. Beyond the intrinsic ad-
vantages derived from the massive use of AI-TST-ME methods in
terms of model predictive capabilities, the increasing popularity
of such methods is justified by improved theoretical methods and
algorithms currently available, and by the capability of measuring
rate constants for elementary steps in a more accurate way, thus
providing an immediate validation target for the theoretical results.
In addition, automated computational protocols implementing the
state-of-the-art AI-TST-ME methods [7–11] are reaching out to a
much wider audience, thus paving the way to a more standardized
approach to theoretical calculations within the combustion chem-
istry community. Nonetheless, adopting the best rate parameters
does not necessarily lead to improved model performances when
looking at a wide range of experimental targets [4] , [12] . This is
due to multiple reasons: i) reference kinetic mechanisms within
the combustion science and engineering community have a long
and consolidated history, or, in machine learning terms, are “well-
trained” models, iteratively validated over a wide range of experi-
mental targets over decades of research activities [12] , [13] . ii) Mod-
els that have been historically developed largely relying on anal-
ogy rules and on semi-empirical, or at least less complex, thermo-
chemical kinetics principles [14] are typically self-consistent, even
in terms of the very likely possibility of hiding error compensation
phenomena. iii) Every rate constant, including those from theoret-
ical methods, is affected by an uncertainty [12] , [15] , [16] .
The implementation of theory-based development strategies is
an iterative process that shows its payback only in the mid-to-
long-term perspective. In fact, due to the hierarchical nature of de-
tailed mechanisms and their development [17] , implementing one
single accurate rate parameter, or new rate parameters for an en-
tire reaction class, might strongly perturb the critical equilibrium
between the different modules of a kinetic model. This is partic-
ularly important if, while gradually introducing new parameters
from theoretical calculations, a well performing model is needed
for applications of interest to the end-user.
Regarding theoretical determinations the uncertainty can be in-
tuitively considered as decreasing with an increasing detail in the
level of theory [18] . In the past, uncertainty propagation methods
were used to quantify the level of uncertainty of phenomenological
rate coefficients, in n -propyl radical oxidation, obtained from the-
ory [19] . In recent times, quantum chemistry calculations are said
to have reached a level of accuracy comparable to that of exper-
imental measurements [5] , promoting their applicability in com-
bustion mechanism development [20] . A multi-scale modelling ap-
proach was proposed by Burke et al. [21] , [22] , who optimized a set
of uncertain theoretical kinetics parameters directly relating their
uncertainties to the combustion behaviour in terms of macroscopic
targets (ignition delay time, laminar flame speed, etc.). Shannon
et al. [23] proposed the use of experimental data and uncertainty
quantification to constrain and optimize input parameters in the
master equation using MESMER [24] .
Essentially, each parameter of a kinetic model, expressed in any
form, can be considered as a randomly distributed variable within
its uncertainty range [16] . This feature can be exploited in mathe-
matical optimization if the model is required to accurately perform
on a small, as well as large, set of experimental targets.
Optimization is a powerful tool for data-driven mechanism de-
velopment, which can be used in combination with the solution of
the so called “inverse problem”, consisting in obtaining a new set
of constrained kinetic parameters, by minimizing or maximizing a
chosen objective function using experimental data as targets.
In the context of chemical mechanisms, Solution Mapping
[25] was the first method applied to a large, complex system.
This method faces the multi-modality of the problem through
polynomial response surfaces, and it was applied for the devel-
opment of the GRI-MECH [26] . This mechanism was trained on
77 well-documented and heterogeneous experimental targets de-
scribing the combustion of natural gas. Frenklach et al. [27] in-
troduced the concept of collaboration of data , and demonstrated
that a joint analysis on the entire data sample can increase the
amount of extracted information and improve the results. Feeley
et al. [28] showed that the techniques of data collaboration can
be used to rigorously assess the mutual consistency of experimen-
tal results and identify potential outliers, using a chemical kinetic
2
A. Bertolino, M. Fürst, A. Stagni et al. Combustion and Flame 229 (2021) 111 36 6
model. The methodology, called Bound-to-Bound Data Collabora-
tion (B2B-DC) has been successfully applied and refined in several
other works [29–34] . The applicability of Evolutionary/Genetic Al-
gorithms (EA/GA) to optimization problems involving detailed ki-
netics was broadly investigated by Elliott et al. [35] . EA/GA were
found particularly suitable for searching objective-function spaces
characterized by high dimensionality. Turányi et al. [36] pro-
posed a sum-of-squared-error-based methodology, accounting for
both direct and indirect measurements, and successfully applied
it to H
2
/O
2 [37] , H
2
/O
2
/NO
x [38] , H
2
/CO mixtures [39] , CH
2
O
and CH
3
OH [40] , and ethanol [41] . Najm et al. [42] applied for-
ward Uncertainty Quantification (UQ) and Polynomial Chaos Ex-
pansion (PCE) to chemical kinetics. Sheen and Wang introduced
the method of uncertainty minimization by polynomial chaos ex-
pansion (MUM-PCE) [43] , [44] . Cai and Pitsch [45] minimized the
uncertainty in a n -pentane combustion mechanism by applying the
MUM-PCE method to the optimization of rate rules. They also pro-
posed a strategy to optimize pressure-dependent reactions, formu-
lated via logarithmic interpolation, i.e. PLOG standard [46] . PLOG
expressions are indeed gradually substituting the previous formu-
lation of pressure-dependent rate constants, as they yield a better
fitting to experiments or calculations [5] . As this formalism uses
accurate rates for discrete pressures, the parameters of each pres-
sure value were considered independent from each other in [45] .
The optimization of relatively compact kinetic mechanisms,
such as methane, hydrogen, and ammonia is particularly attractive,
because of i) the large availability of high-fidelity data [4] , ii) the
current interest in e-fuels produced from renewable energy [47] ,
and iii) their compact size allowing to benchmark the suitability of
different optimization algorithms before their application to more
complex networks. Among them, the combustion kinetics of am-
monia (NH
3
) is one of the most active research fields, due to the
high potential of ammonia as a fuel, from both an economic and
a technical perspective [48] . Indeed, ammonia is a carbon-free en-
ergy vector with high hydrogen content, which can be liquefied at
pressures higher than 9.9 bar at ambient temperature. Historically
ammonia has been used as NO reducing agent in both selective
and non-selective catalytic reduction. The importance of ammo-
nia is also related to other renewable energy sources: for exam-
ple, it is a by-product of anaerobic digestion of municipal wastew-
ater sludges [49] , and it is found in trace amounts in biogas [50] .
The combined use of NH
3 with conventional fuels like H
2 or CH
4
has also been studied in order to improve shortcomings related to
its low reactivity [51] , [52] . Also, optimal operating conditions were
found to minimize NO
x emissions [53] . Therefore, several mech-
anisms describing the oxidation of NH
3 and NH
3
/H
2 fuel blends
were developed [54–56] . Glarborg et al. [57] recently proposed a
comprehensive nitrogen chemistry model, including ammonia it-
self. Anyway, uncertainties still persist in the characterization of
ammonia chemistry for an accurate prediction of ignition, specia-
tion, and laminar flame speed [57] . So far, optimization studies in
chemical kinetics have been relying on objective functions based
on the L1 and L2 norms of the difference between models pre-
dictions and corresponding experimental targets [35] , [58] , [59] . Re-
cently, You et al. [31] minimized the L
1
-norm of the difference
between the active variables values and the nominal ones, con-
strained on the feasible set of combinations identified with B2B-
DC [27] . The formulations in [31] not only improve the model
performance, but also minimize the deviation of parameters val-
ues from the literature recommendations. Recently, Bernardi et al.
[60] presented an innovative framework based on Curve Matching
(CM), consisting in a multi-faceted functional analysis of the pro-
files obtained from both models and experiments. In this approach,
they introduced a proper metric to quantify the similarity between
the curves representing experiments and simulations, rather than a
point-wise measure of the distance between them. Pelucchi et al.
[61] revised and proposed such framework as a further step to-
wards an automatic model validation protocol.
In this work, a novel methodology for the optimization of ki-
netic mechanisms is proposed, which includes, for the first time,
the possibility to optimize PLOG reactions by accounting for in-
terdependencies between rates at different pressures and the use
of the CM index [61] as the objective function. The effectiveness
of such approach was verified by adopting a kinetic mechanism
for ammonia combustion as a case study [20] . This model was
recently proposed and largely relies on theoretical calculations of
key reaction rate constants. As an added value, this work also
presents guidelines for attributing reasonable uncertainty factors
for theoretical determinations performed with different theory lev-
els. On these bases, optimization was carried out using a non-
gradient based, mono-objective, Evolutionary Algorithm (EA) in
OptiSMOKE + [62] , capable of handling all the parameters as uni-
formly distributed random variable within their estimated bounds,
simultaneously.
The manuscript is organized as follows. Section 2 presents the
proposed methodology. Section 3 describes the results, for pure
ammonia on a wide range of experimental conditions covering ig-
nition delay times in Shock Tubes (ST) and Rapid Compression Ma-
chines (RCM), Plug Flow (PFR) and Jet-Stirred Reactors (JSR) specia-
tion measurements and laminar burning speed in Freely Propagat-
ing Flames (FPF). Finally, conclusions are presented in Section 4 .
2. Methodology
2.1. Optimization procedure
As in [18] , [20] , all the parameters of the selected rate constants
expressed according to the modified Arrhenius expression ( k = A
T
βexp(-E
a
/RT)) undergo optimization, i.e. pre-exponential factors
(A), temperature exponents ( β), and activation energies (E
a
). The
logarithmic expression of the rate constant adopted in this work
yields:
κ= ln
(
k
(
T
) )
= ln
(
A
)
+ βln
T
T
ref
E
a
R
1
T
1
T
ref
= α+ βln
T
T
ref
ε
1
T
1
T
ref
(1)
where, α, β, and εare continuous random variables representing
the Arrhenius parameters, usually assumed to be uniformly [59] or
normally [63] distributed. In Eq. (1) , A
is a re-parametrized form
of the pre-exponential factor at the reference temperature T
ref
:
A
= A T
β
ref
exp
ε
T
ref
, (2)
The re-parametrization in Eq. (2) minimizes the high correla-
tion between the parameters of the Arrhenius equation, and makes
parameters estimation easier [64] , [65] . In this work, a reference
temperature of 10 0 0 K was adopted for all reactions.
The uncertainty of the rate coefficients is usually assumed as
symmetric, and is reported in literature in terms of f
r factor [15] ,
being defined as follows:
f
r
=
κmax
κ0
ln
(
10
)
=
κ0
κmin
ln
(
10
)
, (3)
The problem of defining the constraints for the active parame-
ters was dealt with in several studies. In the deterministic frame-
work of B2B-DC [27] , [30] , the feasible set is obtained by combining
the initial bounds of both active variables and experimental data.
In MUM-PCE [43] , a statistical approach is adopted, which assumes
“a priori” distributions for both the model parameters and the
measurements, and produces “a posteriori” distributions for both
3
A. Bertolino, M. Fürst, A. Stagni et al. Combustion and Flame 229 (2021) 111 36 6
model parameters and predictions. These two approaches were re-
cently compared, and they were found to give consistent results
[32] . Nagy and Turányi [66] , [67] considered the dependence of f
r
on temperature, and proposed a method to determine the covari-
ance matrix and the multivariate normal distribution of the trans-
formed Arrhenius parameters from prior information on the rate
constant. In a later study, Nagy et al. [68] recommended the adop-
tion of temperature-independent uncertainty and uniform distribu-
tions for Arrhenius parameters in case of little prior information.
As we discuss later (see 2.5), the nominal mechanism in this study
largely relies on ab-initio calculations. For this reason, the temper-
ature dependence of f
r is not accounted for, and uniform distribu-
tions for all the active variables are employed.
As reported in Eq. (1) , κis a weighted sum of three ran-
dom variables with joint uniform distribution, which results in
a higher probability near κo [68] . For the sake of simplicity, in
the following we assume that for all temperatures the kinetic
constant κis a normally distributed random variable with mean
value κ0
, corresponding to κ(p
o
), and standard deviation σκ, with
p
0
= [ α0
, β0
, ε
0
] . As in [43] , [59] , we assume that f
r corresponds to
the 2 σκof the distribution of κ, and we constrain it at 3 σ. From
Eq. (3) , κmax
and κmin
can be obtained, i.e. the maximum and min-
imum linear constraints of κin T [T
min
, T
max
]. As an element κi
in
κcan be retrieved by sampling from the distributions of the nor-
malized Arrhenius parameters, f
r can also be propagated from κ
to α, β, and εto estimate their bounds. In the following, the hy-
pothesis of mutual independence between parameters is used ex-
clusively to achieve this goal. Given the equation:
10
f
r =
k
max
(
T
)
k
0
(
T
)
=
k
0
(
T
)
k
min
(
T
)
= exp
α+ βln
(
T
)
εT
1
, (4)
and assuming that the maximum variation p
i
of one parameter
is determined by projecting the uncertainty of κon the parameter
itself (i.e. keeping constant the other two to their nominal values
so that p
i
= 0), the following constraints can be retrieved:
α0
ln
10
f
r
αα0
+ ln
10
f
r
, (5)
β0
f
r
log
10
(
T
)
ββ0
+
f
r
log
10
(
T
)
, (6)
ε
0
T f
r
ln
(
10
) ε ε
0
+ T f
r
ln
(
10
)
, (7)
This operation results in 2 non-linear constraints for βand εin
T [T
min
, T
max
]. However, it can be shown that:
lim
T →∞
k
(
T
)
= exp
(
α)
T
β(8)
lim
T 0
k
(
T
)
= exp
εT
1
(9)
The limits (8) and (9) indicate that at high temperature, the
term T
βcontrols the value of κ, while the contribution of ε T
1
is progressively smaller. The opposite is true for low temperatures.
Thus, the sensitivity of κto βis maximum at T
max
. Conversely,
the sensitivity of κto εis maximum at T
min
. By bounding β
and εin Eqs. (6) and (7) at T
max and T
min
, respectively, we en-
sure that κ( α0
, βmax
, ε
0
, T ) , κ( α0
, βmin
, ε
0
, T ) , κ( α0
, β0
, ε
max
, T ) ,
and κ( α0
, β0
, ε
min
, T ) never violate the linear constraints on κ(T ) ,
when T [T
min
, T
max
]. In this work, the minimum and maxi-
mum temperatures are 300 and 3000 K, respectively. Indeed,
from the definition of f
r in Eq. (3) , also κ( αmax
, β0
, ε
0
, T ) , and
κ( αmin
, β0
, ε
0
, T ) do not violate the mentioned constraints. The
adoption of this methodology for the estimation of parameter
boundaries has two main advantages. First, it reduces the proba-
bility of sampling a kinetic rate constant κ(T ) , which violates the
Fig. 1. Reaction rate constant k as a function of parameters bounds for
NH
2
+ NO
2
= H
2
NO + NO [98] .
above mentioned linear constraints, with respect to previously pro-
posed methods [69] . Further details about this feature are provided
in the Supplementary Material (SM). Secondly, it enables the opti-
mization of PLOG-based reactions. As an example, Fig. 1 shows the
projections of the parameters bounds on the kinetic constants of
the reaction NH
2
+ NO
2
=
H
2
NO + NO. Those resulting from varying
only αoverlap with the 2 σof the distribution of κ. On the other
hand, those resulting from the variation of βand εoverlap with
the 2 σof k only at T
max and T
min
, respectively, while not exceed-
ing them along T [T
min
, T
max
]. The limit values of the correspond-
ing κdistribution, i.e. κ( αmax
, βmax
, ε
min
) and κ( αmin
, βmin
, ε
max
) ,
are also displayed. They include the entire space of κand exceed
it. In fact, since the parameters are correlated [66] , not all the com-
binations of the three are valid.
All the combinations, which result in values of κbelonging
to the area between the limit values and the 3 σbounds of the
distribution of κ, are excluded from the set of eligible parameter
combinations. This is achieved by introducing a penalty function
during the optimization: the associated objective function is equal
to curve matching index of 0 (i.e. the maximum error in the CM
methodology). Conversely, for each valid combination suggested by
the adopted optimization algorithm, a corresponding set of simula-
tions responses are obtained by performing model evaluations for
the entire database. Subsequently, the CM indices:
C M
j
= d
0
j, L
2
+ d
1
j, L
2
+ d
0
j,p
+ d
1
j,p
4
[
0 , 1
]
, (10)
are calculated, as a weighted sum of 4 dissimilarity indices, for
each dataset j. A unitary CM value indicates a perfect matching
between model evaluations and experiments. In particular, func-
tional estimations of both experimental, g(x) , and model evalu-
ations, m(x) (and their derivatives g’(x) and m’(x) ) are obtained
by interpolating smoothed splines, which result in satisfactory ap-
proximations of both data points and first derivatives [60] , [61] .
Based on these estimations, the dissimilarity indices are computed
as follows:
d
0
j, L
2
=
1
1 +
m g
|
D
|
[
0 , 1
]
, (11)
d
1
j, L
2
=
1
1 +
m
g
|
D
|
[
0 , 1
]
, (12)
d
0
j,p
= 1 1
2
m
m
g
g
[
0 , 1
]
, (13)
4
A. Bertolino, M. Fürst, A. Stagni et al. Combustion and Flame 229 (2021) 111 36 6
Fig. 2. Example of bootstrap procedure with 10 variations. Experimental data from
[83] .
d
1
j,p
= 1 1
2
m
m
g
g
[
0 , 1
]
, (14)
where | D | is the intersection of the domain between g and m . For
instance, if the abscissa values of g belong to [50 0,150 0], and those
of m belong to [40 0, 180 0], the value of | D | would be 10 0 0 (i.e.
| D | = 150 0–50 0). The g is the L
2
-norm of the function g. All the
dissimilarity indices are intrinsically constrained between 0 and 1,
where 1 indicates maximum similarity, and 0 maximum dissimi-
larity. Individually, d
0
j, L
2
depends on the area enclosed by gand m ,
while d
1
j, L
2
evaluates the same quantity between their respective
derivatives. Hence, the first generalizes a classical L
2
-norm, while
the second extends it. On the other side, the Pearson dissimilarity
measures d
0
j,p
and d
1
j,p
indicate perfect matching if g and m , and
their derivatives, only differ by vertical translation. Further math-
ematical details and examples are given in [60] , [61] . In order to
account for the uncertainty in the evaluation of (10), a bootstrap-
ping procedure [70] on the experimental data is carried out. This
procedure relies on the assumption that each data point is nor-
mally distributed within its experimental uncertainty. A sufficiently
large set of possible experimental trends is generated taking ran-
dom samples from the above-mentioned distributions. Fig. 2 dis-
plays an example of the application of the bootstrap procedure for
laminar flame speed data, where 7 gaussian distributions (i.e. one
for each data point) were sampled 10 times to generate as many
bootstrap variations.
A set of 50 bootstrap variations (N
b
= 50) for each data point
was adopted after verifying the substantial independence of the fi-
nal output on a further broadening of the set. Thus, the objective
function in this work is defined as:
M =
1
DS
DS
i =1
1 1
N
b
N
b
j=1
CM
j
i
, (15)
Where, DS is the number of target datasets and N
b is the number
of bootstrap variations. In order to discuss its advantages, the final
index ( Eq. (15) ) is compared to the modified versions of L
1
-norm
and L
2
-norm:
L
1
=
1
DS
DS
i =1
1
E
i
E
i
j=0
Y
exp
i,j
Y
sim
i,j
σY
exp
i,j
, (16a)
L
2
=
1
DS
DS
i =1
1
E
i
E
i
j=0
Y
exp
i,j
Y
sim
i,j 2
σ2
Y
exp
i,j , (16b)
where E
i
is the number of discrete experiments belonging to the
i
th dataset. Y
ij
exp
, Y
ij
sim and σare the values of the j
th measure-
ment, simulation, and experimental uncertainty in the i
th dataset,
respectively.
As in Olm et al. [71] , when the error on ignition delay times
is evaluated, the following transformation is used, Y
ij
exp
=
ln(y
ij
exp
)
and Y
ij
sim
=
ln(y
ij
sim
), where the y
ij
exp and y
ij
sim refers to the ab-
solute experimental and simulated values, respectively. For other
experimental targets, such as species concentrations and laminar
flame speeds, the logarithmic transformation is not used. The same
is done for the estimation of Curve Matching indices for ignition
delay times.
In this work, the objective function minimization is performed
by means of an Evolutionary Algorithm (EA) [72] , whose solution is
less dependent on the initial guess compared to other algorithms
[35] . Indeed, in EA the initial guess is a set of sampled combi-
nations of active parameters, i.e. the ‘population’. Initially, a pop-
ulation of 100 different combinations of parameters is sampled,
evaluated and labelled with objective function values. Then, the
algorithm starts the first iteration (a ‘generation’ ), where the ele-
ments of the current population (the ‘parents’ ) are ranked applying
a linear scaling of probability based on the corresponding objective
function values. In general, the best performing parents undergo
uniform crossover, where a couple of parents (or a ‘chromosome’ )
is selected, and each parameter value can be swapped between the
two with a probability equal to the crossover rate p
c
. This opera-
tion produces a new pair of elements (the off-springs ’), resulting
from the cross-over of as many parents . Subsequently, mutation is
introduced. In particular, for each new off-spring , every variable has
the same probability to mutate, according to the mutation rate p
m
.
A non-uniform mutation operator was adopted to assign a new
parameter value by sampling from its distribution. When muta-
tion and cross-over are complete, a resulting population of 200 is
obtained, i.e. twice the size of the initial one. In this work, a re-
placement strategy, which selects the 50 best individuals in 200
elements, and randomly selects other 50 from the remaining 150,
was adopted. This ensures the balance between global and local
search. The adopted probability of cross-over (p
c
= 0.65) and muta-
tion (p
m
= 0.5) were suggested by Elliott et al. [35] . The new parent
population undergoes the same procedure iteratively until satisfac-
tory accuracy is achieved. In the present work, Dakota [72] and
OpenSMOKE ++ [73] , [74] , are coupled in OptiSMOKE ++ [62] to
perform the optimization. The first toolbox is specifically conceived
to address engineering problems such as optimization, calibration
and uncertainty quantification. On the other hand, OpenSMOKE ++
enables the simulation of multiple experimental combustion facil-
ities typically considered for kinetic model development and vali-
dation.
2.2. Optimization of reactions in PLOG formalism
For those reactions exhibiting a “fall-off” behaviour, the rate
k(T,P) is usually determined from the low and high-pressure limit
constants, together with a blending function that smoothly con-
nects the limiting rates across the fall-off regime, using different
possible formulations. Among these, the Troe formulation [75] , [76]
is the most widely used. An alternative formulation based on log-
arithmic interpolations, expressed with the so-called PLOG, has
been recently proposed [46] , and is rapidly growing in popular-
ity because of the potentially superior accuracy, thus becoming the
new standard formalism. PLOG reactions are typically introduced
5
A. Bertolino, M. Fürst, A. Stagni et al. Combustion and Flame 229 (2021) 111 36 6
Fig. 3. 3D behaviour of a PLOG reaction. R143: HNO
=
H + NO before (dashed line)
and after optimization (continuous line).
in a kinetic mechanism using multiple Arrhenius rate constants ac-
counting for temperature dependence at constant pressures cov-
ering the entire range of conditions from the low to the high-
pressure limits. Then, a proper (i.e. logarithmic) interpolation is
adopted for the intermediate pressures. In this way, the combined
effects of pressure (P) and temperature (T) on the rate constant k is
properly accounted for. As a result, the three Arrhenius parameters
for each pressure value cannot be optimized independently from
each other, even within their own uncertainty ranges, in order to
keep the physical consistency in the whole pressure domain. On
the contrary, the same optimization performed using all the nom-
inal pre-exponential factors, temperature exponents and activation
energies, i.e. treating reactions at different pressures as indepen-
dent from each other, would result in a non-monotonic behaviour
with arguable physical meaning. Additionally, since the number of
reactions within the same PLOG is the result of a fitting needed to
describe complex k (T, P) with a small acceptable error, the number
of parameters to be handled scales accordingly. This may result in
an abrupt increase in the number of parameters for a single re-
action. For the first time in literature, we propose an approach to
optimize the parameters at all pressures simultaneously, based on
what proposed for the parameters bounds in the previous section,
using only three, uniformly distributed random variables with an
average value of 0, and constrained in the following ranges:
X
1
ln
10
f
r
, ln
10
f
r
, (17)
X
2
f
r
log
10
(
T
max
)
, +
f
r
log
10
(
T
max
)
, (18)
X
3
[
f
r
T
min
ln
(
10
)
, + f
r
T
min
ln
(
10
)
]
, (19)
These variables are associated with α, β, and ε, at all pressures,
respectively. The value of X
1 is sampled from its distribution and
added to all the α, at different pressures, i.e. each reaction rate is
changed by the same factor, and the same is done for β0 and ε0
,
using X
2 and X
3
. As an example, Fig. 3 displays results from the
Section 3 about the comparison between nominal and optimized
rate for the decomposition reaction HNO
=
H + NO, to which we at-
tributed an uncertainty factor f
r of 0.3. The reported pressure val-
ues for this reaction are 0.1,1,10 ,10 0 and 1000 bar. Fig. 3 highlights
the preserved consistency in the pressure dependent behaviour of
the reaction rates at different temperatures.
Fig. 4. Collected experimental data on Ammonia combustion in terms of operating
conditions (temperature, pressure, and composition).
2.3. Database
Fig. 4 summarizes the features of each test case (TC) in the tem-
perature, pressure and composition space. The experimental data
considered in this work cover the entire space of operating condi-
tions.
The database, consisting of 60 different datasets (with 635 ex-
perimental points) from different test cases, was divided in op-
timization and validation targets (i.e. 75% and 25%, respectively).
For high-temperature conditions, the shock tube experiments from
Mathieu and Petersen [55] , and Shu et al. [77] cover ignition delay
time in a wide range of composition ( φ= 0.5–2.0) and pressures
(10–40 bar). Stagni et al. [20] reported data for ammonia oxidation
close to the atmospheric pressure for lean mixtures in two differ-
ent systems, namely jet stirred and flow reactors. At low temper-
atures, He et al. [78] and Pochet et al. [79] provided auto-ignition
data at higher pressures, for lean, stoichiometric and rich mixtures
in rapid compression machines. Wargadalam et al. [80] and Song
et al. [81] published speciation data for very lean conditions, at
pressures of 1, 30 and 10 0 bar, in flow reactors. Davidson et al.
[82] investigated ammonia pyrolysis in a shock tube at extremely
high temperatures ( T > 2500 K). The laminar burning speed ex-
periments by Lhuillier et al. [51] were only considered for vali-
dation. However, flame speed targets were included by using the
data from Ronney [83] . The TCs from Rota [84] and Dagaut [85] in
jet stirred reactors were excluded from the optimisation set, yet
used for validation, as they cover a part of the operating condi-
tions space which is already densely populated (see Fig. 4 ).
2.3.1. Numerical simulations
In this work, the ignition delay time was calculated using the
definition reported in the corresponding experimental paper. The
constant volume assumption was used to simulate the shock tube
data of Mathieu and Petersen [55] , and Davidson [82] . In repro-
ducing the data from Shu et al. [77] , gas dynamic effects were
accounted for using the methodology described in [86] . The RCM
data were reproduced under the hypothesis of adiabatic core [87] ,
and detailed volume profiles from He [78] and Pochet [79] were
used to properly account for the compression stroke and heat ex-
change effects in each experiment. For the flow reactor experi-
6