# A mixed integer genetic algorithm used in biological and chemical defense applications.

**ABSTRACT** There are many problems in security and defense that require a robust optimization technique, including those that involve

the release of a chemical or biological contaminant. Our problem, in particular, is computing the parameters to be used in

modeling atmospheric transport and dispersion given field sensor measurements of contaminant concentration. This paper discusses

using a genetic algorithm for addressing this problem. An example is given how a mixed integer genetic algorithm can be used

in conjunction with field sensor data to invert a forward model to obtain the meteorological data and source information necessary

for prediction of the subsequent concentration field. A new mixed integer genetic algorithm is described that is a state-of-the-art

tool capable of optimizing a wide range of objective functions. Such an algorithm is used here for optimizing atmospheric

stability, wind speed, wind direction, rainout, and source location. We demonstrate that the algorithm is successful at reconstructing

these meteorological and source parameters despite moderate correlations between their effects on the sensor data.

**0**Bookmarks

**·**

**88**Views

- [Show abstract] [Hide abstract]

**ABSTRACT:**This article studies the suitability of modern population based algorithms for designing combination cancer chemotherapies. The problem of designing chemotherapy schedules is expressed as an optimization problem (an optimal control problem) where the objective is to minimize the tumor size without compromising the patient’s health. Given the complexity of the underlying mathematical model describing the tumor’s progression (considering two types of drugs, the cell cycle and the immune system response), analytical and classical optimization methods are not suitable, instead, stochastic heuristic optimization methods are the right tool to solve the optimal control problem. Considering several solution quality and performance metrics, we compared three powerful heuristic algorithms for real-parameter optimization, namely, CMA evolution strategy, differential evolution, and particle swarm pattern search method. The three algorithms were able to successfully solve the posed problem. However, differential evolution outperformed its counterparts both in quality of the obtained solutions and efficiency of search.Soft Computing 06/2013; 17(6-6):913-924. · 1.30 Impact Factor - [Show abstract] [Hide abstract]

**ABSTRACT:**An accidental or intentional release of hazardous chemical, biological, radiological, or nuclear material into the atmosphere obligates responsible agencies to model its transport and dispersion in order to mitigate the effects. This modeling requires input parameters that may not be known and must therefore be estimated from sensor measurements of the resulting concentration field. The genetic algorithm (GA) method used here has been successful at back-calculating not only these source characteristics but also the meteorological parameters necessary to predict the contaminants subsequent transport and dispersion. This study assesses the impact of sensor thresholds, i.e. the sensor minimum detection limit and saturation level, on the ability of the algorithm to back-calculate modeling variables. The sensitivity of the back-calculation to these sensor constraints is analyzed in the context of an identical twin approach, where the data is simulated using the same Gaussian Puff model that is used in the back-calculation algorithm in order to analyze sensitivity in a controlled environment. The solution is optimized by the GA and further tuned with the Nelder–Mead downhill simplex algorithm. For this back-calculation to be successful, it is important that the sensor capture the maximum concentrations.Measurement 06/2011; 44(5):802-814. · 1.53 Impact Factor - [Show abstract] [Hide abstract]

**ABSTRACT:**The Aerodyne Inverse Modeling System was developed to enable location and characterization of hazardous atmospheric releases from dispersion and meteorological data. It combines an automatically-generated tangent-linear of SCIPUFF with a cost function tailored for practical applications and a minimization algorithm that can search for multiple instantaneous or continuous sources without requiring an initial guess. In this work AIMS was applied to estimate the sources in 84 FFT 07 cases that included instantaneous and continuous releases for up to four source locations. FFT 07 was a controlled short-range (˜500 m) dispersion test using 100 digiPIDs evenly distributed over an area of 0.5 × 0.5 km. AIMS estimated sources were in average within 90-150 m of the real sources, with the distances from estimated to real source ranging from 0 to 510 m. AIMS performed better estimating the location of instantaneous sources than of continuous ones. It also performed better for single-source situations than for multiple source scenarios and when 16 sensors were used instead of 4. In addition to using stationary sensors, AIMS also has the capability of processing data from mobile sensors. This was applied using model-generated data in an example of a release in a setting similar to an industrial facility.Atmospheric Environment 10/2011; 45:6085-6092. · 3.06 Impact Factor

Page 1

FOCUS

A mixed integer genetic algorithm used in biological and chemical

defense applications

Sue Ellen Haupt•Randy L. Haupt•

George S. Young

? Springer-Verlag 2009

Abstract

defense that require a robust optimization technique,

including those that involve the release of a chemical or

biological contaminant. Our problem, in particular, is

computing the parameters to be used in modeling atmo-

spheric transport and dispersion given field sensor mea-

surements of contaminant concentration. This paper

discusses using a genetic algorithm for addressing this

problem. An example is given how a mixed integer genetic

algorithm can be used in conjunction with field sensor data

to invert a forward model to obtain the meteorological

data and source information necessary for prediction of the

subsequent concentration field. A new mixed integer

genetic algorithm is described that is a state-of-the-art tool

capable of optimizing a wide range of objective functions.

Such an algorithm is used here for optimizing atmospheric

stability, wind speed, wind direction, rainout, and source

location. We demonstrate that the algorithm is successful at

reconstructing these meteorological and source parameters

despite moderate correlations between their effects on the

sensor data.

There are many problems in security and

Keywords

Source characterization ? Atmospheric dispersion

Genetic algorithm ? Mixed integer ?

1 Motivation

In the case of an accidental or intentional release of a toxic

chemical, biological, radiological, or nuclear (CBRN)

contaminant, responsible agencies must decide which areas

to evacuate, how to mitigate the release, and how to exe-

cute emergency response. Potentially life-or-death deci-

sions should be based on forecasts of transport and

dispersion of the contaminant. In a real situation, however,

it is unlikely that the exact information regarding the

source parameters (location, time, strength of the release)

or the meteorological data (wind speed and direction,

atmospheric stability) would be available. One way to

mitigate this problem is to assimilate data from field sen-

sors into the atmospheric transport and dispersion model,

for example, meteorological data, concentration data, or

both. In assimilating these data into the model, however,

one must consider difficulties including: (1) monitored

concentration data contains errors; (2) inherent uncertain-

ties apply to modeling chaotic processes such as turbulent

dispersion; and (3) transport and dispersion models com-

pute the ensemble average of many realizations of an event

while the goal is to reproduce a specific single realization

of the event in real time. The process of assimilating the

data into the modeling framework should thus be formu-

lated as one in optimization, specifically configured to

address these issues as well as the fit of model to data.

Dispersion modeling is therefore augmented by assimilat-

ing ground truth from a network of field sensors to back-

calculatethesourceand

required by the model to predict the subsequent transport

and dispersion of the contaminant.

Our previous work demonstrated that coupling inverse

models with transport and dispersion models using

a genetic algorithm (GA) is an effective approach for

meteorologicalparameters

S. E. Haupt (&) ? R. L. Haupt

Applied Research Laboratory, The Pennsylvania State

University, P.O. Box 30, State College, PA 16804, USA

e-mail: haupts2@asme.org

S. E. Haupt ? G. S. Young

Meteorology Department, The Pennsylvania State University,

Walker Building, University Park, PA 16804, USA

123

Soft Comput

DOI 10.1007/s00500-009-0516-z

Page 2

attributing concentration measured at a receptor to each of

a specified number of sources. A GA is an optimization

technique that integrates genetic recombination with nat-

ural selection to evolve better solutions to an optimization

problem (Goldberg 1989; Haupt and Haupt 2004; Holland

1975). This technique was tested using a basic Gaussian

plume dispersion model on synthetic data for both con-

trived source configurations and also with the actual source

configuration for Logan, Utah (Haupt 2005). The meth-

odology was then statistically analyzed using Monte Carlo

techniques to determine the confidence intervals, including

in the presence of both additive and multiplicative white

noise (Haupt et al. 2006). We found that even when the

noise was the same magnitude as the signal, the GA-cou-

pled model could apportion the pollutant to the correct

source. The next step replaced the Gaussian plume dis-

persion model with an operational Second order Closure

Integrated PUFF model, SCIPUFF (Allen et al. 2007a).

The GA-coupled model performed as well with SCIPUFF

computing the dispersion as with the Gaussian plume

model. That enhanced coupled model was then tested on

field test data (Allen et al. 2007a). Within the limitations of

the data, the coupled model still performed admirably. The

cases where performance was disappointing were traced to

difficulties during the field test that would be expected to

impact data quality. The problem was then reformulated

to additionally compute the wind speed and direction

(Allen et al. 2007b). A subsequent study with that refor-

mulated model additionally included the wind speed, time

of release, and effective plume height as parameters to

optimize (Long et al. 2009). That work also analyzed the

amount of data required to back-calculate six parameters in

the presence of noise and formulated a measure of how

much information is necessary to compute a sufficiently

good solution.

The inverse problems in these prior studies were all

solved using a genetic algorithm (GA). The parameters to

be optimized by the GA are the input values for the dis-

persion model. Thus, for each potential solution, the results

of the dispersion model with those estimated parameters

are compared to the monitored concentration pattern. That

series of efforts progressed from simply tuning the source

strength through identifying up to seven relevant parame-

ters: two dimensional location (x, y), effective plume

height, time of release, strength of release, wind direction,

and wind speed. The general process is depicted in Fig. 1.

The concentrations are assumed to be measured at a set of

sensor locations. The concentrations predicted by the dis-

persion model using ‘‘guesses’’ at the modeling variables

are compared to those measured. The GA incorporates the

resulting trial solution performance information to con-

struct better guesses to the model variables using the

operators. After enough generations, the modeling vari-

ables converge to an accuracy sufficient for predicting the

future concentration field.

The GA used in our prior efforts was a continuous GA,

i.e. all of the variables being optimized were real-valued.

For this current effort, however, we need to additionally

invert for both an integer and a binary variable: atmo-

spheric stability class and a rainout switch. The stability of

the atmosphere determines the dispersion coefficients that

govern the plume spread with distance and is typically

divided into six discrete categories. We also account for the

fact that rain causes a portion of the contaminant to ‘‘wash

out’’ from the atmosphere, lowering surface concentration.

This expanded effort requires a mixed integer genetic

Fig. 1 Schematic of source and

meteorological data

optimization for security

S. E. Haupt et al.

123

Page 3

algorithm (MIGA), which allows the simultaneous opti-

mization of real, integer, and binary variables.

The formulation of the problem is described in Sect. 2.

Section 3 details the MIGA. Results for three numerical

experiments are described in Sect. 4. Section 5 summarizes

the work and discusses the utility of the algorithm in the

context of similar problems.

2 Problem formulation

2.1 Model formulation

The goal of the current study is to develop and test an

algorithm for combining an atmospheric transport and

dispersion model with field sensor data on contaminant

concentration so as to back-calculate (i.e. invert for) the

meteorological data and source characteristics necessary

for subsequent transport and dispersion modeling. The

predicted concentration values are compared to those

measured using the cost function:

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

PTR

where Cris the forecast concentration as predicted by the

transport and dispersion model at receptor r,Rrthe observed

concentration retrieved from receptorr,TR the total number

of receptors, and a and e are constants used to avoid taking

the logarithm of zero (a = 1, e ¼ 1 ? 10?13here).

The model agreement with sensor data is tested in the

context of an identical twin experiment; that is, the trans-

port and dispersion model used to optimize agreement with

the sensor data is the same model that produces the syn-

thetic sensor data. The identical twin approach is conve-

nient for formulating and testing problems because it

removes several of the sources of error from consideration:

we no longer expect inherent fluctuations due to turbulence

and we are assured that the sensor data is not contaminated

with noise. Since it allows us to compare to data that we

know are exact, it allows evaluation of the inversion

algorithm alone rather than the combination of model and

data. The disadvantage, of course, is that a level of realism

is lost in this approach. Thus, the approach is best suited for

algorithm analysis rather than estimating absolute perfor-

mance of an algorithm in the real-world setting.

cost ¼

PTR

r¼1lnðaCrþ eÞ ? lnðaRrþ eÞ

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

½?2

q

r¼1lnðaRrþ eÞ½?2

q

ð1Þ

2.2 Concentration prediction

A Gaussian plume transport and dispersion model is used

to forecast the contaminant concentration. This model is

used because it is an exact solution of the ensemble

averaged diffusion equation. The problem is formulated in

a Cartesian domain. Wind is assumed to blow in the

positive x-direction (i.e. the domain is rotated to so that the

x axis aligns with the direction of the wind). This model is

formulated as:

exp?ðz ? HeÞ2

2r2

z

Cr¼

Q

urzry2pexp

(

?y2

2r2

y

!

?

"#

þ exp?ðz þ HeÞ2

2r2

z

"#)

ð2Þ

where Cris the concentration of emissions from source n

over time period m at a receptor location, ðx;y;zÞ the

Cartesian coordinates of the receptor in the downwind

direction from the source, Q the emission rate from source

n over time period m, u the wind speed, Hethe effective

height of the plume centerline above ground, and ry;rzis

the dispersion coefficients in the y- and z-directions,

respectively.

The transport is in the x-direction at wind speed, u and

the contaminant dispersed in the y- and z-directions with

standard deviation of the spread given by the dispersion

coefficients. The dispersion coefficients are computed from

(Beychok 1994).

n

where x is the downwind distance (in km) and I, J, and K

are empirical coefficients dependent on the Pasquill Sta-

bility Class (Pasquill 1961), which characterizes the

atmospheric turbulence scales. In a highly unstable atmo-

sphere (stability class A), large eddies mix the contaminant

over a relatively large physical extent; thus, the dispersion

coefficients are large for this case. In contrast, for stable

conditions such as form on a calm night, few eddies exist

and mixing is slight (stability classes E and F), resulting in

smaller dispersion coefficients. For the slightly unstable

(classes B and C) and neutral (class D) cases, the coeffi-

cient values fall between those extremes. The dispersion

coefficients can be looked up in tables to produce ryand rz

(Beychok 1994).

The dispersed pollutant form the Cr of (2) and the

monitored data, Rrare the concentration values that are

compared in the cost function (1).

r ¼ exp I þ J lnðxÞ þ KðlnðxÞ½?2

o

ð3Þ

2.3 Impact of atmospheric stability

In our previous work (Allen et al. 2007a; Haupt 2005,

2007; Haupt et al. 2006; Long et al. 2009), we assumed a

neutrally stable atmosphere and used values of I, J, and K

appropriate for that assumption. The shape of a plume

varies greatly, however, with differing stability classes.

This sensitivity occurs because, in the atmosphere,

A mixed integer genetic algorithm used in biological and chemical defense applications

123

Page 4

turbulent dispersion is much larger than molecular disper-

sion; therefore, the turbulence scales of the atmosphere

determine the plume spread. Pasquill (Pasquill 1961)

defined six atmospheric stability classes, each with its own

characteristic scales. The most unstable class (stability A or

1) results in large-scale turbulent motion so the contami-

nant can be carried to higher elevations. In contrast, the low

stability classes (such as stability F or 6) are characterized

by smaller scale atmospheric eddies so the contaminant

does not spread as much in either the crosswind or vertical

directions. Figure 2 demonstrates that point by comparing

concentration isosurfaces for stable, neutral, and unstable

atmospheric conditions.

The six stability classes are discrete and thus are most

readily coded as integer values. Thus, the GA must be able

to optimize both real and integer parameters.

2.4 Impact of rainout

Additionally a binary variable is added to the cost function

to indicate whether or not it is raining: B = 0 indicates a

‘‘no rain’’ condition and B = 1 means ‘‘rain.’’ Steady rain

decreases the concentration exponentially with distance

because it eliminates a fixed fraction of the contaminant in

each time interval, so the concentration equation in the case

of rain becomes:

? exp?ðz ? HeÞ2

2r2

z

Cr¼ e?ax

Q

urzry2pexp

"

?y2

2r2

#

y

!"

þ exp?ðz þ HeÞ2

2r2

z

"#( )#

ð4Þ

where a is a constant that determines the rainout rate. Here

x is the downwind distance in km, a = 0 for ‘‘no rain,’’ and

a = 0.75 km-1for ‘‘rain’’ situations. Figure 3 compares a

plume for stability 2 with and without rainout.

2.5 Solution domain and variable settings

The transport and dispersion model calculations presented

here use a 32 9 32 grid of receptors (TR = 32 9

32 = 1,024) with the source located in the center of the

grid as seen in Fig. 3. In prior work (Long et al. 2009) we

showed that a grid of this size is quite robust for such back-

calculations of source and meteorological parameters, even

in the presence of noise. The target solution uses

u = 5 ms-1, h = 180? (wind from the south in the mete-

orological convention of wind direction), and with a sta-

bility class of 4.1

We conduct three sets of numerical experiments to test

the algorithm. First, we back-calculate the basic meteoro-

logical variables: wind direction, wind speed, and stability

classification. Then we compute the location (x, y) and

strength of the source in addition to wind direction and

stability class. The third experiment includes the meteo-

rological variables from the first experiment plus a binary

variable to indicate whether or not it is raining.

Fig. 2 Comparison of concentration isosurfaces for stabilities 1 (a), 3 (b), 4 (c), and 6 (d). Note the different vertical coordinate

1The algorithm was tested for sensitivity to changes in these

parameters and found to be insensitive. Therefore, although this

single case is shown here, we expect that the same results are

attainable for other values of the variables.

S. E. Haupt et al.

123

Page 5

2.6 Solution technique

The algorithm used to optimize the agreement between

monitored data and predicted concentrations is a MIGA.

Figure 4 is a flow diagram of a genetic algorithm applied to

the source characterization problem solved here. The

algorithm begins with a population of chromosomes, which

comprise first guesses of the variables to be optimized. The

GA works with many such guesses at once, so a matrix of

trial solutions is formed with chromosomes as the rows.

Initially, all of the chromosomes in the population matrix

contain random values, in this case, between 0 and 1. This

matrix is passed to the cost function and a column vector of

costs is returned. In a process known as natural selection,

the best chromosomes are retained in the population while

the ‘‘unfit’’ ones are discarded. The remaining population

then undergoes two operators, mating and mutation, that

generate new potential solutions. The operation of mating

combines the variable values from the best trial solutions

(the mating pool or parents) to produce a new population of

improved variable estimates, the offspring. The mutation

operator then modifies a set of those chromosomes of

parents and offspring that now form the population in order

to maintain an adequate sampling of the variable space,

thus preventing premature convergence to a suboptimal set

of variable values. The process repeats until an adequate

set of variables is identified. At that point, the source and

meteorology variables have been tuned to produce the best

agreement between measured concentrations and those

observed. The GA is quite robust at solving difficult non-

linear coupled optimization problems with a multitude of

local minima that are difficult for traditional techniques

(Goldberg 1989; Haupt and Haupt 2004; Holland 1975).

More details of the GA technique are described in Haupt

and Haupt (2004).

The MIGA used here minimizes cost functions that are

comprised of real number continuous variables, integer

variables, and binary variables. We configure the MIGA to

minimize the cost defined in (1). Including the integer and

binary variables in the search space necessitates a method

to simultaneously optimize integer, binary, and real values.

3 The mixed integer genetic algorithm

We describe several unique features of the MIGA used

here and first introduced in (Haupt 2007), including

•

all variables are represented with values between zero

and one,

the uniform crossover mating operation is used,

mutations occur on an entire chromosome rather than

on an individual variable, and

all scaling and mapping of the variables occurs in the

cost function.

•

•

•

This MIGA is versatile because the same algorithm can

be used for optimizing combinations of binary, integer, and

real variables.

3.1 Variable coding and chromosomes

In order to make the MIGA as flexible as possible, all

variables are mapped to continuous values between 0.0 and

1.0. The term continuous, as used here, specifies continu-

ous over the finite range of 0.0–1.0. A simple transforma-

tion in the cost function maps it to the appropriate range for

each variable. If a variable has an integer or binary value,

then the cost function will map it to a discrete value within

that range. The benefit of this approach is that all the

scaling, quantizing, and rounding happen in the cost

function, so that the MIGA operates independent of the

Fig. 3 Surface concentration of plume at stability 2 with (a) and

without (b) rainout with a = 0.75 in (4)

A mixed integer genetic algorithm used in biological and chemical defense applications

123

Page 6

variable type. There is no need for a binary GA or a real

GA, because the operators work with any combination of

variable types. A chromosome can have any mix of real,

integer, and binary variables.

Because of this mapping the initial population matrix is

an Nvar9 Npopmatrix of uniform random numbers in the

range of 0.0–1.0, where Nvaris the number of variables to

be optimized and Npopis the population size.

2

a11

a21

...

aNpop1

a12

a22

???

.

a1Nvar

..

...

???

aNpopNvar

6664

Each row is a chromosome that serves as an input to the

cost function (1). Inside the cost function, the domain-

limited real variables of each chromosome are converted to

the problem’s true variable types. Real variables are

mapped by

3

7775

ð5Þ

xn¼ xmax? xmin

The integer value mapping is

ðÞamnþ xmin

ð6Þ

xn¼ roundup xmax? xmin

where ‘roundup’ rounds to the next highest integer and xn

values are integers. Where necessary, the variable is

converted to binary by either rounding its value:

ðÞamn

f g þ xmin

ð7Þ

xn¼ round amn

or by quantizing its value:

fg

ð8Þ

xn¼ quantize amn

Let us consider a case where we wish to optimize wind

speed, wind direction, stability, and rainout. For wind

fg

ð9Þ

speed, we choose xWS;min¼ 0 and xWS;max¼ 20: Wind

direction uses xWD;min¼ 0 and xWD;max¼ 360: There are

six discrete stability classes that are decoded by assigning

each an integer value of 1 through 6 so that xstab;min¼ 1 and

xstab;max¼ 6: The binary value that determines whether or

not it is raining is determined via (8). If the rainout value is

1, the concentration is determined by (4). If it is 0, then (2)

is used instead.

The next step in the algorithm is natural selection. This

process occurs via the cost function, in this case, variables

in each chromosome are used to compute C using (2) or

(4). C is then fed into the cost function (1). Chromosomes

with low costs survive, while chromosomes with high costs

are discarded. This step either keeps a certain percentage of

the population or discards members with costs that exceed

a certain level. Surviving chromosomes become the mating

pool. Discarded chromosomes from the population are

replaced by new offspring chromosomes. In order to create

the offspring, parents must be selected. This current

application replaces 50% of the population at each gener-

ation. We use tournament selection (Haupt and Haupt

2004). In general, two parents produce two offspring that

replace two discarded chromosomes.

3.2 Mating

Mating between two selected chromosomes uses uniform

crossover, which is preferable for a MIGA because uniform

crossover provides a larger exploration of the cost surface

than other approaches to crossover (Haupt and Haupt

2004). First, a random binary mask is created consisting of

ones and zeros with the same length as the chromosome. A

one in the mask column means that the offspring receives

Fig. 4 Diagram indicating flow

of logic for a mixed integer

genetic algorithm applied to the

source characterization problem

S. E. Haupt et al.

123

Page 7

the value of that variable from parent#1, a zero that it

receives it from parent#2. As an example:

parent#1 ¼

parent#2 ¼

mask ¼

offspring ¼

If the matrix elements represent binary variables, then this

type of crossover results in a diversity of values. If the

elements represent continuous or integer variables then this

operation merely interchanges the values between chro-

mosomes. Consequently, mutation is the primary incubator

of diversity within the population for continuous and

integer values in this algorithm.

am1

an1

1

am1

am2

an2

0

an2

am3

an3

1

am3

am4

an4

1

am4

ð10Þ

3.3 Mutation

The simplest approach to mutation is to randomly select

variables in the population and replace them with uniform

random values. Indeed, this is the first step of the mutation

operation. If the third element of a chromosome is selected

for mutation, the mutated chromosome (chrom0) is derived

from the selected chromosome (chrom) by

chrom0¼ ar1

where the primed values are new uniform random numbers.

The second step of the mutation operator used here

adds a random adjustment factor to the chromosome

selected for mutation. The correction factor comes from

multiplying each element within the chromosome by a

random number (?1?brm?1) and multiplying the

resulting chromosome by a mutation factor (0?ar?1) so

that

chromc¼ ar br1ar1

Finally, the mutated chromosome is given by

ar2

a0

r3

ar4

½?ð11Þ

br2ar2

br3ar3

br4ar4

½?ð12Þ

chrom0¼ rem chrom þ chromc

where rem is the remainder function (digits to the left of the

decimal point are ignored).

fg

ð13Þ

4 Results

4.1 Three-variable problem

The first numerical experiment optimizes the base meteo-

rological parameters used for computing dispersion: wind

speed, wind direction, and Pasquill–Gifford stability class

(integer). Note that the stability class determines the dis-

persion coefficients and thus the spread of the plume. The

MIGA was run with Npop= 12 and a mutation rate for the

first step of mutation set to 0.2.

First a single run was accomplished using 5,000 gener-

ations to assess convergence properties. A plot of the con-

vergence appears in Fig. 5. For this case, the best solution

converges in about 1,200 generations. Based on this result, a

margin of error is added and the remaining three-variable

runs use 2,000 generations. Note that the high mutation rate

forces the algorithm to continually try new solutions; thus,

the mean solution does not change much.

The results of the first series of optimizations appear in

Table 1. That table reports the statistics of ten independent

runs of 2,000 generations each for optimizing the three

meteorological parameters. It is apparent that the GA is

quite reliable for this back-calculation. Not only is the

mean value of each variable quite close to the known exact

solution, but the standard deviations are quite small. In

fact, the integer denoting the stability category is consis-

tently diagnosed correctly. In additional runs (not shown),

we were assured that this calculation is not sensitive to

stability category.

4.2 Five-variable problem

Now, the problem is reconfigured to include the source

location variables in the inversion. Each of these have

xmin= -8,000 m and xmax= 8,000 m. The actual source

Fig. 5 Convergence of first numerical experiment on a 32 9 32 grid

for stability 4

Table 1 Results of ten MIGA optimizations of meteorological

parameters

Wind speed (m/s)Wind direction (?)Stability class

Actual5.000 180.0006

Mean 4.990180.0266

Median4.987 180.0286

SD0.2260.0140

A mixed integer genetic algorithm used in biological and chemical defense applications

123

Page 8

is located at (0, 0). We back-calculate the source strength

using xQ;min¼ 0 and xQ;max¼ 10 (these are scaling factors

on a non-dimensional emission rate, Q). Note that source

strength, Q and wind speed, u, appear as a ratio in (2), so

their affects on the concentration values, C, and thus the

cost function value have a high (albeit negative) correla-

tion; therefore trying to distinguish between the two pro-

duces an ill-posed problem. So here we choose to assume

that wind speed is known for this calculation.2

Table 2 reports statistics of ten independent runs, each

run for 10,000 generations3when optimizing wind direc-

tion, stability, the (x, y) location and strength of the source.

Generally, the MIGA is successful at finding the correct

values of the five parameters. The value of source strength

(a factor that multiplies the emission rate) is not quite as

close to actual, but still has a reasonable percent error

(taken as the difference from the actual divided by the

range). Although the magnitude of the source location error

appears large, it is small on the scale of the search domain,

-8,000 to 8,000 m. The final row of Table 2 lists the

difference between the mean and the actual as a percentage

skill score (based on the search range). Based on these

scores, the source location has been pin-pointed rather

accurately.

4.3 Four-variable problem

The third numerical experiment seeks to optimize four

variables: wind speed, wind direction, stability class, and

rainout switch (a binary number). This experiment thus

tests the MIGA’s ability to optimize real, integer, and

binary variables simultaneously. Each of the ten indepen-

dent runs used 300 generations. The results appear in

Table 3. The real valued meteorological variables, wind

speed and direction are consistently retrieved to a high

degree of accuracy. Both the integer and binary variables

were computed correctly in each of the ten runs with

standard deviations of 0. Thus we conclude that the MIGA

is an effective and reliable tool for such problems.

5 Discussion

The MIGA is a useful advance of computing technology

that allows joint optimization of integer, binary, and con-

tinuous variables. Its utility is demonstrated in a CBRN

security application—back-calculating source and meteo-

rological parameters for subsequent dispersion modeling.

Specifically, it allows adding the computation of stability

category and rainout switch in a way that would be difficult

for more traditional techniques. This work could aid

decision-makers by giving better estimates of contaminant

dispersion.

Note that there are several limitations of the current

work that affect its applicability. First, the demonstration

application has been done in the context of an identical

twin environment. Although this environment is quite

useful for algorithm testing as it allows verification against

a known solution, it does not, however, permit assessment

of the algorithm’s ability to cope with the unknown

vagaries of actual field monitored data. The second limi-

tation is that this work has been done in a noiseless envi-

ronment. In a real situation, there would be noise due to

error in the measurements, the inability of the ensemble

average model to match a specific realization, and the

inherently chaotic fluctuations due to atmospheric turbu-

lence. Incorporating synthetic noise (as was done in prior

work (Haupt et al. 2006; Long et al. 2009) could help

simulate these effects and thus provide a first estimate of

the algorithm’s sensitivity to them. In addition, sensors

have detection and saturation limits. Rodriguez et al.

(2009) showed that it is important to build such limits into

the models. Finally, the rainout simulation is not physically

realistic. Although some meteorological forethought went

into formulating (4), the basic form is ad-hoc and the

rainout rate (a) is not based on any observations and lacks a

dependence on rain rate. That part of the calculation was

Table 2 Results of ten MIGA optimizations of meteorological

parameters including the binary rainout switch

Source

strength

Wind direction

(?)

Stability

class

x (m)

y (m)

Actual1.00 180.004 0.00.0

Mean 1.38 180.414-9.924.6

Median 1.20180.30 4-22.046.7

SD 0.67 0.32025.6 72.3

% Error 3.8 0.010.00 0.000.15

Table 3 Results of ten MIGA optimizations of meteorological

parameters plus source location and strength

Wind

speed (m/s)

Wind

direction (?)

Stability

class

Rainout

switch

Actual5.000 180.00021

Mean5.015 180.05521

Median 4.998180.05521

SD0.055 0.04000

2One way to address the coupling of Q and u is to employ a Gaussian

puff model that provides a time-varying concentration field. In that

case the additional information allows computing both parameters

(Long et al. 2009). This approach, however, is not appropriate for the

continuous release considered here.

3More generations are required to assure convergence when

including these additional variables.

S. E. Haupt et al.

123

Page 9

solely for demonstrating that a binary variable could be

encoded and used in the inversion.

6 Conclusions

This work has introduced a new tool, the MIGA, that is

capable of optimizing integer, binary, and continuous

variables simultaneously. This algorithm was demonstrated

to be effective at back-calculating source and meteorology

information necessary to model dispersion of a CBRN

release if field sensor data is available. The biggest

advantage of this algorithm is its ability to find the global

optimum of such a highly nonlinear problem with a com-

plex cost surface with many local optima. The biggest

disadvantage of this algorithm is that it is not fast—a run of

the four-variable problem takes about 17 min on a desktop

PC. Although this sounds slower than standard steepest

descent algorithms, those algorithms are not successful in

solving this problem. In fact, when a Nelder Meade

Downhill Simplex algorithm (Nelder and Mead 1965) is

applied to this problem, it cannot find the solution.

Therefore, the robustness of the MIGA is the important

factor that makes it a clear winner for this application.

Thus, this application demonstrates the robustness of the

MIGA algorithm.

Future work will proceed on two fronts. First, we will

concentrate on further tuning and testing the MIGA. There

are a myriad of ways to tweak the GA operators that will be

explored. Such algorithm modifications should be tested in

the context of test functions to examine algorithm behavior

in a carefully construed environment. The second front will

be in applying the MIGA to broader problems, including

extensions of the current demonstration problem. For

example, we will examine the robustness of the results in

the face of noise in the data. We will also optimize addi-

tional variables, such as simultaneously optimizing source

strength and wind direction as well as including effective

source height and the height of the atmospheric boundary

layer. For these additional variables, we will use an

instantaneous source model and a puff dispersion model.

We could study these variables for various receptor con-

figurations and compute the amount of information nec-

essary to complete an inversion, both without noise and in

the presence of noise. Note that even further complexity

could be added by incorporating a more refined dispersion

model and using a more refined model or real data to relax

the identical twin assumption. Finally, it would be inter-

esting to use the MIGA to optimize the number of receptors

and their location.

In summary, the MIGA described here has proven to be

a powerful optimization tool that could be applied to many

problems, including those in CBRN defense arena.

Acknowledgments

making Fig. 2. Many helpful discussions with Christopher Allen,

Kerrie Long, Anke Beyer-Lout, Andrew Annunzio, Yuki Kuroki, Lili

Lei, and Luna Rodriguez helped inspire this work. The third author

also expresses his eternal gratitude to Francis de Sales and John

Bosco for support in manuscript preparation.

The authors would like to thank Kerrie Long for

References

Allen CT, Haupt SE, Young GS (2007a) Source characterization with

a genetic algorithm-coupled receptor/dispersion model incorpo-

rating SCIPUFF. J Appl Meteorol 46(3):273–287

Allen CT, Young GS, Haupt SE (2007b) Improving pollutant source

characterization by optimizing meteorological data with a

genetic algorithm. Atmos Environ 41:2283–2289

Beychok MR (1994) Fundamentals of stack gas dispersion, 3rd edn.

Milton Beychok, Pub., Irvine, CA, 193 pp

Goldberg DE (1989) Genetic algorithms in search, optimization, and

machine learning. Addison-Wesley, New York

Haupt SE (2005) A demonstration of coupled receptor/dispersion

modeling with a genetic algorithm. Atmos Environ 39:7181–

7189

Haupt RL (2007) Antenna design with a mixed integer genetic

algorithm. IEEE AP-S Trans. 55(3):577–582

Haupt RL, Haupt SE (2004) Practical genetic algorithms, 2nd edn

with CD. John Wiley & Sons, New York, NY

Haupt SE, Young GS, Allen CT (2006) Validation of a receptor/

dispersion model coupled with a genetic algorithm using

synthetic data. J Appl Meteorol 45:476–490

Holland JH (1975) Adaptation in natural and artificial systems. The

University of Michigan Press, Ann Arbor

Long KJ, Haupt SE, Young GS (2009) Assessing sensitivity of source

term estimation. Atmos Environ (submitted)

Nelder JA, Mead R (1965) A simplex method for function minimi-

zation. Comput J 7:308–313

Pasquill F (1961) The estimation of the dispersion of windborne

material. Meteorol Mag 90:33–49

Rodriguez LM, Haupt SE, Young GS (2009) Impact of sensor

characteristics on source characterization for dispersion model-

ing. Measurement (in revision)

A mixed integer genetic algorithm used in biological and chemical defense applications

123