Using Genetic Algorithms in Chem-Bio Defense Applications

advertisement
2007 ECSIS Symposium on Bio-inspired, Learning, and Intelligent Systems for Security
Using Genetic Algorithms in Chem-Bio Defense Applications
Sue Ellen Haupt*
Randy L. Haupt
George S. Young
Meteorology Department
The Pennsylvania State University
State College, PA 16804-0030
*haupts2@asme.org
uncertainties in modeling turbulent dispersion;
transport and dispersion models compute the ensemble
average of many realizations of an event while the
goal is to reproduce a specific single realization of the
event in real time; and the wind field evolves in time
and space. Although these considerations imply a
rather formidable problem, careful formulation of the
problem as one in optimization that combines the
physical dispersion modeling capability with the
ground truth from a network of field sensors enables
back-calculation of the parameters necessary to predict
the downwind transport and dispersion of the
contaminant.
Our previous work demonstrated that coupling
inverse models with transport and dispersion models
using a genetic algorithm (GA) is an effective
approach for attributing concentration contribution at a
receptor to each of a specified number of sources [1].
This methodology was tested using a basic Gaussian
plume dispersion model on synthetic data for circular
source configurations and with actual source
configuration for Logan, Utah. The methodology was
then validated using Monte Carlo techniques to
determine the confidence intervals [2]. We also
studied the robustness of the methodology by
considering the effects of both additive and
multiplicative white noise [2]. We found that even
when the noise was the same magnitude of the signal,
the GA coupled model could correctly apportion the
pollutant to the correct source. The next step was to
replace the Gaussian plume dispersion model with an
operational Second order Closure Integrated puff
model, SCIPUFF [3]. The GA coupled model
performed as well with SCIPUFF computing the
dispersion as with the Gaussian plume model. That
enhanced coupled model was then tested on field test
Abstract
There are many problems in Security and Defense
that require a robust optimization technique, including
those that involve the release of a chemical or
biological contaminant. This paper discusses using a
genetic algorithm for addressing such problems. An
example is given how a mixed integer genetic
algorithm can be used in conjunction with field sensor
data to invert for source information and all necessary
meteorological data. A new mixed integer genetic
algorithm is described that is a state-of-the-art tool
capable of optimizing a wide range of objective
functions. Such an algorithm is useful for optimizing
atmospheric stability, wind speed, wind direction, and
source location. We demonstrate that the algorithm is
successful at reconstructing these source and
meteorological parameters.
1. Introduction
In the case of an accidental or intentional release
of a toxic contaminant, responsible agencies must
decide which areas to evacuate, how to mitigate the
release, and to plan for emergency response. That
process is likely to be based on forecasts of transport
and dispersion of the contaminant. In a real situation,
however, it is unlikely that the exact information
regarding the source parameters (location, time,
strength of the release) or the meteorological data
(wind speed and direction, atmospheric stability) that
are necessary to predict contaminant concentrations
downwind would be available. In addition, monitored
concentration data contains errors; there are inherent
0-7695-2919-4/07 $25.00 © 2007 IEEE
DOI 10.1109/BLISS.2007.29
151
data [3]. Within the limitations of the data, the
coupled model still performed admirably. The cases
where performance was disappointing were traced to
difficult situations during the field test that would be
expected to impact data quality. For those cases, prior
comparisons of model results to the measured
concentrations were also quite poor. The reformulation
of the problem to additionally compute the wind speed
and direction appears in Allen, et al. [4].
The inverse problems in these prior studies were
all solved using a genetic algorithm (GA). The
parameters to be optimized by the GA are the input
values for the dispersion model. Thus, for each
potential solution, the results of the dispersion model
with those estimated parameters are compared to the
monitored concentration pattern. That series of efforts
progressed from identifying the source strength
through identifying all relevant parameters. This
process is depicted in Figure 1.
data and predicted concentrations. The cost function to
be minimized by the GA is:
TR
∑ [ ln(aC
r
cost =
r =1
+ ε ) − ln(aRr + ε )]
2
(1)
TR
∑ [ln(aR
r
r =1
+ ε )]
2
where:
Cr= forecast concentration as predicted by the
Gaussian puff equation at receptor r,
Rr=observed concentration retrieved from receptor r,
and a and ε are constants used to avoid taking the
logarithm of zero (a = 1, ε = 1× 10 here). TR is
the total number of receptors.
The dispersion model used is a Gaussian plume
model. The runs presented here use a 32X32 grid of
receptors (TR = 32X32 = 1024) with the source
located in the center of the grid as seen in Figure 2.
Note that many of the receptors receive negligible
impact. We demonstrate two sets of calculations. First,
we back-calculate the meteorological parameters:
wind direction, wind speed, and stability classification.
Then we add to that the location (x,y) of the source.
−13
Figure 1. Schematic of source and meteorological
data optimization for Security.
The current paper describes our first efforts to
compute atmospheric stability parameters in addition
to wind direction and speed. The stability of the
atmosphere determines the dispersion coefficients that
govern the extent of the plume spread with distance
and time. This requires a Mixed Integer Genetic
Algorithm (MIGA), described in more detail below.
Figure 2. Concentration of dispersed plume on
32X32 grid for stability 4.
2.2 The Mixed Integer Genetic Algorithm –
A GA is an optimization technique that integrates
genetic recombination with natural selection to evolve
better solutions to an optimization problem. Figure 3 is
a flow chart of a typical GA. A single guess of the
optimum input to the cost function is placed in a row
vector called a chromosome. The GA works with
many guesses at once, so a matrix is formed with
chromosomes as the rows. Initially, all the
chromosomes in the population are random. This
matrix is passed to the cost function and a column
vector of costs is created. The operation of mating
2. Methodology
2.1 Model Formulation –
Given field sensor data, the algorithm must backcalculate source characteristics and meteorological
data for subsequent transport and dispersion modeling.
The technique used to solve that problem is a MIGA,
which optimizes the agreement between monitored
152
combines the information from the best prior
parameter values to produce a new population of
improved estimates. The mutation operator generates
new solutions to maintain an adequate sampling of the
parameter space, preventing premature convergence to
a suboptimal set of parameter values [5]. The GA is
quite robust at solving difficult nonlinear coupled
optimization problems that are difficult for traditional
techniques.
a real GA, because the operators work with any
combination of variable types. A chromosome can
have any mix of real, integer, and binary variables.
The next step in the algorithm is natural selection.
Chromosomes with low costs survive, while
chromosomes with high costs are discarded. This step
either keeps a certain percentage of the population or
discards members with costs that exceed a certain
level. Surviving chromosomes are known as the
mating pool. Discarded chromosomes from the
population are replaced by new chromosomes called
offspring. In order to create the offspring, parents must
be chosen. Here we use tournament selection. In
general, two parents produce two offspring that
replace two discarded chromosomes.
Mating between two selected chromosomes uses
uniform crossover, which is preferable for a MIGA
since uniform crossover provides a larger exploration
of the cost surface than other approaches to crossover.
First, a random binary mask is created consisting of
ones and zeros to the length of the chromosome. A one
in the mask column means the offspring receives the
variable value in parent#1. If it has a zero, then the
offspring receives the variable value in parent#2.
Mutation is performed by randomly selecting
variables in the population and replacing them with
uniform random values. The mutation rate determines
the total number of variables that receive a mutation
This type of mutation modifies the entire chromosome
rather than a single variable. It is attractive, because it
is not confined to exploring one variable at a time.
Figure 3. Flow chart of a genetic algorithm.
We develop a new MIGA approach. The new
MIGA used here has several unique features,
including
• All variables are represented with values between
zero and one,
• The uniform crossover mating operation is used,
• Mutations occur on an entire chromosome rather
than an individual variable, and
• All scaling and mapping of the variables occurs in
the cost function.
This MIGA is versatile because the same algorithm
can be used for any type of variable.
The MIGA used here minimizes cost functions
that are comprised of real number continuous variables
and integer variables in calculating the cost. We
configure the MIGA to minimize the cost in (1). The
integer variable, atmospheric stability class, is
included in the search space.
4. Results
The MIGA was run for two different types of
inversion. The first configuration optimized all
meteorological parameters used for computing
dispersion: wind speed, wind direction, and Pasquill
Gifford stability class (integer). Note that the stability
class determines the dispersion coefficients, i.e. the
spread of the plume.
First a single run was
accomplished using 5000 generations. A plot of the
convergence properties appears in Figure 4. For this
case, the best solution converges in about 1200
generations. Note that the high mutation rate forces the
algorithm to continually try new solutions; thus, the
mean solution does not change much.
The results of the first series of optimizations
appear in Table 1. That table reports the statistics of 10
runs of 2000 generations each for optimizing the three
meteorological parameters. It is obvious that the GA is
quite reliable for this back-calculation.
3. Algorithm Details
In order to make the MIGA as flexible as possible, all
variables are mapped to continuous values between 0
and 1 [6]. The term continuous as used here, specifies
values between 0 and 1. If a variable has an integer or
binary value, then the cost function will convert it to a
continuous value. The benefit of this approach is that
all the scaling, quantizing, and rounding happen in the
cost function, so the MIGA operates independent of
the variable type. There is no need for a binary GA or
153
5. Discussion
The MIGA is a useful advance of technology that
allows jointly optimizing integer, binary, and
continuous parameters. It is particularly applicable for
extending this work in security – back-calculating
source and meteorological parameters for subsequent
dispersion modeling. Specifically, it allows adding the
computation of stability category in a way that would
be difficult for more traditional techniques. This work
will aid decision-makers by giving better estimates of
contaminant dispersion.
Future work will concentrate on examining the
robustness of the results in the face of noise in the
data. In addition, we will look at more variables, such
as source strength and effective source height in the
inversion process. For these additional variable, we
will use a instantaneous source model, a puff
dispersion model. We will study these variables for
various receptor configurations and compute the
amount of information necessary to complete an
inversion, both without noise and in the presence of
noise. Finally, we plan to examine using the MIGA to
optimize the number of receptors and their location.
Figure 4. Convergence of the MIGA for
optimizing three meteorological parameters.
Table 1. Results of 10 GA optimizations of
meteorological parameters.
Wind Spd
(m/s)
Actual
Mean
Median
Stand Dev
Wind Dir
(°)
5.000
4.990
4.987
0.226
Stabilty
Class
180
180.026
180.028
0.014
6
6
6
0
References
Table 2 reports statistics of 10 separate runs, each
run for 10,000 generation when optimizing wind
direction, stability, the (x,y) location and strength of
the source. Generally, the MIGA is successful in
identifying the relevant parameters. The value of
source strength (a factor that multiplies the emission
rate) is not quite as close to actual, but still a
reasonable percent error (taken as the difference from
the exact divided by the range). Although the
magnitude of the source location appears large, it is on
a scale ranging from -8000 to 8000m. The final row of
Table 2 lists the difference between the mean and the
actual as a percentage. When looking at a percentage
error, the source location has been pin-pointed rather
accurately.
[1] S.E. Haupt, “A Demonstration of Coupled
Receptor/Dispersion Modeling with a Genetic
Algorithm,” Atmospheric Environment, vol. 39, pp.
7181-7189, 2005.
[2] S. E. Haupt, G. S. Young, and C. T. Allen, “Validation
of a Receptor/Dispersion Model Coupled with a Genetic
Algorithm Using Synthetic Data,” J. Appl. Meteor., 45,
476–490, 2006.
[3] C. T. Allen, S.E. Haupt, and G. S. Young, “Source
Characterization With a Genetic Algorithm-Coupled
Receptor/Dispersion Model Incorporating SCIPUFF”, J.
Appl. Meteor, 46, No. 3, 273–287, 2007.
[4] C.T. Allen, G.S. Young, and S.E. Haupt, “Improving
Pollutant Source Characterization by Optimizing
Meteorological Data with a Genetic Algorithm”, to
Atmospheric Environment, 41, 2283-2289, 2007.
Table 2. Results of 10 GA optimizations of
meteorological parameters plus source siting.
Source
Strength
Wind
Dir (°)
Actual
Mean
Median
1.00
1.38
9.20
180.00
180.41
180.30
StdDev
0.67
0.32
% Error
3.8
0.01
Stabilty
Class
4
4
4
x
(m)
y
(m)
0.0
-9.9
-22.0
0.0
24.6
46.7
0
25.6
72.3
0.00
0.00
0.15
[5] R. L. Haupt and S. E. Haupt, Practical Genetic
Algorithms, 2nd edition with CD. John Wiley & Sons,
New York, NY, 2004.
[6]
154
R. L. Haupt, Antenna Design with a Mixed Integer
Genetic Algorithm, IEEE AP-S Trans., 55, No. 3, 577582, 2007.
Download