Document 11863906

advertisement
This file was created by scanning the printed publication.
Errors identified by the software have been corrected;
however, some errors may remain.
An Overview of Stochastic
Spatial simulation
R. Mohan Srivastaval
Abstract. - Though tools for stochastic spatial simulation are
becoming increasingly available and accessible, the variety of different algorithms dso forces users to make conscious choices about
the methodologies they select. This overview paper presents a brief
summary of two of the most successful and popular approaches, sequential simulation and simulated annealing, and uses these as the
basis for discussing some of the practical and theoretical issues surrounding the use of stochastic spatial simulation.
INTRODUCTION
In a broad range of earth science applications, stochastic spatial simulation is
rapidly becoming a common tool for engineering and decision-making in the
face of uncertainty. In environmental applications, it is being used as the basis
for remediation planning and confirmation sampling, not only at large Superfund sites but also for small backyard contamination problems. In fisheries
studies, stochastic spatial simulation is being used by regulatory agencies to
improve their ability to manage fish stocks and to guarantee the stability of
the fisheries industry. In agricultural studies, the same tools are being used to
determine whether or not policies designed to improve soil properties are having their intended effect. As the interest in stochastic spatial simulation grows,
there is an explosion in the variety of simulation algorithms. Even a comprehensive toolkit like the public domain GSLIB software (Deutsch and Journel,
1992), with its 11 routines for simulation, does not cover the breadth of what
is currently available. With simulation research thriving a t several academic
institutions, broad choice is starting to become bewildering confusion.
WHAT DO WE WANT FROM A STOCHASTIC METHOD?
Stochastic models as art
Stochastic models often are not used quantitatively and serve only as colorful
wallpaper for the halls of research centers. Though such artwork is often dismissed as lacking in substance, it can play a useful role in catalyzing better
technical work. In industries where there is an endemic belief that all spatial
models should look like contour maps, it is difficult to get people to appreciate
that there is much more complexity between the available sample data than
'Geostatistician, FSS International, Vancouver, B.C., Canada
traditional contouring techniques can portray. Stochastic models often have
their greatest influence when they are used to challenge a complacent belief in
the simplicity of spatial processes. When engineers and scientists who are familiar with traditional contouring techniques first see the results of a stochastic
approach, their initial reaction is often one of surprise and skepticism. Their
surprise is due to the fact that stochastic models are much more visually complex than the models they are familiar with; their skepticism is due to the fact
that there are many different renditions, rather than a unique "best" model.
This initial surprise and skepticism often gives way to curiosity, however, as
they realize that these alternate renditions all share a similar visual character
and all somehow manage to honor the available data.
Once a stochastic model is rendered on maps and cross-sections and hung on
the walls of a meeting room, it often becomes a lightning rod for criticism as
people identify specific features that cannot possibly exist. Their reasons vary
from geologic arguments based on depositional environment to geophysical arguments based on surface seismic studies to engineering arguments based on
pressure tests to biologic arguments based simple principles of food supply.
In all of these situations, the benefit of the stochastic model may simply be
to focus the views of a wide variety of experts and, in so doing, to point the
direction to an improved model. Whether this new and clearer understanding
is used to improve the stochastic model or is used to make intelligent adaptations to more traditional models, the initial stochastic model has still played
an important role in the evolution of an appropriate spatial model.
Stochastic models for assessing the impact of uncertainty
Many of the engineers and scientists who are responsible for forecasting the
future behavior of a spatial process, such as fluid flow in a reservoir or contaminant transport in an aquifer, have a solid understanding of the fact that there
is always uncertainty in any spatial model. They know that their forecasts
are based on a specific model that is designated the "best" model, but they
also want a couple of other models, a "pessimistic" one and an "optimistic"
one, so that they can assess whether the decisions and policies that have been
developed on the "best" model are flexible enough to handle the uncertainty.
When used for this kind of study, a stochastic approach offers a variety of spatial models, each one of which is consistent with the available data. We can sift
through these many possible renditions and select one that looks pessimistic
and another that looks optimistic.
Stochastic models for Monte Carlo risk analysis
Stochastic modelling offers the hope of supporting a Monte Carlo type of risk
analysis since the various renditions it produces are not only plausible, in the
sense that they honor all the available data, but they are also intended to be
equally probable in the sense that any one of them is as likely a representation
of the true underlying spatial phenomenon as any other one. In such studies,
hundreds of alternate models, if not thousands, are generated and processed to
produce a distribution of possible values for some critical engineering parameter - breakthrough time of a contaminant, for example, or an economically
optimal ore/ wast e classification; these distributions are then used to optimize
some decision by minimizing an objective function.
A critical aspect of this use of stochastic modelling is the belief in a "space
of uncertainty" and that the stochastic modelling technique can produce outcomes that sample this space fairly. We know that we cannot possibly look at
all possible outcomes, but we believe that we can get a fair representation of
the whole spectrum of possibilities. When used for this purpose, we hope that
our stochastic modelling technique does not have any systematic tendency to
show us optimistic or pessimistic scenarios since we are going to use all of the
outcomes as equally probable representations of reality.
While these types of studies overlap somewhat with those described in the
previous section, the important difference between the two is that Monte Carlo
risk analysis involves the notion of a probability distribution while the type
of study described in the previous section does not. The previous section
referred to studies where models are chosen by sifting through a large set
and extracting two that seem to be plausible, but extreme, scenarios. No-one
particularly cares whether the "optimisticn case is the 95th percentile or the
99th percentile; it's just an example of a case that produces a very optimistic
result while still honoring the same basic data. In Monte Carlo risk analysis,
however, we depend on the notion of a complete probability distribution of
possible outcomes and on the hope that our stochastic approach is producing
outcomes that fairly represent the entire space.
Stochastic models for honoring spatial variation
Though stochastic techniques can produce many out comes, stochastic modelling studies often use a single outcome as the basis for forecasting and planning. When used in this way, spatial simulation techniques are attractive not
so much for their ability to generate many outcomes but rather for their ability
to produce outcomes that have realistic spatial variation.
Over the past decade it has become increasingly clear that forecasts of behavior
of spatial processes are more accurate when based on models that reflect the
actual spatial heterogeneity of the phenomenon under study. In the petroleum
indsutry, there are countless examples of reservoir performance predictions
whose failure is due to the use of overly simplistic models; similar disasters
can be found in environmental remediation plans that were based on smooth
contour maps that could not possibly be used to assess the effect of unexpected
"hot spots". Most traditional methods for modelling spatial processes end up
with a model that is too smooth and continuous, gently undulating from one
sample location to the next rather than showing the spatial variation that
is known to exist between samples. Such smoothness often leads to biased
predictions and poor decisions - contaminants escape from their repositories
much quicker than expected, for example, or the actual volumes of remediated
soil end up being much larger than originally forecast.
Though using a stochastic method to produce a single outcome is often viewed
with disdain by those who prefer to pump hundreds of possible outcomes
through Monte Carlo risk analysis, it can be argued that even a single outcome
from a stochastic approach is a better basis for prediction and decision-making
than a single outcome from a technique that does not honor spatial variation.
TWO COMMON STOCHASTIC METHODS
Sequential simulation
The family of "sequential" procedures all make use of the same basic algorithm:
1.
2.
3.
4.
5.
Choose at random a node where no simulated value exists.
Estimate the local conditional probability distribution (lcpd).
Draw at random a single value from the lcpd.
Include the newly simulated value in the set of conditioning data.
Repeat steps 1 through 4 until all grid nodes have a simulated value.
(a) C0NDSTK)NIffi DATA
11.2
(b) RAN-
LOCAnON
(c) LCPD A 1
@
(d) SlMULA ?ED VALUE
21.9
POROSITY
Figure 1: Sequential simulation of porosities in an aquifer.
The only significant difference between the various sequential procedures is
estimation of the lcpd; any technique that estimates the lcpd can be used to
drive sequential simulation. Multigaussian kriging, for example, produces an
estimate of the lcpd by assuming that it follows the normal distribution and
estimating its mean and standard deviation; this was the approach used in
the diagram in Figure 1. Indicator kriging is an example of another technique
that could be used to estimate the lcpd. With this procedure, no assumption
is made about the shape of the distribution and the lcps is contructed by
directly estimating the probability of being below a series of thresholds or the
probability of being within a set of discrete classes.
8)
Actual Target
Net-to-gross: 69.7% 70%
Ave. length: 29.78m 60m
Ave. thickness: 4.531~1 lorn
(23,15)
Sgo2,13)
tatistics deteriorate,
back to last grid
Actual Target
Net-to-gross: 69.7% 70%
Ave. length: 29.34m 60m
1Om
Ave. thickness: 4.45m
e) ITERATION NO. 464
b,5] cf (8.22)
tat~st~cs
deteriorate,
go back to last grid
Actual Target
Net-to-gross: 69.7% 70%
Ave.length: 45.46m 60m
Ave. thickness: 6.51 m 1Om
INITIAL GRID.
ote: Each pixel IS 20m
horizontally by 3m vertically
(c) ITERATION NO. 2
t,
b) ITERATKlN NO. 1
2,17) t,(25,25)
tatistics ~mprove,
keep this grid
Q
Actual Target
Net-to-gross: 69.7% 70%
Ave. length: 30.00m 60m
1om
Ave. thickness: 4.57m
6
d) E R A T D N NO. 463
243) t,(7,12)
tatistics improve,
keep this grid
Actual Target
Net-to-gross: 69.7% 70%
Ave. length: 45.68m 60m
Ave. thickness: 6.55m
10m
L
Actual Target
69.7% 70%
60.00m 60m
9.88m
1Om
f) ITERATDN NO. 1987
Net-to-gross:
2514) t,(l4,12)
tatistics improve,
Ave. length:
keep this grid
Ave. thickness:
Figure 2: Simulated annealing for a simple sandlshale simulation.
Simulated annealing
In simulated annealing the spatial model is constructed by iterative trial-anderror. Figure 2 shows a simplified annealing procedure for the problem of
producing a sandlshale model that has a 70% sand, an average shale length
of 60 meters and an average thickness of 10 meters. In Figure 2a the pixels
in the model are randomly initialized with sands and shales in the correct
global proportion. The random assignment of the sands and shales causes the
average shale length and thickness to be too short. Figure 2b shows what
happens after we perturb the initial model by swapping a sand pixel with a
shale ~ i x e l . Such a swap will not affect the global proportions of sand and
shale; it does, however, make a tiny improvement in both the average shale
length a.nd thickness. Figure 2c shows the grid after a second swap. We're not
quite as lucky with this second swap since it makes the average shale length
and thickness a little worse, so we reject this second swap, go back to the
previous grid shown in Figure 2b and try again.
This swapping is repeated many more times and, after each swap, the average
shale length and thickness are checked to see if we are any closer to the target
values. If a particular swap does happen to get us closer to the length and
thickness statistics we want then we keep the swap; on the other hand, if
the swap causes the image to deteriorate (in the sense that the length and
thickness statistics are further from our target) then we undo the swap, go
back to the previous grid and try again.
Figure 2d shows the grid after the 463rd swap, which happens to be a good
one. At this point, 66 of the attempted 463 swaps were good and were kept,
while the remaining 397 were bad and were not kept. Figure 2e shows the
effect of the 464th swap, a bad one that is not kept because it causes the
statistics to deteriorate. Figure 2f shows what the grid looks like after the
1987th swap. At this point there were 108 good swaps and 1879 bad ones, and
the shale length and thickness statistics are very close to our target values.
The key to simulated annealing is minimizing the deviation between the grid
statistics and target values; the function that describes this deviation is usually called the "energy" or "objective" function. In the example in Figure 2,
where we were trying to match the average length and thickness of the shales,
the energy function was simply the sum of the absolute differences between
the actual average dimensions and our target values: E = IActual average
length-60 1 1 Actual average thickness- 101. The annealing procedure tries to
drive the value of E to zero; decreases in energy are good, increases are bad.
+
Though Figure 2 displays some of the key concepts in simulated annealing, it
is a very simplified version and does not show some of the important aspects of
practical implementations of simulated annealing. Most simulated annealing
procedures do not recalculate the energy after each swap. What they do
instead is perform several swaps before recalculating the energy. Initially,
when the energy is high (the statistics of the grid are far away from the target
values) many swaps are performed before the energy is recalculated; as the
energy decreases and the grid statistics converge to their desired target values,
the number of swaps between recalculations is decreased.
The second important difference between most practical implementations of
annealing and the simplified example in Figure 2 is tolerance of lack of progress.
In the example shown in Figure 2, any swap that increased the energy (moved
us further away from our target statistics) was immediately undone. In the
common practical implementations of simulated annealing, some increases in
energy are tolerated. The chance of allowing the energy to increase is related
to two factors: the magnitude of the increase and the number of iterations. In-
creases in energy are more tolerated early on in the procedure; small increases
are always more tolerated than large ones.
A third difference between the example given here and most practical implementations is that the example in Figure 2 does not have any conditioning
information to honor. Conditioning information is quite easily honored in
practice by setting the appropriate pixels to the values observed and then ensuring that these conditioning pixels are never swapped later in the procedure.
The other important difference between most practical implementations of
simulated annealing and the simplified example in Figure 2 is that the energy
function is usually much more complex than the simple one used here. Simulated annealing is a very flexible approach that can produce stochastic models
that honor many different kinds of information; the paper by Kelkar and Shibli
in this volume provides an example of the use of annealing for honoring the
fractal dimension. For any desired features or properties that we can summarize numerically, we can use simulated annealing to honor these features
by including appropriate terms in the energy function. In Figure 2, for example, we wanted to honor both the average shale thickness and the average
shale length so our energy function had two components. In more interesting
practical examples, the energy function might have several more components.
For example, we could have one component that describes how close the grid's
variogram comes to some target variogram, a second component that describes
how similar the synthetic seismic response of the grid comes to some actual
seismic data, and a third component that describes how close a simulated
tracer test on the grid comes to observed tracer test results.
ADVANTAGES AND DISADVANTAGES
Stochastic models as art
If a stochastic model is being used to visually catalyze more critical thinking,
then one of the primary criteria in choosing a method has to be the visual appeal of the final result. For stochastic modelling of geological objects, like sand
channels, pixel-based techniques, like the sequential procedures discussed earlier, often tend to produce geological objects that are too broken up around the
edges. Simulated annealing is able to better preserve the sharp edged features
that many geologists expect to see. There are also object-based techniques,
also often referred to as "boolean" methods, whose visual appeal is hard to
beat because they deal with entire objects rather than elementary pixels.
For stochastic modelling of continuous variables, such as porosities or contaminant concentrations, a boolean approach is not going to completely do
the trick. Such an approach is usually useful as a first step, but the artwork is unconvincing if continuous properties within each population are assigned randomly. The images look more plausible when the properties within
a given population have a prescribed pattern of spatial continuity. With most
of the techniques for modelling continuous variables, the spatial continuity is
described through a variogram model. Though experimental variograms typically show a nugget effect, experience has shown that stochastic models of
continuous properties have more visual appeal if the variogram model used in
the stochastic simulation procedure has no nugget effect.
For stochastic modelling of a very continuous property, such as the thickness of an aquifer, sequential techniques tend not to work well. The same
variogram models that describe very continuous surfaces - the gaussian variogram model, for example, or a power model with an exponent close to 2
- also produce chaotic weighting schemes when some of the nearby samples
fall directly behind closer samples. Since sequential procedures include each
simulated value as hard data that must be taken into account when future
nodes are simulated, and since they are typically performed on a regular grid,
chaotic weighting schemes are virtually guaranteed whenever we are trying to
simulate a very continuous surface.
Most of the stochastic modelling techniques for continuous variables give us
the ability to honor a variogram, or a set of indicator variograms. If we are
trying to produce stochastic artwork that shows specific sedimentary features,
such as a succession of fining-upwards sequences, or a fabric of cross-cutting
features, neither of which is well-described by a variogram, then annealing is
often the most useful approach. As long as we can find a way of numerically
summarizing the sedimentary features that we would like to see, then we can
include appropriate terms in our energy function for annealing.
Stochastic models for assessing the impact of uncertainty
When the goal of a study is to produce a handful of models that capture
a couple of extreme scenarios as well as some intermediate ones, one of the
practical issues that comes to the forefront is computational speed. This is
especially true because it is usually not possible to steer a stochastic model to
an optimistic realization or to a pessimistic one; the only way of finding such
extreme scenarios is to produce many realizations and sift through them all.
It is difficult to get an objective ranking of the speed of stochastic modelling
algorithms since their authors are tempted to brag that their own method
is lightning fast. Though all methods are workable in practice, some require
several days of run-time on fast computers to produce a single realization
despite their author's enthusiastic claims of speed. Such procedures cannot be
a practical basis for producing many realizations.
As a general rule, sequential procedures are slow because they have to search
through an ever-increasing set of hard data to find the nearby samples that
are relevant for the estimation of the local conditional probability distribution.
The notable exceptions to this are the sequential simulation procedures based
on Markov random fields. These sequential procedures considerably accelerate
the search for previously simulated data by using only the closest ring of data;
some of them accomplish even greater speed by precalculating the kriging
weights that will eventually be needed during the sequential simulation.
Annealing is a highly variable procedure in terms of its run-time. For certain
problems, annealing is very rapid; for others it is abysmally slow. The key
to rapid annealing is the ability to update rather than recalculate the energy
function. If a global statistic has t o be recalculated from scratch every time two
pixels are swapped, then annealing is not likely to be speedy; if there is some
convenient trick to updating the previous calculation rather than recalculating
from scratch, then annealing can be very efficient.
Stochastic models for Monte Carlo risk analysis
For studies that involve Monte Carlo risk analysis, there are two criteria that
need to be considered when selecting a stochastic modelling technique. The
first is computational speed; the second is the issue of whether the stochastic
method is fairly sampling the space of uncertainty. The issue of the space of
uncertainty is one of the thorniest theoretical problems for stochastic modelling. There are some who take the view that the space of uncertainty must
be theoretically defined outside the context of the algorithm itself so that the
user does have some hope of checking that a family of realizations represents a
fair sampling of the space of uncertainty. Others, however, take the view that
the space of uncertainty can only be defined through the algorithm and that it
is, by definition, the family of realizations that is produced when all possible
random number seeds are fed to the algorithm.
This issue of the equiprobability of the realizations and the fair sampling of
the space of uncertainty comes to the forefront when annealing is used as
a post-processor of the results from some other technique. Though there is
some theory that backs up the claim that annealing fairly samples the space of
uncertainty, this theory is entirely based on the assumption that the starting
grid is random (or that the initial energy is very high). Though annealing offers
tremendous flexibility in incorporating and honoring widely different types of
information, there remains an awkward ambiguity about its implicit space of
uncertainty and whether this space is being fairly sampled.
Stochastic models for honoring spatial variation
If the reason for producing a stochastic model is to respect critical heterogeneities, then the choice of a stochastic modelling method should take into
account the type of data that exists on heterogeneities. Annealing is a powerful technique for accommodating informat ion on heterogeneities that comes
through complex, indirect or t ime-varying measurements, such as pressure
transient tests in aquifers. Annealing has been used, for example, to directly
honor tracer test results; though the procedure is very slow (because the tracer
test response cannot be readily updated when pixels are swapped), the resulting scenarios are tremendously realistic since they capture the connectivity of
the extreme values that is implicit in the actual tracer test observations.
CONCLUSIONS
Though this paper has tended to discuss stochastic simulation methods in
isolation, many practical problems call for a combination of appropriate simulation algorithms. It is unfortunate that much of the technical literature and
many of the commercial software packages for stochastic spatial modelling
promote a single method as better than all others. There is no stochastic
modelling method that is universally best for all possible spatial phenomena.
As stochastic modelling becomes more accepted in the earth sciences and as
new stochastic modelling techniques are developed, it is certain that the most
successful applications will be those that view the growing assortment of methods as a complementary toolkit rather than as a set of competing methods.
REFERENCES
Deutsch, C.V., and Journel, A.G ., 1992, GSLIB: Geostatistical Software Library and User's Guide, Oxford University Press, 340 p.
Yarus, J., and Chambers, R., (eds.), 1994, Stochastic Modeling and Geostatistics, Practical Applications and Case Histories, American Association of
Petroleum Geologists .
BIOGRAPHICAL SKETCH
R. Mohan Srivastava, is a geostatistical consultant with fifteen years of experience in the practice of geostatistics. He is an author of A n Introduction
to Applied Geostatistics, the major introductory textbook on the practice of
geostatistics, and of more than thirty technical articles and reports on the
theory and practice of geostatistics. In addition to using geostatistics for both
exploration and development projects with the oil and mining industries, he
has also applied geostatistics to a wide variety of environmental and agricultural problems. He is one of the founding partners of FSS International, a
consulting group that specializes in geost atistics, and currently manages their
Canadian operations from his office in Vancouver.
Download