Uploaded by Reshmi Sengupta

Web Chapter 18

advertisement
Final PDF to printer
CHAPTER
18
Simulation
CHAPTER CONTENTS
18.1 What Is Simulation?
18.2 Monte Carlo Simulation
18.3 Random Number Generation
18.4 Excel Add-Ins
18.5 Dynamic Simulation
CHAPTER LEARNING
OBJECTIVES
When you finish this chapter you should be able to
LO 18-1 List characteristics of situations where simulation is appropriate.
LO 18-2 Distinguish between stochastic and deterministic variables.
LO 18-3 Explain how Monte Carlo simulation is used and why it is called static.
LO 18-4 Explain how to generate random data by using a discrete or continuous CDF.
LO 18-5 Use Excel to generate random data for several common distributions.
LO 18-6 Describe functions and features of commercial modeling tools for Excel.
LO 18-7 Explain the main reasons for using dynamic simulation and queuing models.
18-1
doa57594_web_ch18_001-020_onlinecontent.indd
18-1
10/09/17 03:49 PM
Final PDF to printer
©Greg Pease/Getty Images
18.1 WHAT IS SIMULATION?
A simulation is a computer model that attempts to imitate the behavior of a real system or activity. Models are simplifications that try to include the essentials while omitting unimportant
details. We use simulation to help quantify relationships among variables that are too complex
to analyze mathematically. We can test our understanding of the world by seeing whether our
model leads to realistic predictions. If the simulation’s predictions differ from what really happens, we can refine the model in a systematic way until its predictions are in close enough
agreement with reality.
LO 18-1
List characteristics of situations where simulation is
appropriate.
Simulation Defined
Simulation, from the Latin simulare, means to ‘‘fake’’ or to ‘‘replicate.’’ Today, computer simulation is used as a powerful tool to assess the impact of policies, to evaluate performance, to
train professionals and more . . . without actually having to experiment with or perturb the real
system.
Source: From http://www.iro.umontreal.ca/~vazquez/SimSpiders/GenerRV/index.html
System simulation is the mimicking of the operation of a real system in a computer, such as
the day-to-day operation of a bank, or the value of a stock portfolio over a time period, or the
running of an assembly line in a factory, or the staff assignment of a hospital or a security company. Instead of building extensive mathematical models by experts, simulation software has
made it possible to model and analyze the operation of a real system by non-experts, who are
managers but not programmers.
Source: From http://home.ubalt.edu/ntsbarsh/simulation/sim.htm#rintroduction
A Versatile Tool
Simulation is a rehearsal. We rehearse a play or a speech. We take practice SAT exams. We go
to football practice. We do so because we want to make our mistakes before the “real thing,”
when the consequences of a major flub are consequential. Business and not-for-profit enterprises know that a walk-through is essential before a change is implemented.
Simulation is planning. Super Bowl planning begins years in advance, picking the site, analyzing hotel capacity, envisioning transportation facilities and entertainment, and so on. Super
Bowl site evaluation involves what-if analysis. Where will traffic bottlenecks develop? How
long, on average, will it take for people to get from the hotels to the stadium? How long will
18-2
doa57594_web_ch18_001-020_onlinecontent.indd
18-2
10/09/17 03:49 PM
Final PDF to printer
18-3
Applied Statistics in Business and Economics
people have to wait for restaurant seating at peak times? When does planning become simulation, and vice versa? The boundary is not always clear.
Simulation is a behavioral tool that helps decision makers focus on important aspects of a
problem, instead of bickering about details, preferences, or personalities. In creating a simulation model, people are obliged to state their assumptions, name the variables that are important, and suggest hypothesized relationships among the variables. Simulation is not just a
quantitative tool for operations research specialists, but rather a general device to help people
think clearly.
Applications
Simulation models can be quite simple or very complex, depending on the purpose. A queuing
model of customers at a single bank ATM requires only a simple Poisson model of arrivals
and empirical estimates of the mean arrival rate by time of day. A queuing model of a grocery
store with multiple checkout lanes is more complex. A model of Disney World queues during
the busy season is very complex. We sometimes simulate events by using people, as in disaster
simulations to test emergency personnel preparedness for terror attacks or disease outbreaks in
major cities. Simulation studies have improved
• Passenger flows at Vancouver International Airport.
• Hospital surgery scheduling at Henry Ford Hospital.
• Traffic flows in metropolitan Oakland County.
• Waiting lines at Disney World.
• Just-in-time scheduling in Toyota auto assembly plants.
Besides these real activities, you are probably familiar with computer games that simulate car
chases, Kung-Fu, and WWI aerial dogfights. Flight simulators can be as close to real flying as
the budget will allow, ranging from a PC Cessna 172 up to the Boeing 787 used by the airlines
to certify (yes, actually certify) their pilots.
When Do We Simulate?
There are many reasons, but simulation is especially attractive when real experiments are dangerous, costly, or impossible. Training a novice pilot in a flight simulator is safer and cheaper
than using a real airplane. Of course, the simulation must adequately describe reality, or the
simulation is worse than useless. In general, we might consider simulation when
•
•
•
•
•
The system is complex.
Uncertainty exists in the variables.
Real experiments are impossible or costly.
The processes are repetitive.
Stakeholders can’t agree on policy.
Conversely, we are less inclined to simulate when the system is simple, variables are stable or
nonstochastic, real experiments are cheap and nondisruptive, the event will only happen once,
or stakeholders agree on policy.
Sometimes simulation is followed up by “dry runs” with a real system. For example, the
Denver International Airport was designed from the ground up, so nobody knew how its automated baggage handling system would perform. Engineering design showed that it would be
successful. But during a rehearsal with actual bags prior to opening the new airport, bags were
crushed and were routed incorrectly. The airport opening was delayed until the problems could
be resolved. There are limits to any simulation’s ability to mimic the “real thing.” But a priori
analysis through simulation modeling can reveal potential problem areas, sometimes without
actual physical testing of the system.
doa57594_web_ch18_001-020_onlinecontent.indd
18-3
10/09/17 03:49 PM
Final PDF to printer
Chapter 18 Simulation 18-4
Advantages of Simulation
Unlike a deterministic model (in which the variables can’t vary), simulation lets key variables
change in random but specified ways so that we can see what happens to the bottom-line decision variable(s) of interest. It helps us understand the range of possible outcomes and their
probabilities and allows a sensitivity analysis showing which factors have the most influence on
the outcome. Simulation is useful in business, government, and health care because it
•
•
•
•
•
•
•
Is less disruptive than real experiments.
Forces us to state our assumptions clearly.
Helps us visualize the implications of our assumptions.
Reveals system interdependencies.
Quantifies risk by showing probabilities of events.
Helps us see a range of possible outcomes.
Promotes constructive dialogue among stakeholders.
A simulation project has several phases. In Phase I (design) we identify the problem, set
objectives, design the model, and collect data. In Phase II (execution) we do empirical modeling, specify the variables, validate the model, execute the simulation, and prepare reports.
In Phase III (communication) we explain the findings to decision makers. Thinking of it
this way, you can see that simulation can help bring people together in a common way of
thinking.
Risk Assessment
Risk assessment means thinking about a range of outcomes and their probabilities. People don’t
always think this way. Automobile executives often want their marketing staff to provide a single, most likely sales volume forecast for a new vehicle. Accountants for an electric utility are
asked to provide a cash flow forecast with a single, most likely prediction. You want to know
what grade you will get in your statistics class.
But variation is inevitable. A point estimate for a random variable is almost certain to be
wrong. If you are a “B” student, you can’t be sure of a “B” on every exam. The “B” is only the
average. In general, if X is normally distributed and you predict that the next item sampled
will be equal to the mean, you are ignoring the distribution around the mean. Remember the
Empirical Rule (68 percent within μ ± 1σ, 95 percent within μ ± 2σ, etc.)? It would assist a
decision maker to know the 95 percent range of possible values for the decision variable, as
well as the “most likely” value μ. That is the point of risk assessment. It is especially useful
when the model is so complex that it is difficult to study mathematically.
Components of a Simulation Model
Simulation models can be classified in various ways, but they have some things in common.
Table 18.1 summarizes the components of a simulation model in general terms. Simulation
variables can be either deterministic (nonrandom or fixed) or stochastic (random)
variables. If a variable is stochastic, we must hypothesize its distribution (normal, exponential, etc.). By allowing the stochastic variables to vary, we can study the behavior of the
output variables that interest us to establish their ranges and likelihood of occurrence. We
are also interested in the sensitivity of our output variables to variation in the stochastic
input variables.
There are two broad types of simulation models: static simulation (time isn’t explicit) and
dynamic simulation (events occurring sequentially over time). Dynamic simulation requires specialized software, while simple static simulation can be done in Excel spreadsheets. Therefore, we
will begin by discussing static simulation, using Excel functions. Then we’ll discuss commercial
software that can facilitate static simulation, and finally take a brief look at dynamic simulation.
doa57594_web_ch18_001-020_onlinecontent.indd
18-4
LO 18-2
Distinguish between stochastic and deterministic
variables.
10/09/17 03:49 PM
Final PDF to printer
18-5
Applied Statistics in Business and Economics
TABLE 18.1
Component
Components of a
Simulation Model
Explanation
List of deterministic factors F1, F2, . . . , Fm
These are quantities that are known or fixed,
or whose behavior we choose not to model
(i.e., exogenous).
List of stochastic input variables V1, V2,
. . . , Vk
These are quantities whose value cannot be
known with certainty and are assumed to vary
randomly.
List of output variables O1, O2, . . . , Op
These are stochastic quantities that are
important to a decision problem, but whose
value depends on things in the model and
whose distribution is not easily found.
Assumed distribution for each stochastic
input variable
These are chosen from known statistical
distributions, such as normal, Poisson,
triangular, and so on.
A model that specifies the rules or
formulas that define the relationships
among Fs and Vs
Formulas may be accounting identities such
as Profit = Revenue − Cost or behavioral
hypotheses such as Car Sales = b0 + b1
(Income after Taxes) + b2 (Net Worth) + b3
(3-month T-Bill Rate).
A simulation method that produces
random data from the specified
distributions and captures the results
This is a programming language (such as VBA)
although it may be embedded invisibly in a
spreadsheet with built-in functions like Excel’s
=RAND() or other add-ins.
An interface that summarizes the model’s
inputs, outputs, and simulation results
LO 18-3
Explain how Monte Carlo
simulation is used and why
it is called static.
Typically, spreadsheet tables or graphs to
summarize the outcomes of the simulation.
18.2 MONTE CARLO SIMULATION
Static simulation, in which time is not considered, uses the Monte Carlo method. The computer creates the values of the stochastic random variables. However, “random” does not
mean “chaotic” because we specify the distribution (e.g., normal) and its parameters (e.g., μ
and σ). Then we draw repeated samples from each distribution—often hundreds or thousands
of iterations. Each sample yields one possible outcome for each stochastic variable. By studying the results, we can see the range of possibilities and how frequently each outcome occurs.
For each output variable of interest, we usually look at percentiles (e.g., quartiles) as well
as the mean, based on many samples. We usually make a histogram or a similar visual display
of the results. We also do this for each stochastic input variable, to verify that the sampling is
being done correctly (i.e., to make sure the desired distribution is being sampled).
Which Distribution?
You can use any distribution for a stochastic input variable. But in a static simulation (e.g., for
financial modeling), some are used more than others. Table 18.2 shows four probability distributions that are of interest because they correspond to the way managers often think and can
easily be simulated in Excel.
Suppose that the price of aluminum is a stochastic input in your monthly cash flow forecasts for the next 12 months. You want to choose a model to represent the price of aluminum. The uniform model lets the price vary anywhere within the range a to b, with no
central tendency. The normal model allows symmetric variation about a historical mean, if
you know the historical standard deviation or use some form of the Empirical Rule such as
σ = range/6. The triangular model allows you to state the range a to b but also allows a best
guess c without forcing you to assume symmetry around the mean. The exponential model
describes a variable that usually is very near zero but could have very high values.
doa57594_web_ch18_001-020_onlinecontent.indd
18-5
10/09/17 03:49 PM
Final PDF to printer
Chapter 18 Simulation 18-6
TABLE 18.2 Some Useful Distributions
Distribution
Illustration
Characteristics
The familiar bell-shaped curve. Symmetric,
with a peak in the middle and gradually
tapering tails.
Normal N( μ, σ)
Pro: Familiar, well-known.
Con: Extreme outcomes possible.
µ - 3σ µ - 2σ µ - 1σ
µ
µ + 1σ µ + 2σ µ + 3σ
Has a central peak (mode) and clear end
points (minimum, maximum).
Triangular T(a, b, c)
Pro: Easy to understand.
Con: Harder to simulate.
a
b
c
Uniform U(a, b)
Specify upper and lower limits.
Every value is equally likely.
1
b-a
Pro: Easy to understand.
Con: Range may be too broad.
a
Exponential Expon(λ)
b
λ
Ideal for waiting time. Mode is zero.
Large values are rare.
Pro: For highly skewed data.
Con: Extreme values possible.
The Axolotl Corporation sells three products (A, B, and C). Prices are set competitively and are assumed constant. The quantity demanded, however, varies widely from
month to month. To prepare a revenue forecast, the firm sets up a simple simulation
model of its input variables, as shown in Table 18.3. The output variable of interest is
total revenue PA QA + PB QB + PC QC.
EXAMPLE 18.1
Three-Product Revenue
Forecast
TABLE 18.3 Simulation Setup for Revenue Calculation
Variable
Type
Price
Quantity
Deterministic
Stochastic input
Revenue
Stochastic output
doa57594_web_ch18_001-020_onlinecontent.indd
18-6
Product A
Product B
Product C
PA = 80
Normal
QA ~ N(50, 10)
μ = 50, σ = 10
PB = 150
Triangular
QB ~ T(0, 5, 40)
Min = 0, Max = 40,
Mode = 5
PBQB
PC = 400
Exponential
QC ~ Expon(λ)
λ = 2.5
PAQA
PCQC
10/09/17 03:49 PM
Final PDF to printer
18-7
Applied Statistics in Business and Economics
You can see that variation in the quantity demanded would make it difficult to predict total revenue. You could predict its mean, based on the mean of each distribution,
but what about its range? Simulation reveals things that are not obvious. The results of
a static simulation using 100 Monte Carlo iterations are shown in Table 18.4 and summarized in Figure 18.1.
TABLE 18.4 Results of 100 Iterations of Revenue Simulation
Percentile
Min
5%
25%
50%
75%
95%
Max
Sample Mean
Expected Mean
Product A
Product B
26
34
44
50
56
64
69
49.85
50
1
4
8
14
20
32
35
15.11
15
Product C
Total Revenue
0
0
1
2
3
6
11
2.30
2.5
4,180
4,745
5,943
7,000
8,340
10,022
10,780
7,335
7,250
Note: For product A (normal), the mean demand is μA = 50. For product C (exponential), the mean demand is μC = 2.5.
For product B (triangular), the mean demand is μB = (a + b + c)/3 = (0 + 40 + 5)/3 = 15. Assuming independence, the
mean total revenue is PA μA + PB μB + PC μC = (80)(50) + (150)(15) + (400)(2.5) = 4,000 + 2,250 + 1,000 = 7,250, which
compares well with the simulation mean. Although product A contributes the most to total revenue at the mean, this may not
be the case in a particular simulation because demand can fluctuate.
FIGURE 18.1
Histograms for 100 Iterations
of Revenue Simulation
Demand—Product A
Demand—Product B
35
50
30
40
25
30
20
20
15
10
10
5
0
0
10
20
30
40
50
60
70
80
0
5
10
Demand—Product C
15
20
25
30
35
40
Total Revenue
80
50
40
60
30
40
20
20
10
0
0
0
3
6
9
doa57594_web_ch18_001-020_onlinecontent.indd
12
18-7
15
18
21
2,000
4,000
6,000
8,000 10,000 12,000 14,000
10/09/17 03:49 PM
Final PDF to printer
Chapter 18 Simulation 18-8
These results suggest that Axolotl’s revenue could be as low as $4,180 or as high as
$10,780. There is a 50 percent chance (between the 25th and 75th percentiles) of seeing
revenue between $5,943 and $8,340. The median revenue seems to be below the mean,
suggesting that total revenue is right-skewed. That is to be expected because both the
triangular (product B) and exponential (product C) are right-skewed distributions (you
can also see the skewed distribution of demand for products B and C by looking at their
simulation results). The normal distribution (product A) is reflected in the simulation
results, which are symmetric and lie well within the μ ± 3σ limits.
This simulation could be repeated another 100 times by clicking a button. Simulation
results will vary, but as long as the number of iterations is reasonably large, you will see
considerable stability. The results shown here are typical of this static model. If you wish
to play with this model further, it is available on the LearningStats downloads (see endof-chapter McGraw-Hill Connect® resources). Color-coding is used in the spreadsheet
and graphs to distinguish inputs from outputs.
18.3 RANDOM NUMBER GENERATION
LO 18-4
Basic Concept: Inverse CDF
Random numbers are at the heart of any simulation. So how do we generate random data? In
general, if you know F(x), the cumulative distribution function (CDF) of your distribution, you
generate a uniform U(0, 1) random number R and then find F −1(R), where F −1 is the inverse
CDF. Essentially, what you have to do is set F(x) = R and then solve for x, as illustrated in
Figures 18.2 and 18.3. However, this is sometimes easier said than done because finding F −1
may be tricky, especially for discrete distributions as shown in Figure 18.3. If you use a programming language, it is not difficult, and there are plenty of commercial packages that do it.
But it is harder if you are a do-it-yourself person who wants to use only the functions available
within Excel.
Explain how to generate
random data by using a discrete or continuous CDF.
FIGURE 18.2
1.00
Random x from
Continuous CDF
0.80
R
0.60
0.40
0.20
0.00
x
FIGURE 18.3
1.00
Random x from Discrete
CDF
0.80
R
0.60
0.40
0.20
0.00
doa57594_web_ch18_001-020_onlinecontent.indd
x
18-8
10/09/17 03:49 PM
Final PDF to printer
18-9
Applied Statistics in Business and Economics
Random Data in Excel
Table 18.5 shows some Excel functions to create random data from a few of the more common distributions. After you use them in your own spreadsheet, you will begin to see how they
work. Some of the end-of-chapter exercises ask you to use these functions, so look them over
carefully.
TABLE 18.5
Creating Random Data in
Excel
Distribution
What to Put in Excel Cell
Explanation
Uniform U(0, 1)
=RAND()
Built-in Excel function
Uniform U(a, b)
=RANDBETWEEN($A$1, $A$2)
$A$1 is the minimum and $A$2 is
the maximum (or use cell names
like Xmin and Xmax).
Normal N(0, 1)
=NORM.S.INV(RAND())
Excel’s inverse normal function.
Normal N( μ, σ)
=NORM.INV(RAND(), $A$1, $A$2)
$A$1 is the mean and $A$2 is the
standard deviation (or use cell
names like Mu and Sigma).
Exponential Expon(λ)
=-LN(RAND())*$A$1
$A$1 is the mean Poisson arrival
rate (or use cell name like Lambda)
Triangular T(a, b, c)
No single cell formula but can be
done in Excel with two cells (see
LearningStats).
Better to use @Risk, XLSim, or
LearningStats.
Binomial B(n, π)
=BINOM.INV($A$1, RAND(), $A$2)
$A$1 is the number of trials and
$A$2 is the probability of success.
Other Ways to Get Random Data
LO 18-5
Use Excel to generate random data for several common distributions.
Excel’s Data Analysis > Random Number Generation will create random data for uniform, normal,
Bernoulli, binomial, and Poisson distributions (see Figure 18.4). MegaStat (MegaStat > Random
Numbers) makes uniform, normal, and exponential random data (see Figure 18.5). Minitab
(Calc > Random Data) offers a very broad menu of distributions (see Figure 18.6). There are
even websites that will give you guaranteed random numbers! For spreadsheet Monte Carlo
simulation, it is best to use a specialized package such as @Risk, XLSim, or YASAI that offers
many built-in functions to create random data and keep track of your simulation results (see
Useful Websites and Related Reading at the end of this chapter).
FIGURE 18.4
Generating Random Data
in Excel
Bootstrap Method
In recent years, much attention has been paid to resampling to estimate unknown parameters, most notably the bootstrap method. It can be applied to just about any parameter.
doa57594_web_ch18_001-020_onlinecontent.indd
18-9
10/09/17 03:49 PM
Final PDF to printer
Chapter 18 Simulation 18-10
FIGURE 18.5
Generating Random Data
in MegaStat
FIGURE 18.6
Generating Random Data
in Minitab
Although it requires specialized software, the bootstrap method is easy to explain. It rests
on the principle that the sample reflects everything we know about the population. From a
sample of n observations, we use Monte Carlo random integers to take repeated samples of
n items with replacement from the sample and to calculate the statistic of interest for each
sample. The average of these statistics is the bootstrap estimator. The standard deviation from
these repeated estimates is the bootstrap standard error. The distribution of these repeated
estimates is the bootstrap distribution (which generally is not normal). The bootstrap method
of estimation (see LearningStats downloads for Unit 08 for more details and a spreadsheet
simulation) avoids having to assume normality when constructing a confidence interval or
finding percentiles.
The accuracy of the bootstrap estimator increases with the number of resamples (e.g., we
might resample the sample 10,000 times to get many possible variations on the sample information). The percentiles of the resulting distribution of sample estimator provide the bootstrap
confidence interval. For example, a 90 percent confidence interval would be formed by the 5th
and 95th percentiles. No assumption of normality is required. Before the advent of powerful
computers, such an approach was unthinkable. When data are badly skewed, the bootstrap is
an excellent choice. Resampling is not just for means. There are bootstrap estimators for most
common statistics, as well as for those that are hard to study mathematically. Some statistical
packages now offer bootstrap estimators. Resampling is in the mainstream of statistics, even
though it is less familiar to most people in business (check the web for further information).
doa57594_web_ch18_001-020_onlinecontent.indd
18-10
10/09/17 03:49 PM
Final PDF to printer
18-11
Applied Statistics in Business and Economics
18.4 EXCEL ADD-INS
LO 18-6
Describe functions and features of commercial modeling tools for Excel.
We can generate our own random data within Excel. However, Excel isn’t optimized for statistics and doesn’t keep track of your results. Other vendors (e.g., @Risk) have created Excel AddIns offering more features. They not only calculate probabilities but also permit Monte Carlo
simulation to draw repeated samples from a distribution.
Illustration: Using @Risk
BobsNetWorth
Table 18.6 shows some examples of @Risk input functions that can be pasted directly into cells
in an Excel spreadsheet. These functions are intuitive and easy to use. The input cell becomes
active and will change each time you update the spreadsheet by pressing F9. We illustrate @
Risk simulation with Bob’s net worth. The spreadsheet is shown in Figure 18.7 (it is also in
LearningStats downloads for Chapter 18). For those who do not have access to @Risk software (probably a majority), the downloads also contain “pure Excel” versions with reduced
TABLE 18.6
Examples of @Risk
Distributions
@Risk Function
Example
Interpretation
Normal
=RiskNormal(47,2)
Truncated normal
=RiskTnormal(47,2,43,51)
Triangular
=RiskTriang(3,8,14)
Normal with mean μ = 47 and standard
deviation σ = 2.
Normal with mean μ = 47 and standard
deviation σ = 2. Lowest allowable value is
43 and highest allowable value is 51 (set
at μ ± 2σ).
Lowest value is 3, most likely value is 8,
highest value is 14.
FIGURE 18.7
Bob’s Stochastic Balance
Sheet
doa57594_web_ch18_001-020_onlinecontent.indd
18-11
10/09/17 03:49 PM
Final PDF to printer
Chapter 18 Simulation 18-12
capabilities. Where relevant, cell range names are used (e.g., Net Worth) instead of cell references (e.g., H9). Comments have been added to cells that specify stochastic inputs (purple highlight) or stochastic outputs (orange highlight). Output cells are bottom-line variables of interest
while input cells are the drivers of the output(s).
On a given day, Bob views his actual net worth as dependent on the market value of his
assets. Some of his asset and liability values are deterministic (e.g., checking account, savings
account, student loans) while the values of his car, beer can collection, and stocks (and hence
his net worth) are stochastic. Table 18.7 shows the @Risk functions that describe each stochastic input (distribution, skewness, coefficient of variation).
Asset
@Risk Function
Comments
Mustang
Beer can collection
Garland stock
Oxnard stock
ScamCo stock
=RiskTriang(8000,10000,15000)
=RiskTriang(0,50,1000)
=RiskNormalN(15.38,3.15)
=RiskNormal(26.87,2.02)
=RiskNormal(3.56,0.26)
Triangular, right-skewed
Triangular, very right-skewed
Normal, symmetric, large CV
Normal, symmetric, small CV
Normal, symmetric, small CV
TABLE 18.7
Distributions Used in
Bob’s Stochastic Balance
Sheet
These functions tell us about Bob’s reasoning. For example, Bob thinks that if he finds the
right buyer, his Mustang could be worth up to $15,000. He is quite sure he won’t get less than
$8,000, and he figures that $10,000 is the most likely value. When you paste an @Risk input
function for the desired statistical distribution in a cell, its contents become stochastic, so that
every time the spreadsheet is updated a new value will appear. For example, the input cell for
Oxnard containing the @Risk function =RiskNormal(26.87,2.02) is a random variable with μ =
26.87 and σ = 2.02. All @Risk distributions are available from Excel’s Insert > Function menu.
An output cell is calculated as usual except that =RiskOutput()+ is added in front of the cell’s
contents; for example, =RiskOutput + TotalAssets – TotalDebt, where TotalAssets and TotalDebt
are defined elsewhere in the spreadsheet (of course, you can also use cell references like C12
and H7 instead of cell names). The @Risk toolbar appears on the regular Excel toolbar, as
illustrated in Figure 18.8.
FIGURE 18.8
Simulation Ribbon
in @Risk
The @Risk setup screens and typical settings are shown in Figure 18.9. You can get up to
10,000 Monte Carlo replications. @Risk keeps track of all simulated values of the input and
output cells, and will let you see various displays of the simulation results. For each stochastic
input cell, choose a distribution (see Figure 18.10). Then click the Start Simulation icon on the
top menu bar. Various reports can be generated and placed either in a new workbook or in the
active workbook. You will get a menu of graphs that are available. Distributions of simulated
input and output variables can be displayed in tables (statistics, percentiles) or charts (histograms, cumulative distributions). You can reveal any desired percentile on the cumulative
distribution, or use a tornado chart to reveal sensitivities of output variables to all the input
variables.
By default, the middle 90 percent of the outcomes are shown. In Figure 18.11, we see the
distribution of net worth. The shape is symmetric but platykurtic, with mean $9,498. In the
simulation, Bob’s net worth exceeded $10,000 about 40 percent of the time. You can drag the
doa57594_web_ch18_001-020_onlinecontent.indd
18-12
10/09/17 03:49 PM
Final PDF to printer
18-13
Applied Statistics in Business and Economics
FIGURE 18.9
Typical Simulation
Settings
FIGURE 18.10
Distributions to Choose
vertical sliders to show different percentiles. This is easy, but inexact. To select integer percentiles, use the arrows at the bottom of the histogram. The ascending cumulative distribution
(Figure 18.12) reveals additional detail.
For sensitivity analysis, click on the Tornado graph icon to see a list of factors that explain
variation in net worth, listed in order of importance, as illustrated in Figure 18.13. Bars that
face right are positively affecting the output variable, while bars that face left (if any) are affecting the output variable negatively. Sensitivities can range from −1.0 to +1.0, with values near
zero indicating lack of importance. Here, the Mustang value and Oxnard and Garland stock
prices are the most important input variables, while beer cans are less important and the
ScamCo stock contributes little variation to net worth.
Pros and Cons
While these Excel Add-In packages are powerful, their cost may strain academic budgets.
Fortunately, some textbooks offer a student version at modest extra cost. Nonetheless, student lab setup is likely to require a skilled site administrator, and some training is required
to use these packages effectively. LearningStats downloads (see end-of-chapter McGraw-Hill
Connect® resources) include exercises using @Risk, along with instructions, but there also are
Excel-only versions for those who don’t have access to @Risk.
doa57594_web_ch18_001-020_onlinecontent.indd
18-13
10/09/17 03:49 PM
Final PDF to printer
Chapter 18 Simulation 18-14
FIGURE 18.11
Histogram of 500
Iterations of Net Worth
FIGURE 18.12
Cumulative Distribution
of 500 Iterations of Net
Worth
FIGURE 18.13
Tornado Graph for
Sensitivity Analysis of Five
Inputs
doa57594_web_ch18_001-020_onlinecontent.indd
18-14
10/09/17 03:49 PM
Final PDF to printer
18-15
Applied Statistics in Business and Economics
18.5 DYNAMIC SIMULATION
LO 18-7
Explain the main reasons
for using dynamic simulation and queuing models.
Discrete Event Simulation
In a dynamic simulation, input variables are defined at discrete points in time (such as every
minute) or continuously (changing smoothly over time). The most common form of dynamic
simulation is discrete event simulation in which the system state is assessed by a clock at distinct points in time. If you already knew something about simulation before reading this chapter, you were probably thinking of dynamic simulation, which involves computer modeling of
flows (e.g., airport passenger arrivals, automobile assembly line flow, hospital surgical suite
scheduling).
In discrete event simulation, we observe a “snapshot” of the system state at any given
moment. The system activity may be represented visually, even using animation to help us
visualize flows, queues, and bottlenecks. The emphasis in discrete event simulation is on measurements such as
•
•
•
•
•
•
Arrival rates
Service rates
Length of queues
Waiting time
Capacity utilization
System throughput
Although it is fairly easy to understand and extremely powerful, this kind of simulation
requires specialized software, and therefore will not be discussed in detail here. But we can
make a few general comments. Most universities offer courses on simulation, if you want to
know more.
Queuing Theory
If customer arrivals per unit of time follow a Poisson distribution and service times follow
an exponential distribution, some rather interesting theorems have been proven regarding
the length of customer queues, mean waiting times, facility utilization, and so on. This
is known as queuing theory. Queuing theory is a topic covered in courses such as operations management, simulation, or decision modeling. It flows from what you have learned
already in statistics. The simplest situation is a single-server facility (such as a single ticket
window selling tickets) whose customers form a single, well-disciplined queue (first-come,
first-served) whose arrivals from an infinite source are Poisson distributed with mean λ (customer arrivals per unit of time) and service times are exponentially distributed with mean
1/μ (units of time per customer) or its reciprocal μ (customers served per unit of time). If
we assume that λ < μ to prevent the buildup of an infinite queue, then the following may be
demonstrated:
λ
​
​ units of time​​
​E
​ xpected wait time: ______
μ(μ − λ )
(18.1)
​λ​2​
​E
​ xpected length of waiting line: ______
​
​ customers​​
μ(μ − λ )
(18.2)
λ
​ ​× 100%​​
​Expected facility utilization: __
μ
(18.3)
Simulating Queuing Models
Theorems such as these are quite useful in facility planning. Unfortunately, the situation can
quickly become more complex, as shown in Figure 18.14. We could have multiple servers with
a single, well-disciplined queue (as in most banks and post offices) or multiple servers with
doa57594_web_ch18_001-020_onlinecontent.indd
18-15
10/09/17 03:49 PM
Final PDF to printer
Chapter 18 Simulation 18-16
FIGURE 18.14
Various Queuing Situations
Type of Queue
Customer Queue Pattern
Service Facilities
Single queue,
single server
Server 1:
Single queue,
multiple servers
Server 2:
Server 3:
Server 1:
Multiple queues,
multiple servers
Server 2:
Server 3:
Serial server,
single queue
Server 1:
Server 2:
Server 3:
multiple queues (as in a grocery store checkout) or one queue with multiple serial servers (each
with its own service rate) where you must complete one step in the process before going to the
next (as in some hospital admissions or manufacturing processes).
Further, many applications of queuing models do not meet the assumption of Poisson arrivals or exponential service times. In that case, the results shown in formulas 18.1–18.3 are no
longer valid. This is where simulation modeling is most valuable because we can evaluate queuing systems with many different structures in a computer model that allows for the complexity
needed.
One popular simulation package is called Arena. With Arena, we can model both small and
large business processes that contain interrelated steps and process feedback loops. Arena
allows businesses involved in process improvement projects to test process changes in a realistic
simulation before making costly, and perhaps permanent, changes to a real process. Arena can
be integrated with Microsoft Visio, a process diagramming software tool, and is therefore frequently used in business courses on process design and improvement techniques.
Simulation is used to study processes that involve stochastic (random) variables as well as deterministic (nonrandom) variables. Simulation is useful when real experiments are impossible or costly.
Simulation is useful for planning, risk assessment, and what-if analysis. Simulation helps decision makers assess the likelihood of various possible outcomes and the effects of their decisions. Monte Carlo
models are static because time is not explicit. They use computer-created values of the stochastic input
variable(s) from specified distributions (e.g., normal) and their parameters (e.g., μ and σ). From many
such samples, an empirical distribution is created for each output variable of interest. To generate a
random data value for a given input variable, we generate a uniform random deviate U(0, 1) and then
find the inverse CDF for the assumed distribution of the input variable. In a dynamic simulation, time
is explicit. Their applications include models of arrivals, service times, and queues (e.g., in a grocery
checkout lane). For simple queuing models, there are formulas for mean waiting times, queue length,
and facility utilization. However, specialized (and costly) software is needed for detailed simulation
of flows over time and the resulting empirical distributions, so dynamic simulation is not ordinarily
studied in introductory statistics.
doa57594_web_ch18_001-020_onlinecontent.indd
18-16
CHAPTER SUMMARY
10/09/17 03:49 PM
Final PDF to printer
18-17
Applied Statistics in Business and Economics
KEY TERMS
CHAPTER REVIEW
Arena
@Risk
bootstrap method
deterministic variables
dynamic simulation
input variable
inverse CDF
models
Monte Carlo method
output variable
queuing theory
risk assessment
simulation
static simulation
stochastic variables
Visio
what-if analysis
XLSim
YASAI
1. Define (a) simulation, (b) deterministic variable, (c) stochastic variable, and (d) risk assessment.
2. Explain how simulation is (a) a planning tool and (b) a behavioral tool.
3. Name three applications of simulation.
4. When is simulation appropriate? When is it not appropriate?
5. (a) List five advantages of simulation. (b) What are the three stages of simulation modeling?
6. What are the two types of simulation? How are they different?
7. Explain the meaning of these components of a simulation model: (a) deterministic factors, (b) stochastic input variable, (c) output variable, (d) model, and (e) interface.
8. (a) Why does this chapter focus mainly on static simulation? (b) What is Monte Carlo simulation?
Explain how it works.
9. Name three distributions that are useful in simulation and give their main characteristics.
10. To generate random numbers, list some distributions covered in this textbook that can be created in
(a) Excel, (b) MegaStat, and (c) Minitab.
11. Explain the meaning of (a) dynamic simulation, (b) well-disciplined queue, and (c) infinite queue.
12. (a) List five variables that can be studied using a queuing model. (b) List two kinds of queuing models that are more complex than a single-server queue.
13. (a) Why do we need packages like @Risk? (b) What factors limit their use?
CHAPTER EXERCISES
18.1
(a) Use Excel’s function =NORM.INV(RAND(),50,8) to create 100 random numbers from the normal
distribution N(50, 8). Hint: Refer to Table 18.5. (b) Calculate the sample mean and standard deviation
and then compare them with their theoretical values. (c) Is the range what you would expect from this
normal distribution? Explain. (d) Make a histogram or similar display. Does the shape appear normal?
18.2 Create 100 random numbers with mean 0 and standard deviation 1 from the standard normal
distribution N(0, 1). Use the built-in random number generators (not your own functions) from as
many of these as you can: (a) Excel, (b) MegaStat, (c) Minitab. List pros and cons of each package’s capabilities and ease of use.
18.3 (a) Use Excel’s function =RAND() to create 100 uniform U(0, 1) random numbers. (b) Calculate the
sample mean and standard deviation and compare them with their theoretical values (see Chapter 7).
18.4 (a) Use Excel’s Data Analysis > Random Number Generation to create 100 Poisson random numbers
with mean λ = 2.5. (b) Calculate the sample mean and standard deviation and compare them with
their theoretical values (see Chapter 6).
18.5 (a) Use Excel’s Data Analysis > Random Number Generation to create 100 binomial random numbers
with n = 30 and π = .25. (b) Calculate the sample mean and standard deviation and compare them
with their theoretical values (see Chapter 6).
18.6 (a) Use Excel’s Data Analysis > Random Number Generation to create 100 discrete random numbers
with values x = 0, 50, 100, 200, 500, 1,000 whose respective probabilities are P(x) = .40, .25, .15,
.10, .05, .05. (b) Calculate the sample mean with its theoretical value, using the definition of E(X)
in Chapter 6.
18.7 (a) Use the method in Table 18.5 to create 100 exponential random numbers with mean waiting
time 1/λ= 0.40. Discuss the characteristics of the resulting sample (minimum, maximum, mode,
etc.). (b) Why did the simulation of product demand in this chapter (see Tables 18.3 and 18.4)
round the exponential values to integers?
18.8 Use the freezer simulation in LearningStats Unit 18 McGraw-Hill Connect® downloads to observe
temperature samples. (a) Press the F9 key 5 times. Did you get any sample means above the UCL
doa57594_web_ch18_001-020_onlinecontent.indd
18-17
10/09/17 03:49 PM
Final PDF to printer
Chapter 18 Simulation 18-18
or below the LCL? Repeat. (b) How did Monte Carlo simulation help demonstrate control charts
for a mean?
Freezer
18.9 In LearningStats Unit 8 McGraw-Hill Connect® downloads, use the bootstrap simulation for a
mean. (a) Select a normal population. Observe the histogram of the 20 sample items as you press
F9 10 times. Are the samples consistent with the specified population shape? Press F9 10 more
times, observing the confidence intervals. Are they similar? (b) Repeat the previous exercise, using
a uniform population. (c) Repeat the previous exercise, using a skewed population. (d) Repeat the
previous exercise, using a strongly skewed population. (e) How did simulation help demonstrate
the bootstrap concept? Note: The mean is 50 for each distribution.
Bootstrap
18.10 In LearningStats Unit 18 McGraw-Hill Connect® downloads, choose one of the three simulation projects. The three scenarios use only the normal N(μ, σ) and triangular T(a, b, c) distributions because they are flexible yet easy to understand. Each scenario involves a problem faced
by a hypothetical character named Bob. If you have access to @Risk, use the @Risk version.
Otherwise, use the Excel-only version. Use the 100-iteration worksheet to answer the questions
posed below:
a. Scenario 1: Bob’s Stochastic Balance Sheet. Questions: (i) How often does Bob’s expected net
worth exceed $10,000? (ii) What is Bob’s expected net worth? (iii) On a given day, is there a
50 percent chance that Bob’s net worth exceeds $10,000? (iv) What are the 25 percent and
75 percent points of his daily net worth? The 5 percent and 95 percent points? (v) Verify that
the stock prices have the desired means and standard deviations. (vi) Verify that the Mustang
and beer can means are equal to (Min + Max + Mode)/3.
b. Scenario 2: Bob’s Mail-Order Business. Questions: (i) What is Bob’s expected profit? (ii) What
is Bob’s median profit? (iii) Estimate the 5 percent and 95 percent outcomes and interpret
them. (iv) Estimate the quartiles of Bob’s profit and interpret them. (v) Why might Bob undertake this business venture? Why might he not? Explain. (vi) Compare the mean of each input
variable with its expected value. The expected value of a triangular variable is (Min + Max +
Mode)/3. (vii) If Bob had enough capital to mail twice as many flyers, would it change the
outcome?
c. Scenario 3: Bob’s Statistics Grades. Questions: (i) What do the parameters say about Bob’s selfevaluation? (ii) What is Bob’s expected grade? (iii) What is the chance that Bob’s overall grade
will be below 70? What is his chance of exceeding 80? (iv) Estimate and interpret the quartile
points for his overall grade. (v) Estimate and interpret the 5 percent and 95 percent points for
his overall grade. (vi) From the histogram, what grade range is most likely? (vii) To check the
simulation, inspect the mean and standard deviation of each input variable. Are they about
what they are supposed to be?
MONTE CARLO SIMULATION PROJECT
18.11 Objective: To demonstrate that an expected value E(X) is an average, and that there is variation
around the average.
Scenario: A life insurance company charges $1,500 for a $100,000 one-year term life insurance
policy for a 60-year-old white male. If the insured lives, the company gains $1,500. If the insured
dies, the company loses $98,500 (the $100,000 face value of the policy minus the $1,500
prepaid premium). The probability of the insured’s death during the year is .012.
Instructions: (a) Calculate the company’s expected payout ($100,000 with probability .012,
$0 with probability .988). (b) Calculate the expected net profit by subtracting the expected
payout from $1,500. (c) To perform a Monte Carlo simulation of net profit for 1,000 insurance
policies, enter the Excel formula =IF(RAND()< 0.012,–98500,1500)) into cell A1 and then copy
the formula into cells A1:A1000. (d) To get the simulated net profit, in cell C1 enter the formula
=AVERAGE(A1:A1000). (e) Press F9 10 times, each time writing down the average in cell C1. (f)
Was the average net profit close to the expected net payout from part (b)? (g) To count the
number of times the company had to pay, enter =COUNTIF(A1:A1000,”=-98500”) in cell C2. (h)
Press F9 10 times and write down how many times the company had to pay.
Bottom Line Questions: How much variability is there in the number of claims paid and in the
net profit for 1,000 policies? Why is the expected value an incomplete description of net profit?
Why does an insurance company need to issue lots of insurance policies in order to have stable
profits? Would 1,000 policies be enough?
doa57594_web_ch18_001-020_onlinecontent.indd
18-18
10/09/17 03:49 PM
Final PDF to printer
18-19
Applied Statistics in Business and Economics
Useful Websites
@Risk
Simio
XLSim
YASAI
RELATED READING
www.palisade.com
www.simio.com/index.html
http://xlsim.software.informer.com/
www.yasai.rutgers.edu/
Banks, Jerry; John Carlson; Barry L. Nelson; and David Nicol. Discrete Event System Simulation. 5th ed.
Pearson, 2010.
Chernick, Michael R., and Robert A. LaBudde. An Introduction to Bootstrap Methods with Applications to
R. Wiley, 2012.
Conway, Richard W., and John O. McCain. “The Conduct of an Effective Simulation Study.” INFORMS
Transactions on Education 3, no. 3 (May 2003), pp. 13–22.
Gentle, James E. Random Number Generation and Monte Carlo Methods. Springer-Verlag, 2003.
Kelton, W. David; Randall P. Sadowski; and Nancy B. Swets. Simulation with Arena. 6th ed. McGraw-Hill,
2015.
McLeish, Don L. Monte Carlo Simulation and Finance. Wiley, 2005.
Robert, Christian P., and George Casella. Monte Carlo Statistical Methods. 2nd ed. Springer-Verlag, 2004.
Rossetti, Manuel D. Simulation Modeling and Arena, 2nd ed. Wiley, 2015.
Rubinstein, Reuven Y., and Dirk P. Kroese. Simulation and the Monte Carlo Method. 3rd ed. Wiley, 2016.
Vose, David. Risk Analysis: A Quantitative Guide. 3rd ed. Wiley, 2007.
doa57594_web_ch18_001-020_onlinecontent.indd
18-19
10/09/17 03:49 PM
Final PDF to printer
Chapter 18 Simulation 18-20
CHAPTER 18 More Learning Resources
You can access these LearningStats demonstrations through McGraw-Hill’s Connect® to help you
understand simulation.
Topic
LearningStats Demonstrations
Overview
Overview of Simulation
Using Excel
How to Create Random Data
Random Normal Data Explained
Examples
Axolotl’s Three Products
Freezer Temperature Control Chart
Your Annual Fuel Cost
Bootstrap Simulation
Excel projects
Project Instructions
Bob’s Balance Sheet (Excel)
Bob’s New Business (Excel)
Bob’s Statistics Grades (Excel)
@Risk projects
Bob’s Balance Sheet (@Risk)
Bob’s New Business (@Risk)
Bob’s Statistics Grades (@Risk)
Key:
= PowerPoint
= Word
doa57594_web_ch18_001-020_onlinecontent.indd
= Excel
18-20
10/09/17 03:49 PM
Download