Simulation

advertisement
CIS 2033 based on
Dekking et al. A Modern Introduction to Probability and Statistics. 2007
Michael Baron. Probability and Statistics for Computer Scientists, CRC 2006
Instructor Longin Jan Latecki
Ch. 6 Simulations
What is a simulation?
One uses a model to create specific situations in order to study the response of
the model to them and then interprets this in terms of what would happen to the
system “in the real world”.
Models for such systems involve random variables, and we speak of probabilistic
or stochastic models, and simulating them is stochastic simulation.
Stochastic simulation of a system means generating values for all the random
variables in the model, according to their specified distributions, and recording
and analyzing what happens
We refer to the generated values as realizations of the random variables
Quick exercise 6.1
Describe how you can simulate a coin toss when instead of a coin
you have a die.
Any ideas on how to simulate a roll of a die if you only have a coin?
Quick exercise 6.1
Describe how you can simulate a coin toss when instead of a coin
you have a die.
Any ideas on how to simulate a roll of a die if you only have a coin?
Repeat the coin tosses if you get TTH or TTT.
Quick exercise 6.2
A random variable Y has outcomes 1, 3, and 4 with the following probabilities:
P(Y = 1) = 3/5, P(Y = 3) = 1/5, and P(Y = 4) = 1/5.
Describe how to construct Y from a U(0, 1) random variable.
6.2 Generating realizations: Bernoulli
Simulations are almost always done by computer, which usually have one or more socalled (pseudo) random number generators. Which mimics U(0,1).
Bernoulli random Variables
Suppose U has a U(0,1) distribution. To construct a Ber(p) random variable for some
0 < p < 1.
so that
A Matlab code:
U = rand;
X = (U<p)
Binomial
Binomial Random Variables
We can obtain a binomial RV as a sum of n independent Bernoulli RVs.
For this we start with n uniform RVs, for example:
A Matlab code:
n = 20; p = 0.68;
U = rand(n,1);
X = sum(U<p)
Geometric
Geometric Random Variables
A while loop of Bernoulli trials generates a geometric RV.
We run the loop until the first success occurs.
Variable X in the while loop counts the number of failures.
A Matlab code:
X=1;
while rand > p
X = X + 1;
end;
X
%at least one trail
%continue while there are failures
%stop at the first success
Arbitrary discrete distribution
Arbitrary Discrete Random Variable X
that takes values x0, x1, … with probabilities pi = P(X=xi) such that p0+p1+…=1.
Algorithm
1. Divide interval [0, 1] into subintervals of lengths pi
A0 = [0, p0)
A1 = [p0, p0+p1)
A2 = [p0+p1, p0+p1+p2),
…
2. Obtain U from Uniform(0, 1), the standard uniform distribution
3. Set X = xi if U belongs to interval Ai
A0
0
X has the desired distribution, since
P( X  xi )  P(U  Ai )  pi
A1
A2
A3
1
Continuous Random Variables
GaussianEx1.m plots histogram of samples drawn from
a Gaussian with mean zero and std one and plots the
Gaussian with mean 1 and std 1.
Values of histogram are normalized so that the sum of
the area of the histogram bars (rectangles) is equal to 1.
This way both plots are directly comparable, since the
area under the Gaussian curve is also one.
x = -5:0.1:5;
g=normpdf(x,0,1);
samples=randn(1000,1);
%samples = normrnd(0,1,1000,1);
[hs, hx] = hist(samples,x);
hs=(hs/1000)*(101/10);
figure; bar(x,hs,'r');
hold on
plot(x,g,'LineWidth',3);
Here we drew samples from the target
distribution, which is usually impossible.
Continuous Random Variables
A simple yet surprising fact:
Regardless of the initial distribution of RV X, the cdf of X, FX(X), has uniform distribution:
Let X be a continues RV with a strictly increasing cdf FX(x). Define an RV as U = FX(X).
Then the distribution of U is Uniform(0, 1).
Proof: Since FX(x) is a cdf, 0 ≤ FX(x) ≤ 1 for all x. Hence values of U lie in [0, 1].
We show that the cdf of U is FU(u)=u, hence U is Uniform(0, 1):
FU (u )  P (U  u )
 P ( FX ( X )  u )
 P ( X  F inv X (u ))
 FX ( F
u
inv
X
(u ))
by definition of cdf
Generating Continuous Random Variables
In order to generate a continues RV X with cdf FX, let us revert the formula U = FX(X).
Then X can be obtained from the standard uniform variable U as
X  F inv X (U )
Proof:
P( X  x)  P( F inv X (U )  x)  P( FX ( F inv X (U ))  FX ( x))
 P(U  FX ( x))
since U is Uniform(0, 1)
 FX ( x)
Algorithm
1. Obtain a sample u from
the standard uniform RV U
2. Compute a sample from X as
x  F inv X (u)
Exponential random variables
Applying this method to exponential distribution
F(x) = 1 – e-λx
F(x) = u ↔ x =
ln (1- u) = Finv(u)
if U has a U(0,1) distribution, then the random variable X is defined by
X = Finv(U) =
ln (1- U) =
ln (U)
So we can simulate a RV with Exp(λ) using a Uniform(0,1) RV
SamplingEx1.m plots histogram of samples drawn from
exponential distribution and plots the exponential distribution
Values of histogram are normalized so that the sum of the area
of the histogram bars (rectangles) is equal to one.
This way both plots are directly comparable.
x = 0:0.1:10;
lambda=1;
f=lambda*exp(-lambda*x);
%sample from uniform dist. U(0,1)
samples=rand(1000,1);
%samples = normrnd(0,1,1000,1);
samples=-(1/lambda)*log(1-samples);
[hs, hx] = hist(samples,x);
hs=(hs/1000)*(101/10);
figure; bar(x,hs,'r');
hold on
plot(x,f,'LineWidth',3);
Quick exercise 6.3
A distribution function F is 0 for x < 1 and 1 for x > 3, and
F(x) = 1/4 (x − 1)2 if 1 ≤ x ≤ 3.
Let U be a U(0, 1) random variable.
Construct a random variable with distribution F from U.
Quick exercise 6.3
A distribution function F is 0 for x < 1 and 1 for x > 3, and
F(x) = 1/4 (x − 1)2 if 1 ≤ x ≤ 3.
Let U be a U(0, 1) random variable.
Construct a random variable with distribution F from U.
6.4 A single server queue
There is a well and people want to determine how powerful of a pump they
should buy. The more power would mean less wait times but also increased cost
Let Ti represent the time between consecutive customers, called interarrival time,
e.g., the time between customer 1 and 2 is T2.
Let Si be the length of time that customer i needs to use the pump
The pump capacity v (liters per minute) is a model parameter that we wish to
determine.
If customer i requires Ri liters of water, then Si = Ri / v
To complete the model description, we need to specify the distribution of T and R
Interarrival times: every Ti has an Exp(0.5) distribution (minutes)
Service requirement: every Ri has U(2,5) distribution (liters)
6.4 cont
Wi denotes the waiting time of customer i,
for customer W1 = 0 since he doesn’t have to wait.
The waiting time of customer i depend on customer i-1: Wi = max{Wi-1 + Si-1 - Ti, 0}
We are interested in average waiting time:
n=
The average wait time for v = 2 turns out
to be around 2 minutes
The average wait time for v = 3 turns out
to be around .5 minutes
plot of (n,
n)
for customers n = 1,2,…
6.4 Work in system
One approach to determining how busy the
pump is at any one time is to record at
every moment how much work there is in
the system.
For example if I am halfway through filling
my 4 liter container and 3 people are
waiting who require 2, 3, and 5 liters,
then there are 12 liters to go;
at v = 2, there are 6 minutes of work in the
system and at v = 3 there is just 4.
The amount of work in the system just
before a customer arrives equals the
waiting time of the customer, this is also
called virtual waiting time.
Figure 6.8 illustrates the work in the system
for v = 2 and v = 3
Download