Slide set 14 Stat 330 (Spring 2015) Last update: February 3, 2015

advertisement
Slide set 14
Stat 330 (Spring 2015)
Last update: February 3, 2015
Stat 330 (Spring 2015): slide set 14
Gamma Example (Baron 4.7)
Compilation of a computer program consists of 3 blocks that are processed
sequentially, one after the other. Each block takes Exponential time with
mean of 5 minutes, independently of other blocks.
(a) Compute the expectation and variance of the total compilation time.
For a Gamma random variable T with α = 3 and λ = 1/5,
E(T ) =
3
1/5
= 15 (min) and V ar(T ) =
3
(1/5)2
= 75 (min)
(b) Compute the probability for the entire program to be compiled in less
than 12 minutes.
This can be done using repeated integration by parts (see Baron p.87).
1
Stat 330 (Spring 2015): slide set 14
Gamma Example (Cont’d)
However we will use the Gamma-Poisson formula:
For T ∼ Gam(α, λ) and X ∼ P o(λt),
P (T > t) = P (X < α) and P (T ≤ t) = P (X ≥ α)
Need P (T < 12) where T ∼ Gam(3, 1/5).
Note that t = 12 so X ∼ P o(12/5) i.e., X ∼ P o(2.4).
From the Gamma-Poisson formula,
P (T < 12) = P (T ≤ 12) = P (X ≥ 3) = 1 − P o2.4(2) = 1 − 0.5697
2
Stat 330 (Spring 2015): slide set 14
Erlang distribution
Hits on a web page Recall: we modeled waiting times until the first hit as
Exp(2). How long do we have to wait for the second hit?
To calculate waiting time for the second hit, we add the waiting times until
the first hit and the time between the first and the second hit.
Let Y1=the waiting time until the first hit. Then Y1 ∼ Exponential with
λ=2
Let Y2, the time between first and second hit. By the memoryless property
of the exponential distribution, Y2 has the same distribution as waiting time
for the first hit.
That is Y2 ∼ Exponential with λ = 2
We want the total time for the second hit X := Y1 + Y2.
This is the sum of two independent exponential random variables.
3
Stat 330 (Spring 2015): slide set 14
Erlang distribution (cont’d)
If Y1, . . . , Yk are k independent exponential random variables with parameter
λ, their sum X has an Erlang distribution:
X :=
k
X
Yi
i=1
is Erlang(k,λ). The Erlang density fk,λ is
λk
f (x) =
xk−1e−λx
(k − 1)!
for x ≥ 0
where k is called the stage parameter, λ is the rate parameter.
Note this is the same density as that of Gamma(k, λ) with α = k is an
integer.
4
Stat 330 (Spring 2015): slide set 14
Erlang distribution (cont’d)
Expected value and variance of an Erlang distributed variable X can be
computed using the properties of expected value and variance for sums of
independent random variables:
k
k
X
X
1
E[X] = E[
Yi] =
E[Yi] = k ·
λ
i=1
i=1
k
k
X
X
1
V ar[X] = V ar[
Yi] =
V ar[Yi] = k · 2
λ
i=1
i=1
Alternatively, we can use the formulas for the expectation and variance of
Gamma(k, λ) random variable.
This so because Erlang(k, λ) distribution is the same as a Gamma(k, λ)
distribution where k is an integer.
5
Stat 330 (Spring 2015): slide set 14
Erlang distribution (cont’d)
Thus, in order to compute the distribution function, we can use the GammaPoisson formula.
We need the cdf of the Erlang random variable X denoted by Erlangk,λ(x)
Erlangk,λ(t) = P (X ≤ t)
In order to use the Gamma-Poisson formula, we now consider distribution
of X as X ∼ Gamma(k, λ)
From Gamma-Poisson formula,
P (X ≤ t) = P (Y ≥ k) where Y ∼ P o(λt)
Thus
Erlangk,λ(t) = P (Y ≥ k) = 1 − P (Y ≤ k − 1) = 1 − P oλt(k − 1)
6
Stat 330 (Spring 2015): slide set 14
Erlang distribution: Example
Hits on a web page (continued)
1. What is the density of the waiting time until the second hit?
We previously defined X , as the sum of two exponential variables, each
with rate λ = 2.
Thus X has an Erlang distribution with stage parameter 2; thus, the density
of X is
fX (x) = fk,λ(x) = 4xe−2x
for x ≥ 0
2. Find the probability that we have to wait > 1 min for the 3rd hit.
Z := waiting time until the third hit has an Erlang(3,2) distribution. Thus
P (Z > 1) = 1 − Erlang3,2(1) = 1 − (1 − P o2·1(3 − 1)) = P o2(2) = 0.677
We will come across the Erlang distribution again, when modelling the
waiting times in queueing systems, where customers arrive with a Poisson
rate and need exponential time to be served.
7
Stat 330 (Spring 2015): slide set 14
Normal distribution
The normal density is a “bell-shaped” density.
parameters: µ and σ 2 and is
fµ,σ2 (x) = √
1
2πσ 2
e
(x−µ)2
−
2σ 2
The density has two
for − ∞ < x < ∞
The expected value and variance of a normal distributed r.v. X are:
Z ∞
xfµ.σ2 (x)dx = . . . = µ
E[X] =
−∞
∞
Z
V ar[X] =
(x − µ)2fµ.σ2 (x)dx = . . . = σ 2.
−∞
Thus, the parameters µ and σ 2 are actually the mean and the variance of
the N (µ, σ 2) distribution.
8
Stat 330 (Spring 2015): slide set 14
Normal densities for several parameters
µ determines the location of the peak on the x−axis, σ 2 determines the
“width” of the bell.
9
Stat 330 (Spring 2015): slide set 14
Normal distribution (cont’d)
The cumulative distribution function (cdf) of X is
Rt
Nµ,σ2 (t) := Fµ,σ2 (t) = −∞ fµ,σ2 (x)dx
Unfortunately, there does not exist a closed form for this integral. However,
to get probabilities means we need to evaluate this integral.
Fortunately, tables of the cdf of standard normal distribution N (0, 1),the
normal distribution that has mean 0 and a variance of 1, are available.
We can use these tables to compute the cdf of the normal distribution
N (µ, σ 2) for any set of values of µ and σ. How?
We use the fact that X ∼ N (µ, σ 2) can be standardized to obtain a Z
random variable Z ∼ N (0, 1) as follows:
Z=
X −µ
σ
10
Stat 330 (Spring 2015): slide set 14
Standard Normal distribution
IF X ∼ N (µ, σ 2) then Z =
X−µ
σ
∼ N (0, 1).
no Thus
E[Z] = σ1 (E[X] − µ) = 0
V ar[Z] =
1
V
σ2
ar[X] = 1
It is common practice to denote the cdf N0,1(t) by Φ(t) (more commonly
represented as Φ(z))
The values of the Φ(z) are tabulated in tables usually called stanadard
normal tables (or Z tables); however, these tables are (sometimes) only
available for positive values of z
This table is sufficient because, Φ(−z) = 1 − Φ(z) as f0,1 is symmetric
around 0.
11
Stat 330 (Spring 2015): slide set 14
Standard Normal distribution (cont’d)
Recall, the area to the left of the graph up to a specified vertical line at z
represents the probability P (Z < z)
It’s easy to see, that the area in the lkeft tail is equal to the area in the
right tail:
P (Z ≤ −z) = P (Z ≥ +z).
This is true because P (Z ≥ +z) = 1 − P (Z ≤ z), which proves the above
statement.
12
Stat 330 (Spring 2015): slide set 14
Using the Z-table
Suppose Z is a standard normal random variable.
straight look-up
•
P (Z < 1) = Φ(1)
•
P (0 < Z < 1) = P (Z < 1) − P (Z < 0) = Φ(1) − Φ(0) = 0.8413 −
0.5 = 0.3413.
•
P (Z < −2.31) = 1 − Φ(2.31) = 1 − 0.9896 = 0.0104.
or P (Z < −2.31) = Φ(−2.31) = 0.0104.
•
P (|Z| > 2) = P (Z < −2) + P (Z > 2) = 2(1 − Φ(2)) = 2(1 −
0.9772) = 0.0456.
look-up
or P (|Z| > 2) = P (Z < −2) + P (Z > 2) = 2Φ(−2) = 2 × 0.0228 =
0.0456.
=
0.8413.
look-up
look-up
look-up
13
Stat 330 (Spring 2015): slide set 14
Using the Z-table (cont’d)
Suppose, X ∼ N (1, 2) and that we need to calculate P (1 < X < 2)
A standardization of X gives Z :=
X−1
√ .
2
Thus:
1−1 X −1 2−1
√ < √ < √
P (1 < X < 2) = P
=
2
2
2
√
= P (0 < Z < 0.5 2) = Φ(0.71) − Φ(0)
= 0.7611 − 0.5 = 0.2611.
Note that the standard normal table only shows probabilities for z < 3.99.
This is all we need, though, since P (Z ≥ 4) ≤ 0.0001.
Review Examples 4.10, 4.11, and 4.12 from Baron
14
Download