Slide set 18 Stat 330 (Spring 2015) Last update: February 16, 2015

advertisement
Slide set 18
Stat 330 (Spring 2015)
Last update: February 16, 2015
Stat 330 (Spring 2015): slide set 18
Stochastic Processes
Review: What is a Random variable?
Definition: A stochastic process is a set of random variables indexed by
some indices, particularly time t, and is usually denoted by X(t).
Some remarks:
1. Stochastic process is a mathematical model of reality.
2. Modeling usually requires somehow specifying the joint distribution
(X(t1), · · · , X(tk )) or P (X1 ∈ A1, · · · , Xk ∈ Ak )
3. Values of X(t) are called states, the set of all possible values for X(t) is
called the state space.
The example about ’hits on a webpage’ is a typical example of stochastic
process, and it has a special name: Poisson Process.
1
Stat 330 (Spring 2015): slide set 18
Poisson Process
Review: What is Exponential distribution? and Poisson distribution?
1. Exponential: P (T ≤ t) = 1 − e−λt for all t ≥ 0 where T is waiting time
for rare event to happen (once).
2. Poisson: P (X = k) = e−λλx/x! where X is the number of observations
of rare event during certain time period (or space).
3. pdf of Exponential distribution: fT (t) = λe−λt for t ≥ 0, and λ is the
rate, 1/time. What is E(T ), and Var(T )? What is E(X) and Var(X))
4. Lack of memory property for Exponential:
P (T > t + s|T > t) = P (T > s)
(this is key for Poisson process later)
5. Exponential race: P (min(S, T ) > t) = P (S > t, T > t) = e−(λ+µ)t if
T, S independent. What about P (min(T1, · · · , Tn) > t)?
2
Stat 330 (Spring 2015): slide set 18
Poisson process
Definition: A stochastic process X(t) is called homogenous Poisson process
with rate λ, if
1. for t > 0, X(t) takes values in {0, 1, 2, 3, . . .}.
2. distribution depends only on length of interval for any 0 ≤ t1 < t2:
X(t2) − X(t1) ∼ P oλ(t2−t1)
3. non-overlapping intervals are independent for any 0 ≤ t1 < t2 ≤ t3 < t4
X(t2) − X(t1) is independent from X(t4) − X(t3)
Jargon: X(t) is a “counting process” with independent Poisson increments.
3
Stat 330 (Spring 2015): slide set 18
Example
♣ A counter of the number of hits on our webpage is an example for a
Poisson process with rate λ = 2/min.
♥ Here arrival times are generated from Exp(2). X(t) counts numbers of
hits until time t min.
♦ For example, we find that X(t) = 3 for t ∈ [5, 8] minutes; i.e., only 3
hits upto any time within 5 to 8 minutes.
4
Stat 330 (Spring 2015): slide set 18
Example (cont’d)
Remarks
1. X(t) can be thought of as the number of occurrences until time t.
2. Similarly, X(t2) − X(t1) is the number of occurrences in the interval
(t1, t2].
3. With the same argument, X(0) = 0 - ALWAYS!
4. The distribution of X(t) is Poisson with rate λt, since:
X(t) = X(t) − X(0) ∼ P oλ(t−0)
5
Stat 330 (Spring 2015): slide set 18
Example (Cont’d)
Based on the last example:
For a given Poisson process X(t) we define occurrences
O0 = 0, Oj = time of the j thoccurrence = the first t for which X(t) ≥ j
and the inter-arrival time between successive hits:
Ij = Oj − Oj−1 for j = 1, 2, . . .
The time until the k th hit Ok is therefore given as the sum of inter-arrival
times Ok = I1 + . . . + Ik .
6
Stat 330 (Spring 2015): slide set 18
Equivalence theorem
Equivalence theorem:
X(t) is a Poisson process with rate λ iff the inter-arrival times I1, I2, . . .
are i.i.d. Expλ.
Corollary:
The time until the kth hit Ok is an Erlangk,λ distributed variable, ⇐⇒
X(t) is a Poisson process with rate λ.
Note: This theorem is very important! - it links the Poisson, Exponential,
and Erlang distributions tightly together! Some thoughts:
• Why Poisson so important?!
• We mention homogeneous Poisson process; What is meant by
homogeneous?
• What is a nonhomogeneous process?
7
Stat 330 (Spring 2015): slide set 18
Example
Hits on a website: Hits on a popular Web page occur according to a Poisson
Process with a rate of 10 hits/min. One begins observation at exactly noon.
1. Evaluate the probability of 2 or less hits in the first minute.
Let X be the number of hits in the first minute, then X is a Poisson
variable with λ = 10:
P (X ≤ 2) = P o10(2) = e−10 + 10 · e−10 + 102/2e−10 = 0.0028.
(You may also check the Poisson cdf table).
2. Evaluate the probability that the time till the first hit exceeds 10 seconds.
Let Y be the time until the first hit - then Y has an Exponential
distribution with parameter λ = 10 per minute or λ = 1/6 per second.
P (Y ≥ 10) = 1 − P (Y ≤ 10) = 1 − (1 − e−10·1/6) = e−5/3 = 0.1889.
8
Stat 330 (Spring 2015): slide set 18
3. Evaluate the mean and the variance of the time till the 4th hit.
Let T be the time till the 4th hit. Then T has an Erlang distribution with
stage parameter k = 4 and λ = 10 per minute.
E[T ] =
V ar[T ] =
k
4
=
= 0.4 minutes
λ 10
4
k
2
=
=
0.04minutes
.
2
λ
100
4. Evaluate the probability that the time till the 4th hit exceeds 24 seconds.
Need P (T > 24/60) where T ∼ Erlang(4, 10) and T is in minutes; so
we’ll use the Gamma-Poisson formula:
P (T > 0.4) = P (X < 4) where X ∼ P oi(λ · t)
= P (X ≤ 3) where X ∼ P oi(10 · 0.4)
= P o4(3) = 0.433 Website table,p.786 or Baron p.384
9
Stat 330 (Spring 2015): slide set 18
5. The number of hits in the first hour is Poisson with mean 600.
You would like to know the probability of more than 650 hits. Exact
calculation isn’t really feasible. So approximate this probability and justify
your approximation.
Recall that a Poisson distribution with large rate λ can be approximated
by a normal distribution with mean µ = λ and variance σ 2 = λ.
approx
Then X ∼ N (600, 600) → Z :=
approx
X−600
√
∼
600
N (0, 1).
Then:
P (X > 650)
= 1 − P (X ≤ 650) = 1 − P
Z≤
650 − 600
√
600
≈
≈ 1 − Φ(2.05)
= 1 − 0.9798 = 0.0202. Webpage table, p.789 or Baron p. 386
10
Stat 330 (Spring 2015): slide set 18
Poisson Process: Conditioning
Poisson process possesses an interesting property that is consistent with
thinking of it as ”random occurrences” in time t, which leads to the
conditioning theorem
Theorem: Let X(t) be a Poisson process. Given that X(T ) = k, the
conditional distribution of the time of the k occurrences O1, . . . , Ok is the
same as the distribution of k ordered independent standard uniform variables
U(1), U(2), . . . , U(k).
♣ In other word, given that there were k arrivals, the set of arrival times is
the same as the locations of k darts thrown at random on the interval [0, t].
♠ This tells us a way to simulate a Poisson process with rate λ on the
interval (0, T ).
11
Stat 330 (Spring 2015): slide set 18
Simulating a Poisson Process
•
first, draw a Poisson value w from P oλT . ( This tells us, how many
uniform values Ui we need to simulate )
•
second, generate w many standard uniform values u1, . . . , uw
•
define oi = T ·u(i), where u(i) is the ith smallest value among u1, . . . , uw .
♥ The above theorem tells us, that, if we pick k values at random from
an interval (0, t) and order them, we can assume that the distance between
two successive values has an exponential distribution with rate λ = k/t.
♦ So far, we are looking only at arrivals of events. Besides that, we could,
for example, look at the number of surfers that are on our web site at the
same time.
♣ There, we have departures as well and, related to that, the time each
surfer stays - which we will call service time (from the perspective of the
web server).
12
Download