istribution d Geometric

advertisement
Stat 330 (Spring 2015): slide set 10
2
P (3 ≤ Y ≤ 7) = P (Y ≤ 7)−P (Y ≤ 2) = 1−0.957 −(1−0.952) = 0.204
• the first job to time out is between the third and the seventh?
P (Y < 3) = P (Y ≤ 2) = 1 − 0.952 = 0.0975
• Y is less than 3?
P (Y = 3) = 0.9520.05 = 0.045
• the third job times out?
Example 2. Watch the input queue at the alpha farm for a job that times
out. The probability that a job times out is 0.05. Let Y be the index of the
first job to time out, then Y ∼ Geo0.05. What’s then the probability that
Geometric distribution Example 2
Last update: January 28, 2015
Stat 330 (Spring 2015)
Slide set 10
Stat 330 (Spring 2015): slide set 10
Variance Var[X] =
k−1 failures
success!
1−p
p2
Geometric distribution Example 2 (cont’d)
1
we expect the 20th job to be the first to time out
= 20
p
1−p
= 380
very spread out!
p2
3
That is, X is memoryless: “does not remember that it counts up to i
already”!
If X ∼ Geop, then P (X ≥ i + j|X ≥ i) = P (X ≥ j) for i, j = 0, 1, 2, . . .
Interesting property of the Geometric distribution
V ar[Y ] =
E[Y ] =
Plugging in p = 0.05 in the above formulas gives us:
What are the expected value for Y , what is V ar[Y ]?
1
Stat 330 (Spring 2015): slide set 10
How often is S executed on average? - What is E[X]?
P (X = k) = pX (k) = 0.9k−1 · 0.1
Solution: Assume P (B = true) = 0.1 and let X be the number of times S
is executed. Then, X has a geometric distribution with pmf:
Repeat S until B
Example 1: Examine the following programming statement:
3. The cdf is: FX (t) = P (X ≤ t) = 1 − (1 − p)t
2. Expectation E[X] =
1
p,
p
1. The pmf is: pX (k) = P (X = k) = (1 − p)k−1 · Review: X =number of repetitions of the experiment until we have the first
success in a Bernoulli experiment.
Geometric distribution
Stat 330 (Spring 2015): slide set 10
e−λ λx
x!
for x = 0, 1, 2, 3, . . .
Stat 330 (Spring 2015): slide set 10
p(x) =
x=0
∞
e−λ
y=0
λ
y!
y
=λ
x=1
(Note carefully that the average number of initiation for a two-day period
is 20)
Then we have
The number of initiation per day X has a Poisson distribution with parameter
λ = 10.
( The above assumes that account initiations is a rare event within the time
period of one day because no two customers can open an account at the
same time.)
6
The number of initiation in a two-day period Y has a Poisson distribution
with parameter λ = 20
Part (a) What is the probability that more than 8 new accounts will be
initiated today?
This is a key step in solving Poisson distribution related problems.
7
Note that X and Y are random variables with different Poisson distributions
because the events they represent occur during different time intervals.
P (Y > 16) = 1 − P o20(16) = 1 − 0.221 = .779
Part (b) What is the probability that more than 16 new accounts will be
initiated in two days?
New Accounts Customers of an internet service provider initiate new
accounts at the average rate of 10 accounts per day.
Then we have P (X > 8) = 1 − P o10(8) = 1 − 0.333 = .667
Poisson distribution: Example 3.22 (Baron) (Cont’d
Stat 330 (Spring 2015): slide set 10
λx−1
(x−1)!
Stat 330 (Spring 2015): slide set 10
x=1
∞
5
= e−λλ
4
• Var[X] = . . . = λ (left as an exercise)
= e−λλ
∞
λx
(x−1)!
∞
λx
λx
= e−λ ·
= e−λeλ = 1
x!
x!
x=0
Expected Value and Variance of X ∼ P oλ are:
∞
∞
−λ x
• E[X] = x=0 x e x!λ = 0 + e−λ
x=0
∞
2. Do all probabilities sum to 1?
1. Obviously, all values of p(x) ≥ 0 for x ≥ 0.
Check that p(x) defined above is actually a probability mass function. How?
Poisson pmf (cont’d)
Poisson distribution: Example 3.22 (Baron)
We denote the cdf by P oλ(t)
λ is called the rate parameter.
p(x) =
Definition: The Poisson probability mass function (pmf) is defined as:
Z = # of hits on a web page in a 24h period.
Y = # of flaws on a standard size piece of manufactured product (e.g.,
100m coaxial cable, 100 sq.meter plastic sheeting)
X = # of alpha particles emitted from a polonium bar in an 8 minute
period.
Examples:
Situation: The Poisson distribution follows from a certain set of assumptions
about the occurrence of “rare” events in time or space.
Poisson distribution
Stat 330 (Spring 2015): slide set 10
Stat 330 (Spring 2015): slide set 10
100
0.991000.010 = 0.366.
0
Such a beautiful result requires very delicate mathematics.
in distribution.
Xn → X ∼ Poisson(λ)
10
Theorem: {Xn} is a sequence of random variables s.t. Xn ∼ Bin(Nn, pn)
with Nn → ∞, pn → 0 and Nnpn → λ ∈ (0, ∞), then
Rule of thumb: use Poisson approximation if n ≥ 20 and (at the same time)
p ≤ 0.05.
P (X = 0) = 1 −
since (1 − x/n)n → ex.
alternatively
2 1000
≈ e−2 = 0.13534
1000
P (X = 0) = (1 − 0.002)1000 = 0.9981000 = 0.13506
The probability for no typo on a page is P (X = 0), i.e
11
Result (not a theorem): For large n, the Binomial distribution can be
approximated by the Poisson distribution, where λ is taken as np:
(np)k
n k
p (1 − p)n−k ≈ e−np
k!
k
Poisson to approximate Binomial (example)
Stat 330 (Spring 2015): slide set 10
Stat 330 (Spring 2015): slide set 10
Example: (Typos) Imagine you are supposed to proofread a paper. Let us
assume that there are on average 2 typos on a page and a page has 1000
words. This gives a probability of 0.002 for each word to contain a typo.
The number of typos on a page X is then a Binomial random variable, i.e.
X ∼ B1000,0.002.
9
8
Ramification: For larger k, however, the binomial coefficient nk becomes
hard to compute, and it is easier to use the Poisson distribution instead of
the Binomial distribution.
Poisson to approximate Binomial
We need to obtain a value for λ!
Approximation: On the other hand, a defective chip can be considered to
be a rare event, since p is small (p = 0.01). So, approximate X as Poisson
variable.
P (X = 0) =
Then
Solution: Let X be the number of defective chips found in the box. Model
X as a Binomial variable with distribution B100,0.01. Then
e−110
= 0.3679.
0!
We know that the expected value of X is λ. In this example, therefore, we
take λ = 1.
Example: A manufacturer of chips produces 1% defectives. What is the
probability that in a box of 100 chips no defective is found?
P (X = 0) =
Note that we expect 100 · 0.01 = 1 chip out of the box to be defective.
Poisson distribution: Example (cont’d)
How do we choose λ in an example? - look at the expected value!
Poisson distribution: Another Example
Stat 330 (Spring 2015): slide set 10
P (X = 1) = 1000 ·
2 999
2 1−
≈ 2 · e−2 = 0.27067!
1000
1000
e−λ λx
x!
P (X = 1) ≈
That is use P (X = x) =
e−221
= 2 · e−2 = 0.27067
1!
to calculate
12
So basically, we are calculating this probability using the Poisson pmf with
λ = 1000 · 0.002 = 2
and
The probability of one typo on a page is
1000
P (X = 1) =
0.002 · 0.998999 = 0.27067
1
P (X = 2) ≈
e−222
= 0.27067
2!
1000
(1 − 0.002)9980.0022 = 0.27094
2
alternatively, using X ≈ P o2
P (X = 2) =
The probability for two typos on a page is P (X = 2), i.e
13
Stat 330 (Spring 2015): slide set 10
Poisson to approximate Binomial (example cont’d)
Download