Uploaded by Haluk Karakaş

Discrete Distributions METU

advertisement
IE 265
• Discrete Distributions – Part 2
• Hypergeometric distribution
• Poisson distribution
2
Hypergeometric distribution
• Suppose we have N items, k of which are labeled as
•
•
•
•
success.
Suppose n items are randomly selected from N without
replacement.
Let X be the # of successes in our sample.
Then, X is a hypergeometric random variable.
In general,
k  N  k
  

x n  x 

p( x) 
, x  0,1,2,, min{n, k}
N
 
n 
X ~ Hypgeo( N , k , n)
3
Hypergeometric vs Binomial
• If the selection is without replacement, then
• trials are dependent and
• we use hypergeometric distribution
• If the selection is with replacement, then
• trials are independent and
• we use binomial distribution
4
Hypergeometric vs Binomial
• r.v. X~Hypgeo(100,90,15)
• success probability in trial 1: 90/100.
• success probability in trial 2: 89/99 for S (in the previous
trial), 90/99 for F (in the previous trial).
• success probability in trial 3: 88/98 for SS, 89/98 for SF
and FS, 90/98 for FF.
• dependence of a trial on the previous trials
• r.v. Y~Bin(15,90/100)
• success probability: (constant) 90/100 for all trials.
• independence of the trials
5
Hypergeometric distribution
Mean and variance
k
E( X )  n  
N
similar to binomial where E(X) = m = np
k  N n
 k 
Var ( X )  n   1   

 N   N   N 1 
similar to binomial where
Var ( X )  np(1  p),
but has a correction
factor for dependent trials
6
Hypergeometric distribution
• Ex: A warehouse contains 10 printing machines, four of
which are defective. Five machines are selected for
purchase
a) What is the probability that all five of these are nondefective?
b) What is the probability that the third defective is found in the fifth
trial?
7
Binomial Approximation to Hypergeometric
• If the sampling fraction n / N is small (say < 0.1), then
Hypgeo( N , k , n) ~ Bin(n, k / N ) .
• For example, if we draw a sample of 9 parts from a batch
of 1000, we can ignore dependence due to without
replacement selection, and assume p is constant at k / N
for every draw.
8
Poisson distribution
• Poisson distribution is used to model occurence of rare
events.
• It is a limiting distribution for binomial distribution as 𝑛 → ∞
and 𝑝 → 0 such that np remains constant, i.e. we take the
limit under the restriction that the mean of binomial
distribution, np, remains constant at a value l.
• Let
l be the average # of occurences per unit time, e.g.
average # of accidents at a particular highway intersection
during a year (rare compared to all those vehicles crossing
the intersection).
9
Poisson distribution
•
Suppose we divide the unit time interval (e.g. one year) into n
small subintervals such that:
1. P(one occurence in a subinterval) = p
2. P(no occurence in a subinterval) = 1 - p
3. P(more than one occurence in a subinterval)
•
0
1, if there is an occurence in subinterval i
Let X i  
0, otherwise
n
•
X   X i gives total # of occurences per unit time interval,
i 0
e.g. # of accidents per year
• Then, X ~ Bin(n, p) where E(X) = np = l and p = l/n
10
Poisson distribution
• We want to investigate the behavior of X as n   ,
and np remains constant, i.e. we wish to find
 n x
lim P( X  x)  lim   p (1  p ) n  x
n 
 x
p 0
np  l
where p 
l
n
p0
11
Poisson distribution
x
n!
l  l
lim P( X  x)  lim
  1  
n 
(n  x)! x!  n   n 
p 0
n x
,
since p 
l
n
np  l
n(n  1)(n  2)  (n  x  1) lx  l 
 lim
1  
x
n
x!  n 
n  1  2   x  1 
1  1    1 

n  n  n  
n 
(each term  1 as n  )
n x
12
Poisson distribution
lim P( X  x)  lim
n 
p 0
np  l
l 
l
1



x!  n 
x
n
 l
1  
 n
x
1
 l  l   l 
1  1   1  
 n  n   n 
(each term  1 as n  )
 lim
Poisson pmf   e l
lx 
l
x
n
1  
x!  n 
lx
x!
,
l
n

given lim 1    e l
n 
 n
of them
13
Poisson distribution
e  l lx
, x  0,1,2,
• In general, p ( x) 
x!
X ~ Poisson(l )
• If
l is the average (mean) # of occurences per unit time,
then p(x) is the probability of x occurences per unit time.
• Validity of p(x):
i.
p( x)  0, x
ii.
2
3


e  l lx
l
l
l
 e 1  l      1

x!
2! 3!
x 0



el
14
Poisson distribution
0.4
Poisson(1)
Poisson(4)
Poisson(10)
0.2
0.1
1.0
0.0
0.8
0
2 4
6
8 10 12 14 16 18 20
x
F (x )
p (x )
0.3
0.6
0.4
0.2
0.0
0
2
4
6
8 10 12 14 16 18 20
x
15
Poisson distribution
Mean and variance


e  l lx
e  l lx
e  l lx
E( X )   x
x

x!
x!
x 0
x 1
x 1 ( x  1)!


 e l l 
x 1
lx 1
( x  1)!

ly
y 0
y!
 e l l 
 e l l el  l
Var (X )  l
16
Poisson distribution
• Ex: Assume that, during the course of the semester,
there are 1000 lectures in the engineering school. Each
lecture has a probability 1/106 of having a stranger walk
in.
a) What is the probability that a stranger walks into one lecture in a
semester?
b) In an academic year (ignoring the summer school)? (see the
next slide before working on this part)
17
Poisson distribution
• What if 𝜆 is given as the average # of occurences per
unit time, but 𝑋 is the # of occurences in 𝑡 consecutive
time units? Then, 𝑋~𝑃𝑜𝑖𝑠𝑠𝑜𝑛(𝜆𝑡).
• Let random variables 𝑌𝑚 ~𝐵𝑖𝑛(𝑛, 𝑝) with large 𝑛 and
small 𝑝 for each 𝑚.
• Approximately, 𝑌𝑚 ~𝑃𝑜𝑖𝑠𝑠𝑜𝑛(𝑛𝑝) where 𝑛𝑝 = 𝜆.
• For 𝑡 = 2:
• exact distribution: 𝑋 = 𝑌1 + 𝑌2 , 𝑋~𝐵𝑖𝑛(2𝑛, 𝑝)
• approximate distribution: 𝑋 = 𝑌1 + 𝑌2 , 𝑋~𝑃𝑜𝑖𝑠𝑠𝑜𝑛(2𝑛𝑝)
18
Poisson distribution
• Ex: The # of accidents per day occuring on a highway is
distributed as Poisson with a mean rate of three accidents
per day
a) What is the probability that three or more accidents occur today?
b) What is the probability that at least three accidents occur in two
days?
c) What is the probability that three or more accidents will occur
today given that at least one accident occured today?
19
Poisson distribution
• Ex: What is the probability that a machine used in
production will break down five times in a year, if its mean
time between breakdowns is four months?
20
Poisson Approximation to Binomial
• For small p, large n, and relatively constant
l = np:
Bin(n, p) ~ Poisson(l ) .
• In general, the approximation works well when 𝑛 ≥ 100 and
𝑝 ≤ 0.1 (criteria vary a bit from reference to reference).
• Ex: A process is known to produce 5% defectives. What is the
probability that there will be more than 10 defectives in the next
batch of 1000 parts?
X: the # of defectives in the next batch of 1000 parts
Since p = 0.05 is small and n = 1000 is large, binomial
distribution can be approximated by Poisson with l = np = 50.
e 50 50 x
P( X  10)  1  
x!
x 0
10
Download