Binomial and Geometric Random Variables

advertisement
Binomial Random Variables
Binomial experiment
• A sequence of n trials (called Bernoulli trials),
each of which results in either a “success” or a
“failure”.
• The trials are independent and so the probability
of success, p, remains the same for each trial.
• Define a random variable Y as the number of
successes observed during the n trials.
• What is the probability p(y), for y = 0, 1, …, n ?
• How many successes may we expect? E(Y) = ?
Returning Students
• Suppose the retention rate for a school indicates
the probability a freshman returns for their
sophmore year is 0.65. Among 12 randomly
selected freshman, what is the probability 8 of
them return to school next year?
Each student either returns or doesn’t.
Think of each selected student as a trial,
so n = 12.
If we consider “student returns” to be a
success, then p = 0.65.
12 trials, 8 successes
• To find the probability of this event, consider the
probability for just one sample point in the event.
• For example, the probability the first 8 students return
and the last 4 don’t.
• Since independent, we just multiply the probabilities:
P (( S , S , S , S , S , S , S , S , F , F , F , F ))
 P ( R1  R 2 
 P ( R1 ) P ( R 2 )
 R 8  R 9  R10  R11  R12 )
P ( R8 ) P ( R9 )
 (0 .6 5) (1  0 .6 5)
8
4
P ( R12 )
12 trials, 8 successes
• For the probability of this event, we sum the
probabilities for each sample point in the event.
• How many sample points are in this event?
• How many ways can 8 successes and 4 failures occur?
12
8
4
4
C C , or sim ply C
12
8
• Each of these sample points has the same probability.
• Hence, summing these probabilities yields
P (8 successes in n trials)
= C 8 (0.65) (0.35)  0.237
12
8
4
Binomial Probability Function
• A random variable has a binomial distribution
with parameters n and p if its probability function
is given by
p ( y )  C y p (1  p )
n
y
n y
Rats!
• In a research study, rats are injected with a drug.
The probability that a rat will die from the drug
before the experiment is over is 0.16.
Ten rats are injected with the drug.
What is the probability that at
least 8 will survive?
Would you be surprised if at
least 5 died during the
experiment?
Quality Control
• For parts machined by a particular lathe, on
average, 95% of the parts are within the
acceptable tolerance.
• If 20 parts are checked, what is the probability that
at least 18 are acceptable?
• If 20 parts are checked, what is the probability that
at most 18 are acceptable?
Binomial Theorem
• As we saw in our Discrete class,
the Binomial Theorem allows us to expand
n
( p  q) 
n

n
y
Cy p q
n y
y0
• As a result, summing the binomial probabilities,
where q = 1- p is the probability of a failure,

y
n
P (Y  y )   C y p (1  p )
n
y0
y
n y
 ( p  (1  p ))  1
n
Mean and Variance
• If Y is a binomial random variable with
parameters n and p, the expected value and
variance for Y are given by
E (Y )  n p and V (Y )  n p (1  p )
Deriving Expected Value
n
E (Y ) 

n
y p( y) 
y0
n


y 1
 yC  ( p
n
y
y
)( q
n y
), w here q  1  p
y 1

 y
n!
n y
y
(
p
)(
q
)

 y !( n  y )! 

 y
n!
n y

(
p
)(
q
)

y 1  ( y  1)!( n  y )! 
n
When y = 0, the
summand is zero.
Just as well start
at y = 1.

 y 1
( n  1)!
n y
 np 
(
p
)(
q
)

y 1  ( y  1)!( n  y )! 
n
And deriving…

 y 1
( n  1)!
n y
E (Y )  n p  
(
p
)(
q
)

y 1  ( y  1)!( n  y )! 
n

 y 1
( n  1)!
n 1 ( y 1)
 np 
(
p
)(
q
)

y 1  ( y  1)!( n  1  ( y  1))! 
n
n
 n p  C
y 1
n 1
n 1
y 1
 n p  C z
n 1
( p
y 1
)( q
 ( p )( q
z
z0
n 1
 n p ( p  q)
=1
 np
n 1  ( y 1)
n 1  z
)
)
DerivingVariance?
Just the highlights (see page 104 for details).
Show s that E [Y (Y  1)]  n ( n  1) p
2
and so E (Y )  E [Y  Y ]  E (Y )  E [Y (Y  1)]  E (Y )
2
2
 n ( n  1) p  np
2
T hus, V (Y )  E (Y )  [ E ( Y )]
2
2
 [ n ( n  1) p  np ]  ( np )
2
 n p  np  np  n p
2
2
2
2
2
2
 np  np
2
 np (1  p )  npq
“fairly common trick” to use E[Y(Y-1)] to find E(Y2)
Rats!
• In a research study, rats are injected with a drug.
The probability that a rat will die from the drug
before the experiment is over is 0.16.
Ten rats are injected with the drug.
• How many of the
rats are expected to
survive?
• Find the variance
for the number of
survivors.
Geometric Random Variables
Your
•
•
•
•
st
1
Success
Similar to the binomial experiment, we consider:
A sequence of independent Bernoulli trials.
The probability of “success” equals p on each trial.
Define a random variable Y as the number of the
trial on which the 1st success occurs.
(Stop the trials after the first success occurs.)
• What is the probability p(y), for y = 1,2, … ?
• On which trial is the first success expected?
Finding the probability
• Consider the values of Y:
y = 1: (S)
(S)
S
y = 2: (F, S)
(F, S)
y = 3: (F, F, S)
S
y = 4: (F, F, F, S)
F
(F, F, S)
S
and so on…
F
(F, F, F, S)
p(1) = p
S
F
p(2) = (q)( p)
p(3) = (q2)( p)
….
3
p(4) = (q )( p)
Geometric Probability Function
• A random variable has a geometric distribution
with parameter p if its probability function is
given by
p( y)  q
y 1
p
w here q  1  p , for y  1, 2,...
Success?
• Of course, you need to be clear on what you
consider a “success”.
• For example, the 1st success might mean finding
the 1st defective item!
(D)
D
(G, D)
D
G
(G, G, D)
D
G
G
Geometric Mean, Variance
• If Y is a geometric random variable with
parameter p the expected value and variance for Y
are given by
E (Y ) 
1
p
and V (Y ) 
1 p
p
2
Deriving the Mean

E (Y ) 
n
 y p ( y )   y (q
y 1
n
y 1
)( p ), w here q  1  p
y 1
 p yq
y 1
 p (1  2 q  3 q  4 q 
2
3
)
y 1
 p
d
(q  q  q  q
2
3
4
)
dq
d  q 

 1
1
 p


  p

2
dq  1  q 
 (1  q )  p
Deriving Variance
Using the “trick” of finding E[Y(Y-1)] to get E(Y2)…

E [Y (Y  1)] 
n
 y ( y  1) p ( y )   y ( y  1)( q
y 1
 pq  y ( y  1) q
 pq  2  6 q  12 q 
y2
2
y2
 pq
2
q  q  q  q 
2
2
d q
 q 
 pq 2 

d q 1 q 
d
)( p )
y2
n
d
y 1
3
2

4
2q
p
2


Deriving Variance
Now, forming the second moment, E(Y2)…
E (Y )  E [Y (Y  1)]  E [Y ] 
2
2q  p
p
And so, we find the variance…
V (Y )  E (Y )   E (Y ) 
2

2q  p
p
2
2
2
 1 
1 p
  
2
p
p
 
2
At least ‘a’ trials? (#3.55)
• For a geometric random variable and a > 0,
show
P(Y > a) = qa
• Consider
P(Y > a) = 1 – P(Y < a)
= 1 – p(1 + q + q2 + …+ qa-1)
= qa , based on the sum of a
geometric series
At least b more trials?
• Based on the result, it follows
P(Y > a + b) = qa+b
• Also, the conditional probability
P(Y > a + b | Y > a ) = qb = P(Y > b)
“the memoryless property”
No Memory?
• For the geometric distribution
P(Y > a + b | Y > a ) = qb = P(Y > b)
• This implies
P(Y > 7 | Y > 2 ) = q5 = P(Y > 5)
“knowing the first two trials were failures, the
probability a success won’t occur on the next
5 trials”
as compared to
“just starting the trials and a success won’t occur
on the first 5 trials”
same probability?!
Estimating p (example 3.13)
• Considering implementing a new policy in a large
company, so we ask employees whether or not
they favor the new policy.
• Suppose the first four reject the new policy, but
the 5th individual is in favor of the policy.
What does this tell us about the percentage of
employees we might expect to favor the policy?
Can we estimate the probability p of getting a
favorable vote on any given trial?
What value of p is most likely?
• We wish to find the value of p which would
make it highly probable that the 5th individual
turns out to be first “success”.
• That is, let’s maximize the probability of
finding the first success on trial 5, where
p(5) = (1- p)4 p
• For what value of p is this probability a max?
Find the Extrema
Using the derivative to locate the maximum
d
 (1  p ) 4 p   (1  p ) 3 (1  5 p )

dp 
The derivative is zero and the probability is at its
maximum when p = 0.2
“the method of
maximum likelihood”
Download