math2200
• Suppose on average, Shaq shoots 45.1%
• Let X be the number of free throws Shaq needs to shoot until he makes one
• Pr(X=2)=?
• Pr(X=5)=?
• E(X)=?
• Only two possible outcomes
– Success or failure
• Probability of success, denoted by p , is the same for every trial
• The trials are independent
• Examples
– tossing a coin
– Free throw in a basketball game
• Be careful when sampling without replacement in finite population
• Precisely, these draws are not independent
• But if the size of the population is large enough, we can treat them as independent
– Rule of thumb: the sample size is smaller than
10% of the population
• How long does it take to achieve a success in
Bernoulli trials?
• A Geometric probability model tells us the probability for a random variable that counts the number of Bernoulli trials until the first success
• Geom( p )
– p = probability of success
– q = 1p = probability of failure
– X: number of trials until the first success occurs
– P(X=x) =
– E(X) =
– Var(X) =
• How long does it take to achieve a success in
Bernoulli trials?
• A Geometric probability model tells us the probability for a random variable that counts the number of Bernoulli trials until the first success
• Geom( p )
– p = probability of success
– q = 1p = probability of failure
– X: number of trials until the first success occurs
– P(X=x) = q x-1 p
– E(X) =
– Var(X) =
• How long does it take to achieve a success in
Bernoulli trials?
• A Geometric probability model tells us the probability for a random variable that counts the number of Bernoulli trials until the first success
• Geom( p )
– p = probability of success
– q = 1p = probability of failure
– X: number of trials until the first success occurs
– P(X=x) = q x-1 p
– E(X) = 1/ p
– Var(X) = q / p 2
• What is the probability that Shaq makes at least one successful throw in the first four attempts?
• What is the probability that Shaq makes at least one successful throw in the first four attempts?
– 1-P(NNNN) = 1-(1-0.451) 4 = 0.9092
– P(X=1)+P(X=2)+P(X=3)+P(X=4)
• A Binomial model tells us the probability for a random variable that counts the number of successes in a fixed number of
Bernoulli trials.
• Binom( n , p )
• Let X be the number of success in n
Bernoulli trials
• p = probability of success
• q = 1p = probability of failure
• In n trials, there are
C n k
n !
!
ways to have k successes.
– Read n
C k as “ n choose k.
”
• Note: n!
= n x (n1) x … x 2 x 1 , and n!
is read as “ n factorial.”
Binomial probability model for Bernoulli trials:
Binom(n, p ) n = number of trials p = probability of success q = 1 – p = probability of failure
X = number of successes in n trials
(
x )
x p q n x where n !
!(
x )!
np npq
• Use P(X=x) directly
• Binomial random variable can be viewed as the sum of the outcome of n Bernoulli trials
• Let Y1,…,Yn be the outcome of each
Bernoulli trial
• E(Y1)=…=E(Yn)=p*1+q*0=p
• Var(Y1)=…=Var(Yn)=(1-p) 2 *p+(0-p) 2 *q = pq
• Suppose Y1,…,Yn are independent and have the same mean µ and variance σ 2
• Let X = Y1+…+Yn
• E(X) = E(Y1)+…+E(Yn)=nµ
• Var(X) = Var(Y1)+…+Var(Yn)=nσ 2
• If Shaq shoots 20 free throws, what is the probability that he makes no more than two?
• Binom(n,p), p=0.451, n=20
• P(X=0 or 1 or 2)
= P(X=0) + P(X=1) + P(X=2) = 0.0009
• If X ~ Binomial(n,p), n=10000
• P(X<2000)=?
• When dealing with a large number of trials in a Binomial situation, making direct calculations of the probabilities becomes tedious (or outright impossible).
• When n is large, np is not too small or too big, then
Binomial(n,p) looks similar to Normal with mean = np and variance = npq
• P(X<2000)=P(Z<(2000-np)/sqrt(npq))
• Success/failure condition : np>=10 and nq>=10
• When we use the Normal model to approximate the Binomial model, we are using a continuous random variable to approximate a discrete random variable.
• So, when we use the Normal model, we no longer calculate the probability that the random variable equals a particular value, but only that it lies between two values.
• For small p and large n, even when np<10, we can approximate Binomial(n,p) by Poisson(np)
• Let λ=np, we can use Poisson model to approximate the probability.
• Poisson(λ)
– λ : mean number of occurrences
– X: number of occurrences
x
e
x x !
• Although it was originally an approximation to the Binomial, the Poisson model is also used directly to model the probability of the occurrence of events for a variety of phenomena.
– It’s a good model to consider whenever your data consist of counts of occurrences.
– It requires only that the events be independent and that the mean number of occurrences stays constant.
• It scales to the sample size
– The average occurrence in a sample of size
35,000 is 3.85
– The average occurrence in a sample of size
3,500 is 0.385
• Occurrence of the past events doesn’t change the probability of future events
– Even though the events appear to cluster, the probability of another event occurring is still the same
• In 1946, the British statistician R.D. Clarke studied the distribution of hits of flying bombs in London during World War II.
• Want to know if the Germans were targeting these districts or if the distribution was due to chance.
• Clarke began by dividing an area into hundreds of tiny, equally sized plots.
• The average number of hits per square is then
537/576=.9323 hits per square
# of hits
# of cell plots with # of hits above
Poisson Fit
0
229
1
211
2
93
3
35
4 5
7 1
• No need to move people from one sector to another, even after several hits!
• The average number of hits per square is then
537/576=.9323 hits per square
# of hits
# of cell plots with # of hits above
Poisson Fit
0
229
1
211
2
93
3
35
4
7
5
1
226.7 211.4 98.5 30.6 7.1 1.6
• No need to move people from one sector to another, even after several hits!
• Be sure you have Bernoulli trials.
– You need two outcomes per trial, a constant probability of success, and independence.
– Remember that the 10% Condition provides a reasonable substitute for independence.
• Don’t confuse Geometric and Binomial models.
• Don’t use the Normal approximation with small n .
– You need at least 10 successes and 10 failures to use the Normal approximation.
– Geometric model
• When we’re interested in the number of Bernoulli trials until the next success.
– Binomial model
• When we’re interested in the number of successes in a certain number of Bernoulli trials.
– Normal model
• To approximate a Binomial model when we expect at least 10 successes and 10 failures.
– Poisson model
• To approximate a Binomial model when the probability of success, p , is very small and the number of trials, n , is very large.
• 2 nd + VARS ( DISTR )
• pdf: P(X=x)
– geometpdf(prob,x)
– binompdf(n,prob,x)
– poissonpdf(mean,x)
• cdf: P(X<=x)
– beometcdf(prob,x)
– binomcdf(n,prob,x)
– poissoncdf(mean,x)