Binomial Distribution - Erwin Sitompul

advertisement
Probability and Statistics
Lecture 6
Dr.-Ing. Erwin Sitompul
President University
http://zitompul.wordpress.com
President University
Erwin Sitompul
PBST 6/1
Chapter 5
Some Discrete Probability Distributions
Chapter 5
Some Discrete Probability
Distributions
President University
Erwin Sitompul
PBST 6/2
Chapter 5.1
Introduction
Introduction
 Often, the observations generated by different statistical
experiments have the same general type of behavior.
 The discrete random variables associated with these experiments
can be described by essentially the same probability distribution in
a single formula.
 In fact, one needs only a handful of important probability
distributions to describe many of the discrete random variables
encountered in practice.
 In this chapter, we are going to present these commonly used
distributions with various examples.
President University
Erwin Sitompul
PBST 6/3
Chapter 5.2
Discrete Uniform Distribution
Discrete Uniform Distribution
 If the random variable X assumes the values x1, x2, ..., xk, with
equal probabilities, then the discrete uniform distribution is given
by
f ( x; k ) 
1
,
k
x  x1 , x 2 ,
, xk
When a light bulb is selected at random from a box that contains a
40-watt bulb, a 60-watt bulb, a 75-watt bulb, and a 100-watt bulb,
each element of the sample space S = {40, 60, 75, 100} occurs with
probability 1/4.
Therefore, we have a uniform distribution, with
f ( x ; 4) 
1
,
x  40, 60, 75,100
4
President University
Erwin Sitompul
PBST 6/4
Chapter 5.2
Discrete Uniform Distribution
Discrete Uniform Distribution
When a dice is tossed, each element of the sample space S = {1, 2, 3,
4, 5, 6} occurs with probability 1/6.
Therefore, we have an uniform distribution with
f ( x ; 6) 
1
,
x  1, 2, 3, 4, 5, 6
6
President University
Erwin Sitompul
PBST 6/5
Chapter 5.3
Binomial and Multinomial Distributions
Bernoulli Process
 An experiment often consists of repeated trials, each with two
possible outcomes that may be labeled success or failure. We
may choose to define either outcome as a success.
 The process is referred to as a Bernoulli process.
 Each trial is called a Bernoulli trial.
 Strictly speaking, the Bernoulli process must possess the following
properties:
1. The experiment consists of n repeated trials.
2. Each trial results in an outcome that may be classified as a
success or a failure.
3. The probability of success, denoted by p, remains constant from
trial to trial.
4. The repeated trials are independent.
President University
Erwin Sitompul
PBST 6/6
Chapter 5.3
Binomial and Multinomial Distributions
Bernoulli Process
Consider the set of Bernoulli trials where three items are selected at
random from a manufacturing process, inspected, and classified
defective or non-defective. A defective item is designated a success.
The number of successes is a random variable X assuming integer
values from 0 to 3.
The items are selected independently from a process and we shall
assume that it produces 25% defectives.
The probability of the outcome NDN can be calculated as
9
 3  1  3 
P(NDN )  P(N )P(D )P(N )        
 4   4   4  64
President University
Erwin Sitompul
PBST 6/7
Chapter 5.3
Binomial and Multinomial Distributions
Bernoulli Process
The probabilities for the other possible outcomes can also be
calculated to result the probability distribution of X
 The number X of successes in n Bernoulli trials is called a
binomial random variable.
 The probability distribution of this discrete random variable is
called the binomial distribution, and denoted by b(x; n, p).
P ( X  2)  f (2)  b (2 : 3, ) 
1
4
President University
9
64
Erwin Sitompul
PBST 6/8
Chapter 5.3
Binomial and Multinomial Distributions
Binomial Distribution
 |Binomial Distribution| A Bernoulli trial can result in a success
with probability p and a failure with probability q = 1 – p. Then the
probability distribution of the binomial random variable X,
the number of successes in n independent trials, is
b( x : n, p )  n C x p q
x
n x
x  0,1, 2, ..., n
,
 The mean and variance of the binomial distribution b(x; n, p) are
2
  np
  npq
The probability that a certain kind of component will survive a given
shock test is 3/4. Find the probability that exactly 2 of the next 4
components tested will survive.
p 
3
4
2
3

3 1
b  2 : 4,   4 C 2    
4

4 4
President University
2

54
256
Erwin Sitompul
PBST 6/9
Chapter 5.3
Binomial and Multinomial Distributions
Binomial and Multinomial Distributions
The probability that a patient recovers from a rare blood disease is
0.4. If 15 people are known to have contracted this disease, what is
the probability that (a) at least 10 survive, (b) from 3 to 8 survive,
and (c) exactly 5 survive?
Let X be the number of people that survive. Table A.1 gives help.
(a)
9
P ( X  10)  1  P ( X  10)  1   b ( x ;15, 0.4)  1  0 .9 6 6 2  0.0338
x0
?
15
Can you calculate
 b ( x ;15, 0.4) 
manually?
(b)
P (3  X  8) 
x  10
8
8
2
x3
x0
x0
 b ( x ;15, 0.4)   b ( x ;15, 0.4)   b ( x ;15, 0.4)
 0 .9 0 5 0  0 .0 2 7 1  0.8779
(c)
P ( X  5)  b (5;15, 0.4) 
President University
5
C 5 (0.4) (0.6)
15
10
 0.1859
Erwin Sitompul
PBST 6/10
Chapter 5.3
Binomial and Multinomial Distributions
Table A.1 Binomial Probability Sums
President University
Erwin Sitompul
PBST 6/11
Chapter 5.3
Binomial and Multinomial Distributions
Binomial and Multinomial Distributions
A large chain retailer purchase a certain kind of electronic device
from a manufacturer. The manufacturer indicates that the defective
rate of the device is 3%.
(a) The inspector of the retailer randomly picks 20 items from a
shipment. What is the probability that there will be at least one
defective item among these 20?
(b) Suppose that the retailer receives 10 shipments in a month and
the inspector randomly tests 20 devices per shipment. What is
the probability that there will be 3 shipments containing at least
one defective device?
Let X be the number of defective devices among the 20 items.
(a)
P ( X  1)  1  P ( X  1)  1  P ( X  0)  1 
(b)
p  0.4562  P (Y  3)  b (3;10, 0.4562) 
President University
C 0 (0.03) (1  0.03)
0
20
20
 0.4562
C 3 (0.4562) (1  0.4562)
10
Erwin Sitompul
3
10  3
 0.1602
PBST 6/12
Chapter 5.3
Binomial and Multinomial Distributions
Binomial and Multinomial Distributions
It is conjectured that an impurity exists in 30% of all drinking wells
in a certain rural community. In order to gain some insight on this
problem, it is determined that some tests should be made. It is too
expensive to test all of the many wells in the area so 10 were
randomly selected for testing.
(a) Using the binomial distribution, what is the probability that
exactly three wells have the impurity assuming that the
conjecture is correct?
(b) What is the probability that more than three wells are impure?
(a)
P ( X  3) 
(b)
P ( X  3)  1  P ( X  3)
C 3 (0.3) (1  0.3)
3
10
10  3
 0.2668
Try also to use Table
?
A.1 to find this value
 1   b ( x ;10, 0.3)
3
x0
 1   0 .0 2 8 2  0 .1 2 1 1  0 .2 3 3 5  0 .2 6 6 8 
 0.3504
President University
Erwin Sitompul
PBST 6/13
Chapter 5.3
Binomial and Multinomial Distributions
Binomial and Multinomial Distributions
Consider the previous “drinking wells” example. The “30% are
impure” is merely a conjecture put forth by the area water board.
Suppose 10 wells are randomly selected and 6 are found to contain
the impurity. What does this imply about the conjecture? Use a
probability statement.
P ( X  6) 
C 6 (0.3) (1  0.3)
10
6
10  6
 0.0368
 Should the 30% impurity conjecture is true,
there is only 3.68% chance that it stands after 6
wells are found contaminated.
 The investigation suggests that the impurity
problem is much more severe than 30%.
President University
Erwin Sitompul
PBST 6/14
Chapter 5.3
Binomial and Multinomial Distributions
Binomial and Multinomial Distributions
 The binomial experiment becomes a multinomial experiment if
we let each trial have more than 2 possible outcomes.
 |Multinomial Distribution| If a given trial can result in the k
outcomes E1, E2, ..., Ek with probabilities p1, p2, .., pk, then the
probability distribution of the random variables X1, X2, ..., Xk,
representing the number of occurrence for E1, E2, ..., Ek in n
independent trials is
f ( x1 , x 2 , ..., x k ; p1 , p 2 , ..., p k , n ) 
with
x1 ! x 2 !... x k !
x
x
x
p1 1 p 2 2 ... p k k
k
k
x
n!
i
n
and
i 1
President University

pi  1
i 1
Erwin Sitompul
PBST 6/15
Chapter 5.3
Binomial and Multinomial Distributions
Binomial and Multinomial Distributions
The complexity of arrivals and departures into an airport are such
that computer simulation is often used to model the “ideal”
conditions.
For a certain airport containing three runways it is known that in the
ideal setting the probabilities that the individual runways are
accessed by a randomly arriving commercial jet are 2/9, 1/6, and
11/18 for runway 1, runway 2, and runway 3, respectively.
If there are 6 randomly arriving airplanes, what is the probability
that 2 airplanes will do the landing in runway 1, 1 in runway 2, and 3
in runway 3?
2
1
3
2 1 11 
6 !  2   1   11 

f  2,1, 3; , ,
,6 
    
  0.1127
9 6 18  2 !1!3!  9   6   18 

?
What is the probability that 2 airplanes
will do the landing in runway 1?
President University
Erwin Sitompul
PBST 6/16
Chapter 5.4
Hypergeometric Distribution
Hypergeometric Distribution
 As opposed to the binomial distribution, the hypergeometric
distribution is based on the sampling done without replacement.
The independence among trials is not required.
 Applications for the hypergeometric distribution are found in many
areas, with heavy uses in acceptance sampling, electronic
sampling, and quality assurance.
 The experiment where the hypergeometric distribution applies
must possess the following two properties:
1. A random sample of size n is selected without replacement from
N items
2. k of the N items may be classified as successes and N–k are
classified as failures.
 The number X of successes of a hypergeometric experiment is
called a hypergeometric random variable.
 The hypergeometric distribution of such variable is denoted by
h(x; N, n, k)
President University
Erwin Sitompul
PBST 6/17
Chapter 5.4
Hypergeometric Distribution
Hypergeometric Distribution
A particular part that is used as an injection device is sold in lots of
10. The producer feels that the lot is deemed acceptable if no more
that one defective is in the lot.
Some lots are sampled and the sampling plan involves random
sampling and testing 3 of the parts out of 10. If none of the 3 are
defective, the lot is accepted. Give comment on the utility of this
plan.
P ( X  0) 
2
C0 8 C3
10
P ( X  0) 
3
 0.291
 In case there are 3 defectives,
there is still a chance of 29.1%
that the lot is accepted.
C3
C0 7 C3
10
 0.467
 In case there are 2 defectives,
there is still a chance of 46.7%
that the lot is accepted.
C3
 As conclusion, a plan to do this kind of quality control is faulty.
Unacceptable lot can still be accepted with high probability.
 3 samples are not enough. The sample size must be increased.
President University
Erwin Sitompul
PBST 6/18
Chapter 5.4
Hypergeometric Distribution
Hypergeometric Distribution
 The probability distribution of the hypergeometric random variable
X, the number of successes in a random sample of size n selected
from N items of which k are labeled success and N–k labeled
failure, is
h ( x; N , n, k ) 
k
Cx
N k
N
Cnx
, x  0,1, 2, ..., n
Cn
 The mean and variance of the hypergeometric distribution
h(x;N,n,k) are
 
nk
N
and
President University

2

N n
N 1
n
k 
k 
1



N 
N 
Erwin Sitompul
PBST 6/19
Chapter 5.4
Hypergeometric Distribution
Hypergeometric Distribution
Lots of 40 components each are called unacceptable if they contain
as many as 3 defectives or more. The procedure for sampling the lot
is to select 5 components at random and to reject the lot if a
defective is found.
(a) What is the probability that exactly 1 defective is found in the
sample if there are 3 defectives in the entire lot?
(b) Find the mean and variance of the random variable and use
Chebyshev’s theorem to interpret the interval μ ± 2σ.
(a)
h (1; 40, 5, 3) 
3
C 1 37 C 4
40
(b)
 
nk

(5)(3)
N

2

N n
N 1
40
n
 0.3011
C5

3
 Again, this method of testing is
not acceptable, since it detects a
bad lot (with 3 defectives) only
about 30% of the time
 0.375
8
k 
k 
3 
 40  5 
 3 

(5)
1

1







  0.3113
40 
N 
N 
 40  1 
 40  
  0 .5 5 8
President University
 In at least 3/4 of the time, the
number of defectives will be
between –0.741 and 1.491
components
Erwin Sitompul
PBST 6/20
Chapter 5.4
Hypergeometric Distribution
Hypergeometric Distribution
 If the number of sample n is small compared to the sample size N,
the nature of the N items changes very little in each draw,
although without replacement.
 In this case, where n/N ≤ 0.05, the value of binomial distribution
can be used to approximate the value of hypergeometric
distribution.
A manufacturer of automobile tires reports that among a shipment of
5000 sent to a local distributor, 1000 are slightly blemished. If one
purchase 10 of these tires at random from the distributor, what is
the probability that exactly 3 are blemished?
h (3; 5000,10,1000) 
p 
1
C3
4000
5000
C 10
1000
C7
 0.2015
 Exact
hypergeometric
probability
,
5
1

b  3;10,  
5

3
1 4
C3    
10
5 5
President University
10  3
 0.2013
Erwin Sitompul
 Approximation
using binomial
distribution
PBST 6/21
Chapter 5.5
Negative Binomial and Geometric Distributions
Negative Binomial Distribution
 Consider an experiment where the properties are the same as
those listed for a binomial experiment, with the exception that the
trials will be repeated until a fixed number of successes occur.
 We are interested in the probability that the kth success occurs on
the xth trial.
 This kind of experiment is called negative binomial experiment.
 The number X of trials to produce k successes in a negative
binomial experiment is called a negative binomial random
variable, and its probability distribution is called the negative
binomial distribution.
 |Negative Binomial Distribution| If repeated independent trials
can result in a success with probability p and a failure with
probability q = 1–p, then the probability distribution of the random
variable X, the number of the trial on which the kth success occurs,
is
b ( x; k , p ) 
*
k
Ck 2 p q
x 1
President University
xk
, x  k , k  1, k  2, ...
Erwin Sitompul
PBST 6/22
Chapter 5.5
Negative Binomial and Geometric Distributions
Negative Binomial Distribution
In an NBA (National Basketball Association) championship series, the
team which wins four games out of seven will be the winner.
Suppose that team A has probability 0.55 of winning over the team
B and both teams A and B face each other in the championship
games.
(a) What is the probability that team A will win the series in six
games?
(b) What is the probability that team A will win the series?
(a)
b (6; 4, 0.55) 
*
(b) P ( te a m
4
C 4  2 (0.55) (0.45)
6 1
62
 0.1853
A w in s th e ch a m p io n sh ip se rie s )
 b (4; 4, 0.55)  b (5; 4, 0.55)  b (6; 4, 0.55)  b (7; 4, 0.55)
*
*
*
*
 0.0915  0.1647  0.1853  0.1668  0.6083
President University
Erwin Sitompul
PBST 6/23
Chapter 5.5
Negative Binomial and Geometric Distributions
Negative Binomial Distribution
In an NBA (National Basketball Association) championship series, the
team which wins four games out of seven will be the winner.
Suppose that team A has probability 0.55 of winning over the team
B and both teams A and B face each other in the championship
games.
(c) If both teams face each other in a regional playoff series and the
winner is decided by winning three out of five games, what is the
probability that team A will win a playoff?
(c)
P ( te a m A w in s th e re g in a l se rie s )
 b (3; 3, 0.55)  b (4; 3, 0.55)  b (5; 3, 0.55)
*
*
*
 0.1664  0.2246  0.2021  0.5931
President University
Erwin Sitompul
PBST 6/24
Chapter 5.5
Negative Binomial and Geometric Distributions
Geometric Distribution
 If we consider the special case of the negative binomial distribution
where k = 1, we have a probability distribution for the number of
trials required for a single success.
 If repeated independent trials can result in a success with
probability p and a failure with probability q = 1–p, then the
probability distribution of the random variable X, the number of
the trial on which the first success occurs, is
g ( x ; p )  pq
x 1
, x  1, 2, 3, ...
 The mean and variance of a random variable following the
geometric distribution are
 
1
and
p
President University

2

1 p
p
2
Erwin Sitompul
PBST 6/25
Chapter 5.5
Negative Binomial and Geometric Distributions
Geometric Distribution
In a certain manufacturing process it is known that, on the average,
1 in every 100 items is defective. What is the probability that the
fifth item inspected is the first defective item found?
g (5; 0.01)  (0.01)(0.99)
4
 0.0096
At “busy time” a telephone exchange is very near capacity, so callers
have difficulty placing their calls. It may be of interest to know the
number of attempts necessary in order to gain a connection.
Suppose that we let p = 0.05 be the probability of a connection
during busy time. We are interested in knowing the probability that 5
attempts are necessary for a successful call.
P ( X  5)  g (5; 0.05)  (0.05)(0.95)
President University
4
 0.041
Erwin Sitompul
PBST 6/26
Chapter 5.6
Poisson Distribution and Poisson Process
Poisson Distribution and Poisson Process
 Experiments yielding numerical values of a random variable X, the
number of outcomes occurring during a given time interval or in a
specified region, are called Poisson experiments.
 The time interval may be given in any length, such as minute, day,
week, and month.
 The specified region may be a line segment, an area, a volume, or
a piece of material
President University
Erwin Sitompul
PBST 6/27
Chapter 5.6
Poisson Distribution and Poisson Process
Properties of Poisson Process
 A Poisson experiment is derived from the Poisson process and
possesses the following properties:
1. The number of outcomes occurring in one time interval or
specified region is independent of the number that occurs in any
other disjoint time interval or region of space.
2. The probability that a single outcome will occur during a very
short time interval or in a small region is proportional to the
length of the time interval or the size of the region and does not
depend on the number of outcomes occurring outside this time
interval or region.
3. The probability that more than one outcome will occur in such a
short time interval or fall in such a small region is negligible
President University
Erwin Sitompul
PBST 6/28
Chapter 5.6
Poisson Distribution and Poisson Process
Poisson Distribution and Poisson Process
 |Poisson Distribution| The probability distribution of the Poisson
random variable X, representing the number of outcomes
occurring in a given time interval or specified region denoted by t,
is
t
x
e ( t )
p ( x;  t ) 
,
x  0,1, 2, ...
x!
where λ is the average number of outcomes per unit time or
region, and e = 2.71828.... (natural number).
 The mean and variance of the Poisson distribution p(x;λt) both
have the value λt.
President University
Erwin Sitompul
PBST 6/29
Chapter 5.6
Poisson Distribution and Poisson Process
Poisson Distribution and Poisson Process
During a laboratory experiment the average number of radioactive
particles passing through a counter in 1 millisecond is 4. What is the
probability that 6 particles enter the counter in a given millisecond?
x  6,  t  4
4
p (6; 4) 
e (4)
6
6!
 0.1042
Ten is the average number of oil tankers arriving each day at a
certain port city. The facilities at the port can handle at most 15
tankers per day. What is the probability that on a given day tankers
have to be turned away?
15
P ( X  15)  1  P ( X  15)  1  
 Table A.2 gives help
p ( x ;10)
x0
 e  10 (10)1 e  10 (10) 2
 1 


1!
2!


e
 10
(10)
15 !
15



 1  0 .9 5 1 3  0.0487
President University
Erwin Sitompul
PBST 6/30
Chapter 5.6
Poisson Distribution and Poisson Process
Table A.2 Poisson Probability Sums
President University
Erwin Sitompul
PBST 6/31
Chapter 5.6
Poisson Distribution and Poisson Process
Poisson Distribution As a Limit of Binomial
 It should be clear from the three properties of the Poisson process
that the Poisson distribution relates to the binomial distribution.
 In the case of the binomial, if n is quite large and p is small, the
conditions begin to simulate the continuous space or time region
implications of the Poisson process.
 Poisson distribution can be taken as a limiting form of the binomial
distribution when n ∞ and p  0, and np remains constant.
 If the conditions are fulfilled, the Poisson distribution can be used
with μ = np, to approximate binomial distribution.
 Let X be a binomial random variable with probability distribution
b(x;n,p). When n ∞ and p  0, and μ = np remains constant,
b ( x; n, p )  p ( x,  )
President University
Erwin Sitompul
PBST 6/32
Chapter 5.6
Poisson Distribution and Poisson Process
Poisson Distribution and Poisson Process
In a certain industrial facility accidents occur infrequently. It is
known that the probability of an accident on any given day is 0.005
and accidents are independent of each other.
(a) What is the probability that in any given period of 400 days there
will be an accident on one day?
(b) What is the probability that there are at most three days with an
accident?
x  1,  t  (0.005)(400)  2
2
(a)
p (1; 2) 
e (2)
1
1!
b (1; 400, 0.005) 
(b)
 Considered as
Poisson process
 0.2707
3
P ( X  3) 

3
p ( x ; 2) 
x0
3
P ( X  3) 
1
C 1 (0.005) (0.095)
400

2
e (2)
x0
 b ( x ; 400, 0.005)
399
 0.2707
 Considered as
Bernoulli process
x
 0.8571
x!
 0.8571
x0
President University
Erwin Sitompul
PBST 6/33
Chapter 5.6
Poisson Distribution and Poisson Process
Poisson Distribution and Poisson Process
In a manufacturing process where glass products are produced,
defects or bubbles occur, occasionally rendering the piece
undesirable for marketing. It is known that, on average, 1 in every
1000 of these items produced has one or more bubbles. What is the
probability that a random sample of 8000 will yield fewer than 7
items possessing bubbles?
  (8000)(0.001)  8
6
6
P ( X  7) 
 b ( x ; 8000, 0.001)
x0
 Actually a
problem for
Binomial
Distribution
President University


p ( x ; 8)  0.3134
x0
 Solved by
approximation
using Poisson
Distribution
Erwin Sitompul
PBST 6/34
Probability and Statistics
Homework 6
1. A communications system consists of n components, each of which will,
independently, function with probability p. The total system will be able
to operate effectively if at least one-half of its components function. For
what values of p is a 5-component system more likely to operate
effectively than a 3-component system?
(Ro.E5.1c s144)
2. It has been established that the number of defective stereos produced
daily at a certain plant is Poisson distributed with mean 4. Over a 2-day
span, what is the probability that the number of defective stereos does
not exceed 3?
(Ro.E5.2f s+10)
3. The probability of hitting a target is 1/5 and ten shots are fired
independently.
(a) What is the probability of the target being hit at least twice?
(b) Find the conditional probability that the target is hit at least twice,
assuming that at least one hit is scored.
(Fe.VI.10.5-6 s16.9)
President University
Erwin Sitompul
PBST 6/35
Download