Handout 3

advertisement
STAT 211
Handout 3 (Chapter 3): Discrete Random Variables
Random variable (r.v.): A random variable is a function from sample space, S to the
real line, R. That is, a random variable assigns a real number to each element of S.
Discrete random variable: Possible values of isolated points along the number line.
Random variables have their own sample space, or set of possible values. If this set is
finite or countable, the r.v. is said to be discrete.
Example 1: Define the numeric numbers random variable can take and their probabilities.
Use this example to demonstrate most of the properties and tools in this handout.
(i)
Consider an experiment in which each of three cars come to a complete stop
(C) on the intersection or not (N). Let random variable X be the number of
cars come to a complete stop.
(ii)
Consider an experiment of four home mortgages that are classified as fixed
rate (F) or variable rate (V). Let random variable Y be the number of homes
with the fixed mortgage rate.
Probabilistic properties of a discrete random variable, X:
P(X  a) + P(X > a) = 1 where a is a constant integer. Then P(X > a) = 1 - P(X  a)
P(X  a) = 1 - P(X < a) = 1- P(X  a-1)
P(a < X < b) = P(X < b) - P(X  a) = P(X  b-1) - P(X  a) where b is also a constant
integer.
P(a  X  b) = P(X  b) - P(X < a) = P(X  b) - P(X  a-1)
Discrete probability distribution: A probability distribution describes the possible
values and their probability of occurring. Discrete probability distribution is called
probability mass function (pmf), p(.) and need to satisfy following conditions.
 0  p(x)=P(X=x)  1 for all x where X is a discrete r.v.
  p ( x)  1
all x
Examples: Discrete Uniform, Bernoulli, Binomial, Hypergeometric, Negative Binomial,
Geometric Distributions.
Example 1(ii) (continue):
y
p(y)
0
1/16
1
4/16
2
6/16
3
4/16
4
1/16
 All probabilities, p(y) are between 0 and 1
 When you sum probabilities for all possible y values, they add up to 1.
p(y) is a legitimate probability mass function (pmf) of y.
P(Y>2)=p(3)+p(4) or 1-P(Y≤2)=1-p(0)-p(1)-p(2)=5/16
P(Y≥2)=p(2)+p(3)+p(4)=11/16 or 1-P(Y<2)=1-p(0)-p(1)=11/16
P(1<Y<3)=p(2) or P(Y<3)-P(Y≤1)=(p(0)+p(1)+p(2))-(p(0)+p(1))=6/16
P(1≤Y≤3)=p(1)+p(2)+p(3) or P(Y≤3)-P(Y<1)=(p(0)+p(1)+p(2)+p(3))-(p(0))=14/16
Example 2: A pizza shop sells pizzas in four different sizes. The 1000 most recent orders
for a single pizza gave the following proportions for the various sizes.
Size
12"
14"
16"
18"
Proportion 0.20 0.25 0.50 0.05
With X denoting the size of a pizza in a single-pizza order, Is the table above a valid pmf
of x?
Example 3: Could p(x)=x2/50 for x=1,2,3,4,5 be the pmf of x? If it is not, is it possible to
find the pmf of x?
Cumulative Distribution Function, (CDF): F ( x)  P( X  x) 
x
 P( X  y )
for all y
Example 1 (ii)(continue): For this home mortgages example
F(0-)=P(Y<0)=0
F(0)=P(Y0)=p(0)=1/16
F(1)=P(Y1)=p(0)+p(1)=F(0)+p(1)=5/16
F(2)= P(Y2)=p(0)+p(1)+p(2)=F(1)+p(2)=11/16
F(3)= P(Y3)=p(0)+p(1)+p(2)+p(3)=F(2)+p(3)=15/16
F(4+)= P(Y4)= p(0)+p(1)+p(2)+p(3)+p(4)=F(3)+p(4)=16/16=1
0,
1 / 16,

5 / 16,
F ( y)  
11 / 16,
15 / 16,

1,
y0
0  y 1
1 y  2
2 y3
3 y 4
y4
Example 2 (continue): For this pizza example.
F(12-)=P(X<12)=0
F(12)=P(X12)=0.20
F(14)=P(X14)=0.45
F(16)= P(X16)=0.95
F(18+)= P(X18)=1
x  12
12  x  14
14  x  16
16  x  18
x  18
0,
0.20,

F ( x)  0.45,
0.95,

1,
P(a  X  b)  F (b)  F (a  1) and P( X  a)  F (a)  F (a  1) where a and b are
integers.
P(14  X  16)  F (16)  F (12) =0.95-0.20=0.75 or P(X=14)+P(X=16)=0.75
P( X  14)  F (14)  F (12)  0.45  0.20  0.25
The expected value of a random variable, X : =E(X) (the population distribution of
X is centered).
Expected value for the discrete random variable, X: Weighted average of the possible
values. Expected value of the random variable X,   E ( X )   x  p( x)
for all x
Rules of expected value:
(i)
For any constant a and the random variable X, E(aX) = aE(X)
(ii)
For constant b, E(b) = b
(iii) For any constant a and b, the random variable aX+b, E(aX+b) = aE(X)+b
(iv)
For constants a, b, and c and the random variables X and Y,
E(aX  bY  c) = aE(X)  bE(Y)  c
Example 1(ii)(continue): If we use the home mortgages example, determine the expected
value of Y.
 Y  E (Y )   y  p( y )  0(1 / 16)  1(4 / 16)  2(6 / 16)  3(4 / 16)  4(1 / 16) =32/16=2
for all y
On the average, 2 houses expected to have fixed mortgage rated.
Example 2(continue): If we use the pizza example, show that the expected value of X is
approximately 14.8".
 X  E ( X )   x  p( x)  12(0.20)  14(0.25)  16(0.50)  18(0.05) =14.8
for all x
On the average, 14.8" pizza is expected to be ordered.
If we have a new variable Y=2X, the pmf of Y is
y
24
28
32
36
p(y) 0.20 0.25 0.50 0.05
 y  E ( y )   y  p( y )  24(0.20)  28(0.25)  32(0.50)  36(0.05) =2  X =29.6
for all y
What is the approximate probability that X is within 2" of its mean value?
P(12.8 ≤ X ≤ 16.8)=P(X=14)+P(X=16)=0.25+0.50=0.75
The variance of random variable, X: Measure of dispersion. Variance of the random
variable X, 2 = Var(X) = E[( X   ) 2 ]  E ( X 2 )   2 (variability in the population
distribution of X)
The standard deviation of random variable, X:  =  2
 ( x   )  p ( x)
)   x  p ( x) .
Variance for the discrete random variable, X:  2  Var ( X ) 
2
or
for all x
the suggested shortcut is  2  Var ( X )  E ( X 2 )   2 where E ( X 2
2
for all x
If h(X) is the function of random variable X,
E (h( X ))   h( X )  p( x)
for all x
Var(h(X))=
 (h( X )  E (h( X )))
2
 p ( x) .
for all x
If h(X) is a linear function of X, the rules of the mean and the variance can directly be
used instead of going through the mathematics.
Rules of variance:
(i)
For any constant, a and the random variable, X, Var(aX) = a2Var(X)
(ii)
For constant b, Var(b) = 0
(iii) For constants a and b and the random variables X, Var(aX  b) = a2Var(X)
(iv)
For constants a, b and c and the random variables X1 and X2, Var(a X1  b X2 
c) = a2Var(X1)+ b2Var(X2)  2 abCov(X1, X2) (this case will be used in
Chapter 5)
(v)
For constants a1 to an and the random variables X1 to Xn,
n
n
 n
 n
Var   ai X i    ai2  Var ( X i )  2   ai  a j  Cov( X i , X j ) (this case will be
i 1 j 1
 i 1
 i 1
i j
used in Chapter 5)
Example 1 (ii) (continue): If we use the home mortgages example, what is the variance of
Y?
 Y2  Var (Y )  E (Y 2 )  ( E (Y )) 2  (  y 2  p( y ))  2 2
for all y
 (0 (1 / 16)  1 (4 / 16)  2 (6 / 16)  32 (4 / 16)  4 2 (1 / 16))  2 2 =1
2
2
2
Example 2 (continue): If we use the pizza example, what is the variance of X?
 X2  Var ( X )  E ( X 2 )  ( E ( X )) 2  (  x 2  p( x))  14.8 2
for all x
 (12 (0.20)  14 (0.25)  16 (0.50)  18 2 (0.05))  14.8 2 =2.96
2
2
2
and the standard deviation of X is  X = 2.96 =1.72
If we have a new variable Y=2X, the pmf of Y is
y
24
28
32
36
p(y) 0.20 0.25 0.50 0.05
 Y2  Var (Y )  E (Y 2 )  ( E (Y )) 2  (  y 2  p( y ))  29.6 2
for all y
 (24 (0.20)  28 (0.25)  32 (0.50)  36 2 (0.05))  29.6 2 =11.84 = 4  X2
2
2
2
Parameter: If P(X=x) depends on a quantity that can be assigned any one of a number
of possible values, with each different value determining a different probability
distribution, that quantity is called a parameter of the distribution.
Bernoulli Distribution: It is based on Bernoulli trial ( an experiment with two, and only
two, possible outcomes). A r.v. X has a Bernoulli(p) distribution where p is the
parameter if
1 with probabilit y p
X= 
0 with probabilit y 1 - p
0p1.
P(X=x)= p x  (1  p)1x ,
x  0,1
Examples 4: (i) Flip a coin 1 time. Let X be the number of tails observed. Let
P(heads)=0.55 then P(tails)=0.45, In this example p is P(tails)=0.45.
(ii) A single battery is tested for the viability of its charge. Let X be 1 if
battery is OK and zero otherwise. Let P(battery is OK)=0.90 then P(battery is not
OK)=0.10, In this example p is P(Battery is OK)=0.9.
Binomial Distribution: Approximate probability model for sampling without
replacement from a finite dichotomous population. X~Binomial(n,p).
 n fixed trials
 each trial is identical and results in success or failure
 independent trials
 the probability of success (p) is constant from trial to trial
 X is the number of successes among n trials
n
P( X  x)     p x  (1  p) n x ,
x  0,1,2,...., n
 x
E(X) = np and
Var(X) = np(1-p)
Binomial Theorem: For any real numbers x and y and integer n  0,
n
 n
( x  y ) n      x i  y n i .
i 0  i 
x
x
n
Cumulative distribution function: F ( x)   P( X  k )      p k  (1  p) n  k
k 0
k 0  k 
Table A.1 demonstrates cumulative distribution function values for n=5,10,15,20,25 with
different p values.
Example 5: A lopsided coin has a 70% chance of "head". It is tossed 20 times. Suppose
X: number of heads observed in 20 tosses ~ Binomial (n=20, p=0.70)
Y: number of tails observed in 20 tosses ~ Binomial (n=20, p=0.30)
Determine the following probabilities for the possible results:
a. at least 10 heads
P(X≥10)=1-P(X<10)=1-P(X≤9)=1-0.017
P(Y≤10)=0.983
b. at most 13 heads
P(X≤13)=0.392
P(Y≥7)=1-P(Y<7)=1-P(Y≤6)=1-0.608
c. exactly 12 heads
P(X=12)=P(X≤12)-P(X≤11)=0.228-0.113
P(Y=8)= P(Y≤8)-P(Y≤7)=0.887-0.772
d. between 8 and 14 heads (inclusive)
P(8≤X≤14)= P(X≤14)-P(X≤7)=0.584-0.001
P(6≤Y≤12)= P(Y≤12)-P(Y≤5)=0.999-0.416
e. fewer than 9 heads
P(X<9)=P(X≤8)=0.005
P(Y>11)=1-P(Y≤11)=1-0.995
Hypergeometric Distribution: Exact probability model for the number of successes in
the sample.
X~Hyper(M,N,n)
 M  N  M 
 

x  n  x 

P( X  x) 
,
max( 0, n  N  M )  x  min( n, M )
N
 
n 
Let X be the number of successes in a random sample of size n drawn from a population
with size N consisting of M successes and (N-M) failures.
M
M
E(X) = n 
where
is the proportion of successes in the population.
N
N
N n
N n
M  M
 n   1   where
Var(X) =
is the finite population correction factor.
N 1
N 1
N 
N
Example 6: An urn filled with N balls that are identical in every way except that M are
red and N-M are green. We reach in and select n balls at random (n balls are taken all at
once, a case of sampling without replacement). What is the probability that exactly x of
the balls are red?
N
  : Total number of samples of size n that can be drawn from the N balls.
n 
M 
  : Number of ways that x balls will be red out of M red balls.
x 
N  M 

 : Number of ways that remaining n-x balls will be green..
n  x 
X: number of red balls drawn from a sample of n balls has hypergeometric distribution
and the answer is P(X=x) where x=0,1,2,3,....,n
Example 7: A quality-control inspector accepts shipments whenever a sample of size 5
contains no defectives, and she rejects otherwise.
a. Determine the probability that she will accept a poor shipment of 50 items in which
20% are defective.
Let X be the number of defective items in a poor shipment where from N=50 items,
sample of n=5 is selected with M=50(0.20)=10 defective items in a poor shipment
10  40 
  
0 5
P(accept shipment)=P(X=0)=    =0.3106
 50 
 
5 
b. Determine the probability that she will reject a good shipment of 100 items in which
2% are defective.
Let X be the number of defective items in a good shipment where from N=100 items,
sample of n=5 is selected with M=100(0.02)=2 defective items in a good shipment
 2  98   2  98 
     
1 4
2 3
P(reject shipment)=P(X1)=P(X=1)+P(X=2)=        =0.098
100 
100 




5 
5 
Negative Binomial Distribution: Binomial Distribution counts the number of successes
in a fixed number of Bernoulli trials. Negative Binomial Distribution counts the number
of Bernoulli trials required to get a fixed number of successes.
X ~ NegativeBinomial(r,p)
 x  r  1 r
  p  (1  p) x , x  0,1,2,3,...........
P( X  x)  
 r 1 
X: number of failures before the rth success
p: probability of successes
r: number of successes
r (1  p )
r (1  p)
E(X) =
and
Var(X) =
p2
p
Example 8 (Exercise 3-71, 6th edition which is Exercise 3-69, 5th edition ):
P(male birth)=0.5
A couple wished to have exactly 2 female children in their family. They will have
children until this condition is fulfilled.
(a) What is the probability that the family has x male children?
X: number of male children until they have 2 girls
p=P(female)=0.5
r:number of girls=2
X~Negative binomial(r=2,p=0.5)
 x  2  1
(0.5) 2 (1  0.5) x  ( x  1)(0.5) x  2 , x=0,1,2,3,……
P(X=x)= 
 2 1 
(b) What is the probability of family has four children? (Answer=0.1875)
(c) What is the probability that the family has at most 4 children? (Answer=0.6875)
(d) How many male children would you expect this family to have? (Answer=2)
How many children would you expect this family to have? (Answer=4)
The Geometric Distribution is the simplest of the waiting time distributions and is a
special case of the negative binomial distribution (r=1).
P( X  x)  p  (1  p) x 1 , x  1,2,3,...........
p: probability of success
X: the trial at which the first success occurs (waiting time for a success)
E(X) =
(1  p)
,
p
Var(X) =

Need to remember that
(1  p )
p2
 a x1 
x 1
and P(X >x)=(1-p)x
1
and
1 a

 a x1 
x  n 1
an
1 a
Example 9: A series of experiments conducted in order to reduce the proportion of cells
being scrapped by a battery plant because of internal shorts. The experiment was
successful in reducing the percentage of manufactured cells with internal shorts to around
1%. Suppose we are interested in the number of the test at which the first short is
discovered. Find the probability that at least 50 cells are tested without finding a short.
X : the number of tests until the first short ~ Geometric(p)
p : probability of internal shorts=0.01
P(X>50) = (1-p)x = (1-0.01)50 =0.605
Poisson Distribution: waiting time for an occurrence (waiting time for a bus, waiting
time for customers to arrive in a bank, etc.). The probability of an arrival is proportional
to the length of waiting time.
e   x
P( X  x) 
, x  0,1,2,3,...........,
 0
x!
: rate per unit time or per unit area
X : number of occurrences in a given time period or place (example: # of parts
produced/hour, or # of fractures /blade, and so on.)

i
i 0
i!
Note that e   
and E(X)=Var(X)=.
e    k
k!
k 0
k 0
Table A.2 demonstrates cumulative distribution function values with different  values.
x
x
Cumulative distribution function: F ( x)   P( X  k )  
Example 10: Transmission line interruptions in a telecommunications network occur at
an average rate 1 per day.
Let X be the number of line interruptions in t days
E(X)==1(t)=t interruptions in t days
Find the probability that the line experiences
a. no interruptions in 5 days
e 5  5 0
P(X=0)=
 0.0067
0!
b. exactly 2 interruptions in 3 days
e 3  3 2
P(X=2)=
 0.224
2!
c. at least 1 interruption in 4 days
e 4  4 0
P(X≥1)= 1-P(X=0)=1 0.9817
0!
d. at least 2 interruptions in 5 days
e 5  5 0 e 5  51
P(X≥2)= 1-P(X=0)-P(X=1)=1
 0.9596
0!
1!
Example 11 (Exercise 3.78, 6th edition which is Exercise 3-76, 5th edition ):
X: the number of missing pulses when you are writing onto a computer disk and then
sending it through a certifier ~ Poisson(=0.2)
E(X)==0.2
(a) What is the probability that a disk has exactly one missing pulse?
e 0.2 0.21
P(X=1)=
=0.1638
1!
(b) What is the probability that a disk has at least 2 missing pulses?
e 0.2 0.2 0 e 0.2 0.21

P(X≥2)=1-P(X=0)-P(X=1)= 1 
=0.0175
0!
1!
(c) If two disks are independently selected, what is the probability that neither
contains a missing pulse?
e 0.2 0.2 0
P(X=0)=
=0.8187 : P(one disk contains no missing pulse)
0!
P(neither contains a missing pulse)=[P(X=0)]2=0.6703
Proposition: Suppose that in the binomial probability mass function, we let n   and
p  0 in such a way that n.p remains fixed at value  > 0. Then, Binomial probabilities
can be approximated by using Poisson probabilities.
Download