CHAPTER 1 PROBABILITY

advertisement
1. COMBINATORIAL ANALYSIS
1.1 Counting Principles
1. Theorem (The basic principle of counting): If the set E contains n elements and the set F
contains m elements, there are nm ways in which we can choose, first, an element of E and
then an element of F.
2. Theorem (The generalized basic principle of counting): If r experiments that are to be
performed are such that the first one may result in any of n1 possible outcomes, and if for
each of these n1 possible outcomes there are n 2 possible outcomes of the second
experiment, and if for each of the possible outcomes of the first two experiments there are
n3 possible outcomes of the third experiment, and if …, then there is a total of
n1  n2   nr , possible outcomes of the r experiments.
3. Theorem: A set with n elements has 2 n subsets.
4. Tree diagrams
1.2 Permutations
1. Permutation: n!
1
The number of permutations of n things taken r at a time: Prn 
n!
(n  r )!
2. Theorem: The number of distinguishable permutations of n objects of k different types, where
n1 are alike, n 2 are alike, …, n k are alike and n  n1  n2    nk , is
n!
n1!n2 !  nk !
1.3 Combinations
1. Combination: The number of combinations of n things taken r at a time: C rn 
n!
r!(n  r )!
(combinatorial coefficient; binomial coefficient)
n
2. Binomial theorem: ( x  y ) n   C in x i y n i
i 0
3. Multinomial expansion: In the expansion of ( x1  x 2    x k ) n , the coefficient of the term
x1n1 x 2n2  x knk
,
n1  n2    nk  n
( x1  x2    xk ) n 
,
is
n!
n1!n2 !  nk !
.
Therefore,
n!
x1n1 x2n2  xknk . Note that the sum is taken
n1  n2  nk  n n1!n2 !  nk !

over all nonnegative integers n1 , n 2 , …, n k such that n1  n2    nk  n .
1.4 The Number of Integer Solutions of Equations
1. There are C rn11 distinct positive integer-valued vectors ( x1 , x2 , , xr ) satisfying
x1  x2    xr  n , xi  0 , i  1, 2, , r .
2. There are Crn1r 1 distinct nonnegative integer-valued vectors ( x1 , x2 , , xr ) satisfying
x1  x2    xr  n .
2
3
2. AXIOMS OF PROBABILITY
2.1 Sample Space and Events
1. Set theory concepts: set, element, roster method, rule method, subset, null set (empty set).
2. Complement: The complement of an event A with respect to S is the subset of all elements of
S that are not in A. We denote the complement of A by the symbol A’ ( Ac ).
3. Intersection: The intersection of two events A and B, denoted by the symbol A  B , is the
event containing all elements that are common to A and B.
-- Two events A and B are mutually exclusive, or disjoint, if A  B   that is, if A and B
have no elements in common.
4. The union of the two events A and B, denoted by the symbol A  B , is the event containing
all the elements that belong to A or B or both.
5. Venn diagram:
4
6. Sample space of an experiment: All possible outcomes (points)
7. Events: subsets of the sample space
impossible events (impossibility):  ; sure events (certainty): S.
c
8. DeMorgan’s laws:
n
n 
c
  Ei    Ei
i

1
i

1


c
n
n 
c
  Ei    Ei
i

1
i

1


2.2 Axioms of Probability
1. Probability axioms: (1) 0  P( A)  1 ;
(2) P ( S )  1 ;
(3) P( A1  A2  )  P( A1 )  P( A2 )   if
A1 , A2 , 
is a sequence
of mutually exclusive events.
2. Equally likely outcomes: the probabilities of the single-element events are all equal
2.3 Basic Theorems
1. (1) 0  P( A)  1 ;
5
(2) P( A) 
N ( A)
;
N (S )
(3) complementary events: P( A )  1  P( A) ;
(4) P( A  B)  P( A)  P( B)  P( A  B) : inclusion-exclusion principle
P( A1  A2  A3    An )
(5) If A1, A2,…, An is a partition of sample space S, then
 P( A1 )  P( A1 )    P( An )
 P( S )
1
(6) If A and A’ are complementary events, then P( A)  P( A' )  1 .
6
3. CONDITIONAL PROBABILITY AND INDEPENDENCE
3.1 Conditional Probability
1. Conditional probability: P( A B ) 
P( A  B)
.
P( B)
2. If in an experiment the events A and B can both occur, then
P( A  B)  P( A) P( B A)  P( B) P( A B) .
P( A  B  C)  P( A  B) P(C A  B)  P( A) P( B A) P(C A  B) ,
The multiplication rule:
P( A1  A2    An )  P( A1 ) P( A2 A1 ) P( A3 A1  A2 ) P( An A1  A2    An1 ) .
3. Partition: Let
B1 , B2 , , Bn 
be a set of nonempty subsets of the sample space S of an
B1 , B2 , , Bn
experiment.
If the events
are mutually exclusive
B1  B2    Bn  S , the set B1 , B2 , , Bn  is called a partition of S.
and
4. Theorem of total probability: If B1 , B2 ,  is a partition of S, and A is any event, then

P ( A)   P ( A Bi ) P ( Bi ) .
i 1
5. Bayes’ Theorem: If B1 , B2 ,  is a partition of S, and A is any event, then
P( Bi A) 
P( Bi ) P( A Bi )
P( Bi  A)
 
.
P( A)
 P( Bi ) P( A Bi )
i 1
3.2 Independence
1. Independent events: If A, B are independent events  P( A  B)  P( A) P( B) .
7
2. Theorem: If A and B are independent, then A and B ; A and B are independent.
3. The events A, B, and C are called independent if P( A  B)  P( A) P( B) ,
P( A  C )  P( A) P(C ) , P( B  C )  P( B) P(C ) , P( A  B  C )  P( A) P( B) P(C ) .
B, and C are independent events, we say that
4. The set of events
k  2 , of
A1 , A2 , , An 
A1 , A2 , , An ,
A, B, C
is an independent set of events.
is called independent if for every subset
A , A
P( Ai1  Ai2    Aik )  P( Ai1 ) P( Ai2 )  P( Aik ) .
8
If A,
i1
i2

, , Aik ,
4. DISTRIBUTION FUNCTIONS AND DISCRETE RANDOM VARIABLES
4.1 Random Variable
1. Random variables X: Let S be the sample space of an experiment. A real-valued function
X : S  R is called a random variable of the experiment if, for each interval I  R ,
s : X (s)  I 
is an event.
2. Probability function p X (x) : (a) p X ( x)  0 if x  range RX ; (b) p X ( x)  0 if x  RX (c)
p
xRX
X
( x)  1 .
4.2 Distribution Functions
1. Cumulative Distribution Functions (cdf): FX (t )  P( X  t ) ,    t   .
2. FX (t ) is non-decreasing; 0  FX (t )  1 ; lim FX (t )  0, lim FX (t )  1 .
t  
t 
3. If c  d , then FX (c)  FX (d ) ; P(c  X  d )  FX (d )  FX (c) ; P( X  c)  1  FX (c) .
4. The cdf of a discrete random variable: a step function.
4.3 Expectations of Discrete Random Variables
1. Expected value (mean value or average value or expectation) for a random variable X:
E( X )   X 
 xp
xRX
X
( x) .
9
2. Let g be a real-valued function. Then g(X) is a random variable with
E[ g ( X )] 
 g ( x) p
xRX
X
( x) .
3. Let g1 , g 2 , , g n be real-valued functions, and let 1 ,  2 , ,  n be real number. Then
E[1 g1 ( X )   2 g 2 ( X )     n g n ( X )]  1 E[ g1 ( X )]   2 E[ g 2 ( X )]     n E[ g n ( X )] .
4.4 Variances of Discrete Random Variables
1. The variance of a random variable: the average square distance between X and its mean  X
Var[ X ]  E[( X   X ) 2 ]  E[ X 2 ]   X2 .
2. Standard deviation:  X  Var[X ] .
3. Let X be a discrete random variable, then Var[ X ]  0 if and only if X is a constant with
probability 1.
4. Let X be a discrete random variable, then for constants a and b: Var[aX  b]  a 2Var[ X ] ,
 aX b  a  X .
10
5. SPECIAL DISCRETE DISTRIBUTIONS
5.1 Bernoulli and Binomial Random Variables
1. Bernoulli trials: an experiment with two different possible outcomes
Bernoulli random variable X with parameter p, p is the probability of a success.
p X ( x)  p x (1  p)1 x  p x q1 x , for x  R X  0, 1
expected value: E[ X ]   X  p ; variance: Var[ X ]   X2  p(1  p)  pq
Example: If in a throw of a fair die the event of obtaining 4 or 6 is called a success, and the
event of obtaining 1, 2, 3, or 5 is called a failure.
2. Binomial distribution: number of successes to occur in n repeated, independent Bernoulli
trials
binomial random variable Y with parameters n and p
pY ( y)  C yn p y q n y , for y  RY  0, 1, 2, , n
Expected value: E(Y )  Y  np ; Variance: Var[Y ]   Y2  npq
Example: A restaurant serves 8 entrees of fish, 12 of beef, and 10 of poultry. If customers
select from these entrees randomly, what is the probability that two of the next
four customers order fish entrees?
11
5.2 Multinomial Random Variables
1. Multinomial trials: an experiment with k  2 different possible outcomes
2. Multinomial distribution: n independent multinomial trials
multinomial random variable X 1 , X 2 , , X k with parameters n, p1 , p 2 , , p k ; X i : the
k
number of ith outcomes;
 xi  n ;
i 1
k
p
i 1
i
 1 ; R X i  {0, 1, , n}
n

 x1 x2
 p1 p 2  pkxk , for ( x1 , x2 , , xk )  R X1 , X 2 ,, X k
p X1 , X 2 ,, X k ( x1 , x2 , , xk )  
 x1 , x2 , , xk 
Example: Draw 15 balls with replacement from a box containing 20 red, 10 white, 30 black,
and 50 green. What is the probability of 7R, 2W, 4B, 2G?
5.3 Geometric distribution: trial number of the first success to occur in a sequence of independent
Bernoulli trials
geometric random variable N with parameter p
p N (n)  pq n 1 , for n  R N  1, 2, 3, 

geometric series: 1  q  q 2     q i 
i 0
expected value: E ( N )   N 
1
, if q  1
1 q
q
1
; variance: Var[ N ]   N2  2
p
p
12
Example: From an ordinary deck of 52 cards we draw cards at random, with replacement, and
successively until an ace is drawn. What is the probability that at least 10 draws are
needed?
Memoryless property of geometric random variables: In successive independent Bernoulli trials,
the probability that the next n outcomes are all failures does not change if we are given
that the previous m successive outcomes were all failures.
PN ( N  n  m N  m)  PN ( N  n)
5.4 Negative binomial distribution: trial number of the rth success to occur
negative binomial random variable N r with parameters r, p
PN2 (n)  pC1n1 pq n2  (n  1) p 2 q n2 , for n  RN2  2, 3, 4, 
PNr (n)  Crn11 p r q nr , for n  RNr  r, r  1, r  2, 
expected value: E ( N r )   N r 
rq
r
; variance: Var[ N r ]   N2 r  2
p
p
Example: Sharon and Ann play a series of backgammon games until one of them wins five
games. Suppose that the games are independent and the probability the Sharon wins
a game is 0.58. Find the probability that the series ends in seven games.
5.5 Poisson Distribution
1. The Poisson probability function: PK (k ) 
e  k
, for k  0, 1, 2, 3, 
k!
13
Poisson random variable K with parameter 
expected value: E( K )   K   ; variance: Var[ K ]   K2  
2. The Poison approximation to the binomial: If X is a binomial random variable with
parameters n and p   n , then lim PX ( x) 
n 
e   x
, for x  R X  0, 1, 2, 
x!
Example (Application of the Poisson to the number of successes in Bernoulli Trials and the
number of Arrivals in a time period): your record as a typist shows that you make
an average of 3 mistakes per page. What is the probability that you make 10
mistakes on page 437?
3. Poisson processes
Example: Suppose that children are born at a Poisson rate of five per day in a certain
hospital. What is the probability that at least two babies are born during the
next six hours?
14
6. CONTINUOUS RANDOM VARIABLES
6.1 Probability Density Functions
1. Densities
2. Probability density function (pdf) for a continuous random variable X: f X (x)
f X ( x)  0 for all x;



b
f X ( x)dx  1 ; P(a  X  b)   f X ( x)dx ;
a
b


a
P( X  b)   f X ( x)dx  FX (b) ; P( X  a)   f X ( x)dx ; f X ()  f X ()  0 ;
P ( X  x)  0 ; P( X  x)  f X ( x)dx
Example: Experience has shown that while walking in a certain park, the time X, in minutes,
between seeing two people smoking has a density function of the form:
f X ( x)  xe x , x  0 . (a) Calculate the value of  . (b) Find the probability
distribution function of X. (c) What is the probability that Jeff, who has just
seen a person smoking, will see another person smoking in 2 to 5 minutes?
6.2 Cumulative Distribution Functions (cdf)
  PX ( x), for a discrete random variable X

1. FX (t )  P( X  t )   t xt
,  t  
 f X ( x)dx, for a continuous random variable X
2. The cdf of a discrete random variable: a step function; the cdf of a continuous random
variable: a continuous function
3. The probability function of a discrete random variable: size of the jump in FX (t ) ; the pdf of
15
a continuous random variable: f X (t ) 
d
FX (t )
dt
6.3 Expectations and Variances
1. Definition: If X is a continuous random variable with pdf f X (x) , the expected value of X is

defined by E[ X ]   X   xf X ( x)dx .

Example: In a group of adult males, the difference between the uric acid value and 6, the
standard value, is a random variable X with the following pdf:
f X ( x) 
27
(3 x 2  2 x) if 2 / 3  x  3 .
490
Calculate the mean of these
differences for the group.
2. Theorem: Let X be a continuous random variable with pdf f X (x) ; then for any function h:

E[h( X )]   h( x) f X ( x)dx .

3. Corollary: Let X be a continuous random variable with pdf f X (x) . Let h1 , h2 , , hn be
real-valued functions and 1 ,  2 , ,  n be real numbers. Then
E[1h1 ( X )   2 h2 ( X )     n hn ( X )]  1 E[h1 ( X )]   2 E[h2 ( X )]     n E[hn ( X )] .
4. Definition: If X is a continuous random variable with E[ X ]   X , then Var[ X ] and  X ,
called the variance and standard deviation of X, respectively, are defined by
Var[ X ]  E[( X   X ) 2 ]  E[ X 2 ]   X2 ,  X  Var[X ] .
16
7. SPECIAL CONTINUOUS DISTRIBUTIONS
7.1 Uniform Random Variable
1. Density of a uniformly distributed random variable: f X ( x) 
1
, for a  x  b .
ba
2. The cdf of a uniformly distributed random variable:
0 if x  a

x  a
FX ( x)  
if a  x  b
b  a
1 if x  b

3. The expected value and variance of the uniform random variable:  X  E[ X ] 
Var[ X ]   X2 
ab
;
2
(b  a) 2
.
12
Example: Starting at 5:00 A.M., every half hour there is a flight from San Francisco airport
to Los Angeles International airport. Suppose that none of these planes is
completely sold out and that they always have room for passengers. A person
who wants to fly to L.A. arrives at the airport at a random time between 8:45
A.M. and 9:45 A.M. Find the probability that she waits (a) at most 10 minutes;
(b) at least 15 minutes.
7.2 The Exponential Distribution
1. The exponential probability law with parameter  :
17
FT1 (t )  1  e  t ,
f T1 (t )  e t , for t  0.
T1 is the time of occurrence of the first event in a Poisson process with parameter  ,
starting at an arbitrary origin t  0 ; X t  0  T1  t , Poisson process X t is the
number of events to occur with parameter   t .
2. The expected value and variance of the exponential random variable:  T1  E[T1 ] 
Var[T1 ]   T21 
1
2
1

;
.
Example: Suppose that every three months, on average, an earthquake occurs in California.
What is the probability that the next earthquake occurs after three but before
seven months?
3. Memoryless feature of the exponential: P(T1  a  b T1  b)  P(T1  a) , a, b are any two
positive constants.
7.3 The Erlang Distribution
1. The cdf and pdf for T2 :
FT2 (t )  1  (1  t )e  t ,
f T2 (t )  2 te t , for t  0.
T2 is the time of occurrence of the second event in a Poisson process with parameter  ,
starting at an arbitrary origin t  0 ; X t  1  T2  t , Poisson process X t is the
number of events to occur with parameter   t .
( t ) k  t
e ,
k!
k r

2. The Erlang probability law with parameters r and  :
FTr (t )  
f Tr (t ) 
18
r t r 1
(r  1)!
e t , for t  0.
Tr is the time of occurrence of the rth event in a Poisson process with parameter  ,
starting at an arbitrary origin t  0 ; X t  r  1  Tr  t, X t is the number of events
to occur.
3. The expected value and variance of the Erlang random variable: Tr  E[Tr ] 
Var[Tr ]   T2r 
r
2
r

;
.
Example: Suppose that, on average, the number of  -particles emitted from a radioactive
substance is four every second. What is the probability that it takes at least 2
seconds before the next two  -particles are emitted?
7.4 The Gamma Distribution
1. The gamma probability law with parameters n and  : f U (u ) 
n u n1
 ( n)
e u , for u  0, n  0,
  0.

Gamma function: (n)   x n1e  x dx, n  0 ; (n)  (n  1)(n  1) ; (n)  (n  1)! , if n
0
is a positive integer.
The Erlang random variable is a particular case of a gamma random variable, where n is
restricted to the integer values r  1, 2, 3,  .
2. The expected value and variance of the gamma random variable: E[U ]  U 
Var[U ]   U2 
n
2
.
7.5 The Normal (Gaussian) Distribution
19
n

;
1. The normal probability law with parameters  and  : f X ( x) 
1
 2
e ( x   )
2
2 2
, for
   x  ,   0 .
2. The expected value and variance of the normal random variable:  X  E[X ]   ;
Var[ X ]   X2   2 .
1
3. Standard normal random variable (the unit normal): f Z ( z ) 
2
e z
2
2
,   0 ,   1.
4. The new random variable: If X is normal with mean  and variance  2 , then X  b is
normal with mean   b and variance  2 ; aX is normal with mean a and variance
a 2 2 ; aX  b is normal with mean a  b and variance a 2 2 .
5. Switching from a non-unit normal to the unit normal: Z 
X 

,  Z  0 ,  Z  1.
Example: Suppose that height X is normal with   66 inches and  2  9 . You can find
the probability that a person picked at random is under 6 feet, using only the
table for the standard normal.
6. The probability that a normal random variable is with k standard deviations of the mean:
P(  k  X    k )  P(k  X *  k ) , X * is the unit normal.
7. The normal approximation to the binomial: A binomial distribution X with parameters n, p
where n is large, then X is approximately normal with   np ,  2  npq .
Example: A coin has P(heads)  0.3 . Toss the coin 1000 times so that the expected
number of heads is 300. Find the probability that the number of heads is 400 or
more.
20
7.6 The Election Problem
4.6 Functions of a Random Variable
1. The distribution function (cdf) method:

FX ( g 1 (t )), if g is increasing
, if X is a random variable
FW (t )  
1
1
1 - ( FX ( g (t ))  P( X  g (t ))), if g is decreasing
with cdf FX (t ) , g is a monotonic function for x  RX , W  g ( X )
dg 1 (t )
2. The density function (pdf) method: f W (t )  f X ( g (t ))
dt
1
4.7 Simulating a Random Variable
1. Simulating a continuous random variable
2. Simulating a discrete random variable
3. Simulating a mixed random variable
21
8. JOINTLY DISTRIBUTED RANDOM VARIABLES
8.1 Joint Densities
1. Jointly distributed random variable: If the observed values for two or more random variables
are simultaneously determined by the same random mechanism.
2. The discrete random variables X 1 , X 2 : the probability function is p X 1 , X 2 ( x1 , x 2 ) ;
p X 1 , X 2 ( x1 , x 2 )  0 for all x1 , x 2 ;
p
X1 , X 2
( x1 , x 2 )  1 .
The continuous random
R
variables X 1 , X 2 : the pdf f X1 , X 2 ( x1 , x 2 ) is positive; f X1 , X 2 ( x1 , x 2 )  0 for all x1 , x 2 ;

R
f X1 , X 2 ( x1 , x2 )dx1dx2  1 ; the probability is given by integrating f X1 , X 2 ( x1 , x 2 ) .
3. The joint density of independent random variables: The random variables X, Y are
independent if and only if
p X , Y ( x, y )  p X ( x) pY ( y ) , when X, Y are discrete;
f X , Y ( x, y )  f X ( x) f Y ( y ) , when X, Y are continuous.
4. Uniform joint densities: If X and Y are jointly uniform on a region, the joint pdf is
1
, for (x, y) in the region.
f X , Y ( x, y) 
area of region
The joint density of independent uniform random variables: P (event) 
favorable area
total area
8.2 Marginal Densities
1. The discrete random variables X 1 , X 2 with the probability function p X 1 , X 2 ( x1 , x 2 ) and
22
range R X 1 , X 2 , the marginal probability function for X 1 is p X1 ( x1 )   p X1 , X 2 ( x1 , x2 ) for
x2
x1  R X 1 .
R X 1 is the marginal range for X 1 , the set of first elements of ( x1 , x 2 )  R X 1 , X 2 .
2. The continuous random variables X 1 , X 2 with pdf f X1 , X 2 ( x1 , x 2 ) and range R X 1 , X 2 , the
marginal pdf for X 1 is f X1 ( x1 )   f X1 , X 2 ( x1 , x 2 )dx 2 for x1  R X 1 .
x2
R X 1 is the marginal
range for X 1 , the set of first elements of ( x1 , x 2 )  R X 1 , X 2 .
8.3 Functions of Several Random Variables
8.4 Sums of Independent Random Variables
1. The sum of independent binomials with a common p: X 1 with parameters n1 , p , X 2 with
parameters n2 , p , then X 1  X 2 with parameters n1  n2 , p
2. The sum of independent Poissons: X 1 with parameter 1 , X 2 with parameters  2 , then
X 1  X 2 with parameters 1  2
3. The sum of independent exponentials with a common  : X 1 , , X n with parameter  ,
then X 1    X n has a gamma(Erlang) distribution with parameters n,  .
4. The density of the sum of two arbitrary independent random variables: the convolution of the
individual pdf, f X Y ( z )  
RX
f X ( x) f Y ( z  x)dx   f X ( z  y ) f Y ( y )dy
RY
5. The sum of independent normals: X 1 with mean 1 and variance  12 , X 2 with mean
 2 and variance  22 , then X 1  X 2 with mean 1   2 and variance  12   22
23
9. EXPECTATION
9.1 Expectation of a Random Variable

1. Definition of expected value (continuous):  X  E[ X ]   xf X ( x)dx

2. Mean of a uniformly distributed random variable:  X  E[ X ] 
ab
, if X is uniform on
2
[a, b]
3. The expected value of g ( X ) , some function of X: E[ g ( X )]   g ( x) P( X  x) , if X is
x

discrete; E[ g ( X )]   g ( x) f X ( x)dx , if X is continuous. The expected value of g ( X , Y ) ,

some function of X and Y: E[ g ( X , Y )]  
plane
g ( x, y ) f X , Y ( x, y )dA
4. Properties of Expectation: i. E[k ]  k (constant k); ii. E[ X  Y ]  E[ X ]  E[Y ] ; iii.
E[kX ]  kE[ X ] (constant k); iv. If X and Y are independent then E[ XY ]  E[ X ]E[Y ] .
8. The expected value of the normal random variable: E[ X ]  
9.2 Variance
1. Definition of variance (continuous): Var[ X ]  E[( X   X ) 2 ]  E[ X 2 ]   X2
2.
3. The variance of the Erlang random variable:
24
4. The variance of the gamma random variable:
5. The variance of the normal random variable: Var[ X ]   2
25
5.2 Conditional Distributions and Independence
1. Definition 5.3: Conditional probability function
2. Definition 5.4: Conditional pdf
3. Definition 5.5: Independence
4. Definition 5.6:
5.3 Multinomial and Bivariate normal Probability Laws
1. Multinomial trial
2. Theorem 5.3: Multinomial probability function
3. Bivariate normal probability law
26
Download