Math 2B - Recitation Notes 5

advertisement
Math 2B - Recitation Notes 5
Maria M. Nastasescu
Exam instructions: Please use a standard bluebook and put your name and section on the
outside. You may use the textbook (Pitman), handouts, solution sets, your notes and homework,
and TA notes (Someone elses notes handcopied by you are OK.) Calculators and computers are
allowed, but only to do elementary numerical calculations (like a finite sum) and to use elementary
built-in functions like logs and normal distribution functions (as an alternative to using tables)not to do programming, simulations, numerical inte- gration, calculus, symbolic computations, or
functions like ones that calculate binomial or Poisson probabilities (built in discrete probability
density functions). Please indicate clearly any work done in overtime. Points will be recorded
separately and considered informally in course grades.
1
Basic Formulas in Probability Theory
• P (Ac ) = 1 − P (A)
• If A and B are independent, P (AB) = P (A)P (B)
• If A and B are disjoint, P (A ∪ B) = P (A) + P (B)
• Inclusion-Exclusion: P (A ∪ B) = P (A) + P (B) − P (AB)
2
Conditional Probability
• Definition: The conditional probability of A given B is P (A|B) =
P (AB)
P (B)
• Average and Bayes’ Rules, special case: for any events A and B,
P (A) = P (A|B)P (B) + P (A|B c )P (B c )
P (B|A) =
P (A|B)P (B)
P (A|B)P (B) + P (A|B c )P (B c )
(2.1)
(2.2)
• Average and Bayes’ Rules, general case: Suppose B1 , B2 , · · · , Bn partition the sample
space. Then
P (A) = P (A|B1 )P (B1 ) + · · · + P (A|Bn )P (Bn )
(2.3)
P (Bi |A) =
P (A|Bi )P (Bi )
P (A|Bi )P (B1 ) + · · · + P (A|Bn )P (Bn )
(2.4)
Example. Suppose people can be classified in terms of the risk they have of getting involved in
an accident: small risk (5% chance of an accident in any given year), average risk (15%) and high
risk (30%). Suppose 20% of the population has a small risk of accidents, 50% of the population
has an average risk of accidents and 30% of the population has a high risk of accidents. What
1
proportion of the populatin have accidents in a fixed year? If a given person has no accidents in
a year what is the probability that he or she is small risk?
Proof. Let S be the event that a person has small risk, A that it has average risk and H
that it has high risk. Let X the event that a person has an accident in a given year. We know
P (X|S) = 0.05, P (X|A) = 0.15 and P (X|H) = 0.30. We know that P (S) = 0.20, P (A) = 0.50
and P (H) = 0.30.
P (X) = P (X|S)P (S) + P (X|A)P (A) + P (X|H)P (H) = 0.175
P (S|X c ) =
=
3
P (X c |S)P (S)
P (X c |S)P (S) + P (X c |A)P (A) + P (X c |H)P (H)
(1 − 0.05) · 0.20
= 0.2303
(1 − 0.05) · 0.20 + (1 − 0.15) · 0.50 + (1 − 0.30) · 0.30
(2.5)
(2.6)
(2.7)
Binomial Distribution
• If Xi is a random variable
Pn which is 1 with probability p and 0 otherwise and if the Xi are
independent then X = i=1 Xi is a random variable and it has the binomial distribution
n k
P (X = k) =
p (1 − p)n−k
(3.1)
k
• If p is close to 0.5 and n large, then can
p approximate binomial distribution with the normal
distribution. Let µ = np and σ = np(1 − p). If X is a random variable in the (n, p)
binomial distribution then
b + 1/2 − µ
b − 1/2 − µ
P (a ≤ X ≤ b) ≈ Φ
−Φ
.
(3.2)
σ
σ
Remark: If we don’t have the 1/2 in the above formula we just get the result of the central
limit theorem. The 1/2 however makes this approximation better.
• If p is small then
P (X = k) ≈ e−µ
µk
k!
(3.3)
where µ = np.
Example. In a sample of 80, 000 married couple, what is the probability that both partners
were born on December 1st? The probability that both partners were born on December 1st is
1
· 1 = 7.51 · 10−6 which is clearly very small. The mean is then µ = 80, 000 · 7.51 · 10−6 ≈ 0.6.
365 365
Let X be the number of couples that share December 1st as their birthday. Then
P (X ≥ 1) = 1 − P (X = 0) = 1 − e−0.6
2
(0.6)0
= 0.4512.
0!
(3.4)
4
Continuous Distributions
• Given a probability density function f we have
b
Z
f (x)dx
P (a < X < b) =
(4.1)
a
• The cumulative distribution function is
Z
t
f (x)dx
F (t) = P (X < t) =
(4.2)
−∞
5
Expectation
• E(X) =
R∞
−∞
xf (x)dx where f is the density function
R∞
• E(X i ) = −∞ xi f (x)dx
P
P
• E( i Xi ) = i E(Xi ) (this holds even if Xi is not independent)
• E(XY ) = E(X)E(Y ) if X and Y are independent
• V ar(X) = E(X 2 ) − (E(X))2
6
Central Limit Theorem
The Central Limit Theorem says that if you have n identically distributed random variables (with
finite variance), then as n → ∞, the sume (or average) of the n variables converges in distribution
to the normal distribution.
Let Sn = X1 +· · ·+Xn be the sum of n independent
random variables with the same distribution.
√
For n large, E(Sn ) = nµ and SD(Sn ) = σ n, where µ = E(Xi ) and σ = SD(Xi ). Then for all
a≤b
Sn − nµ
√
P a≤
≤ b ≈ Φ(b) − Φ(b)
(6.1)
σ n
where Φ is the standard normal cumulative distribution function. The error of this approximation
goes to 0 as n → ∞.
7
Change of variables
Let X be a random variable with density fX (x) on (a, b). Let Y = g(X), where g is either strictly
increasing or strictly decreasing on (a, b). The density of Y on (g(a), g(b)) is then
fY (y) = fX (x)
1
|dy/dx|
(7.1)
where y = g(x). Here we solve y = g(x) for x in terms of y and then substitute this value of x into
fX (x) and and dy/dx.
From the proof of the above theorem (see p. 304) we can write the above expression in another
form:
fY (y) = fX (x(y))|dx/dy|.
(7.2)
3
Another way to do it is the following:
fY (t) =
8
dFY (t)
dP (g(X) < t)
dP (X < g −1 (t))
dFX (g −1 (t)
dg −1 (t)
=
=
=
= fX (g −1 (t))
dt
dt
dt
dt
dt
(7.3)
Independent Normal Variables
If X and Y are independent with normal (λ, σ 2 ) and normal (µ, τ 2 ) distribution, then X + Y has
normal (λ + µ, σ 2 + τ 2 ) distribution.
Example. Let X and Y be independent normal (0, 9) and normal (3, 16) distributions respectively. What is P (X < Y )? From above we obtain that X − Y is a normal (−3, 25) distribution.
Thus we have
X −Y +3
0+3
√
= Φ(.6) = .7257
(8.1)
P (X < Y ) = P (X − Y < 0) = P
< √
25
25
9
Continuous Joint Distributions
If f is the joint distribution of X and Y then
ZZ
P ((X, Y ) ∈ B) =
f (x, y)dxdy.
(9.1)
B
The marginal density function of X is
Z
∞
fX (x) =
f (x, y)dy.
(9.2)
−∞
Example. If X1 and X2 are independent exponential random variables with parameters λ1 and
X1
λ2 respectively. Find the distribution of Z = X
. Also compute P (X1 < X2 ).
2
−λ1 x1 −λ2 x2
1
First note that fX1 X2 (x1 , x2 ) = λ1 λ2 e
and then
e
. We start with the c.d.f. for Z = X
X2
take the derivative to get fZ . We have
Z ∞ Z ax2
X1
≤ a = P (X1 ≤ aX2 ) =
λ1 λ2 e−λ1 x1 e−λ2 x2 dx1 dx2
(9.3)
FZ (a) = P
X2
0
0
ax
Z ∞
Z ∞
1 −λ1 x1 2
−λ2 x2
=
λ1 λ2 e
− e
dx2 = λ2
e−λ2 x2 1 − e−λ1 ax2 dx2
(9.4)
λ1
0
0
0
∞
Z ∞
1
−λ2 x2
−x2 (λ2 +λ1 a)
−y(λ2 +λ1 a)
−λ2 x2
= λ2
e
−e
dx2 = λ2
e
−e
(9.5)
λ2 + λ1 a
0
0
λ2
λ1 a
λ2
−
−1 =1−
=
(9.6)
λ2 + λ1
λ2 + λ1 a
λ2 + λ1 a
Then
fZ (a) = FZ0 (a) =
4
λ1 λ2
(λ1 a + λ2 )2
(9.7)
Finally,
P (X1 < X2 ) = P
X1
λ1
<1 =
.
X2
λ2 + λ1
(9.8)
Example. Suppose that the random variables X1 , · · · , Xk are independent and Xi has an
exponential distribution with parameter λi . Let Y = min{X1 , ·Xk }. Show that Y has exponential
distribution with parameter λ1 + · · · + λk .
If X has an exponential distribution with parameter λ then for every t,
Z ∞
1 − FX (t) = P (X ≥ t) =
λe−λx dx = e−λx
(9.9)
t
Because Y = min{X1 , · · · , Xk } for every t > 0
P (Y > t) = P (X1 > t, · · · , Xk > t) = P (X1 > t)P (X2 > t) · · · P (Xk > t) = e−λ1 t · · · e−λk = e−t(λ1 +···+λk )
(9.10)
Therefore Y has an exponential distribution with parameter λ = λ1 + · · · + λk .
10
Density Convolution Formula
If (X, Y ) has density f (x, y) in the plane, then X + Y has density on the line
Z ∞
f (x, z − x)dx
fX+Y (z) =
(10.1)
−∞
If X and Y are independent then
Z
∞
fX (x)fY (z − x)dx.
fX+Y (z) =
(10.2)
−∞
Example. [Pitman, 5.4.7] Let X and Y have joint density f (x, y). Find formulae for the
densities of each of the following random variables: a) XY ; b) X − Y ; c) X + 2Y
For the first part, we apply the method used to derive the distribution of ratios on p.382. Such
that
Z
P (z < Z < z + dz) = P (x < X < x + dz, z < XY < z + dz).
(10.3)
x
Instead of the cone on p. 382 we now have an area between curves xy = z and xy = z + dz. Thus
we have that the area of the parallelogram for fixed x is approximately equal to
z + dz z
dxdz
dx
−
=
(10.4)
x
x
x
We get the density of Z by integration over x. Thus,
Z
1 z
fZ (z) =
f x,
dx
x
x x
(10.5)
For the second part, we can apply the linear change of variable formula and get that f−Y (t) =
fY (−t)(−1). Then if X and Y are independent
Z
Z
fX−Y (t) = fX (x)f−Y (t − x)dx = −fX (x)fY (x − t)dx
(10.6)
Finally, for the third part we have the density f2Y (t) = 12 fY 2t . We then conclude that
Z t−x
1
fX+2Y (t) = f x,
· · dx
(10.7)
2
2
x
5
Download