.a
c.k
r
Solutions Manual to
Introduction to
st
Probability and Statistics
lee
@
ka
i
for Engineers and Scientists
Original Text by
m
gk
Sheldon M. Ross
이명규 지음
Myeongkyu Lee
mgklee@kaist.ac.kr
Descriptive Statistics
Chapter 3
Elements of Probability
Chapter 4
Random Variables and Expectation
Chapter 5
Special Random Variables
Chapter 6
Distributions of Sampling Statistics
Chapter 7
Parameter Estimation
Chapter 8
Hypothesis Testing
Chapter 9
Regression
Chapter 10
Analysis of Variance
m
gk
lee
@
ka
i
st
Chapter 2
.a
c.k
r
Contents
ii
1
3
5
12
18
19
24
29
35
Chapter 2
.a
c.k
r
Descriptive Statistics
2.11. The sample mean of the entire data set is
99 · 120 + 99 · 100
= 110.
198
st
(a) 50 values are less than or equal to 100, and other 50 values are greater than or equal to 120. The sample
median is between 100 and 120.
ka
i
(b) Nothing can be inferred about the sample mode with current information.
■
lee
@
2.14. Let x and y be the unknown values. By definition, we know that
x + y + 102 + 100 + 105
= 104,
5
2
2
2
(x − 104) + (y − 104) + (102 − 104) + (100 − 104)2 + (105 − 104)2
= 16.
5−1
m
gk
From the first equation, y = 213 − x. By substitution,
Therefore, the two values are
213 +
2
(x − 104)2 + (109 − x)2 = 43,
x2 − 213x + 11327 = 0.
√
61
and
213 −
2
√
61
.
2.18. (a) By the formulas,
x1 = 3,
4−3
7
= ,
2
2
7 7 − 27
14
x3 = +
=
,
2
3
3
14 2 − 14
3
x4 =
+
= 4,
3
4
9−4
x5 = 4 +
= 5,
5
x2 = 3 +
1
■
Chapter 2. Descriptive Statistics
x6 = 5 +
6−5
31
=
,
6
6
and
2
7
1
−3 = ,
1
2
2
2
1 1
14 7
13
2
s3 = · + 3
−
=
,
2 2
3
2
3
2
14
14
2 13
+4 4−
=
,
s24 = ·
3 3
3
3
3 14
17
2
s25 = ·
+ 5 (5 − 4) =
,
4 3
2
2
31
209
4 17
+6
−5 =
.
s26 = ·
5 2
6
30
.a
c.k
r
0
s22 = · 0 + 2
(b) By usual computation,
3+4+7+2+9+6
31
=
,
6
6
2
6 1 X
31
209
s2 =
xi −
=
.
6 − 1 i=1
6
30
(c) By definition,
(j + 1)xj+1 =
j
X
i=1
ka
i
st
x=
xi + xj+1 = jxj + xj+1 = (j + 1)xj + xj+1 − xj .
lee
@
Dividing both sides by j + 1 yields the formula.
m
gk
■
2
Chapter 3
.a
c.k
r
Elements of Probability
3.2. Let H and T denote a head and a tail, respectively. The sample space of this experiment is
S = {HHH, HHT, HTH, HTT, THH, THT, TTH, TTT}.
■
st
HHH, HHT, HTH, and THH have more heads than tails.
ka
i
3.11. The proof is by induction.
(i) For the base case n = 1, P (E1 ) ≤ P (E1 ).
(ii) For the inductive step, assume the inequality for n = k. Then
!
k
[
lee
@
k+1
[
P
Ei
=P
i=1
!
Ei ∪ Ek+1
i=1
=P
k
[
!
+ P (Ek+1 ) − P
Ei
i=1
m
gk
≤P
k
[
k
[
!
Ei ∩ Ek+1
i=1
!
Ei
+ P (Ek+1 )
i=1
≤
k
X
P (Ei ) + P (Ek+1 )
i=1
=
k+1
X
P (Ei ).
i=1
The inequality also holds for n = k + 1.
By induction, Boole’s inequality is true for every n ∈ N.
3.30. Let An be the event that the first n balls chosen are colored red.
Let Bn be the event that n balls in the urn are colored red.
(a) By Bayes’ formula,
P (A2 | B2 )P (B2 )
P (B2 | A2 ) = P2
i=0 P (A2 | Bi )P (Bi )
3
■
Chapter 3. Elements of Probability
=
1 · 14
1
1 2 1
0 · 4 + ( 2 ) · 2 + 1 · 14
=
2
.
3
(b) By the law of total probability,
P2
P (A3 | Bi )P (Bi )
P (A3 )
P (A3 | A2 ) =
= Pi=0
2
P (A2 )
i=0 P (A2 | Bi )P (Bi )
0 · 41 + ( 12 )3 · 21 + 1 · 14
5
= .
6
0 · 41 + ( 12 )2 · 21 + 1 · 14
.a
c.k
r
=
3.47. P (A) = 0.2, P (B) = 0.3, P (C) = 0.4
(a) If P (A ∩ B) = 0, then
■
st
P (A ∪ B) = P (A) + P (B) − P (A ∩ B) = 0.2 + 0.3 − 0 = 0.5.
(b) If P (A ∩ B) = P (A)P (B), then
ka
i
P (A ∪ B) = P (A) + P (B) − P (A ∩ B) = 0.2 + 0.3 − 0.2 · 0.3 = 0.44.
(c) If A, B, and C are independent, by definition,
lee
@
P (A ∩ B ∩ C) = P (A)P (B)P (C) = 0.024.
(d) If P (A ∩ B) = P (A ∩ C) = P (B ∩ C) = 0, then P (A ∩ B ∩ C) = 0.
■
m
gk
3.50. Given that P (A) = 0.6 and
P (B | Ac ) =
P (B ∩ Ac )
P (B ∩ Ac )
=
= 0.1,
P (Ac )
1 − 0.6
P (B ∩ Ac ) = 0.04. It follows from
P (B) = P (B ∩ A) + P (B ∩ Ac )
that
P (A ∪ B) = P (A) + P (B) − P (A ∩ B) = P (A) + P (B ∩ Ac ) = 0.64.
■
4
Chapter 4
4.5.
.a
c.k
r
Random Variables and Expectation
(a) Since f is a probability density function, we must have that
Z ∞
Z 1
f (x) dx =
−∞
cx3 dx =
0
st
so c = 4.
(b) Hence,
Z 0.8
4x3 dx = (0.8)4 − (0.4)4 = 0.384.
ka
i
P {0.4 < X < 0.8} =
c
= 1,
4
0.4
■
lee
@
4.7. The probability that such a tube in a radio set will have to be replaced within the first 150 hours of
operation is
Z 150
P {X ≤ 150} =
Z 150
f (x) dx =
−∞
100
100
1
dx = .
2
x
3
2 3
2
80
5
1
=
.
3
3
243
2
■
m
gk
Therefore, the probability of our interest is
4.11. Since X1 , X2 , · · · , Xn are independent uniform random variables,
FM (x) = P {M ≤ x} = P {X1 ≤ x, · · · , Xn ≤ x} =
n
Y
i=1
P {Xi ≤ x} =
n
Y
x = xn
i=1
for x ∈ [0, 1]. Therefore, the probability density function of M is
fM (x) =
d
FM (x) = nxn−1
dx
for x ∈ [0, 1] and fM (x) = 0 elsewhere.
■
5
Chapter 4. Random Variables and Expectation
4.13. By Equations (4.3.5) and (4.3.6),
Z ∞
fX (x) =
Z 1
2 dy = 2(1 − x) for x ∈ (0, 1),
f (x, y) dy =
−∞
Z ∞
x
Z y
2 dx = 2y for y ∈ (0, 1).
f (x, y) dx =
fY (y) =
−∞
0
X and Y are not independent because f (x, y) ̸= fX (x)fY (y) for some x and y.
■
(a)
ZZ
P {X + Y ≤ a} =
.a
c.k
r
4.16. Since X and Y are independent continuous random variables, f (x, y) = fX (x)fY (y) for all x and y.
f (x, y) dxdy
x+y≤a
Z ∞ Z a−y
fX (x)fY (y) dxdy
=
−∞
Z ∞
=
−∞
Z a−y
fY (y)
fX (x) dx dy
−∞
st
−∞
Z ∞
FX (a − y)fY (y) dy.
=
ka
i
−∞
(b)
ZZ
lee
@
P {X ≤ Y } =
f (x, y) dxdy
x≤y
Z ∞Z y
=
−∞
Z ∞
=
fX (x)fY (y) dxdy
Z y
fY (y)
fX (x) dx dy
−∞
−∞
−∞
Z ∞
=
FX (y)fY (y) dy.
gk
−∞
■
m
4.20. We assume that pY (y) > 0 and fY (y) > 0 for all y. For all x and y,
(a) pX|Y (x | y) =
p(x, y)
= pX (x) if and only if p(x, y) = pX (x)pY (y).
pY (y)
(b) fX|Y (x | y) =
f (x, y)
= fX (x) if and only if f (x, y) = fX (x)fY (y).
fY (y)
Therefore, each condition is equivalent to independence of X and Y .
■
4.25. (a) E[X] should be larger than E[Y ] because a student with more students on the same bus is more
likely to be selected than one with fewer students.
6
Chapter 4. Random Variables and Expectation
40
33
25
50
2907
+ 33 ·
+ 25 ·
+ 50 ·
=
,
148
148
148
148
74
40 + 33 + 25 + 50
= 37.
E[Y ] =
4
(b) E[X] = 40 ·
■
4.29. Let M = max{X1 , · · · , Xn } and N = min{X1 , · · · , Xn }. Since X1 , X2 , · · · , Xn are independent uniform
random variables,
=
n
Y
P {Xi ≤ x} =
i=1
n
Y
x = xn ,
i=1
.a
c.k
r
FM (x) = P {M ≤ x} = P {X1 ≤ x, · · · , Xn ≤ x}
FN (x) = P {N ≤ x} = 1 − P {N > x} = 1 − P {X1 > x, · · · , Xn > x}
=1−
n
Y
P {Xi > x} = 1 −
i=1
n
Y
(1 − x) = 1 − (1 − x)n .
i=1
for x ∈ [0, 1]. Hence, the probability density functions of M and N are
0
nxn dx =
n
.
n+1
lee
@
Z 1
(a) E[M ] =
ka
i
st
nxn−1 if x ∈ (0, 1),
d
FM (x) =
fM (x) =
0
dx
otherwise.
n(1 − x)n−1 if x ∈ (0, 1),
d
fN (x) =
FN (x) =
0
dx
otherwise.
(b) By integration by parts,
Z 1
E[N ] =
nx(1 − x)n−1 dx
m
gk
0
1
Z 1
−(1 − x)n
−(1 − x)n
−n
= nx ·
dx
n
n
0
0
1
−(1 − x)n+1
1
=
.
=
n+1
n
+
1
0
■
4.40. We know that p3 = 1 − p1 − p2 and that E[X] = p1 + 2p2 + 3p3 = 2.
Hence, p1 + 2p2 + 3(1 − p1 − p2 ) = 2, so 2p1 + p2 = 1.
Var(X) = (1 − 2)2 p1 + (2 − 2)2 p2 + (3 − 2)2 p3 = p1 + p3 = 1 − p2 . Since pi ∈ [0, 1],
(a) When (p1 , p2 , p3 ) = ( 12 , 0, 12 ), Var(X) attains its maximum 1.
(b) When (p1 , p2 , p3 ) = (0, 1, 0), Var(X) attains its minimum 0.
■
7
Chapter 4. Random Variables and Expectation
4.45. (a) The marginal probability distributions of X1 and X2 are
P {X1 = x1 } =
3
16
1
8
5
16
3
8
if x1 = 0,
if x1 = 1,
P {X2 = x2 } =
if x1 = 2,
1
2
if x2 = 1,
1
if x2 = 2.
2
if x1 = 3,
respectively.
3
1
5
3
15
+1· +2·
+3· =
.
16
8
16
8
8
1
1
3
E[X2 ] = 1 · + 2 · = .
2
2
2 Recall that Var(X) = E X 2 − (E[X])2 . (Equation 4.6.1)
2
3
79
1
5
3
15
Var(X1 ) = 02 ·
=
+ 12 · + 2 2 ·
+ 32 · −
.
16
8
16
8
8
64
2
3
1
2 1
2 1
= .
Var(X2 ) = 1 · + 2 · −
2
2
2
4
Recall that Cov(X1 , X2 ) = E[X1 X2 ] − E[X1 ]E[X2 ]. (Equation 4.7.1)
1
1
1
1
1
47 15 3
1
3
+1·
+2· +3· +4· +6· =
−
· = .
Cov(X1 , X2 ) = 0 ·
16
16
4
8
8
4
16
8 2
8
4.49. Var(X) = σx2 and Var(Y ) = σy2 .
■
X
Y
X Y
= Var
+ Var
+ 2 Cov
,
σx
σy
σx σy
Var(X) Var(Y ) 2 Cov(X, Y )
=
+
+
σx2
σy2
σx σy
lee
@
0 ≤ Var
X
Y
+
σx
σy
ka
i
st
.a
c.k
r
(b) E[X1 ] = 0 ·
= 2 + 2 Corr(X, Y ).
m
gk
0 ≤ Var
∴ −1 ≤ Corr(X, Y ).
X
Y
X
Y
X Y
−
,
= Var
+ Var
− 2 Cov
σx
σy
σx
σy
σx σy
Var(X) Var(Y ) 2 Cov(X, Y )
=
+
−
σx2
σy2
σx σy
= 2 − 2 Corr(X, Y ).
∴ 1 ≥ Corr(X, Y ).
Therefore, we conclude that −1 ≤ Corr(X, Y ) ≤ 1.
X
Y
∓
=0
σx
σy
Y
X
∓
= c for some constant c
=⇒
σx
σy
σy
⇐⇒ Y = ∓cσy ±
X.
σx
Corr(X, Y ) = ±1 ⇐⇒ Var
b=±
σy
is positive when Corr(X, Y ) = 1 and negative when Corr(X, Y ) = −1.
σx
8
■
Chapter 4. Random Variables and Expectation
4.50. For i ∈ [n] = {1, · · · , n}, let
Xi =
1
if trial i results in outcome 1,
0
if trial i does not result in outcome 1.
E[Xi ] = 1 · p1 + 0 · (1 − p1 ) = p1 . Similarly, for j ∈ [n], let
if trial j results in outcome 2,
0
if trial j does not result in outcome 2.
.a
c.k
r
Yj =
1
E[Yj ] = 1 · p2 + 0 · (1 − p2 ) = p2 . It follows from the definitions of N1 and N2 that
N1 =
n
X
Xi ,
N2 =
i=1
n
X
By Proposition 4.7.2,
Cov(N1 , N2 ) = Cov
n
X
Xi ,
i=1
n
X
Yj =
n X
n
X
Cov(Xi , Yj ).
st
Yj .
j=1
j=1
i=1 j=1
ka
i
Because each trial is independent, Xi and Yj must be independent if they are from two distinct trials, that is,
i ̸= j. By Theorem 4.7.4, Cov(Xi , Yj ) = 0 if i ̸= j. Also, observe that
if trial i results in outcome 1,
(0, 1)
if trial i results in outcome 2,
(0, 0)
if trial i results in outcome 3,
lee
@
(Xi , Yi ) =
(1, 0)
so Xi Yi = 0 for each trial i ∈ [n]. Hence,
m
gk
Cov(N1 , N2 ) =
=
=
n X
n
X
Cov(Xi , Yj )
i=1 j=1
n
X
Cov(Xi , Yi )
i=1
n
X
(E[Xi Yi ] − E[Xi ]E[Yi ])
i=1
=−
=−
n
X
i=1
n
X
E[Xi ]E[Yi ]
p1 p2 = −np1 p2 .
i=1
Intuitively, the covariance is negative because N1 + N2 ≤ n; the sum of N1 and N2 is bounded above by n.
The more trials result in outcome 1, the fewer trials can result in outcome 2, so N1 and N2 are negatively
correlated.
■
9
Chapter 4. Random Variables and Expectation
4.54. We first compute the moment generating function ϕX (t) of X.
=
∞ n−1
X
t
n=1
(n)
For each n ∈ N, ϕX (0) =
n!
=
∞
X
.a
c.k
r
Z 1
ϕX (t) = E etX =
etx dx
0
t
e − 1 if t ̸= 0,
t
=
1
if t = 0
!
∞
1 X tn
−1
if t ̸= 0,
= t n=0 n!
1
if t = 0
tn
.
(n + 1)!
n=0
n!
1
=
. By definition, the nth moment of X is
(n + 1)!
n+1
E[X n ] =
Z 1
xn dx =
st
0
1
.
n+1
(n)
Therefore, we have verified that ϕX (0) = E[X n ] for each n ∈ N.
ka
i
■
4.56. Let X be the random variable with E[X] = 75. X takes only nonnegative values.
lee
@
(a) By Markov’s inequality (Proposition 4.9.1),
P {X > 85} ≤
E[X]
15
=
.
85
17
(b) By Chebyshev’s inequality (Proposition 4.9.2),
m
gk
P {65 ≤ X ≤ 85} = P {|X − E[X]| ≤ 10}
= 1 − P {|X − E[X]| ≥ 10}
Var(X)
102
1
3
=1− = .
4
4
≥1−
n
1X
(c) For n test scores X1 , · · · , Xn of n students taking her final examination, let X =
Xi be the class
n i=1
average. Since Xi ’s are independent and identically distributed,
n
1X
E X =
E[Xi ] = E[X],
n i=1
n
1 X
Var(X)
Var X = 2
Var (Xi ) =
.
n i=1
n
10
Chapter 4. Random Variables and Expectation
By Chebyshev’s inequality,
P
X −E X ≥5
Var X
≥1−
52
1
=1−
n
≥ 0.9.
X − E[X] ≤ 5 = 1 − P
m
gk
lee
@
ka
i
st
.a
c.k
r
Therefore, n ≥ 10.
11
■
Chapter 5
.a
c.k
r
Special Random Variables
5.1. Let X be the number of components independently in working condition. Then X ∼ B(4, 0.6). Since the
system can function adequately if X ≥ 2, the probability is
2 2 3 4
4
3
2
4
3
2
4
3
513
+
=
+
.
2
5
5
3
5
5
4
5
625
■
ka
i
st
P {X ≥ 2} =
5.5. A 4-engine plane can operate if at least 2 engines function, while a 2-engine plane can if at least 1 engine
functions. We find values of the probability p ∈ [0, 1] such that
2
< p < 1.
3
■
gk
Therefore,
lee
@
4 2
4 3
4 4
2
2 2
2
p (1 − p) +
p (1 − p) +
p >
p(1 − p) +
p
2
3
4
1
2
p2 6(1 − p)2 + 4p(1 − p) + p2 > p[2(1 − p) + p]
p 3p3 − 8p2 + 7p − 2 = p(p − 1)2 (3p − 2) > 0.
5.10. X ∼ B(n, p). Poisson approximation: P {X = k} ≈
e−np (np)k
.
k!
m
2 8
10
1
9
(a) P {X = 2} =
= 0.1937102445.
10
10
2
By approximation, P {X = 2} ≈
(b) P {X = 0} =
10
10
9
= 0.3486784401.
0
10
By approximation, P {X = 0} ≈
(c) P {X = 4} =
e−1
≈ 0.1839397206.
2!
e−1
≈ 0.3678794412.
2!
4 5
9
1
4
= 0.066060288.
5
5
4
By approximation, P {X = 4} ≈
e−1.8 (1.8)4
≈ 0.072301734.
4!
■
12
Chapter 5. Special Random Variables
5.11. Let X be the number of prizes I will win. Given that X ∼ B(50, 0.01), Recall that, for a large n and a
small p, we can approximate
P {X = k} ≈ e−λ
λk
k!
where λ = np = 0.5.
(a) P {X ≥ 1} = 1 − P {X = 0} = 1 −
99
100
50
≈ 1 − e−1/2 .
.a
c.k
r
49
50
99
e−1/2
1
(b) P {X = 1} =
≈
.
1
100
100
2
50 49
99
50
99
3e−1/2
1
(c) P {X ≥ 2} = 1 − P {X ≤ 1} = 1 −
−
≈1−
.
100
1
100
100
2
■
5.12. Let A be the event that an individual who tries the drug for a year has 0 colds in that time. Let B be
the event that the drug is beneficial for him or her. By Bayes’ formula,
P (A | B)P (B)
P (A | B)P (B) + P (A | B c )P (B c )
e−2 · 3/4
3e
= −2
=
.
−3
e · 3/4 + e · 1/4
3e + 1
ka
i
st
P (B | A) =
■
P {X = i} =
lee
@
5.17. If X is a Poisson random variable with E[X] = λ, then we know that X ∼ Poisson(λ). Hence,
e−λ λi
for i ∈ Z≥0 . Observe that
i!
P {X = i + 1}
λ
=
≥ 1 ⇐⇒ i + 1 ≤ λ.
P {X = i}
i+1
gk
Therefore, P {X = i} first increases and then decreases as i increases. To be more precise, P {X = λ − 1} =
P {X = λ} if λ ∈ N. It reaches its maximum value when i = ⌊λ⌋ anyway.
■
m
5.20. We shall assume that p ∈ (0, 1].
(a) X = k if the first k − 1 trials are all failures (with probability 1 − p) and the kth trial is a success (with
probability p). Thus, P {X = k} = (1 − p)k−1 p for k ∈ N.
(b) We know that
∞
X
1
=
xk
1−x
k=0
for |x| < 1. It can be shown that this power series allows term-by-term differentiation, so
∞
X
1
=
kxk−1
2
(1 − x)
k=1
13
Chapter 5. Special Random Variables
for |x| < 1. Hence,
E[X] =
∞
X
kP {X = k} = p
k=1
∞
X
k(1 − p)k−1 = p ·
k=1
1
1
= .
2
p
p
(c) In order for Y to equal k, r − 1 successes must result in the first k − 1 trials and a success must be the
outcome of the kth trial. Thus,
P {Y = k} =
k − 1 r−1
k−1 r
k−r
p (1 − p)
·p=
p (1 − p)k−r
r−1
r−1
(d) Write Y =
.a
c.k
r
for k ≥ r.
Pr
i=1 Yi where Yi is the number of trials needed to go from a total of i − 1 to a total of i
successes. Then each Yi is a geometric random variable with probability p. Therefore, by part (b),
E[Y ] = E
" r
X
#
Yi =
i=1
r
X
E[Yi ] =
p
i=1
=
r
.
p
■
st
i=1
r
X
1
ka
i
5.21. Since U is uniformly distributed on (0, 1), its cumulative distribution function is
if x ≤ 0,
if 0 < x < 1,
if x ≥ 1.
lee
@
0
FU (x) = x
1
For a, b ∈ R with a < b, let V = a + (b − a)U . Then U =
m
gk
FU
y−a
b−a
=
0
y − a
b−a
1
0
y − a
FV (y) =
b−a
1
V −a
. The cumulative distribution function of V is
b−a
y−a
≤ 0,
b−a
y−a
if 0 <
< 1,
b−a
y−a
if
≥1
b−a
if
if y ≤ a,
if a < y < b,
if y ≥ b.
Hence, the probability density function of V is
1
dFV (y) b − a
fV (y) =
=
dy
0
Therefore, V = a + (b − a)U is uniform on (a, b).
if a < y < b,
otherwise.
■
14
Chapter 5. Special Random Variables
5.27. Let X be the output of a bulb. We know that X ∼ N µ, σ2 where µ = 2000 and σ = 85. We find the
value of L such that
X −µ
L − 2000
P {X ≥ L} = P Z =
≥
σ
85
L − 2000
=1−Φ
= 0.95
85
where Φ(x) is the standard normal distribution function,
Z z
2
e−x /2 dx,
−∞
for Z ∼ N 0, 12 . Therefore,
.a
c.k
r
1
Φ(z) = P {Z ≤ z} = √
2π
L = 85Φ−1 (0.05) + 2000 ≈ 85 · (−1.645) + 2000 = 1860.175.
∞
5.29. Let I = −∞
e−x /2 dx.
2
st
R
ka
i
(a) Let t = (x − µ)/σ. Then dx = σdt.
Z ∞
2
2
1
e−(x−µ) /(2σ ) dx
2πσ −∞
Z ∞
2
1
e−t /2 σdt
=√
2πσ −∞
I
=√
2π
lee
@
1= √
is equivalent to I =
√
2π.
m
gk
(b) We evaluate the double integral by means of a change of variables to polar coordinates. Note that
dx
dr
J=
dy
dr
dx
cos θ
dθ
=
dy
sin θ
dθ
−r sin θ
r cos θ
= r,
so dxdy = |J|drdθ = rdrdθ. Thus,
Z ∞
2
2
e−x /2 dx
e−y /2 dy
−∞
−∞
Z ∞Z ∞
2
2
=
e−(x +y )/2 dxdy
I2 =
Z ∞
−∞
−∞
Z 2π Z ∞
=
0
2
e−r /2 rdrdθ
0
Z ∞
=
Z 2π
−r 2 /2
re
dr
dθ
0
i
∞
2
= 2π −e−r /2
= 2π.
0
h
0
15
■
Chapter 5. Special Random Variables
Since I is an integral of a positive function over R, I ≥ 0. Therefore, I =
√
2π.
■
5.30. If log X ∼ N µ, σ2 ,
P {X ≤ x} =
0
if x ≤ 0,
.a
c.k
r
P {log X ≤ log x} if x > 0
if x ≤ 0,
0
Z log x (t−µ)2
=
1
√
e− 2σ2 dt if x > 0.
2πσ −∞
5.37. Repair time (in hours): X ∼ Exp(1).
Z ∞
(a) P {X > 2} =
e−x dx = e−2 .
st
2
■
ka
i
R ∞ −x
e dx
P {X ≥ 3}
1
(b) P {X ≥ 3 | X > 2} =
= R3∞ −x
= .
P {X > 2}
e
e dx
2
■
lee
@
5.41. Recall that a Poisson process has independent and stationary increments. By Proposition 5.6.2,
P {N (t) = k} = e−λt
(λt)k
k!
for k ∈ Z≥0 . If t is in years, here λ = 5.
(a) P {N (1/2) ≥ 2} = 1 − P {N (1/2) = 0} − P {N (1/2) = 1} = 1 − 7/2e−5/2 .
gk
(b) The number of events that occur in disjoint time intervals are independent (independent increments), so
P {N (3/4) = 0} = e−15/4 .
m
(c)
P {N (3/4) ≥ 4 ∧ N (1/2) ≥ 2}
P {N (1/2) ≥ 2}
P {N (1/2) = 2 ∧ N (3/4) − N (1/2) ≥ 2}
=
P {N (1/2) ≥ 2}
P {N (1/2) = 3 ∧ N (3/4) − N (1/2) ≥ 1} P {N (1/2) ≥ 4}
+
+
P {N (1/2) ≥ 2}
P {N (1/2) ≥ 2}
P {N (1/2) = 2}P {N (1/4) ≥ 2}
=
P {N (1/2) ≥ 2}
P {N (1/2) = 3}P {N (1/4) ≥ 1} P {N (1/2) ≥ 4}
+
+
P {N (1/2) ≥ 2}
P {N (1/2) ≥ 2}
−5/2
2 1
e
(5/2)
=
1 − e−5/4 − 5/4e−5/4
−5/2
2
1 − 7/2e
P {N (3/4) ≥ 4 | N (1/2) ≥ 2} =
16
Chapter 5. Special Random Variables
#
3
X
e−5/2 (5/2)k
e−5/2 (5/2)3 −5/4
+1−
1−e
.
+
6
k!
k=0
m
gk
lee
@
ka
i
st
.a
c.k
r
■
17
Chapter 6
.a
c.k
r
Distributions of Sampling Statistics
6.4. Let Xi be the number I win for each bet. X1 , · · · , Xn are independent and identically distributed random
variables with P {Xi = 35} = 1/38 and P {Xi = −1} = 37/38.
Let Yn be the number of bets for which I win 35 among n bets; then Yn ∼ B(n, 1/38). We have
n
X
i=1
Y34 >
(b) P
= 1 − P {Y34 = 0} = 1 −
1000
Y1000 >
36
(
≈P
Z≥
(
100000
Y100000 >
36
≈P
1000
1000
36√ − 38
37000
38
!2
37
38
)
Z≥
by the central limit theorem.
34
≈ 0.596.
≈ 1 − Φ(0.29) ≈ 0.3859.
100000
100000
36√ −
38
3700000
38
gk
(c) P
34
36
37n
38
lee
@
(a) P
√
ka
i
n
If n is large, then approximately Yn ∼ N ,
38
n
.
36
st
Xi = 35Yn − (n − Yn ) > 0 ⇐⇒ Yn >
)
≈ 1 − Φ(2.89) ≈ 0.0019.
■
4
2
9
4
m
6.20. By Theorem 6.5.1, S12 ∼ χ29 and S22 ∼ χ24 .
P S12 < S22 = P
2
S12
4χ9 /9
1
<1 =P
< 1 = P F9,4 <
.
S22
2χ24 /4
2
■
18
Chapter 7
7.1. The likelihood function is
exp (nθ − Pn
if θ ≤ xi for each 1 ≤ i ≤ n,
0
otherwise.
i=1 xi )
st
f (x1 , · · · , xn | θ) =
.a
c.k
r
Parameter Estimation
f is an increasing function of θ subject to θ ≤ xi for each 1 ≤ i ≤ n. Hence, θ̂ = min{x1 , · · · xn } maximizes f .
ka
i
Therefore, the maximum likelihood estimator of θ is min{X1 , · · · , Xn }.
■
7.5. Let σ2 > 0 be the common variance. We have
f (x1 , · · · , xn , y1 , · · · , yn , w1 , · · · , wn | µ1 , µ2 ) =
=
Taking logs,
i=1
lee
@
1
√
2π|σ|
3n
n
Y
fXi (xi )
n
Y
fYi (yi )
i=1
n
Y
fWi (wi )
i=1
!
n
1 X
2
2
2
(xi − µ1 ) + (yi − µ2 ) + (wi − µ1 − µ2 )
.
exp − 2
2σ i=1
gk
g(µ1 , µ2 ) = log f (x1 , · · · , xn , y1 , · · · , yn , w1 , · · · , wn | µ1 , µ2 )
n
√
1 X
= −3n log
2π|σ| − 2
(xi − µ1 )2 + (yi − µ2 )2 + (wi − µ1 − µ2 )2 .
2σ i=1
m
To find critical points of g, we solve
n
∂g
1 X
= 2
[(xi − µ1 ) + (wi − µ1 − µ2 )] = 0,
∂µ1
σ i=1
n
∂g
1 X
= 2
[(yi − µ2 ) + (wi − µ1 − µ2 )] = 0,
∂µ2
σ i=1
that is,
"
2n
n
#" #
µ1
n
2n
µ2
#
Pn
x
+
w
i
i
i=1
.
= Pi=1
Pn
n
y
+
i
i=1
i=1 wi
"Pn
19
Chapter 7. Parameter Estimation
Observe that
2n
∂2g
= − 2 < 0 and
∂µ21
σ
∂2g
∂µ21
∂2g
∂µ1 ∂µ2
∂2g
∂µ2 ∂µ1
∂2g
∂µ22
−
=
2n
σ2
2n
n n 3n2
− 2 − − 2
− 2 = 4 > 0.
σ
σ
σ
σ
By the second derivative test, the maximum likelihood estimates of µ1 and µ2 are
µ̂2
"
# "Pn
#
Pn
2n −n
1
i=1 xi +
i=1 wi
= 2
Pn
Pn
3n −n 2n
i=1 yi +
i=1 wi
" Pn
#
Pn
Pn
1 2 i=1 xi + i=1 wi − i=1 yi
.
=
P
P
P
3n 2 ni=1 yi + ni=1 wi − ni=1 xi
Therefore, the maximum likelihood estimators of µ1 and µ2 are
2
n
X
Xi +
i=1
n
X
i=1
Wi −
n
X
!
Yi
i=1
1
and
3n
2
n
X
Yi +
i=1
n
X
Wi −
i=1
n
X
!
Xi
,
i=1
st
1
3n
.a
c.k
r
" #
µ̂1
7.9. x = 11.48, σ = 0.08, and n = 10.
■
ka
i
respectively.
(a) A 95 percent confidence interval for the PCB level of this fish is
1.96 · 0.08
1.96 · 0.08
√
√
11.48 −
, 11.48 +
10
10
lee
@
≈ (11.4304, 11.5296).
(b) A 95 percent lower confidence interval is
gk
1.645 · 0.08
√
−∞, 11.48 +
≈ (−∞, 11.5216).
10
m
(c) A 95 percent upper confidence interval is
1.645 · 0.08
√
, ∞ ≈ (11.4384, ∞).
11.48 −
10
■
7.18. Assume that the sample is normal. x ≈ 133.222, s ≈ 10.213, and n = 18. t0.05,17 = 1.740, t0.025,17 =
2.110 according to Table A3.
(a) A 95 percent confidence interval estimate of the average IQ score of all students at the university is
133.222 −
2.110 · 10.213
2.110 · 10.213
√
√
, 133.222 +
18
18
20
≈ (128.143, 138.301).
Chapter 7. Parameter Estimation
(b) A 95 percent lower confidence interval estimate is
1.740 · 10.213
√
−∞, 133.222 +
18
≈ (−∞, 137.411).
(c) A 95 percent upper confidence interval estimate is
1.740 · 10.213
√
133.222 −
, ∞ ≈ (129.033, ∞).
18
.a
c.k
r
■
7.26. x = 2062.75, s ≈ 104.343, and n = 20. t0.05,19 = 1.729, t0.025,19 = 2.093, and t0.005,19 = 2.861 according
to Table A3.
(a) A 95 percent two-sided confidence interval for the mean number of steps is
2.093 · 104.343
2.093 · 104.343
√
√
, 2062.75 +
2062.75 −
20
20
≈ (2013.92, 2111.58).
st
(b) A 99 percent two-sided confidence interval for the mean number of steps is
2062.75 −
2.861 · 104.343
2.861 · 104.343
√
√
, 2062.75 +
20
20
ka
i
≈ (1996.00, 2129.50).
lee
@
(c) A 95 percent upper confidence interval is
1.729 · 104.343
√
2062.75 −
,∞
20
≈ (2022.41, ∞).
gk
Therefore, v ≈ 2022.41.
■
7.38. The pooled estimate of σ2 is
m
s2p =
4s21 + 2s22
4 · 2.5 + 2 · 7
=
= 4.
6
6
Thus, an estimate of σ is sp = 2.
■
7.39. µ = 3.180, s2 ≈ 6.484 × 10−5 , and n = 8.
(a) By Equation (7.2.3), the maximum likelihood estimate σ̂ of σ is
"
n
1X
(xi − µ)2
n i=1
#1/2
21
≈ 7.706 × 10−3 .
Chapter 7. Parameter Estimation
(b) Because each (Xi − µ)/σ ∼ N (0, 1) is independent, it follows that
2
n X
Xi − µ
σ
i=1
∼ χ2n .
According to Table A2, χ20.05,8 = 15.507 and χ20.95,8 = 2.733. A 90 percent confidence interval for σ 2 is
7 · 6.484 × 10−5 7 · 6.484 × 10−5
,
15.507
2.733
≈ (3.063 × 10−5 , 1.738 × 10−4 ).
.a
c.k
r
Taking square roots, a 90 percent confidence interval for σ is
(5.535 × 10−3 , 1.318 × 10−2 ).
■
7.41. We assume that the samples are normal with a common variance. x = 3358.1, y = 3130.4, and
sp =
(n − 1)s21 + (m − 1)s22
≈
n+m−2
r
124419 + 17729
≈ 266.597.
2
ka
i
s
st
n = m = 10. According to Table A3, t0.025,18 = 2.101 and t0.05,18 = 1.734.
(a) By Equation (7.4.4), a 95 percent two-sided confidence interval for µ1 − µ2 is
r
lee
@
x − y − t0.025,18 sp
1
1
+ , x − y + t0.025,18 sp
n m
r
1
1
+
n m
!
≈ (−22.79, 478.19).
(b) A 95 percent one-sided upper confidence interval for µ1 − µ2 is
gk
x − y − t0.05,18 sp
r
!
1
1
+ , ∞ ≈ (20.96, ∞).
n m
(c) A 95 percent one-sided upper confidence interval for µ1 − µ2 is
r
m
−∞, x − y + t0.05,18 sp
1
1
+
n m
!
≈ (−∞, 434.44).
■
7.44. x = 532.1, y = 548.6, and n = m = 10. According to Table A3, t0.005,18 = 2.878.
s
sp =
(n − 1)s21 + (m − 1)s22
≈
n+m−2
r
2932.8 + 1191.6
≈ 45.412.
2
By Equation (7.4.4), a 99 percent two-sided confidence interval for µ1 − µ2 is
r
x − y − t0.005,18 sp
1
1
+ , x − y + t0.005,18 sp
n m
22
r
1
1
+
n m
!
≈ (−74.949, 41.949).
Chapter 7. Parameter Estimation
■
7.55. Let p be the probability that a person contracting lung cancer will die within 5 years. Let {X1 , · · · , X100 }
be a random sample from a Bernoulli distribution with parameter p; then E[Xi ] = p. We know that
P100
i=1 xi =
67.
1
(a) By Example 7.7a, an unbiased estimate of p is p̂ = 100
P100
i=1 xi = 0.67.
(b) A 95 percent two-sided confidence interval for p is
p̂ − 1.96
p̂(1 − p̂)
, p̂ + 1.96
n
We find some n ∈ N such that
r
1.96
r
p̂(1 − p̂)
n
!
.a
c.k
r
r
.
p̂(1 − p̂)
< 0.02 ⇐⇒ n ≥ 2124.
n
Therefore, an additional sample of at least 2024 data is required to be 95 percent confident that the error
■
ka
i
st
in estimating the probability in part (a) is less than .02.
7.63. If X is an exponential random variable with E[X] = 1/λ, then X ∼ Exp(λ). Hence,
where
lee
@
f (x | λ) = λe−λx⃗1[0,∞) (x),
⃗1A (x) =
if x ∈ A,
0
if x ∈
/ A.
P20
i=1 xi = 4.6. The posterior density function of λ is
f (x1 , · · · , x20 | λ)g(λ)
⃗1(0,∞) (λ)
f (x1 , · · · , x20 | λ)g(λ) dλ
0
P20
λ20 exp −λ i=1 xi 21 e−λ λ2
⃗1(0,∞) (λ)
= R∞
P20
1 −λ 2
20 exp −λ
λ
x
e
λ
dλ
i=1 i 2
0
gk
1
We know that 20
1
m
f (λ | x1 , · · · x20 ) = R ∞
λ22 e−93λ
⃗1(0,∞) (λ).
λ22 e−93λ dλ
0
= R∞
Therefore, the Bayes estimate of λ is
R ∞ 23 −93λ
λ e
dλ
E [λ | x1 , · · · x20 ] = R0∞ 22 −93λ
λ e
dλ
0
= R∞
0
1
λ22 e−93λ dλ
∞ Z ∞
e−93λ
e−93λ
λ23
−
23λ22
dλ
−93 0
−93
0
23
.
=
93
■
23
Chapter 8
.a
c.k
r
Hypothesis Testing
8.2. We test the null hypothesis H0 : µ = 32 against the alternative hypothesis H1 : µ ̸= 32. We first compute
the test statistic
T =
X − µ0
30.4 − 32
√ =
√
= −2.
σ/ n
4/ 25
■
st
|T | > z0.025 = 1.96, so we reject H0 at the 5% level of significance.
compute the test statistic
T =
ka
i
8.5. We test the null hypothesis H0 : µ ≥ 200 against the alternative hypothesis H1 : µ < 200. We first
199.1125 − 200
X − µ0
√ =
√
≈ −0.502.
σ/ n
5/ 8
lee
@
The p-value is P {Z ≤ t} ≈ Φ(−0.502) ≈ 0.308 where Z ∼ N (0, 1). It is greater than α = 0.05 and α = 0.1, so
we accept H0 at both levels of significance.
■
8.10. We test the null hypothesis H0 : µ ≥ 7.6 against the alternative hypothesis H1 : µ < 7.6. We first
compute the test statistic
X − µ0
7.2 − 7.6
4
√ =− .
√ =
3
σ/ n
1.2/ 16
gk
T =
The p-value is P {Z ≤ −4/3} = P {Z ≥ 4/3} ≈ 1 − Φ(1.33) = 0.0918 where Z ∼ N (0, 1). It is greater than
m
α = 0.01 and α = 0.05, so we accept H0 at both levels of significance.
■
8.14. We test the null hypothesis H0 : µ = 24 against the alternative hypothesis H1 : µ ̸= 24. We first
compute the test statistic
T =
90
X − µ0
22.5 − 24
√
√ =
=−
≈ −2.903.
31
S/ n
3.1/ 36
|T | > 2.045 > t0.025,35 , so we reject H0 at the 5% level of significance.
■
8.20. We test the null hypothesis H0 : µ ≥ 30 against the alternative hypothesis H1 : µ < 30. We first
compute the test statistic
r
X − µ0
26.4 − 30
3
√ = q
T =
= −9
.
23
S/ n
2 46 √1
15
24
10
Chapter 8. Hypothesis Testing
)
3
where T9 is a t-random variable with 9 degrees of freedom. We accept H0 if
The p-value is P T9 < −9
23
the significance level α is less than the p-value.
■
(
r
8.29. We test the null hypothesis H0 : µ1 ≤ µ2 against the alternative hypothesis H1 : µ1 > µ2 . We first
compute the test statistic
X −Y
.
T =p 2
σ1 /n + σ22 /m
.a
c.k
r
Let Z ∼ N (0, 1).
(a) When σ2 = 5, T ≈ 2.65, so the p-value is approximately P {Z > 2.65} = 1 − Φ(2.65) = 0.0041.
(b) When σ2 = 10, T ≈ 2.10, so the p-value is approximately P {Z > 2.10} = 1 − Φ(2.10) = 0.0179.
(c) When σ2 = 20, T ≈ 1.33, so the p-value is approximately P {Z > 1.33} = 1 − Φ(1.33) = 0.0918.
■
st
8.31. We test the null hypothesis H0 : µ1 = µ2 against the alternative hypothesis H1 : µ1 ̸= µ2 . We first
compute the test statistic
X −Y
≈ 0.4371,
Sp2 (1/n + 1/m)
ka
i
T =q
where Sp2 is the pooled estimator of the common variance σ 2 given by
(n − 1)S12 + (m − 1)S22
.
n+m−2
lee
@
Sp2 =
The p-value is 2P {T11 ≥ 0.4371} ≈ 0.67 where T11 is a t-random variable with 11 degrees of freedom.
■
Hence,
gk
8.44. By Theorem 6.5.1, (n − 1)S 2 /σ2 ∼ χ2n−1 . Suppose that H0 is true. Then we have
m
P
(n − 1)S 2
(n − 1)S 2
≤
.
2
σ0
σ2
(n − 1)S 2
> χ2α,n−1
σ02
≤P
(n − 1)S 2
> χ2α,n−1
σ2
= α.
Therefore, we
(n − 1)S 2
≤ χ2α,n−1 ,
σ02
(n − 1)S 2
reject H0 if
> χ2α,n−1
σ02
accept H0 if
at the significance level α.
■
25
Chapter 8. Hypothesis Testing
8.45. Because each (Xi − µ)/σ ∼ N (0, 1) is independent, it follows that
2
n X
Xi − µ
σ
i=1
∼ χ2n .
Suppose that H0 is true. Then we have
2
n X
Xi − µ
i=1
σ0
≤
2
n X
Xi − µ
i=1
σ
.
Hence,
σ0
i=1
)
> χ2α,n
≤P
( n X Xi − µ 2
i=1
Therefore, we
accept H0 if
2
n X
Xi − µ
σ0
i=1
2
n X
Xi − µ
σ0
σ
> χ2α,n
= α.
≤ χ2α,n ,
> χ2α,n
st
reject H0 if
)
.a
c.k
r
P
( n X Xi − µ 2
i=1
■
ka
i
at the significance level α.
2
8.50. We assume that {Xi }ni=1 and {Yj }m
j=1 are independent samples from two normal populations N (µx , σx )
and N (µy , σy2 ), respectively. Let
n
m
2
2
1 X
1 X
Xi − X and Sy2 =
Yj − Y .
n − 1 i=1
m − 1 j=1
lee
@
Sx2 =
Hence,
gk
By Theorem 6.5.1, (n − 1)Sx2 /σx2 ∼ χ2n−1 and (m − 1)Sy2 /σy2 ∼ χ2m−1 . When H0 is true,
m
P
Sx2 /σx2
S2
≥ x2 ∼ Fn−1,m−1 .
2
2
Sy /σy
Sy
Sx2
> Fα,n−1,m−1
Sy2
≤P
Sx2 /σx2
> Fα,n−1,m−1
Sy2 /σy2
= α.
Therefore, we
accept H0 if
Sx2
≤ Fα,n−1,m−1 ,
Sy2
reject H0 if
Sx2
> Fα,n−1,m−1
Sy2
at the significance level α.
■
8.60. Let p1 and p2 be the probabilities that the first and the second treatments are effective, respectively.
26
Chapter 8. Hypothesis Testing
(a) We test the null hypothesis H0 : p1 = p2 against the alternative hypothesis H1 : p1 ̸= p2 using the
Fisher-Irwin test. We compute P {X ≤ 39} ≈ 0.649 and P {X ≥ 39} ≈ 0.475, where
P {X = i} =
72
i
84
83 − i
156
83
for 0 ≤ i ≤ 83. The p-value is considerably large, so we accept H0 in general.
(b) Let Y ∼ B(156, p). We test the null hypothesis H0 : p = 0.5 against the alternative hypothesis H1 : p ̸= 0.5.
.a
c.k
r
Suppose that H0 is true. We compute the p-value 2P {Y ≥ 84} ≈ 0.379. Hence, we accept H0 in general.
Therefore, the fact is consistent with the claim that the determination of the treatment to be given to
each patient was made in a totally random fashion.
■
8.63. Let X1 ∼ B(n1 , p1 ) and X2 ∼ B(n2 , p2 ). When n1 and n2 are large, by the central limit theorem,
st
X1 ∼
˙ N (n1 p1 , n1 p1 (1 − p1 )) and X2 ∼
˙ N (n2 p2 , n2 p2 (1 − p2 )). Hence,
ka
i
p1 (1 − p1 )
X1
∼
˙ N p1 ,
,
n1
n1
p2 (1 − p2 )
X2
∼
˙ N p2 ,
,
n2
n2
X2
p1 (1 − p1 ) p2 (1 − p2 )
X1
−
∼
˙ N p1 − p2 ,
+
,
n1
n2
n1
n2
lee
@
X1
X2
−
− (p1 − p2 )
n1
n2
r
∼
˙ N (0, 1).
p1 (1 − p1 ) p2 (1 − p2 )
+
n1
n2
We test the null hypothesis H0 : p1 = p2 against the alternative hypothesis H1 : p1 ̸= p2 . When H0 is true,
their common value can be best estimated by (X1 + X2 )/(n1 + n2 ) because it is the proportion of the total
gk
X1 + X2 successes to the total n1 + n2 trials.
X1 /n1 − X2 /n2
X1 /n1 − X2 /n2
˙ N (0, 1).
≈s
∼
1
X1 + X2
1
1
X1 + X2
1
+
+
p(1 − p)
1−
n1
n2
n1 + n2
n1 + n2
n1
n2
m
s
Therefore, we reject H0 if the test statistic is greater than zα/2 .
■
8.64. We first compute the test statistic
39 44
−
X1 /n1 − X2 /n2
72 84
s
=
T =s
≈ 0.208.
X1 + X2
X1 + X2
1
1
83 73
1
1
·
+
1−
+
n1 + n2
n1 + n2
n1
n2
156 156 72 84
The p-value is 2P {Z ≥ 0.208} ≈ 0.835. Therefore, we accept H0 in general.
27
■
Chapter 8. Hypothesis Testing
8.68. Let X ∼ Poisson(λ). We test the null hypothesis H0 : λ = 52 against the alternative hypothesis
H1 : λ ̸= 52. Suppose that H0 is true. Since
P8
˙ N (416, 416) and 46 + 62 + 60 + 58 +
i=1 Xi ∼ Poisson(416) ∼
47 + 50 + 59 + 49 = 431, we compute the p-value
2P
( 8
X
)
Xi ≥ 431
≈ 2P
Z≥
i=1
431 − 416
√
416
≈ 0.462.
Therefore, we accept H0 in general.
■
.a
c.k
r
8.71. To investigate the relationship between smoking and heart attacks, all the other factors should be
controlled. Thus, the scientist should choose samples such that each smoker is associated with a nonsmoker of
m
gk
lee
@
ka
i
st
almost the same age.
28
■
Chapter 9
93
1319
1775
21413
,Y =
, Sxx =
, and SxY =
. Hence, the least squares estimators are
8
80
8
80
A = Y − Bx =
43727
SxY
21413
and B =
=
.
17750
Sxx
17750
43727 + 21413x
≈ 2.463 + 1.206x.
17750
ka
i
25
20
15
10
5
10
15
20
Moisture x of a wet mix of a certain product
■
m
gk
lee
@
Density Y of the finished product
The estimated regression line is y = A + Bx =
st
9.1. n = 8, x =
.a
c.k
r
Regression
9.8. By Proposition 9.2.1,
A = Y − Bx
n
n
1X
x X
Yi −
(xi − x)Yi
n i=1
Sxx i=1
n X
1
x(xi − x)
=
−
Yi .
n
Sxx
i=1
=
29
Chapter 9. Regression
Because Y1 , · · · , Yn are independent normal random variables with common variance σ 2 ,
Var(A) =
n X
1
−
n
n
X
i=1
x(xi − x)
Sxx
2
Var(Yi )
9.10. By definition,
n
X
(Yi − A − Bxi )2
■
st
SSR =
.a
c.k
r
1
2x(xi − x) x2 (xi − x)2
−
+
2
2
n
nSxx
Sxx
i=1
1
x2
= σ2
+
n Sxx
2
σ
=
Sxx + nx2
nSxx
n
σ2 X 2
=
x .
nSxx i=1 i
= σ2
i=1
ka
i
2
n X
SxY
(xi − x)
=
Yi − Y −
Sxx
i=1
n
X
2 2SxY
S2
2
Yi − Y −
=
(x
−
x)
(xi − x) Yi − Y + xY
i
2
Sxx
Sxx
i=1
2
2SxY
S2
+ xY
Sxx
Sxx
2
Sxx SY Y − SxY
=
.
Sxx
lee
@
= SY Y −
■
SSR =
2
Sxx SY Y − SxY
≈ 872.369.
Sxx
m
gk
9.26. n = 10, x = 2325, Y = 2286, Sxx = 15686250, SxY = 15573000, SY Y = 15461440, and
(a)
30
Chapter 9. Regression
3,000
2,000
1,000
1,000
2,000
.a
c.k
r
Shear Stregth Y (psi)
4,000
3,000
4,000
Weld Diameter x (.0001 in.)
(b) A = Y − Bx ≈ −22.214 and B =
SxY
≈ 0.9928.
Sxx
(n − 2)Sxx
(B − β) ∼ tn−2 .
SSR
ka
i
s
st
(c) We test the null hypothesis H0 : β = 1 against the alternative hypothesis H1 : β ̸= 1. By Equation 9.4.2,
If H0 is true, then the test statistic is
8Sxx
(B − 1) = −2.738 < −t0.025,8 = −2.306.
SSR
lee
@
r
Hence, we reject H0 at the 5% level of significance.
(d) E[α + 2500β] = A + 2500B = 2459.737.
gk
(e) We use the Prediction Interval for a Response at the Input Level x0 on page 381:
s
A + Bx0 ± ta/2,n−2
m
Such a prediction interval is (2186.282, 2236.801).
(f )
31
n + 1 (x0 − x)2 SSR
+
.
n
Sxx
n−2
Standard Residual
Chapter 9. Regression
10
−10
0
1,000
2,000
.a
c.k
r
0
3,000
4,000
Weld Diameter x (.0001 in.)
(g) As indicated both by its scatter diagram and the random nature of its standardized residuals, the plot
st
appears to fit the straight-line model quite well.
■
ka
i
9.30. n = 8, x = 24.6875, Y = 126.5, Sxx = 222.38875, SxY = 425.65, SY Y = 1084, and
SSR =
2
Sxx SY Y − SxY
≈ 269.310.
Sxx
lee
@
We use the Prediction Interval for a Response at the Input Level x0 on page 381:
A + Bx0 ± ta/2,n−2
s
n + 1 (x0 − x)2 SSR
+
.
n
Sxx
n−2
Such a prediction interval is (111.564, 146.460).
gk
9.32. Taking logs,
■
log S = log A − m log N ⇐⇒ log N =
log A log S
−
.
m
m
m
Let Y = log N , x = log S, α = (log A)/m, and β = −1/m. Then we obtain the usual regression equation
Y = α + βx + e.
n = 10, x ≈ 3.698, Y ≈ 3.286, Sxx ≈ 0.2902, and SxY ≈ −4.068. Hence, the least squares estimators are
SxY
A′ = Y − B ′ x ≈ 55.124 and B ′ =
≈ −14.017. Thus, we can estimate
Sxx
m=−
1
A′
≈
0.07134
and
A
=
exp
−
≈ 51.038.
B′
B′
■
32
Chapter 9. Regression
1
1
1
1
1
1
1
⃗
X = 1
1
1
1
1
1
1
1
7.1
0.68
9.9
0.64
3.6
0.58
9.3
0.21
2.3
0.89
4.6
0.00
0.2
0.37
5.4
0.11
8.2
0.87
7.1
0.00
4.7
0.76
5.4
0.87
1.7
0.52
1.9
0.31
9.2
0.19
41.53
63.75
1
16.38
1
45.54
3
5
15.52
8
28.55
5.65
5
⃗ =
25.02 .
3 and Y
52.49
4
38.05
6
30.76
0
8
39.69
17.59
1
13.22
3
50.98
5
4
.a
c.k
r
9.47. Let
st
⃗ is invertible. By Equation 9.10.3, the least squares estimators are given by
⃗ has full rank, so X
⃗TX
X
ka
i
−2.8278
B0
−1
B1
⃗ = X
⃗TX
⃗
⃗TY
⃗ = 5.3707 .
=B
X
B
9.8157
2
0.4482
B3
lee
@
Hence,
⃗ TY
⃗ −B
⃗TX
⃗TY
⃗ = 201.9692.
SSR = Y
(a) The estimated regression plane is
Y = −2.8278 + 5.3707x1 + 9.8157x2 + 0.4482x3 .
gk
(b) By Equation 9.10.6,
−1
⃗ = σ2 X
⃗TX
⃗
Cov(B)
m
where σ 2 is the variance of a normal random error e whose mean is 0. Thus,
Var(B0 ) = σ
2
⃗TX
⃗
X
−1 = 0.7472σ 2 .
11
We test the null hypothesis H0 : β0 = 0 against the alternative hypothesis H1 : β0 ̸= 0.
■
9.50. With 100(1 − a) percent confidence,
(a) Confidence Interval Estimator of α + βx0 on page 378:
s
A + Bx0 ± ta/2,n−2
33
1
(x0 − x)2 SSR
+
.
n
Sxx
n−2
Chapter 9. Regression
(b) Prediction Interval for a Response at the Input Level x0 on page 381:
s
A + Bx0 ± ta/2,n−2
n + 1 (x0 − x)2 SSR
+
.
n
Sxx
n−2
A + Bx0 is an unbiased estimator of the mean response α + βx0 such that
A + Bx0 ∼ N
α + βx0 , σ
2
(x0 − x)2
1
+
n
Sxx
.
(9.4.4)
.a
c.k
r
Let Y denote the future response whose input level is x0 . We know that
Y ∼ N α + βx0 , σ 2 .
To obtain a prediction interval, we consider the distribution of Y − A − Bx0 . Since the future response Y is
independent of the past observations Y1 , · · · , Yn that were used to determine A and B, it follows that Y is also
independent of A + Bx0 . Hence,
st
(x0 − x)2
2
2 1
+
+σ .
Y − A − Bx0 ∼ N 0, σ
n
Sxx
Var(Y −A−Bx0 ) = Var(A+Bx0 )+Var(Y ). Var(A+Bx0 ) is the variance of the estimator of the mean response.
ka
i
Var(Y ) = σ 2 is the variance of a single observation. Therefore, for the same data, a prediction interval for a
m
gk
lee
@
future response always contains the corresponding confidence interval for the mean response.
34
■
Chapter 10
.a
c.k
r
Analysis of Variance
10.3. t-tests are to examine the hypothesis that two normal populations have the same mean value. To test
the hypothesis H0 : µ1 = µ2 = · · · = µm , one may think of running t-tests on all of the
m
2
pairs of samples.
However, multiple testing may increase the chance of making a Type I error. In general, the probability that
m independent tests produce at least one Type I error is 1 − (1 − α)m . Because multiple t-tests here use the
st
same data several times, they may not be independent and would have complicated dependencies. Therefore,
■
ka
i
the analysis of variance is the preferred method to test H0 .
10.6. We test the null hypothesis H0 : µ1 = µ2 against the alternative hypothesis H1 : µ1 ̸= µ2 . m = 2 and
n = 10. By definition,
m
1 X
17.5 + 17.87
Xi =
= 17.685,
m i=1
2
lee
@
X.. =
SSb = n
SSW =
m
X
X i − X..
i=1
m X
n
X
2
= 0.6845,
(Xij − Xi. )2 = 934.681.
i=1 j=1
gk
We compute the test statistic
F =
SSb /(m − 1)
= 0.01318.
SSW /(nm − m)
m
F < F0.05,1,18 = 4.41, and the p-value is 0.9099. Therefore, we accept H0 at the 5% level of significance.
10.8. By definition,
Si2 =
■
n
1 X
(Xij − Xi. )2 .
n − 1 j=1
Hence,
SSW =
m X
n
X
(Xij − Xi. )2 =
i=1 j=1
m
X
(n − 1)Si2 .
i=1
■
35
Chapter 10. Analysis of Variance
10.12. m = 3 and n = 12. By Exercise 10.8,
SSW = (n − 1)
m
X
Si2 = 11 · (145 + 138 + 150) = 4763.
i=1
By definition,
m
1 X
X i = 34,
X.. =
m i=1
m
X
X i − X..
2
= 12 · (−2)2 + 62 + (−4)2 = 672.
.a
c.k
r
SSb = n
i=1
(a) We test the null hypothesis H0 : µ1 = µ2 = µ3 against the alternative hypothesis H1 : not all the means
are equal. We compute the test statistic
F =
1008
SSb /(m − 1)
=
.
SSW /(nm − m)
433
st
F < F0.05,2,33 = 3.29, and the p-value is 0.1133. Therefore, we accept H0 at the 5% level of significance.
(b) Let
ka
i
r
√
C(3, 33, 0.05) 433
1
SSW
=
.
W = √ C(m, nm − m, α)
nm − m
6
n
By the T -method,
P {µ1 − µ2 ∈ (X1. − X2. − W, X1. − X2. + W ) = (−8 − W, −8 + W )} = 0.95,
lee
@
P {µ1 − µ3 ∈ (X1. − X3. − W, X1. − X3. + W ) = (2 − W, 2 + W )} = 0.95,
P {µ2 − µ3 ∈ (X2. − X3. − W, X2. − X3. + W ) = (10 − W, 10 + W )} = 0.95.
■
gk
10.13. m = 3 and n = 5. By definition,
m
1 X
33 + 34 + 32.8
499
Xi =
=
,
m i=1
3
15
"
2 2 2 #
m
X
2
4
11
7
62
SSb = n
X i − X.. = 5 ·
−
+
+ −
=
,
15
15
15
15
i=1
m
X.. =
SSW =
m X
n
X
744
(Xij − Xi. )2 =
.
5
i=1 j=1
(a) We test the null hypothesis H0 : µ1 = µ2 = µ3 against the alternative hypothesis H1 : not all the means
are equal. We compute the test statistic
F =
SSb /(m − 1)
1
= .
SSW /(nm − m)
6
F < F0.05,2,12 = 3.89, and the p-value is 0.8484. Therefore, we accept H0 at the 5% level of significance.
36
Chapter 10. Analysis of Variance
(b) Let
1
W = √ C(m, nm − m, α)
n
r
SSW
≈ 5.937.
nm − m
By the T -method,
P {µ1 − µ2 ∈ (X1. − X2. − W, X1. − X2. + W ) ≈ (−6.937, 4.937)} = 0.95,
P {µ1 − µ3 ∈ (X1. − X3. − W, X1. − X3. + W ) ≈ (−5.737, 6.137)} = 0.95,
.a
c.k
r
P {µ2 − µ3 ∈ (X2. − X3. − W, X2. − X3. + W ) ≈ (−4.737, 7.137)} = 0.95.
■
10.16. Because x.. is the average of finitely many terms, we can change the order of summation. By definition,
m
gk
lee
@
ka
i
st
m n
n m
1 XX
1 XX
x.. =
xij =
xij
mn i=1 j=1
nm j=1 i=1
n
m
m
1 X
1 X X xij
=
xi.
=
m i=1 j=1 n
m i=1
!
n
m
n
1 X X xij
1X
=
=
x.j .
n j=1 i=1 m
n j=1
37
■
0
You can add this document to your study collection(s)
Sign in Available only to authorized usersYou can add this document to your saved list
Sign in Available only to authorized users(For complaints, use another form )