Stat 330 (Spring 2015): Homework 12

advertisement
Stat 330 (Spring 2015): Homework 12
Due: May 1, 2015
Show all of your work, and please staple your assignment if you use more than one sheet. Write your name,
the course number and the section on every sheet. Show all work to earn partial credit. Problems marked
with * will be graded and one additional randomly chosen problem will be graded.
1. * A student interested in purchasing a multifunction inkjet printer wanted to study the relationship
between the retail price (in dollars) and the printing speed (pages per minute) of a typical personal
printer. He obtained data for 20 inkjet printers from a current PC magazine and calculated the following
statistics:
n = 20
Σ x2i = 167.51
Σ xi = 56.3
Σ yi2 = 658528
Σ yi = 3232.0
Σ xi yi = 9918.3
(a) Estimate the linear regression equation predicting the price of an inkjet printer, y, based on the
printing speed, x.
(b) Predict the price in dollars for a printer with a speed of 3.5 pages a minute.
(c) Calculate the sample correlation between x and y and the coefficient of determination, R2 . What
does the R2 value tell you about the fit of your line?
(d) Provide an estimate of the error variance σ 2
Answer:
We first compute Sxx , Sxy and Syy :
Sxx =
n
X
i=1
Sxy
n
X
xi )2 /n
x2i − (
i=1
= 167.51 − 56.32 /20 = 9.0255
n
n
n
X
X
X
=
xi yi − (
xi
yi )/n
i=1
Syy
i=1
i=1
= 9918.3 − (56.3)(3232)/20 = 820.22
n
n
X
X
yi2 − (
yi )2 /n
=
i=1
i=1
= 658525 − 32322 /20 = 136233.8
(a) The regression equation is E(Yi ) = β0 + β1 xi where Yi is the price for printer i and xi is the speed
of the printer. We estimate β0 and β1 using the following formulas:
Pn
(x − x)(yi − y)
Pn i
β̂1 = i=1
2
i=1 (xi − x)
Sxy
=
Sxx
820.22
=
= 90.8780677
9.0255
β̂0 = y − β̂1 x = 3232/20 − 90.8780677 × (56.3/20) = −94.22176057
So the prediction equation is ŷi = −94.2218 + 90.8781xi .
(b) To predict the price in dollars for a printer with a speed of 3.5 pages a minute plug in the speed
xi = 3.5 in the prediction equation So ŷ = −94.2218 + 90.8781 × 3.5 = 223.85 dollars.
(c) We can estimate the correlation r and R2 using the following formulas:
Sxy
820.22
√
r= p
=√
= .74
Sxx Syy
9.0255 136233.8
R2 =r2 = 0.55
The linear model describes only 55% of the variability in the price of inkjet printers; so this is only
a weak to moderate fit to the data.
1
Stat 330 (Spring 2015): Homework 12
Due: May 1, 2015
(d) An estimate of the error variance σ 2 is given by
σ̂ 2 =
n
1 X
SSE
(yi − ŷi )2 =
.
n −Chapter
2 i=1
n−2
9
77
So we first need to find SSE=SST-SSR. The regression sum of squares SSR is given by
(c) The 90% SSR
confidence
interval
σ is
= b1 ×
Sxy =for
90.8780677
× 820.22 = 7454.0
"s
#
# "r
s
r
and SST = Syy = 136233.8. Thus
= 61693.8 giving
(n −SSE
1)s2 = 136233.8
(n − 1)s2− 7454.0 (2)(400
(2)(400
,
=
,
2
2
χ
χ21−α/2− 2) = 3427.43
5.99
0.10
σ̂α/2
= 61693.8/(20
2. * (Baron’s book): 9.10
= [11.6, 89.4] (thousand dollars)
Answer:
9.10
(a) Find p̂ = 24/200 = 0.12. Then for α = 1 − 0.96 = 0.04, find zα/2 = z0.02 = 2.054
(the easiest way is to use Table A5 with ∞ degrees of freedom)
r
r
p̂(1 − p̂)
0.12(1 − 0.12)
= 0.12 ± (2.054)
p̂ ± z0.02
n
200
=
0.12 ± 0.047 or [0.073, 0.167]
(b) Test H0 : p ≤ 0.1 (or H0 : p = 0.1) vs HA : p > 0.1. Disproving the manufacturer’s claim means rejecting H0 in favor of this HA .
This is a one-sided test, therefore our two-sided confidence interval in (a) cannot
be used to conduct this test.
The observed test statistic is
0.12 − 0.1
p̂ − p0
= q
Z= q
p̂(1−p̂)
n
0.12(1−0.12)
200
Chapter 9
= 0.8704.
79
In order to consider different significance levels, let us compute the P-value,
Then the test statistic is
P = P {Z > 0.8704} = 1 − Φ(0.8704) = 1 − 0.8078 = 0.1922,
0.6 − 0.59
from Table A4.Z = q
= 0.1307
1
1
(0.5941)(1 − 0.5941) 70
+ 100
The P-value exceeds both 0.04 and 0.15. Therefore, we do not have a significance
evidence, at the mentioned levels, to disprove the manufacturer’s claim.
The P-value equals
9.11 Test H0 : p1 = p2 Pvs=H2P
p1 >
> |0.1307|}
p2 . Higher
quality
means=lower
proportion of defective
A : {Z
= 2(1
− 0.5517)
0.8966
items.
3. * (Baron’s book): 9.16
(Table
Thisfrom
is a very
high P-value,
no significant
difference
between
Given
p̂1A4).
= 0.12
a sample
of sizethus
n =there
200 isand
p̂2 = 13/150
= 0.0867
from a
the support
candidate
in the two
Answer:
sample
of sizeofmthe
= 150,
we compute
thetowns.
pooled proportion
np̂1 + mp̂
24 p̂+2 13
9.16 Here n1 = 250, n2 = 300, p̂1 = 10/250
= 0.04,
and
= 18/300 = 0.06.
2
p̂(pooled) =
n+m
=
200 + 150
= 0.1057.
(a) A 98% confidence interval for p1 − p2 is
Then, the test statistic is
s
p̂1 (1 − p̂1 ) p̂2 (1 − p̂2 )
p̂2 ± z0.02/20.12 − 0.0867+
Zp̂1=− q
n1
n2 = 1.0027
1
1
r 200
+ 150
(0.1057)(1 − 0.1057)
(0.04)(0.96) (0.06)(0.94)
= 0.04 − 0.06 ± 2.326
+
250
300
Finally, we compute the P-value
=
−0.02 ± 0.043 or [−0.063, 0.023]
P = P {Z > 1.0027} = 1 − 0.8413 = 0.1587
(b) The null hypothesis H0 : p1 = p2 is not rejected against the two-sided alternative
(Table H
A4),: pit 6=
is prather
large, and we conclude that there is no significance evidence
A
1
2 (p1 − p2 = 0) at the 2% level because the 98% confidence interval
that the
quality
of
items
byisthe
supplier
is higher
than the
the quality
quality of
for p1 − p2 contains produced
0. No, there
no new
significant
difference
between
items in
Exercise
9.10.
of the two lots.
9.17 For p̂1 = 45% support of candidate A, the margin of error is
r
r
2
p̂1 (1 − p̂1 )
(0.45)(0.55)
z0.025
= 1.96
= 0.0325 or 3.25%
n
900
For p̂2 = 35% support of candidate B, the margin of error is
Download