Class 19 Selected Practice Problems with answers

advertisement
Practice Problems for Exam2
Paraphrased problems from EMBS.
30. (p 484) A large auto insurance company selected random samples of single and married male policyholders and recorded the number who made a claim over the preceding three-year period.
Single Male Policy-Holders
n=400
Number making claims = 76
Married Male Policy-Holders
n=900
Number making claims = 90
Is the observed difference in claim rates statistically significant?
claim
single
none
76
married
90
166
324
400
810 900
1134 1300
EXPECTED
51.1
114.9
348.9
785.1
12.2
5.4
1.8
0.8
DISTANCES
Calculated Chi-squared
20.1
Pvalue = chidist(20.1,1)
7.20624E-06
pvalue = chitest(observed,expected)
7.20624E-06
YES, we reject the null hypothesis of independence because the p-value is less than 0.05. The difference in
sample claim rates are statistically significant.
40. (p 486) The Wall Street Journal Subscriber Study gathered data on the employment status of a
random sample of subscribers. Sample results were broken out for subscribers of the eastern and
western editions.
Region
EMPLOYMENT STATUS
Eastern Edition
Western Edition
Full-Time
1105
574
Part-Time
31
15
Self Employed
229
186
Not employed
485
344
Test the hypothesis that employment status is independent of region.
EXPECTED
1046.2
28.7
258.6
516.6
1850
632.8
17.3
156.4
312.4
1119
3.31
0.19
3.39
1.93
5.46
0.32
5.60
3.19
1679
46
415
829
2969
DISTANCES
Calculated Chi-square
23.4
p-vaue=chidist(2969,3)
3.3759E-05
p-value=chitest(observed,expected)
3.3759E-05
We reject the null hypothesis that employment status and region are independent.
PEP Problem NOT in the Book. THE MARK and UPTOWN KITCHEN are separate restaurants owned by
the same firm. The weekend revenue generated by THE MARK is normally distributed with mean
$80,000 and standard deviation $10,000. The UPTOWN KITCHEN revenues are normally distributed with
mean $100,000 and standard deviation of $20,000. The following questions refer to the total weekend
revenue enjoyed by the firm.
a. What is the mean?
b. What is the standard deviation?
c. What is the shape of the distribution?
e. Which of your answers to a,b,c requires revenues from the two restaurants to be independent?
f. Which of your answers to a,b,c requires that revenues at each store are normal?
g. (Difficult?) Next weekend, which restaurant will bring in more revenue?
a. mean of the sum is the sum of the means, always. The mean total revenue is $180,000.
b. The variance of the sum is the sum of the variances IF independent. So if independent revenues, the
standard deviation of total revenue is (10000^2+20000^2)^.5 = $22,361
c. Normal. Sums of normal are always normal.
d. only the answer to b requires independence.
e. only the answer to c. requires normality of individual revenues. The central limit theorem does not
apply because n is only 2.
f. The answer to b. requires independence so that we can add the variances.
g. The trick to answering this is to realize it is a question about the difference in revenues. The difference
in revenues (UPTOWN – MARK) will be normal with mean $20,000 and standard deviation $22,361 (the
variance of a difference is the sum of the variances given independence….so the difference is just as
unpredictable as the sum). So the difference will be negative (MARK will bring in more revenue) with
probability NORMDIST(0,20000,22361,true) = 0.186. So the Mark will bring in more revenue than
UPTOWN with probability 0.186 (given independence…which is NOT a reasonable assumption…but
without it, we can’t do much…)
46 (p 341) AARP estimated that the average annual expenditure on restaurants and carryout food was
$1,873 for people 50 and over. Suppose this estimate is based on a random sample of 80 people and
that the sample standard deviation is $550.
b. What is the 95% confidence interval for the population mean amount spent on restaurants and
carryout.
sample mean
t.inv.2t(.05,79)
s
s/n^.5
margin of
error
Lower limit
Upper limit
1873
1.99045
550
61.5
122.4
1750.6
1995.4
d. If the distribution of amounts is positively skewed (not normal), would you expect the sample median
amount to be greater or less than $1,873?
We would expect the sample median to be less than the sample mean for a positively skewed
distribution.
e. (not in the book). If the distribution of amounts is positively skewed (not normal), does this make your
answer to b.) invalid? Briefly explain.
A normal population is not required for our confidence intervals to be valid. This is because of the central
limit theorem saying sample means are normal when n is big regardless of the shape of the population
distribution. Here n is 80 and big enough for the central limit theorem to apply.
52 (p 390). The chamber of commerce of a Florida Gulf Coast community advertises that residential
property is available at a mean cost of $125,000 or less per acre. A random sample of 32 properties has
a sample mean cost per acre of $130,000 and sample standard deviation of $12,500. Comment on the
validity of the advertising statement.
H0 is mean=125. Ha is mean > 125. T-stat is (130-125)/(12.5/32^.5) = 2.27. Pvalue = t.dist.rt(2.27,31) =
0.015. This is statistically significant. We can reject the hypothesis that the claim is truthful in favor of
an alternative hypothesis that the mean is greater than $125K.
56 (p391). Virtual call centers are staffed by individuals working from home. Regional Airways is
considering switching from its traditional call center to a virtual one but only if a level of customer
satisfaction greater than 80% can be maintained. In a test of home agents, 252 out of 300 randomly
chosen customers reported being satisfied with their experience with the call center.
a. Formulate a relevant null and alternative hypothesis.
b. What is the sample proportion of satisfied customers?
c. What is the p-value of your test of hypothesis?
d. What is your conclusion?
Let H0 be P (the satisfaction probability) = 0.8 and Ha: P>0.8. Here we hope to reject the null hypothesis.
We have a variety of ways to calculate the p-value:
Using the binomial
p-value = Pr(X>=252 given H0) = 1-BINOMDIST(251,300,.8,true) = 0.046.
Using the normal approximation to the binomial
p-value = Pr(X>=252 gien Ho) = 1-NORMDIST(252,300*.8,(300*.8*.2)^.5,true) = 0.042.
Using the Z-statistic calculated using number satisfied
p-value = Pr(Z>=(252-240)/(300*.8*.2)^.5) = Pr(Z>=1.73) = 1-NORM.S.DIST(1.73,true) = 0.042.
Using the Z-statistic calculated using sample proportion satisfied
p-value = Pr(Z>= (252/300-.8)/(.8*.2/300)^.5) = 1-NORM.S.DIST(1.73,true) = 0.042.
The sample proportion satisfied is 252/300 and our conclusion is that this sample proportion is
statistically significant higher than 0.8.
42 (p446). Mutual funds are either load or no-load. Because load mutual funds charge fees not charged
by no load funds, the question is whether load funds provide a higher mean return. Returns from a
random sample of 30 load and 30 no load mutual funds are provided in an accompanying Excel
spreadsheet.
a. Formulate a null and alternative hypothesis such that rejection of the null leads to the conclusion that
load funds have higher mean returns.
b. Use the data to test your hypothesis. What is the p-value and your conclusion?
H0: mean returns are equal. Ha: mean return is greater for load fund. This will be a two-sample, onetailed t-test of differences in means. The resulting p-value is 0.28 which means we cannot reject the null
in favor of load funds having a higher mean return. The difference in sample means is not statistically
significant.
44 (p 446). Typical prices of single-family homes in Florida are shown for a random sample of 15
metropolitan areas (Naples Daily News, February 23, 2003). Data are in thousands of dollars and are in
the spreadsheet.
Metro Area
Daytona Beach
Fort Lauderdale
Fort Myers
Fort Walton Beach
Gainesville
Jacksonville
Lakeland
Miami
Naples
Ocala
Orlando
Pensacola
Sarasota-Bradenton
Tallahassee
Tampa-St. Petersburg
Jan-03
117
207
143
139
131
128
91
193
263
86
134
111
168
140
139
Jan-02
96
169
129
134
119
119
85
165
233
90
121
105
141
130
129
Have mean prices changed across the two years? Formulate and test an appropriate hypothesis.
H0: means are equal. Ha: they are not. Use a two sample paired test.
The p-value is 0.000166, so we reject the null hypothesis. There is a statistically significant difference in
means.
46 (p 448). A study claimed that self-employed individuals do not experience greater job satisfaction
than individuals who are not self-employed. Job satisfaction was measured using 18 questions with
answers ranging from 1 to 5. The total score was the measure of job satisfaction. Scores for individuals
in four separate professions are given below and in the spreadsheet.
Lawyer Physical Therapist Cabinetmaker
Systems Analyst
44
55
54
44
42
78
65
73
74
80
79
71
42
86
69
60
53
60
79
64
50
59
64
66
45
62
59
41
48
52
78
55
64
55
84
76
38
50
60
62
Are the differences in sample mean job satisfaction scores across the four professions statistically
significant?
Yes, the p-value from the ANOVA single factor test of equality of the four means is only 0.0061. The
differences in sample means ARE statistically significant.
Download