Uploaded by Louis Angel Ramirez

mat258final prac

advertisement
Math 258: Intro to Statistics B
Final Exam
Spring 2023
Name:
Please read the instructions carefully.
ˆ You have 120 minutes to complete the exam.
ˆ Including this page, your exam booklet should have 9 pages.
ˆ Read each question carefully.
ˆ You must show all necessary work to receive full credit.
ˆ You are permitted the use of a calculator – cell phones are not allowed.
ˆ No textbooks, notes, or neighboring students may be referenced during the exam.
ˆ Round all calculations to 3 decimal places unless otherwise instructed.
Q
Score
1
2
3
4
5
Total
1
1) Age of the Bride The data below represent the ages of brides, in years, from a random sample of 20
marriage certificates filed in Cook County, Illinois.
35
27
24
30
24
27
31
24
23
22
25
23
28
22
26
21
33
25
24
27
Note: Summary statistics and a normality test for this data can be found in the Minitab Output.
1a) Use the sign test to determine whether the median age of brides in Cook County, Illinois is less than
the national average age of 25.8 years, at level of significance α = 0.05.
ˆ State the hypotheses to be tested.
ˆ Calculate the test statistic k0 .
ˆ Identify the appropriate critical value(s).
ˆ State both your formal and proper conclusions.
1b) Use the one-sample t-test to determine whether the mean age of brides in Cook County, Illinois is less
than the national average age of 25.8 years, at level of significance α = 0.05.
ˆ State the hypotheses to be tested.
ˆ Calculate the test statistic t0 .
ˆ Identify the appropriate critical value(s),
and sketch the rejection region for this test.
ˆ State both your formal and proper conclusions.
1c) Which of the above tests is more appropriate for the given sample? Justify your answer.
2) Comparing Athletes The resting heart rates, in beats per minute (bpm), were measured for independent random samples of 11 swimmers and 9 track athletes, producing the following data.
Swimmers
Track
79
82
66
81
62
70
75
70
79
75
78
72
77
68
71
63
75
65
68
77
62
63
65
66
68
68
70
70
71
72
75
75
75
77
77
78
79
79
81
82
Note: Summary statistics and a normality test for this data can be found in the Minitab Output.
2a) Use the Mann-Whitney test to determine whether the median resting heart rate of swimmers is greater
than that of track athletes, at level of significance α = 0.05.
ˆ State the hypotheses to be tested.
ˆ Calculate the test statistic U .
ˆ Identify the appropriate critical value(s).
ˆ State both your formal and proper conclusions.
2b) Use Welch’s t-test to determine whether the mean resting heart rate of swimmers is greater than that
of track athletes, at level of significance α = 0.05.
ˆ State the hypotheses to be tested.
ˆ Calculate the test statistic t0 .
ˆ Identify the appropriate critical value(s),
and sketch the rejection region for this test.
ˆ State both your formal and proper conclusions.
2c) Which of the above tests is more appropriate for the given sample? Justify your answer.
3) The Salary Gap The following data represent the annual salaries, in thousands of dollars, for independent random samples of non-minority women, non-minority men, and minority individuals employed by
a particular shoe company.
Women
Men
Minorities
23
45
18
41
55
30
54
60
34
60
70
41
78
72
44
18
23
30
34
41
41
44
54
55
60
60
70
72
78
45
3a) Use the Kruskal-Wallis test to determine whether the median annual salary at this company differs
between women, men, and minorities, at level of significance α = 0.05.
ˆ Calculate the test statistic H.
ˆ Identify the appropriate critical value.
ˆ State both your formal and proper conclusions.
3b) Use one-way ANOVA to determine whether the mean annual salary at this company differs between
women, men, and minorities, at level of significance α = 0.05.
ˆ Complete the ANOVA table below to calculate the test statistic F0 .
Source
df
SS
Treatment
1884.13
Error
2615.20
MS
F
ˆ Identify the appropriate critical value,
and sketch the rejection region for this test.
ˆ State both your formal and proper conclusions.
3c) Which of the above tests is more appropriate for the given sample? Justify your answer.
Note: Summary statistics and a normal probability plot for this data can be found in the Minitab Output.
Remember to check both of the necessary conditions for ANOVA.
3d) Use pairwise Mann-Whitney tests with Bonferroni adjustment to determine which pairs of treatments,
if any, have significantly different medians, at level of significance α = 0.05.
ˆ The level of significance to be used in each individual comparison is
α
m
=
ˆ Circle the correct conclusions for each comparison.
Comparison
Mann-Whitney P -value
Formal Conclusion
Population Medians are. . .
women vs. men
0.4633
(reject / FTR) H0
(similar / different)
women vs. minorities
0.1732
(reject / FTR) H0
(similar / different)
men vs. minorities
0.0122
(reject / FTR) H0
(similar / different)
3e) Use Tukey’s test to determine which pairs of treatments, if any, have significantly different means, at
level of significance α = 0.05.
ˆ Calculate the test statistics q0 (i, j).
ˆ Identify the appropriate critical value.
ˆ Circle the correct conclusions for each comparison.
x̄i − x̄j
q0 (i, j)
qα,ν,k
Formal Conclusion
Population Means are. . .
x̄men − x̄minorities = 27.0
(reject / FTR) H0
(similar / different)
x̄women − x̄minorities = 17.8
(reject / FTR) H0
(similar / different)
x̄men − x̄women = 9.2
(reject / FTR) H0
(similar / different)
3f ) Which of the above techniques is more appropriate for the given sample? Justify your answer.
Bonus: Based on whichever set of multiple comparisons is most appropriate, use the underlining technique
to identify groups of treatments with similar means/medians.
men
women
minorities
4a) Effective Marketing To compare the effectiveness of a particular company’s national marketing
campaign in two different cities, researchers randomly selected individuals from each city, and asked them
to report their level of awareness of the company’s product. Out of 200 people sampled in New York, 91
reported they had purchased or were aware of the company’s product. Out of 300 people in Los Angeles, 132
had purchased or were aware of the product. Use the two-sample normal approximation test of proportions
to determine whether there is a difference in the proportions of people in these cities who are familiar with
this product, at level of significance α = 0.05.
ˆ State the hypotheses to be tested.
ˆ Calculate the test statistic z0 .
ˆ Identify the appropriate critical value(s),
and sketch the rejection region.
ˆ State both your formal and proper conclusions.
4b) Effective Marketing The complete set of data collected for the above study is provided in the table
below. Researchers randomly sampled individuals from New York, Los Angeles, and Chicago, and asked
them to report whether they had purchased, were aware of, or were not aware of the company’s product. Use
the appropriate chi-square test (independence or homogeneity) to determine whether there is a difference in
the level of awareness of this company’s product among these three cities, at level of significance α = 0.05.
Awareness
City
Purchased
Aware
Not Aware
(Total)
New York
36
55
109
200
(Expected)
41.54
100.31
(Contribution)
0.74
0.75
Chicago
45
56
(Expected)
31.15
43.62
(Contribution)
6.15
3.52
54
Los Angeles
(Expected)
(Contribution)
(Total)
49
150
78
168
300
87.23
0.98
150.46
2.04
189
326
135
ˆ Identify which test is to be performed.
ˆ Calculate any missing expected counts,
and contributions to the test statistic.
ˆ Calculate the test statistic χ20 .
ˆ Identify the appropriate critical value,
and sketch the rejection region.
ˆ State both your formal and proper conclusions.
ˆ Identify which cell deviates most from H0 .
650
5) Boiling at Altitude In the 1850’s, Scottish physicist James Forbes recorded the barometric pressure
(in mm Hg) and the boiling point of water (in degrees Fahrenheit) at different points in the Swiss Alps.
A simple linear regression model was fit to this data, treating barometric pressure as predictor and boiling
point as response. Note: The output from this analysis is provided in the Minitab Output.
ˆ State the estimated equation of the regression line for these variables. What should we expect the
boiling point at altitude to be when the barometric pressure is 26 mm Hg?
ˆ Identify and interpret the coefficient of determination for this model.
ˆ Construct and interpret a 95% confidence interval for the slope β1 .
Is this relation significant? Justify your answer.
ˆ Construct and interpret the appropriate interval to estimate the mean boiling point of water in the
Swiss Alps when the barometric pressure is 26 mm Hg.
ˆ Use the Minitab Output to assess model conditions for this data. Briefly justify your answers.
◦ (Yes / No) Does a linear model appear to be appropriate for the given data?
◦ (Yes / No) Do the residuals appear to have constant variance as x varies?
◦ (Yes / No) Do the residuals appear to be normally distributed?
◦ (Yes / No) Would it be appropriate to use this model for prediction or inference?
Final (Practice) Exam Minitab Express Output
Question 1:
Question 2:
MAT 258
Question 3:
Question 5:
Download