Uploaded by radikal_0_7

Statistics Practice Paper

advertisement
MTRM6002 / STATISTICAL METHODS
HOMEWORK #1
DUE: 5 OCTOBER 2021 @ 6:00 PM
Dr. Nordia Thomas
This is an individual
assignment.
1
MTRM6002 – Dr. Nordia Thomas – Homework #1
Problem 1
PROBLEM 1
Weekly Salaries. Professor Hassett spent one summer working for a small mathematical consulting firm.
The firm employed a few senior consultants, who made between $800 and $1050 per week; a few junior
consultants, who made between $400 and $450 per week; and several clerical workers, who made $300 per
week. The following table displays the salary data (in $).
300
400
300
300
940
300
450
1050
400
300
a. Obtain the sample mean and sample standard deviation of this (ungrouped) data set.
[6]
b. A frequency distribution using single-value grouping is presented in the first two columns of the following table. The third column of the table is for the xf -values, that is, class mark or midpoint (which
here is the same as the class) times class frequency. Complete the missing entries in the table and then
use the grouped-data formula to obtain the sample mean.
[5]
Salary
x
300
400
450
940
1050
Frequency
f
5
2
1
1
1
Salary · Frequency
xf
1500
c. Compare the answers that you obtained for the sample mean in parts (a) and (b). Explain why
the grouped-data formula always yields the actual sample mean when the data are grouped by using
single-value grouping. (Hint: What does xf represent for each class?)
[2]
d. Construct a table similar to the one in part (b) but with columns for x, f , x− x̄, (x− x̄)2 , and (x− x̄)2 f .
Use the table and the grouped-data formula to obtain the sample standard deviation.
[13]
e. Compare your answers for the sample standard deviation in parts (a) and (d). Explain why the
grouped-data formula always yields the actual sample standard deviation when the data are grouped
by using single-value grouping.
[2]
PROBLEM 2
Days to Maturity. The first two columns of the following table provide a frequency distribution, using
limit grouping, for the days to maturity of 40 short-term investments, as found in BARRON’S. The third
column shows the class marks.
Days to
maturity
30-39
40-49
50-59
60-69
70-79
80-89
90-99
Problem 2 continued on next page. . .
Frequency
f
3
1
8
10
7
7
4
2
Class mark
x
34.5
44.5
54.5
64.5
74.5
84.5
94.5
MTRM6002 – Dr. Nordia Thomas – Homework #1
Problem 2 (continued)
a. Use the grouped-data formulas to estimate the sample mean and sample standard deviation of the
days-to-maturity data. Round your final answers to one decimal place.
[10]
b. The following table gives the raw days-to-maturity data.
70
62
75
57
51
64
38
56
53
36
99
67
71
47
63
55
70
51
50
66
64
60
99
55
85
89
69
68
81
79
87
78
95
80
83
65
39
86
98
70
Find the true sample mean and sample standard respectively, rounded to one decimal place. Compare
these actual values of x̄ and s to the estimates from part (a). Explain why the grouped-data formulas
generally yield only approximations to the sample mean and sample standard deviation for non-singlevalue grouping.
[8]
PROBLEM 3
Stressed-Out Bus Drivers. Frustrated passengers, congested streets, time schedules, and air and noise
pollution are just some of the physical and social pressures that lead many urban bus drivers to retire
prematurely with disabilities such as coronary heart disease and stomach disorders. An intervention program
designed by the Stockholm Transit District was implemented to improve the work conditions of the city’s
bus drivers. Improvements were evaluated by G. Evans et al., who collected physiological and psychological
data for bus drivers who drove on the improved routes (intervention) and for drivers who were assigned the
normal routes (control). Their findings were published in the article “Hassles on the Job: A Study of a Job
Intervention With Urban Bus Drivers” (Journal of Organizational Behavior, Vol. 20, pp. 199-208). Following
are data, based on the results of the study, for the heart rates, in beats per minute, of the intervention and
control drivers.
Intervention
68
66
74
58
69
63
68
73
64
76
74
77
60
66
63
52
53
77
71
73
Control
67 63 77
76 54 73
63 60 68
66 55 71
59 68 64
57
54
64
84
82
80
For the intervention and control drivers:
a. Find the mean.
[4]
b. Find the median.
[6]
c. Find the mode.
[2]
d. Find the range.
[4]
e. Find the mean absolute deviation.
[6]
f. Find the 80th percentile.
[2]
g. Find the lower and upper quartiles.
[4]
h. Find the interquartile range and semi-interquartile range.
[3]
i. Find the variance and standard deviation.
[5]
Problem 3 continued on next page. . .
3
MTRM6002 – Dr. Nordia Thomas – Homework #1
Problem 3 (continued)
j. Find the skewness.
[4]
k. Find the kurtosis.
[4]
PROBLEM 4
Outliers and Trimmed Means. Some data sets contain outliers, observations that fall well outside the
overall pattern of the data. Suppose, for instance, that you are interested in the ability of high school
algebra students to compute square roots. You decide to give a square-root exam to 10 of these students.
Unfortunately, one of the students had a fight with his girlfriend and cannot concentrate — he gets a 0. The
10 scores are displayed in increasing order in the following table. The score of 0 is an outlier.
0
58
61
63
67
69
70
71
78
80
Statisticians have a systematic method for avoiding extreme observations and outliers when they calculate
means. They compute trimmed means, in which high and low observations are deleted or “trimmed off”
before the mean is calculated. For instance, to compute the 10% trimmed mean of the test-score data, we
first delete both the bottom 10% and the top 10% of the ordered data, that is, 0 and 80. Then we calculate
the mean of the remaining data. Thus the 10% trimmed mean of the test-score data is
58 + 61 + 63 + 67 + 69 + 70 + 71 + 78
= 67.125
8
The following table displays a set of scores for a 40-question algebra final exam.
2
4
15
15
16
16
16
17
19
20
21
21
21
24
25
25
26
27
27
28
a. Do any of the scores look like outliers?
[2]
b. Compute the usual mean of the data.
[2]
c. Compute the 5% trimmed mean of the data.
[2]
d. Compute the 10% trimmed mean of the data.
[2]
e. Compare the means you obtained in parts (b)—(d). Which of the three means provides the best
measure of center for the data?
[2]
4
Download