Solutions to Exercises for Chapter 2 2.1. a) Range = 54 CI = 2

advertisement
Solutions to Exercises for Chapter 2
2.1.
a) Range = 54
CI = 2 – leads to 27 intervals – bad choice
CI = 3 – leads to 18 intervals – a possibility
CI = 5 – leads to 10–11 intervals – also a possibility
Choice: With only 57 scores, we would choose CI width = 5, going 10–11 intervals.
Lowest interval would be 20–24.
b) Range = 154
CI = 5 – leads to 30–31 intervals – bad choice
CI = 10 – leads to 15–16 intervals – OK
CI = 15 – leads to 10–11 intervals – OK
Choice: With only 32 scores, I would use CI = 15, yielding 10–11 intervals. Lowest
interval would be 120–134.
c) Range = 29
CI = 2 – leads to 14–15 intervals – OK
CI = 3 – leads to 9–10 intervals – OK?
Choice: With the relatively large number of scores (112), I would opt for CI = 2. Lowest
interval would be 8–9.
d) Range = 120
CI = 5 – leads to 24 intervals – not too good
CI = 10 – leads to 12 intervals – OK
CI = 15 – leads to 8 intervals – not too good
Choice: CI = 10, going about 12 intervals. Lowest interval would be 390–399.
In actual practice, if there are two possible solutions, you can construct both grouped distributions and
then make a decision as to which solutions best depicts the data, in your opinion.
2.2. The first step in constructing a frequency distribution is to identify the largest and smallest values in
the data set. For these data, the largest score is 29 and the smallest score is 4. To construct the ungrouped
distribution, you list all possible value between the largest and smallest scores as is done below in Part A.
The final step in constructing an ungrouped frequency distribution is to convert the “tallies” to numbers,
as in Part B.
Part A
Y
Part B
Y
f(Y)
29
1
28-29
1
28
28
0
26-27
0
27
27
0
24-25
1
26
26
0
22-23
4
25
1
20-21
2
24
0
18-19
3
29
25
f(Y)
Part C
|
|
24
Class Interval
Freq.
23
|
23
1
16-17
5
22
|||
22
3
14-15
3
21
||
21
2
12-13
7
20
0
10-11
7
20
19
|
19
1
8-9
9
18
||
18
2
6-7
11
17
|||
17
3
4-5
7
16
||
16
2
15
||
15
2
14
|
14
1
13
|||
13
3
12
||||
12
4
11
||||
11
4
10
|||
10
3
9
||||| |
9
6
8
|||
8
3
7
|||||
7
5
6
||||| |
6
6
5
|||||
5
5
4
||
4
2
The ungrouped distribution indicates that there are more scores in the low end than in the high
end. Given that there are 26 values in the ungrouped distribution, one might wish to construct a grouped
distribution. Applying the guidelines presented earlier in this chapter, a class interval width of two would
yield about 13 intervals. An interval width of three would result in about eight or nine intervals.
Therefore, we would opt for the solution using a width of two. However, one could argue that the class
interval width should be three, given that √60 is between 7 and 8. The grouped frequency distribution is
presented in Part C above. Note that the nominal lower limit of each interval is a multiple of the interval
width.
In addition, we entered the data into an Excel spreadsheet. In the first cell (A1), we entered the
name, “jobsat”, without the quotation marks. Immediately below, we entered the 60 values, all in column
1. Then, we saved the Excel file as a text file named chap2.ex2. Excel automatically appended the
extension, .txt. We then started R and executed the following commands:
chap2.ex2 <- read.table ("c:/bookdatar/chap2.ex2.txt",header=T)
attach(chap2.ex2)
names(chap2.ex2)
length(jobsat)
table(jobsat)
The output from R:
> names(chap2.ex2)
[1] "jobsat"
> length(jobsat)
[1] 60
> table(jobsat)
jobsat
4
5
6
7
8
9 10 11 12 13 14 15 16 17 18 19 21 22 23 25 29
2
5
6
5
3
6
3
4
4
3
1
2
2
3
2
1
2
3
1
1
1
As you can see, R provides us with the ungrouped frequency distribution. We can go from here to
construct a grouped frequency distribution as before.
2.3. First, let’s use R to create a table of the distribution of these computer aptitude (comp.apt) scores.
We created an Excel file as we did with the previous exercise. The results of the table command:
> table(comp.apt)
comp.apt
17 26 35 42 45 48 49 51 53 54 55 57 58 59 60 61 62 63 64 66 67 68 69 70 71 72
1
1
1
1
2
1
2
3
1
2
1
2
2
2
1
1
1
1
2
3
1
1
3
2
2
3
75 76 77 79 81 83 93 97
1
1
1
1
1
2
1
1
In order to construct a grouped frequency distribution for these data, we identify the extreme values: the
highest is 97 and the lowest is 17. Thus, the cases span about 80 values. If we were to use an interval
width of 2, we would have about 40 intervals, not a satisfactory solution. An interval width of 3 would
result in about 27 intervals, still unsatisfactory. Using an interval width of 5 would yield about 16
intervals, thus meeting our criterion of between 10 and 20 intervals. If we were to employ an interval
width of 10, we would have about 8 intervals, too few to meet the criterion. However, one (again) could
argue that the √52 is approximately 7. For now, we select five as our interval width. To construct the class
intervals, we need to use multiples of 5 as the nominal lower limits, starting with the interval that will
contain the smallest value, and proceeding up. Given that the smallest value is 17, we would start with the
interval 15–19. The entire solution is shown below. The results of the “tallying” are shown in Part A.
Coverting the “tallies” to numbers results in Part B, the final solution.
Part A
Part B
Class Interval
f(Y)
Class Interval f(Y)
95–99
|
95–99
1
90–94
|
90–94
1
85–89
0
85–89
80–84
|||
80–84
3
75–79
||||
75–79
4
70–74
||||| ||
70–74
7
65–69
||||| ||
65–69
8
60–64
||||| |
60–64
6
55–59
||||| ||
55–59
7
50–54
||||| |
50–54
6
45–49
|||||
45–49
5
40–44
|
40–44
1
35–39
|
35–39
1
30–34
0
25–29
1
20–24
0
15–19
1
30–34
25–29
|
20–24
15–19
|
Nearly all the values fall between 45 and 84. There are two scores that are extremely high and a tail of
extreme low scores.
2.4. After rolling the die 60 times, we observed the number of times each side of the die appeared as
below. Your solutions will be different from ours.
Y
f(Y)
6
11
5
9
4
11
3
13
2
7
1
9
2.5. In our solution to Exercise 2.2 above, we noted that we might have used a class interval width of
either 2 or 3, although we use 2 in the solution. In this exercise, we are asked to draw a histogram of our
grouped solution. We used R to examine solutions for both CI widths (2 and 3). First, using a CI of 2,
and then 3:
detach(chap2.ex3)
attach(chap2.ex2)
hist(jobsat,breaks=seq(3.5,29.5,2),xlab="Job Satisfaction Scores for 60 Public
School Teachers")
rug(jitter(jobsat))
hist(jobsat,breaks=seq(2.5,29.5,3),xlab="Job Satisfaction Scores for 60 Public
School Teachers")
rug(jitter(jobsat))
Histogram of jobsat
10
8
6
Frequency
6
0
0
2
2
4
4
Frequency
8
12
10
14
Histogram of jobsat
5
10
15
20
25
30
Job Satisfaction Scores for 60 Public School Teachers
Take your pick!
5
10
15
20
25
30
Job Satisfaction Scores for 60 Public School Teachers
6.
detach(chap2.ex2)
attach(chap2.ex3)
hist(comp.apt,prob=T,breaks=seq(14.5,99.5,5),xlab="Computer Aptitude Scores for 52
College Professors")
lines(density(comp.apt))
rug(jitter(comp.apt))
0.015
0.010
0.005
0.000
Density
0.020
0.025
0.030
Histogram of comp.apt
20
40
60
80
100
Computer Aptitude Scores for 52 College Professors
2.7. Using R with the data from Exercise 2.3, we have constructed two stem-and-leaf displays, one with
an interval width of 5 (corresponding to the histogram and frequency polygon) and the other with an
interval width of 10.
> stem(comp.apt)
The decimal point is 1 digit(s) to the right of the |
1 | 7
2 | 6
3 | 5
4 | 255899
5 | 1113445778899
6 | 01234466678999
7 | 00112225679
8 | 133
9 | 37
> stem(comp.apt,scale=2)
The decimal point is 1 digit(s) to the right of the |
1 | 7
2 |
2 | 6
3 |
3 | 5
4 | 2
4 | 55899
5 | 111344
5 | 5778899
6 | 012344
6 | 66678999
7 | 0011222
7 | 5679
8 | 133
8 |
9 | 3
9 | 7
Download