ijobfvc

advertisement
“An Aggie does not lie, cheat, or steal or tolerate those who do.”
Answer Key for Homework 1
You will see raw data (excel format) and the description file (word format) for body fat data on the course
webpage under the datasets. Use this dataset and descriptions to answer the following questions. This is a
big dataset. I suggest you to use a spreadsheet like excel to reduce the amount of time spent and numerical
mistakes with computations.
1.
(12 pt.) Carefully determine the skewness of each boxplot
400
age
300
200
100
0
age
weight
height
(a) Is age negatively skewed, positively skewed or symmetric?
Positively skewed
(b) Is weight negatively skewed, positively skewed or symmetric?
Positively skewed
(c) Is height negatively skewed, positively skewed or symmetric?
Negatively skewed
(d) Are there any outliers on the age data?
No
(e) Look at the outlier(s) on the weight data and tell me if you see anything interesting on their
characteristics?
Middle age (45,46), reasonably tall
(f) Look at the outlier(s) on the height data and tell me if you see anything interesting on their
characteristics?
Middle age (44), overweight
2.
(12 pt. each)
(a) What percentage of these men is older than 50 year old?
76/252= 30.16%
(b) What percentage of these men is younger than 50 year old?
169/252= 67.06%
(c) What percentage of these men is younger than 25 year old and weighs less than 160 pounds ?
4/252=1.59%
(d) What percentage of these men is older than 70 year old and taller than 60 inches? 7/252=2.78%
3.
(6 pt.) The following is the stem and leaf display for the density variable.
Stem-and-Leaf Display: density
Stem-and-leaf of density
Leaf Unit = 0.0010
n
= 252
“An Aggie does not lie, cheat, or steal or tolerate those who do.”
1
1
4
23
56
100
(48)
104
65
26
9
1
99
100
101
102
103
104
105
106
107
108
109
110
5
048
0001355555666778889
001111222334445557777777888889999
00011111112222333333555666677777888888889999
000000111122222222233444444556666677777888889999
000011112222334444444555666667777778899
000000011122222334444555566777778899999
00111123444455677
00011289
8
(a) Is this unimodal data? (Yes/ No)
Yes
(b) Is this negatively skewed, positively skewed or the symmetric data?
4.
Negatively skewed
(24 pt.) Construct a frequency distribution for age variable.
Age
At least 20 and less than 30
At least 30 and less than 40
At least 40 and less than 50
At least 50 and less than 60
At least 60 and less than 70
At least 70 and less than 80
At least 80 and less than 90
frequency
36
39
94
47
27
8
1
Relative frequency
36/252=0.1429
39/252=0.1548
94/252=0.3730
47/252=0.1865
27/252=0.1071
8/252=0.0318
1/252=0.0040
Cumulative relative frequency
36/252 =0.1429
75/252 =0.2976
169/252 =0.6706
216/252 =0.8571
243/252 =0.9643
251/252 =0.9960
1
5.
(6 pt.) If you were constructing a histogram with the same class intervals in the previous question for
the age variable, are there any gaps on the histogram? According to the histogram, is it positively
skewed, negatively skewed or symmetric?
No gaps
Positively skewed
6.
(18 pt.) Calculate the mean, median, lower quartile, upper quartile, minimum, maximum for the age
variable.
MINITAB
by COUNTING EXCEL
Mean
= 44.885
Median
= 43.000
Lower Quartile
= 35.250
35.5
35.75
Upper Quartile
= 54.000
Minimum
= 22.000
Maximum
= 81.000
7.
(12 pt.) Calculate the range, interquartile range, variance, standard deviation for the age variable.
MINITAB
by COUNTING EXCEL
Range
Interquartile Range
Variance
Standard Deviation
8.
=
=
=
=
59
18.75
158.811
12.602
18.5
18.25
(10 pt.) The following descriptive statistics gives you the five number summary for the bodyfat
percentage variables. Construct a boxplot for this variable using this descriptive statistics. Make sure
to check the data for the cutoff’s and the outliers.
Variable
N
Mean
Median
TrMean
StDev
SE Mean
“An Aggie does not lie, cheat, or steal or tolerate those who do.”
%bodyfat
252
19.151
19.200
19.074
Variable
%bodyfat
Minimum
0.000
Maximum
47.500
Q1
12.425
Q3
25.300
8.369
0.527
1.5(IQR)=15(25.3-12.425)=19.3125
Q1-1.5(IQR)=-6.8875
Q3+1.5(IQR)=44.6125
Lower edge of the rectangle is at the lower quartile, 12,425
Upper edge of the rectangle is at the upper quartile, 25.3
Lower whisker can go as low as -6.8875 but the smallest data is at 0. It stops
at 0.
Upper whisker can go as high as 44.6125 but the closest data smaller then this
is 40.1. It stops at 40.1.
There is one more remaining observation (47.5) shown with start and it is an
outlier.
50
%bodyfat
40
30
20
10
0
Download