“An Aggie does not lie, cheat, or steal or tolerate those who do.” Answer Key for Homework 1 You will see raw data (excel format) and the description file (word format) for body fat data on the course webpage under the datasets. Use this dataset and descriptions to answer the following questions. This is a big dataset. I suggest you to use a spreadsheet like excel to reduce the amount of time spent and numerical mistakes with computations. 1. (12 pt.) Carefully determine the skewness of each boxplot 400 age 300 200 100 0 age weight height (a) Is age negatively skewed, positively skewed or symmetric? Positively skewed (b) Is weight negatively skewed, positively skewed or symmetric? Positively skewed (c) Is height negatively skewed, positively skewed or symmetric? Negatively skewed (d) Are there any outliers on the age data? No (e) Look at the outlier(s) on the weight data and tell me if you see anything interesting on their characteristics? Middle age (45,46), reasonably tall (f) Look at the outlier(s) on the height data and tell me if you see anything interesting on their characteristics? Middle age (44), overweight 2. (12 pt. each) (a) What percentage of these men is older than 50 year old? 76/252= 30.16% (b) What percentage of these men is younger than 50 year old? 169/252= 67.06% (c) What percentage of these men is younger than 25 year old and weighs less than 160 pounds ? 4/252=1.59% (d) What percentage of these men is older than 70 year old and taller than 60 inches? 7/252=2.78% 3. (6 pt.) The following is the stem and leaf display for the density variable. Stem-and-Leaf Display: density Stem-and-leaf of density Leaf Unit = 0.0010 n = 252 “An Aggie does not lie, cheat, or steal or tolerate those who do.” 1 1 4 23 56 100 (48) 104 65 26 9 1 99 100 101 102 103 104 105 106 107 108 109 110 5 048 0001355555666778889 001111222334445557777777888889999 00011111112222333333555666677777888888889999 000000111122222222233444444556666677777888889999 000011112222334444444555666667777778899 000000011122222334444555566777778899999 00111123444455677 00011289 8 (a) Is this unimodal data? (Yes/ No) Yes (b) Is this negatively skewed, positively skewed or the symmetric data? 4. Negatively skewed (24 pt.) Construct a frequency distribution for age variable. Age At least 20 and less than 30 At least 30 and less than 40 At least 40 and less than 50 At least 50 and less than 60 At least 60 and less than 70 At least 70 and less than 80 At least 80 and less than 90 frequency 36 39 94 47 27 8 1 Relative frequency 36/252=0.1429 39/252=0.1548 94/252=0.3730 47/252=0.1865 27/252=0.1071 8/252=0.0318 1/252=0.0040 Cumulative relative frequency 36/252 =0.1429 75/252 =0.2976 169/252 =0.6706 216/252 =0.8571 243/252 =0.9643 251/252 =0.9960 1 5. (6 pt.) If you were constructing a histogram with the same class intervals in the previous question for the age variable, are there any gaps on the histogram? According to the histogram, is it positively skewed, negatively skewed or symmetric? No gaps Positively skewed 6. (18 pt.) Calculate the mean, median, lower quartile, upper quartile, minimum, maximum for the age variable. MINITAB by COUNTING EXCEL Mean = 44.885 Median = 43.000 Lower Quartile = 35.250 35.5 35.75 Upper Quartile = 54.000 Minimum = 22.000 Maximum = 81.000 7. (12 pt.) Calculate the range, interquartile range, variance, standard deviation for the age variable. MINITAB by COUNTING EXCEL Range Interquartile Range Variance Standard Deviation 8. = = = = 59 18.75 158.811 12.602 18.5 18.25 (10 pt.) The following descriptive statistics gives you the five number summary for the bodyfat percentage variables. Construct a boxplot for this variable using this descriptive statistics. Make sure to check the data for the cutoff’s and the outliers. Variable N Mean Median TrMean StDev SE Mean “An Aggie does not lie, cheat, or steal or tolerate those who do.” %bodyfat 252 19.151 19.200 19.074 Variable %bodyfat Minimum 0.000 Maximum 47.500 Q1 12.425 Q3 25.300 8.369 0.527 1.5(IQR)=15(25.3-12.425)=19.3125 Q1-1.5(IQR)=-6.8875 Q3+1.5(IQR)=44.6125 Lower edge of the rectangle is at the lower quartile, 12,425 Upper edge of the rectangle is at the upper quartile, 25.3 Lower whisker can go as low as -6.8875 but the smallest data is at 0. It stops at 0. Upper whisker can go as high as 44.6125 but the closest data smaller then this is 40.1. It stops at 40.1. There is one more remaining observation (47.5) shown with start and it is an outlier. 50 %bodyfat 40 30 20 10 0