HSA 523 Homework #3 Answer Key Dr. Robert Jantzen Economics Department 1. (generate the statistics w/ SPSS by using <Analyze><DescriptiveStats><Frequencies> and clicking on the <Statistics> tab to select the statistics you want.) Statistics N Valid Missing average cost per licensed bed (1000s) 15 full-time staff per bed 15 0 0 Mean 238.47 3.533 Median 238.00 3.600 Std. Deviation 33.768 .4030 Minimum 165 2.9 Maximum 293 4.3 25 228.00 3.200 50 238.00 3.600 75 264.00 3.900 Percentiles a. Mean measures the middle # w/ the average. b. Median is the # exactly in the middle of an ordered series of #s. c. The first quartile divides the bottom 25% of the #s from the top 75%. The third quartile divides the bottom 75% from the top 25%. d. The five # summaries show the minimum, first quartile, median, third quartile & maximum. They also show how wide the intervals are that capture each quarter of all the #s. If the middle two intervals are equally wide and the outer two are also equally wide, the #s are symmetrically distributed. e. The interquartile range (third quartile – first quartile) shows the interval that captures the middle 50% of the #s. (IQRs are 36 & .7) f. The sample standard deviation shows how much the #s vary from the mean #. g. The coefficient of variation shows how large the standard deviation is, as a % of the mean. (CVs are .142 = 14.2% & .114 = 11.4%) h. The pearson skewness coefficient shows whether the # series is skewed (absolute value for pearson > .1 indicates skewed #s). (pearsons are .014 & -.166) 2. Box plot shows that although chicken dogs generally have fewer calories, the top 25% of chicken dogs have more calories than the bottom 25% of either meat or beef dogs. 200.00 calories per hot dog 180.00 160.00 140.00 120.00 100.00 80.00 beef meat chicken type of hot dog 3. (Generate the boxplots & histograms by using <Graphs><Interactive> & then the type of graph you want to generate). Both chart types indicate that the number of beds and the average cost of care are skewed to the high numbers. 30% Percent Percent 12% 8% 4% 20% 10% 0% 0% 250.00 500.00 750.00 1000.00 1000.00 1000.00 750.00 2000.00 3000.00 av gcost- av g cost/da y routine care (ahd) avgcost- avg cost/day routine care (ahd) bedsahd - total beds (ahd) bedsahd - total beds (ahd) 500.00 250.00 0.00 3000.00 2000.00 1000.00 (Generate the statistics w/ SPSS by using <Analyze><DescriptiveStats><Frequencies> and clicking on the <Statistics> tab to select the statistics you want. Also turn off <Display frequency tables> otherwise you’ll generate a lot of “output.”) bedsahd total beds (ahd) N Valid Missing Mean Median 717 avgcost- avg cost/day routine care (ahd) 717 0 0 199.1339 454.2999 154.0000 423.0000 165.20221 191.22003 Minimum 12.00 166.00 Maximum 1068.00 3006.00 Std. Deviation Percentiles 25 78.5000 357.0000 50 154.0000 423.0000 75 276.0000 497.5000 a. 5 # summaries can be found above. b. mean & median are above. Since the means are bigger than the medians, the #s could be skewed to the high numbers. c. the pearson skewness coefficients are .27 for beds size & .164 for average cost (mean – median divided by std.dev.) d. the IQRs are 197.5 & 140.5 (3rd Quartile – 1st Quartile). Shows the interval that contains the middle 50% of the #s. e. std. deviations are above. Use if #s are not skewed. f. the coefficients of variation are .83 or 83% for beds and .421 or 42.1% for average cost (CV = Std.Dev./Mean). Use if #s are not skewed. 4. If the mean cholesterol count is 280 w/ a standard deviation of 25: a. for any kind of # distribution we can use Chebyshev’s Rule which says that at least 75% of the #s are within 2 SDs of the mean (between 230 and 330) and that at least 89% are within 3 SDs of the mean (between 205 and 355). b. for a symmetrical bell-shaped distribution we can use the Empirical Rule which says that about 68% of the #s are within 1 SD of the mean (between 255 and 305), about 95% are within 2 SDs (between 230 and 330) and nearly all (99.7%) are within 3 SDs of the mean (between 205 and 355). 5. Given mean CEO pay of 300K w/ a SD of 40K and mean LPN pay of 40K w/ a SD of 2K: a. a CEO earning 260K has a z score of -1 [=(260-300)/40] while an LPN earning 36K has a z score of -2 [=(36-40)/2]. The LPN is more underpaid because (s)he is 2 standard deviations below average while the CEO is only 1 SD below average. b. in order to use the mean and SD in calculations, we must assume that the earnings distributions for CEOs and LPNs are both bell-shaped and symmetrical.