STATISTICS 402 - Assignment 5 Solution 1. An experiment will be performed on individuals with elevated serum cholesterol. We want to see if the type of drug (Lipitor® or Zocor®) has an effect on the change in serum cholesterol (serum cholesterol level before treatment – serum cholesterol level after three months of treatment). A second factor, the amount of drug (0 mg/day, 10 mg/day and 20 mg/day), will also be investigated in a two factor completely randomized experiment. a) What is the response? Change in serum cholesterol (serum cholesterol level before treatment – serum cholesterol level after three months of treatment). b) What are the experimental units? The experimental units are individuals with elevated serum cholesterol. c) The experimenter will use factorial crossing to create the treatment combinations. How many treatment combinations will there be in the experiment? List all the treatment combinations. There will be 6 treatments. 0 mg Lipitor®, 0 mg Zocor®, 5 mg Lipitor®, 5 mg Zocor®, 10 mg Lipitor®, 10 mg Zocor®. d) The experimenter would like to be able to detect a difference in treatment population means of 1.4 standard deviations with Alpha = 0.05 and Beta = 0.10. How many experimental units will the experimenter need? The experimenter will need 18 experimental units for each of the 6 treatments for a total of 108 individuals with elevated serum cholesterol. e) With this number of units, what size difference in factor level population means can be detected when Alpha=0.05 and Beta=0.10? A difference of between 0.6 and 0.7 standard deviations for drug means. A difference of between 0.8 and 0.9 standard deviations for amount means. f) Because of budget constraints, only 6 units are available for each treatment combination. How does this choice affect the size of the detectable difference in treatment population means? in factor level population means? Use Alpha=0.10 and Beta=0.10. With 6 units for each treatment, one can detect a 2.5 standard deviation difference in treatment populations means, a 1 standard deviation difference in drug population means and a 1.4 standard deviation difference in amount population means. 1 g) Explain how you would randomly assign treatments to the experimental units. Include a table that indicates the random assignment of treatments for this experiment to experimental units. Remember the budget constraints in f). Number the 36 individuals with elevated serum cholesterol with a unique number between 0 and 36. Use JMP to do the random assignment. Enter Lipitor® in 18 rows and Zocor® in 18 rows of a column named Drug. In a second column named Amount have six rows with 0 mg, six rows with 10 mg and six rows with 20 mg for each drug. In a third column labeled Individual use Column – Formula – Random – Col Shuffle. For each row, the Drug and Amount are assigned to the individual whose number is in the Individual column. Drug Lipitor® Lipitor® Lipitor® Lipitor® Lipitor® Lipitor® Lipitor® Lipitor® Lipitor® Lipitor® Lipitor® Lipitor® Lipitor® Lipitor® Lipitor® Lipitor® Lipitor® Lipitor® Amount 0 mg 0 mg 0 mg 0 mg 0 mg 0 mg 10 mg 10 mg 10 mg 10 mg 10 mg 10 mg 20 mg 20 mg 20 mg 20 mg 20 mg 20 mg Individual 10 2 15 31 33 8 6 23 13 26 24 18 32 17 14 19 36 5 Drug Zocor® Zocor® Zocor® Zocor® Zocor® Zocor® Zocor® Zocor® Zocor® Zocor® Zocor® Zocor® Zocor® Zocor® Zocor® Zocor® Zocor® Zocor® Amount 0 mg 0 mg 0 mg 0 mg 0 mg 0 mg 10 mg 10 mg 10 mg 10 mg 10 mg 10 mg 20 mg 20 mg 20 mg 20 mg 20 mg 20 mg Individual 22 27 30 34 21 4 9 1 25 3 7 11 12 29 28 35 16 20 2. An experiment is performed by researchers in kinesiology on the effects of step height and stepping frequency on the change in heart rate. Forty eight individuals are recruited from students majoring in kinesiology. They all give their consent to participate in the experiment. Each individual is asked to step up onto a platform and step down repeatedly. The height of the platform will be either 15 cm or 30 cm. The rate at which a participant steps will be 14, 21, 28 or 35 steps per minute. A combination of platform height and stepping frequency will be assigned to each participant completely at random so that 6 participants are assigned each combination. Participants step up and down off the platform for 5 minutes. The change in heart rate (heart rate after stepping – heart rate before stepping) is calculated for each participant. 2 a) What is the response? The response is the change in heart rate (heart rate after stepping – heart rate before stepping). b) What are the conditions? The conditions are step height (15 cm and 30 cm) and stepping frequency (14, 21, 28 and 35 steps per minute). c) How many treatments are there? There are 2x4 = 8 treatments. d) What are the experimental units? The experimental units are students majoring in kinesiology. e) What is an outside variable that is controlled in this experiment? The amount of time spent stepping is the same. All students step for 5 minutes. f) What is an outside variable that is not controlled in this experiment? The fitness level of each of the students is not controlled. Some may be very fit and some may not be as fit. g) Describe, in some detail, what contributes to chance, random, error variation in this experiment. In general, differences among experimental units that are treated the same contribute to chance, random error variation. For this experiment, differences in the fitness level of students assigned a particular combination of step height and stepping frequency will contribute to chance, random, error variation. h) Give the full model describing the relationship between the response, conditions and random error. Be sure to define all the terms in the model within the context of the problem. π = π + πππ + πΊ π = π + πΆπ + π·π + πΆπ·ππ + πΊ Y is the change in heart rate. π is the overall population mean. πππ is the treatment effect for step height, i, and stepping frequency, j. πΆπ is the step height effect of step height, i. π·π is the stepping frequency effect of stepping frequency, j. πΆπ·ππ is the interaction effect for step height, i, and stepping frequency, j. πΊ is the random error. 3 A JMP data table is posted on the course website. Use JMP to analyze the data. i) Estimate the treatment effects. Mean change in heart rate is 34.6875. The means for the 6 treatments are given in the following table. Height=15 cm Height=30 cm Frequency=14 Frequency=21 Frequency=28 Frequency=35 12.00 19.50 28.83 39.50 19.50 37.00 49.33 71.83 The estimated treatment effects are given in the following table. Height=15 cm Height=30 cm Frequency=14 Frequency=21 Frequency=28 Frequency=35 –22.6875 –15.1875 –5.8542 4.8125 –15.1875 2.3125 14.6458 37.1458 j) Test the hypothesis that all the treatment effects are zero against the alternative that at least one is not zero. Be sure to give an appropriate null and alternative hypothesis. Report the value of the appropriate test statistic, P-value, decision, reason for the decision and a conclusion within the context of the problem. π―π : πππ πππ πππ = π π―π¨ : ππππ πππ ≠ π F = 30.6498, P-value < 0.0001 Because the P-value is so small (< 0.05) we reject the null hypothesis and conclude that some of the combinations of step height and stepping frequency affect the change in heart rate. k) Estimate the step height effects. Height=15 cm Height=30 cm Estimated Effect 24.9583 – 34.6875 = –9.7292 44.4167 – 34.6875 = +9.7292 l) Test the hypothesis that the effects of step height are zero against the alternative that at least one is not zero. Be sure to give an appropriate null and alternative hypothesis using the notation from your answer to h). Report the value of the appropriate test statistic, P-value, decision, reason for the decision and a conclusion within the context of the problem. π―π : πππ πππ πΆπ = π π―π¨ : ππππ πΆπ ≠ π F = 61.7012, P-value < 0.0001 Because the P-value is so small (< 0.05) we reject the null hypothesis and conclude that the two step heights affect the change in heart rate differently. 4 m) Where are the statistically significant differences in step height sample means? Support you answer statistically. Because there are only two step heights, 15 cm has a statistically different mean change in heart rate than 30 cm. The F test in l) supports this conclusion. n) Estimate the stepping frequency effects. Estimated Effect 15.75 – 34.6875 = –18.9375 28.25 – 34.6875 = –6.4375 39.0833 – 34.6875 = +4.3958 55.6667 – 34.6875 = +20.9792 Frequency=14 Frequency=21 Frequency=28 Frequency=35 o) Test the hypothesis that all the stepping frequency effects are zero against the alternative that at least one is not zero. Be sure to give an appropriate null and alternative hypothesis using the notation from your answer to h). Report the value of the appropriate test statistic, P-value, decision, reason for the decision and a conclusion within the context of the problem. π―π : πππ πππ π·π = π π―π¨ : ππππ π·π ≠ π F = 46.6892, P-value < 0.0001 Because the P-value is so small (< 0.05) we reject the null hypothesis and conclude that some of the stepping frequencies affect the change in heart rate. p) Where are the statistically significant differences in stepping frequency sample means? Support you answer statistically. π―πΊπ« = π. πππππ(π. πππππ) = π. ππ, so any differences in stepping frequency sample means greater than 9.39 are deemed statistically significant. Level 35 28 21 14 A B C D Mean 55.6667 39.0833 28.2500 15.7500 Levels not connected with the same letter are significantly different. Therefore each stepping frequency has a sample mean change in heart rate that is different from all of the other stepping frequencies. 5 q) Are there any interaction effects that are different from zero? Be sure to give an appropriate null and alternative hypothesis using the notation from your answer to h). Report the value of the appropriate test statistic, P-value, decision, reason for the decision and a conclusion within the context of the problem. π―π : πππ πππ πΆπ·ππ = π π―π¨ : ππππ πΆπ·ππ ≠ π F = 4.2599, P-value < 0.0106 Because the P-value is small (< 0.05) we reject the null hypothesis and conclude that there is statistically significant interaction between step height and stepping frequency. r) Construct an interaction plot. Comment on the plot and what it tells you about the interaction between the two factors. Be specific and be sure your answer deals with the context of this experiment. As stepping frequency increases, the average change in heart rate tends to increase. As height increases, the average change in heart rate tends to increase. However, the effect of height is greater when the stepping frequency is 35 steps per minute compared to 14 steps per minute. This is an indication of interaction. For 21 steps per minute and 28 steps per minute the effect of height is about the same. s) Based on the analysis of these data, what combination of step height and stepping frequency would you recommend if someone wanted to increase their heart rate the most? Using the 35 steps per minute frequency and 30 cm step height produced a mean increase in heart rate of 71.83 beats per minute. The HSD for comparing treatment combination means is 3.19651(4.95438) = 15.84. Because the mean for 35 steps per minute frequency and 30 cm step height is more than 15.84 larger than the any of the other treatment combination means the differences between this treatment combination and all others are statistically significant. 6 Therefore I would recommend 35 steps per minute frequency and a 30 cm height in order for kinesiology majors similar to those who participated in the experiment to increase their heart rate the most after 5 minutes of stepping. t) Construct plots of residuals versus the levels of the two factors. Describe the plots and indicate what this tells you about the Fisher conditions necessary for the analysis of variance. Level 15 30 Level 14 21 28 35 n 24 24 Mean Std Dev 0 5.94845 0 9.62711 n 12 12 12 12 Mean Std Dev 0 4.8006 0 9.7211 0 10.0280 0 7.0475 The amount of variation within each step height is not too different. The ratio of the largest standard deviation to the smallest is less than 2. The amount of variation for each stepping frequency differs. The ratio of the largest standard deviation to the smallest is greater than 2. The equal standard deviation condition may not be met. 7 u) Construct plots of the distribution of residuals. Describe each of the plots in the distribution of residuals. Indicate what this tells you about the Fisher conditions necessary for the analysis of variance. Residual The histogram looks fairly symmetric (there may be a slight skew to the right). The box plot looks fairly symmetric (there may be a slight skew to the right, with the mean slightly bigger than the median). The points on the normal quantile plot follow the diagonal (normal model) line fairly well. It appears that the errors could be normally distributed. 8 Turn in the JMP output that you have used to answer the questions. Response: Change Summary of Fit RSquare RSquare Adj Root Mean Square Error Mean of Response Observations (or Sum Wgts) 0.842859 0.815359 8.58123 34.6875 48 Analysis of Variance Source DF Sum of Squares Model 7 15798.813 Error 40 2945.500 C. Total 47 18744.313 Effect Tests Source Frequency Height Frequency*Height DF 3 1 3 Mean Square 2256.97 73.64 Sum of Squares 10314.229 4543.521 941.063 F Ratio 30.6498 Prob > F <.0001* F Ratio 46.6892 61.7012 4.2599 Prob > F <.0001* <.0001* 0.0106* Effect Details Frequency Least Squares Means Table Level Least Sq Mean Std Error 14 15.750000 2.4771876 21 28.250000 2.4771876 28 39.083333 2.4771876 35 55.666667 2.4771876 Mean 15.7500 28.2500 39.0833 55.6667 9 LSMeans Differences Tukey HSD α=0.050 Q=2.68042 HSD = 2.68042(3.50327) = 9.39 LSMean[i] By LSMean[j] Mean[i]-Mean[j] 14 Std Err Dif Lower CL Dif Upper CL Dif 14 0 0 0 0 21 12.5 3.50327 3.10976 21.8902 28 23.3333 3.50327 13.9431 32.7236 35 39.9167 3.50327 30.5264 49.3069 21 28 35 -12.5 3.50327 -21.89 -3.1098 0 0 0 0 10.8333 3.50327 1.44309 20.2236 27.4167 3.50327 18.0264 36.8069 -23.333 3.50327 -32.724 -13.943 -10.833 3.50327 -20.224 -1.4431 0 0 0 0 16.5833 3.50327 7.19309 25.9736 -39.917 3.50327 -49.307 -30.526 -27.417 3.50327 -36.807 -18.026 -16.583 3.50327 -25.974 -7.1931 0 0 0 0 Level 35 28 21 14 Least Sq Mean 55.666667 39.083333 28.250000 15.750000 A B C D Levels not connected by same letter are significantly different. Height Least Squares Means Table Level Least Sq Mean Std Error 15 24.958333 1.7516361 30 44.416667 1.7516361 Mean 24.9583 44.4167 10 Frequency*Height Least Squares Means Table Level Least Sq Mean Std Error 14,15 12.000000 3.5032723 14,30 19.500000 3.5032723 21,15 19.500000 3.5032723 21,30 37.000000 3.5032723 28,15 28.833333 3.5032723 28,30 49.333333 3.5032723 35,15 39.500000 3.5032723 35,30 71.833333 3.5032723 LS Means Plot Level 35,30 28,30 35,15 21,30 28,15 21,15 14,30 14,15 A B B C B C C D D E D E E Least Sq Mean 71.833333 49.333333 39.500000 37.000000 28.833333 19.500000 19.500000 12.000000 Levels not connected by same letter are significantly different. 11 LSMeans Differences Tukey HSD α=0.050 Q=3.19651 HSD = 3.19651(4.95438) = 15.84 LSMean[i] By LSMean[j] Mean[i]-Mean[j] 14,15 Std Err Dif Lower CL Dif Upper CL Dif 14,15 0 0 0 0 14,30 7.5 4.95438 -8.3367 23.3367 21,15 7.5 4.95438 -8.3367 23.3367 21,30 25 4.95438 9.1633 40.8367 28,15 16.8333 4.95438 0.99663 32.67 28,30 37.3333 4.95438 21.4966 53.17 35,15 27.5 4.95438 11.6633 43.3367 35,30 59.8333 4.95438 43.9966 75.67 14,30 21,15 21,30 28,15 28,30 35,15 35,30 -7.5 4.95438 -23.337 8.3367 0 0 0 0 3.6e-15 4.95438 -15.837 15.8367 17.5 4.95438 1.6633 33.3367 9.33333 4.95438 -6.5034 25.17 29.8333 4.95438 13.9966 45.67 20 4.95438 4.1633 35.8367 52.3333 4.95438 36.4966 68.17 -7.5 4.95438 -23.337 8.3367 -4e-15 4.95438 -15.837 15.8367 0 0 0 0 17.5 4.95438 1.6633 33.3367 9.33333 4.95438 -6.5034 25.17 29.8333 4.95438 13.9966 45.67 20 4.95438 4.1633 35.8367 52.3333 4.95438 36.4966 68.17 -25 4.95438 -40.837 -9.1633 -17.5 4.95438 -33.337 -1.6633 -17.5 4.95438 -33.337 -1.6633 0 0 0 0 -8.1667 4.95438 -24.003 7.67004 12.3333 4.95438 -3.5034 28.17 2.5 4.95438 -13.337 18.3367 34.8333 4.95438 18.9966 50.67 -16.833 4.95438 -32.67 -0.9966 -9.3333 4.95438 -25.17 6.50337 -9.3333 4.95438 -25.17 6.50337 8.16667 4.95438 -7.67 24.0034 0 0 0 0 20.5 4.95438 4.6633 36.3367 10.6667 4.95438 -5.17 26.5034 43 4.95438 27.1633 58.8367 -37.333 4.95438 -53.17 -21.497 -29.833 4.95438 -45.67 -13.997 -29.833 4.95438 -45.67 -13.997 -12.333 4.95438 -28.17 3.50337 -20.5 4.95438 -36.337 -4.6633 0 0 0 0 -9.8333 4.95438 -25.67 6.00337 22.5 4.95438 6.6633 38.3367 -27.5 4.95438 -43.337 -11.663 -20 4.95438 -35.837 -4.1633 -20 4.95438 -35.837 -4.1633 -2.5 4.95438 -18.337 13.3367 -10.667 4.95438 -26.503 5.17004 9.83333 4.95438 -6.0034 25.67 0 0 0 0 32.3333 4.95438 16.4966 48.17 -59.833 4.95438 -75.67 -43.997 -52.333 4.95438 -68.17 -36.497 -52.333 4.95438 -68.17 -36.497 -34.833 4.95438 -50.67 -18.997 -43 4.95438 -58.837 -27.163 -22.5 4.95438 -38.337 -6.6633 -32.333 4.95438 -48.17 -16.497 0 0 0 0 12