Assignment 2, Inferential Statistics I Student Name: Grade: [Please download a copy of this document and put all your answers in this document.] [You can study with other students on this assignment, but you must write the answers yourself. If your assignment paper is the same as other student, both of your grades would be low.] When indicating showing software output in a question, please place them right after your answer for each part of the question. Part I Please use the sample data from students (links to two versions of the data are listed above) to perform the following tasks. Delete the extreme values, i.e., the case with height = 6, and the case with weight = 21, for your analysis. The goal is to understand the student population based on the students in the sample. Link to Data: http://people.ysu.edu/~gchang/stat/Classdata2012.xls Link to Data: http://people.ysu.edu/~gchang/stat/Classdata2012.csv a) Test whether the average Body Mass Index for the student population where the student sample was drawn is less than 26, at 5% level of significance, using the sample data above. You must state null and alternative hypothesis, check normality assumption, report p-value, test statistic value, and draw a proper conclusion. Null hypothesis: Alternative hypothesis: Report p-value from the normality test and conclusion: Report the value of the t-test statistic = Report p-value from the t-test and the conclusion: b) Find the 95% confidence interval for estimating the average BMI for the sampled population. c) Find the sample size so that one can have a 90% power to detect a BMI average that is 0.5 unit lower than 26 (i.e., a = 25.5), at 5% level of significance, using the estimated standard deviation from the sample. (Use the sample size calculation formula in the lecture note.) 1 Assignment 2, Inferential Statistics I Part II A group of investigators are studying a treatment that can help reducing LDL Cholesterol level. The following data shows the LDL at the beginning and the end of the observation period from a sample of participants randomly selected from a specific patient population who received the treatment. Data: Subject ID LDL at the beginning LDL at the end 1 186 142 2 144 113 3 154 101 4 174 122 5 165 129 6 172 136 7 158 139 a) Perform a t-test to test whether the average reduction in LDL (use LDL at the beginning minus LDL at the end) is greater than 30, at 5% level of significance. Please include output tables or charts from statistical software that are useful in supporting and interpreting your result. You must state null and alternative hypothesis, check normality assumption, present p-value, report test statistic value. Null hypothesis: Alternative hypothesis: Report p-value from the normality test and conclusion: Report the value of the t-test statistic = Report p-value from the t-test and the conclusion: b) Find the 95% confidence interval for estimating the reduction in LDL for the sample population from the treatment. c) From the past studies, the standard deviation of the reduction in LDL for this population from this treatment is around 10. Find the sample size so that one can have a 90% power to detect a 4 units average reduction in LDL at 5% level of significance for one-sided t-test. (Use the sample size calculation formula in the lecture note.) 2 Assignment 2, Inferential Statistics I Part III Please use the sample data from students (links to two versions of the data are listed below) to perform the following tasks. Delete the extreme value, i.e., the case with height = 6, for your analysis. The goal is to understand the student population based on the students in the sample. Link to Data: http://people.ysu.edu/~gchang/stat/Classdata2012.xls Link to Data: http://people.ysu.edu/~gchang/stat/Classdata2012.csv Perform a t-test to see if there is statistically significant difference in average BMI between students who did exercise regularly versus those who did not exercise regularly, at 5% level of significance. Also, properly conclude your analysis by providing your comments on the findings. (The normality and equality of variances assumptions must be checked, and the output from the statistical software should be included for supporting your comments.) Part IV Please use the student data below to find the 95% confidence interval for estimating the percentage of students who exercise regularly in the past year using the asymptotic method. (Please also show the software output for this estimation.) Link to Data: http://people.ysu.edu/~gchang/stat/Classdata2012.xls Link to Data: http://people.ysu.edu/~gchang/stat/Classdata2013.csv 3