Assignment 2

advertisement
Assignment 2, Inferential Statistics I
Student Name:
Grade:
[Please download a copy of this document and put all your answers in this document.]
[You can study with other students on this assignment, but you must write the answers
yourself. If your assignment paper is the same as other student, both of your grades
would be low.] When indicating showing software output in a question, please place
them right after your answer for each part of the question.
Part I
Please use the sample data from students (links to two versions of the data are listed
above) to perform the following tasks. Delete the extreme values, i.e., the case with
height = 6, and the case with weight = 21, for your analysis. The goal is to understand
the student population based on the students in the sample.
Link to Data: http://people.ysu.edu/~gchang/stat/Classdata2012.xls
Link to Data: http://people.ysu.edu/~gchang/stat/Classdata2012.csv
a) Test whether the average Body Mass Index for the student population where the
student sample was drawn is less than 26, at 5% level of significance, using the
sample data above. You must state null and alternative hypothesis, check
normality assumption, report p-value, test statistic value, and draw a proper
conclusion.
Null hypothesis:
Alternative hypothesis:
Report p-value from the normality test and conclusion:
Report the value of the t-test statistic =
Report p-value from the t-test and the conclusion:
b) Find the 95% confidence interval for estimating the average BMI for the sampled
population.
c) Find the sample size so that one can have a 90% power to detect a BMI average
that is 0.5 unit lower than 26 (i.e., a = 25.5), at 5% level of significance, using
the estimated standard deviation from the sample. (Use the sample size
calculation formula in the lecture note.)
1
Assignment 2, Inferential Statistics I
Part II
A group of investigators are studying a treatment that can help reducing LDL Cholesterol
level. The following data shows the LDL at the beginning and the end of the observation
period from a sample of participants randomly selected from a specific patient population
who received the treatment.
Data:
Subject ID
LDL at the beginning
LDL at the end
1
186
142
2
144
113
3
154
101
4
174
122
5
165
129
6
172
136
7
158
139
a) Perform a t-test to test whether the average reduction in LDL (use LDL at the
beginning minus LDL at the end) is greater than 30, at 5% level of significance.
Please include output tables or charts from statistical software that are useful in
supporting and interpreting your result. You must state null and alternative
hypothesis, check normality assumption, present p-value, report test statistic value.
Null hypothesis:
Alternative hypothesis:
Report p-value from the normality test and conclusion:
Report the value of the t-test statistic =
Report p-value from the t-test and the conclusion:
b) Find the 95% confidence interval for estimating the reduction in LDL for the
sample population from the treatment.
c) From the past studies, the standard deviation of the reduction in LDL for this
population from this treatment is around 10. Find the sample size so that one can
have a 90% power to detect a 4 units average reduction in LDL at 5% level of
significance for one-sided t-test. (Use the sample size calculation formula in the
lecture note.)
2
Assignment 2, Inferential Statistics I
Part III
Please use the sample data from students (links to two versions of the data are listed
below) to perform the following tasks. Delete the extreme value, i.e., the case with
height = 6, for your analysis. The goal is to understand the student population based on
the students in the sample.
Link to Data: http://people.ysu.edu/~gchang/stat/Classdata2012.xls
Link to Data: http://people.ysu.edu/~gchang/stat/Classdata2012.csv
Perform a t-test to see if there is statistically significant difference in average BMI
between students who did exercise regularly versus those who did not exercise regularly,
at 5% level of significance. Also, properly conclude your analysis by providing your
comments on the findings. (The normality and equality of variances assumptions must be
checked, and the output from the statistical software should be included for supporting
your comments.)
Part IV
Please use the student data below to find the 95% confidence interval for estimating the
percentage of students who exercise regularly in the past year using the asymptotic
method. (Please also show the software output for this estimation.)
Link to Data: http://people.ysu.edu/~gchang/stat/Classdata2012.xls
Link to Data: http://people.ysu.edu/~gchang/stat/Classdata2013.csv
3
Download