Thursday week 4

advertisement
Class work for Thursday April 21st
A study investigated how the use of two different cooking oils affected total cholesterol. The oils tasted and looked
about the same and each individual was given a bottle of oil without a label, so the participants did not know
which of the two types of oil they were using. For context, I have included figure 1 which shows the estimated
distribution of total cholesterol for adults in the United States.
Figure 1: the distribution (sort of a histogram) of total cholesterol for the US adult population. From the paper: Changing Disease Definitions:
Implications for Disease Prevalence. Analysis of the Third National Health and Nutrition Examination Survey, 1988-1994" Effective Clinical
Practice, March/April 1999. 2:76-85
The following exercises will involve simulating many data sets, conducting a test on each data set, and then looking
at the p-values for the different tests.
We will do this using the normal random number generator in R. the line of code,
groupA <- rnorm(n=100, mean=180, sd=40)
creates a sample of 100 random numbers from a normal distribution with mean 180 and standard deviation 40.
The values are put into the vector groupA.
mean(groupA) gives us the mean of the sample which we expect to be close to but not exactly 180.
If we generate another group we can use a function called t.test to test for equality of the two means.
groupB <- rnorm(n=100, mean=190, sd=40)
Here I’ve set the mean for the second population to 190.
To implement a t-test I use:
t.test(groupA,groupB)
To record the p-value I can use:
P_value <- t.test(groupA,groupB)$p.value
######### Exercise 1 (when the null hypothesis is true)
Lets generate some data from two
Each is 100 random values from a
groupA <- rnorm(n=100, mean=180,
groupB <- rnorm(n=100, mean=180,
populations where the means are the same
normal distribution with mean 180 and standard deviation 40
sd=40)
sd=40)
We can look at the data using a stacked box plot
par(mfrow=c(2,1))
cBreaks <- seq(-1000,1000,by=25)
hist(groupA,breaks=cBreaks,xlim=c(50,350),col="gray")
hist(groupB,breaks=cBreaks,xlim=c(50,350),col="gray")
# you can apply a simple t-test to test for a difference of means
t.test(groupA,groupB)
# briefly describe your conclusions for this experiment, pretending that you don't know that
there was not difference in the effects of the oils on total cholesterol.
# If we repeated this experiment many times (with a different group of subjects each time) about
how often would you expect to get a p-value smaller than 0.05?
# Check you answer by simulating a bunch of experiments and keeping track of the p-values
p_values <- numeric(1000)
for(i in 1:1000){
groupA <- rnorm(n=100, mean=180, sd=40)
groupB <- rnorm(n=100, mean=180, sd=40)
p_values[i] <- t.test(groupA,groupB)$p.value
}
what proportion of the p-values were less than 0.05?
Does this seem about right?
Hint: here's one way to calculate the number of p-values less than or equal to 0.05
length(p_values[p_values <= 0.05])
plot a histogram of the p-values and add a vertical line a 0.05 (hint: abline(v=0.05))
######### Exercise 2 (statistically significant versus practical
significance)
Now let's look at an example where there is a very small difference in the mean total cholesterol
between the two groups, and the number of subjects is much larger.
groupA <- rnorm(10000, 181,40)
groupB <- rnorm(10000,179,40)
first plot the data from a single simulation, do a t-test, and describe your results.
When you plot the data do the total cholesterol levels look different?
How important do you think this difference is?
Then repeat simulations many times like above.
How often is the p-value < 0.05?
Do you think the sample size they used was sufficient (i.e. 10,000)?
Why or why not?
####### Exercise 3
Now let's consider an example where there is a relatively large difference bewteen the groups but
the sample size is small.
groupA <- rnorm(20, 190,40)
groupB <- rnorm(20,170,40)
repeat the steps above and answer the questions.
Do you think there will be more or less p-values <= 0.05?
What proportion of the p-values were less than 0.05?
Does this seem about right?
##### Exercise 4: (probability that null hypothesis is true)
Now consider 20,000 scientists around the world each conducting an experiment and publishing
their results
only if they get a p-value <= 0.05.
Lets assume that in half of the cases the null hypothesis is true and for the other half there is
a difference of 5
They are all using very low sample sizes (5 in each group).
What do you think the probability of rejecting is for the two groups?
repeat the simulations for each group
# no difference, null hypothesis is true
groupA <- rnorm(5, 180, 40)
groupB <- rnorm(5, 180, 40)
# moderate difference null hypothesis is false
groupA <- rnorm(5, 180, 40)
groupB <- rnorm(5, 185, 40)
How does the probability of rejecting compare for the two groups?
How frequently was the null hypothesis false when you rejected?
If you saw the published results of one of these studies with wonderful news about the benefits
of a certain type of oil how convinced would you be? Why?
###### Exercise 5 (magnitude and direction of errors in estimated
differences)
As in exercise 4, consider a situation where there is a moderate difference in the mean, and a
small sample size.
Simulate using:
groupA <- rnorm(5, 180, 40)
groupB <- rnorm(5, 185, 40)
But also keep track of the size of the difference.
so add:
cDiff <- numeric(10000)
and inside the for loop:
cDiff[i] <- mean(groupB)-mean(groupA)
plot a histogram of all of the differences.
What do you think of this?
plot a histogram of the differences where the p-value was <= 0.05
Answer all of the following questions using the histogram. I just
So if you rejected the null hypothesis, what are the chances that
means is close to the true difference between the populations?
If you reject, what are the chances that you correctly identified
mean?
If you reject, by about what amount do you tend to over-estimated
estimated differences > 0 here)
(hint: cDiff[p_values<=0.05])
need very approximate answers!
the observed difference in the
which group had the higher
the difference? (just consider
Group Assignments due Tuesday April 25th in Class
Between Thursday and Tuesday work together or individually on the following problems. Use the same group that you were in on Thursday.
You can start on the problems in class if you have time.
You will have 10 minutes to get organized at the beginning of class (on Tuesday) and then each group will present their results. You can just give
a verbal presentation, draw your results on the white board or email me 1 or 2 figures or slides. My email is martin.liermann@noaa.gov
Group 1 Assignment (based on exercise 1)
What happens when the data come from weird distributions?
Replace both instances of rnorm(n=100,mean=180, sd=40) above with some of the distributions you played around with in the
central limit theorem exercise. Try some really strange ones.
Keep the null hypothesis true (in this case lets say the null hypothesis is that there is no difference between the means).
Examples, might include:
- change both to rgamma(10,.1,.1)*180
- change 1 to rnorm(10,180,20) and the other to rnorm(5,180,80)
You can also try other tests: wilcox.test, perm.test
Did the proportion of times that p<=0.5 stay the same?
What does this mean
You can report your results using a simple table or you can use plots
Note: use install.packages("exactRankTests") and then library(exactRankTests) to make the permulation test perm.test available
Group 2 Assignment (based on exercise 2)
Change the sample size from 10,000 to 100 and then through trial and error try to find a two means that give you about the same number of pvalues <=0.05.
Don't worry about getting it exactly the same! (e.g. 0.8 would be close enough to 0.85)
For example you could try 185 and 175 instead of 181 and 179 (so a difference of 10 instead of 2).
Repeat this again for a sample size of 10
What does this say about using the p-value to judge the practical significance of a result?
Group 3 assignment (based on exercise 3)
Repeat the simulations for a range of sample sizes and plot the proportion of the p-values less than 0.05 against sample size.
(Four different sample sizes would be fine. Also, you can plot by hand, in excel, or in R)
Interpret your results.
How should this inform a study design
What happens if you look at p-values <= 0.01 instead of 0.05
Group 4 assignment (based on exercise 4)
Repeat the experiment above over a range of sample sizes (go up to at least 1000) (four different sample sizes would be fine)
Plot the proportion of times the null was false, when you rejected, against the sample size. (you can plot by hand, in excel, or in R)
Interpret
How can this be used to help you interpret the results of a study you see in the media?
Group 5 assignment (based on exercise 5)
Repeat the experiment above over a wide range of sample sizes (go up to at least 100) (four different sample sizes would be fine)
plot the average amount by which you overestimate when you reject. (you can plot by hand, in excel, or in R)
Hint: overEstimates <- cDiff[p_values<=0.05 & cDiff>0]-5 or cDiff[p_values<=0.05 & cDiff>0]/5
Interpret
How can this be used to help you interpret the results of a study you see in the media?
Download