Tuesday week 5

advertisement
Tests and beyond
Group Presentations
• Group 1: How do weird distributions affect the pvalue when the null hypothesis is true.
• Group 2: What does the p-value say about the
effect size?
• Group 3: How does sample size affect the
probability of rejecting (p≤0.05) the null, when
the null hypothesis is true.
• Group 4: How does increasing sample size change
your interpretation of rejecting (p≤0.05) the null.
• Group 5: The estimated effect size if you reject.
Power
How frequently would I reject if there was a
difference of X in the population means?
Example of a power analysis
How often would I reject if the difference
between the means was 15?
p_values <- numeric(1000)
for(i in 1:1000){
groupA <- rnorm(20,180,40)
groupB <- rnorm(20,180+15,40)
p_values[i] <- perm.test(groupA,groupB)$p.value
}
length(P_values[abs(p_values)<=0.05]/1000
Example of a power analysis
How often would I reject if the difference
between the means was 15?
p_values <- numeric(1000)
for(i in 1:1000){
groupA <- rnorm(20,180,40)
groupB <- rnorm(20,180+15,40)
p_values[i] <- perm.test(groupA,groupB)$p.value
}
length(p_values[abs(p_values)<=0.05])/1000
2000
1500
Frequency
1000
500
0
0.0
0.2
0.4
0.6
p-value
0.8
1.0
2000
1500
Frequency
1000
500
0
0.0
0.2
P-value ≤ 0.05
0.4
0.6
p-value
0.8
1.0
2000
1000
500
0
Frequency
1500
Power = 0.21
Probability of rejecting given:
• A difference of 15
• Standard deviation = 40
• Sample size = 20
0.0
0.2
P-value ≤ 0.05
0.4
0.6
p-value
0.8
1.0
Where do we get the difference and
standard deviation?
•
•
•
•
Theory (e.g. expect at most a difference of XX)
Other studies.
A pilot study.
Regulations (must show an increase of at least
ZZ).
Total Cholesterol Example
Example of a power analysis
sampleSizes <- c(5,10,20,40,60,80,100,120,140,160)
power_vals <- numeric(10)
for(j in 1:10){
sampSize <- sampleSizes[j]
p_values <- numeric(10000)
for(i in 1:10000){
groupA <- rnorm(sampSize,180,40)
groupB <- rnorm(sampSize,180+15,40)
p_values[i] <- t.test(groupA,groupB)$p.value
}
power_vals[j] <- length(p_values[abs(p_values)<=0.05])/10000
}
Example of a power analysis
sampleSizes <- c(5,10,20,40,60,80,100,120,140,160)
power_vals <- numeric(10)
for(j in 1:10){
sampSize <- sampleSizes[j]
p_values <- numeric(10000)
for(i in 1:10000){
groupA <- rnorm(sampSize,180,40)
groupB <- rnorm(sampSize,180+15,40)
p_values[i] <- t.test(groupA,groupB)$p.value
}
power_vals[j] <- length(p_values[abs(p_values)<=0.05])/10000
}
Example of a power analysis
sampleSizes <- c(5,10,20,40,60,80,100,120,140,160)
power_vals <- numeric(10)
for(j in 1:10){
sampSize <- sampleSizes[j]
p_values <- numeric(10000)
for(i in 1:10000){
groupA <- rnorm(sampSize,180,40)
groupB <- rnorm(sampSize,180+15,40)
p_values[i] <- t.test(groupA,groupB)$p.value
}
power_vals[j] <- length(p_values[abs(p_values)<=0.05])/10000
}
0
50
100
Sample size
150
0.0
0.2
0.4
0.6
Power
0.8
1.0
1.0
0.6
0.4
0.2
Effect size = 10
0.0
Power
0.8
Effect size = 15
0
50
100
Sample size
150
This stuff makes my head hurt.
There’s got to be an easier way!
• Hypothesis Testing
Reject
• Estimation
!
Estimation & Confidence Intervals
Estimation & Confidence Intervals
Reject
!
Estimation & Confidence Intervals
Reject
!
What is a Confidence Interval?
• Unfortunately it’s not:
95% probability that the true mean lies in the CI.
• 95% of the time the CI includes (covers) the
true mean (difference, etc…)
True Difference
Issues with Tests
•
•
•
•
Statistical vs practical significance.
Power of a test.
What does not rejecting mean?
What does rejecting mean?
Issues with Tests
•
•
•
•
Statistical vs practical significance.
Power of a test.
What does not rejecting mean?
What does rejecting mean?
0
Issues with Tests
Statistical vs practical significance.
Power of a test.
What does not rejecting mean?
What does rejecting mean?
Total cholesterol
•
•
•
•
20
1
0
Total cholesterol
20
1
0
Issues with Tests
•
•
•
•
Statistical vs practical significance.
Power of a test.
What does not rejecting mean?
What does rejecting mean?
Issues with Tests
•
•
•
•
Statistical vs practical significance.
Power of a test.
What does not rejecting mean?
What does rejecting mean?
15
0
Issues with Tests
•
•
•
•
Statistical vs practical significance.
Power of a test.
What does not rejecting mean?
What does rejecting mean?
15
0
Issues with Tests
•
•
•
•
Statistical vs practical significance.
Power of a test.
What does not rejecting mean?
smaller CI’s -> Higher Power
What does
N=10 rejecting mean?
N=20
15
0
N=50
N=75
Issues with Tests
•
•
•
•
Statistical vs practical significance.
Power of a test.
What does not rejecting mean?
2 x se
What does
N=10 rejecting mean?
N=20
15
0
N=50
N=75
Issues with Tests
•
•
•
•
Statistical vs practical significance.
Power of a test.
What does not rejecting mean?
2 x se
𝑣𝑎𝑟1 + 𝑣𝑎𝑟2
What does
N=10 rejecting mean?
se =
N=20
15
0
𝑁
N=50
N=75
Issues with Tests
•
•
•
•
Statistical vs practical significance.
Power of a test.
What does not rejecting mean?
2 x se
𝑣𝑎𝑟1 + 𝑣𝑎𝑟2
What does
N=10 rejecting mean?
se =
N=20
15
0
𝑁
N=50
N=75
se =
1600 + 1600
𝑁
1
se = 57
𝑁
Issues with Tests
•
•
•
•
Statistical vs practical significance.
Power of a test.
What does not rejecting mean?
What does rejecting mean?
Issues with Tests
•
•
•
•
Statistical vs practical significance.
Power of a test.
What does not rejecting mean?
What does rejecting mean?
0
Issues with Tests
•
•
•
•
Statistical vs practical significance.
Power of a test.
What does not rejecting mean?
What does rejecting mean?
15
0
Issues with Tests
•
•
•
•
Statistical vs practical significance.
Power of a test.
What does not rejecting mean?
What does rejecting mean?
15
0
Issues with Tests
•
•
•
•
Statistical vs practical significance.
Power of a test.
What does not rejecting mean?
What does rejecting mean?
Issues with Tests
•
•
•
•
Statistical vs practical significance.
Power of a test.
What does not rejecting mean?
What does rejecting mean?
Issues with Tests
•
•
•
•
Statistical vs practical significance.
Power of a test.
What does not rejecting mean?
What does rejecting mean?
0
Issues with Tests
•
•
•
•
Statistical vs practical significance.
Power of a test.
What does not rejecting mean?
What does rejecting mean?
True effect
size
0
Issues with Tests
•
•
•
•
Statistical vs practical significance.
Power of a test.
What does not rejecting mean?
What does rejecting mean?
0
Issues with Tests
•
•
•
•
Statistical vs practical significance.
Power of a test.
What does not rejecting mean?
What does rejecting mean?
0
Issues with Tests
•
•
•
•
Statistical vs practical significance.
Power of a test.
What does not rejecting mean?
What does rejecting mean?
0
Issues with Tests
•
•
•
•
Statistical vs practical significance.
Power of a test.
What does not rejecting mean?
What does rejecting mean?
0
Issues with Tests
•
•
•
•
Statistical vs practical significance.
Power of a test.
What does not rejecting mean?
What does rejecting mean?
0
Issues with Tests
•
•
•
•
Statistical vs practical significance.
Power of a test.
What does not rejecting mean?
What does rejecting mean?
0
Thursday
• Study Design
End
How would you do a power analysis
How large would my samples need to be in
order to reject 90% of the time?
Need:
1) A reasonable difference than you would like
to be able to detect.
2) An approximate standard deviation of the
data you expect to collect.
Confidence intervals
For the Cholesterol example
• Simulation for power analysis.
• Simulation for checking your test.
Define Power
Use a motivating example.
Confidence Intervals
Use bootstrap to motivate then introduce 2*se
Explain connection between confidence intervals
and tests.
subtract mean difference
Given the mean is X, there is a 95% probability that
the CI will cover it.
Given diff is zero there is a less than 5% probability
that this confidence interval
Rules of Thumb that are useful!
•
•
•
•
Standard error
2*standard error (t-test).
Again, don’t get caught of in the minutia.
The difference between 2 and 1.96 is small
compared to other things.
Download