Robert Ploutz-Snyder, Ph.D. Biostatistician NASA JSC

advertisement
Robert Ploutz-Snyder, Ph.D.
Biostatistician NASA JSC
USRA / Division of Space Life Sciences
Robert.Ploutz-Snyder-1@nasa.gov
Biostats Lab
Overview

Review key terms & concepts relevant
to sample size considerations

Demonstrate easily accessible method
for calculating sample size requirements

Additional Strategies & Topics
Key Terms & Concepts
Demonstration
Additional Strategies
Biostats Lab
Primary Outcome(s)

Primary Outcome variable (a.k.a.
Dependant Variable)




Is it continuous or discrete?
How is it distributed in the population?
Have any pilot data with this outcome?
Is it sensitive enough to detect the effects you
are looking for??
○ Precision, Reliability, Validity

Secondary Outcomes
 Do you really need 30 different ―primary‖
outcomes??
Key Terms &
Concepts
Biostats Lab
Effect

―Effect‖  what are you trying to observe happen?
What are hypothesizing will change, or be different?
 Pre vs. Post Changes in Y Within Group
○ Main Effect for Time
○ ―We anticipate subjects’ coordination to decrease in response to
simulated microgravity, with significant differences observed Post,
relative to Pre.‖
 Mean Outcome of Y in Treatment A vs. Treatment B
○ Main Effect for Treatment
○ ―We expect participants randomized to receive our novel
intervention to report lower mean back pain ratings relative to
participants randomized to receive standard care.‖
 Pre/Post Changes in Y for one group relative to
another
○ Time x Treatment Interaction Effect
○ ―We hypothesize that our novel intervention will result in a
reduction in the amount of strength loss overt time, relative to
controls who receive the intervention typically delivered to LD
Astronauts.‖
Key Terms &
Concepts
Biostats Lab
Effect (cont.)

Note that ―statistical effect‖ (ex. main effect,
interaction effect) ≠ hypothesized effect

However merit review panels tend to respond
well to descriptions of your anticipated Effect in
terms of:
 The outcome variable of interest
 The experimental design
 The statistical effect for support
Key Terms &
Concepts
Biostats Lab
Effect Size

―Effect Size‖  How big of a change
(difference) do you anticipate observing in the
proposed study?

Measured in the metric of your Outcome
variable
 With specified standard deviation (σ)

Typically based on evidence
 Pilot data collected from your lab
 Data published by others in a related study

Even better if consistent with clinically,
scientifically, operationally relevant changes
that would be considered impressive
Key Terms &
Concepts
Biostats Lab
Effect Size (cont.)

Smaller effects are more difficult to detect
 Variability is important consideration


Be realistically optimistic about size estimates
Does it really matter that you detected a
―significant difference‖ in XYZ, if that amount of
difference would not affect …
 A person’s health, safety, performance?
 The way you’d rehab/treat persons affected ―this
much (p<.05)‖ different?
 Decisions NASA would or could make based on your
results (ex. did you have the right control group?)

Statistical significance isn’t the reason we do
research
Key Terms &
Concepts
Biostats Lab
Power
 ―Power‖
 The probability of detecting an
effect, given the effect size, experimental
design & outcome variable(s) you have
chosen, and your assumptions regarding
Type I (α) and Type II (β) errors.
Key Terms &
Concepts
Biostats Lab
Type I & II Errors & Power
The Truth is:
H0 Really isTrue
(there’s no effect)
H0 is Actually False
(there is an effect)
You Rejected H0 Due to a
Statistically Significant Result
Wrong Conclusion
Right Conclusion
You Accepted H0 Due to a
Non-Significant Result
Right Conclusion
Wrong Conclusion
Key Terms &
Concepts
Biostats Lab
Type I & II Errors & Power
The Truth is:
H0 Really isTrue
(there’s no effect)
You Rejected H0 Due to a
Statistically Significant Result
You Accepted H0 Due to a
Non-Significant Result
Key Terms &
Concepts
H0 is Actually False
(there is an effect)
Type I Error
Probability = α
Power
Probability = (1-β)
Probability = 1- α
Type II Error
Probability = β
Biostats Lab
What is a Power Analysis?
 Calculations
performed when planning a
future study that help determine the
minimum number of subjects you will need
to have a high likelihood of detecting the
effect that you expect, given everything
discussed thus far.
 Note that if the effect isn’t real, no amount of
power will help you find it…
Key Terms &
Concepts
Biostats Lab
Parts of the Calculations
Anticipated Effect size
 Variation of the outcome (σ)
 Assumed α risk
 Assumed Power
 Sample Size

Alpha
Effect
Size &
σ
Power
Minimum N
Key Terms &
Concepts
Biostats Lab
Other Considerations

Keep it Simple!
 Power the PRIMARY outcomes, not all of them
 Distill your effect into a ―t-test‖ like comparison if
possible
Be mindful of Experimental Designs issues
that can increase power
 Use data from multiple sources to validate
 Use software that has been validated
 Consult the Biostatistics lab if you need help!

Key Terms &
Concepts
Biostats Lab
On-line Power Software?



Not all of it is good stuff!
The Biostats Lab uses and recommends
http://www.stat.uiowa.edu/~rlenth/Power/
 Drs. Feiveson or Ploutz-Snyder can assist you if you want to
learn how to use it…
Key Terms &
Concepts
Biostats Lab

Example of Recent Power
Analysis
Exercise Lab –Novel Intervention
 Knee extensor strength (KES) and
endurance (KEE) reduces on ISS astronauts
due to negative effects of space flight

We have a new intervention that we
think will reduce these negative effects
Key Terms &
Concepts
Demonstration
Biostats Lab
Study Challenges

For good reasons… we have no pilot data
 Thus difficult to project the likely effect size

Our desire for a ―Usual Care‖ comparison
group is competing with our desire to collect
data on participants using our novel
intervention
 And it’s largely out of our control because of self-
selection
Key Terms &
Concepts
Demonstration
Biostats Lab
What did we do?
Hypothesize a reduced decline (i.e.
Interaction effect) associated with a novel
intervention…
 Powered to detect a wide range of strength
differences (pre/post), including

 Changes similar to historical ISS data
○ i.e. no benefit above what we already do
 5% reduction in the mean change
 10% reduction in the mean change
 15% reduction in mean change
 5% increase (i.e. worse) in the mean change
Key Terms &
Concepts
Demonstration
Biostats Lab
How did we do it?

Used historical ISS data as
a baseline for Pre/Post
changes observed

Assumed similar SD-change (σ)
Assumed 2-tailed α = 0.05
Created Power/Sample Size curves associated with the
five different effect sizes described in prior slide
Did this for all of the Primary outcomes



Key Terms &
Concepts
Demonstration
Biostats Lab
Note that

While our statistical plan was far more
sophisticated, power analysis is very simple
 Simplifying assumptions were conservative on
alpha risk

We did not set a ―critical‖ power to detect, as
most NIH grant applications typically assume
 We allow the reviewer to examine the trade-off
between Power and Sample Size
 But we choose n based on priorities, logistics, etc.
Key Terms &
Concepts
Demonstration
Biostats Lab
Info Needed to do this:

Mean & SD of Change from ISS Data
 KES: mean = 145.02; SD=155.5

Projected mean changes under different
scenarios
 5% reduction ~ 145.02*.95 = 137.8
MS Excel or similar (graphing)
 Web access to Power software:

 http://www.stat.uiowa.edu/~rlenth/Power/
Key Terms &
Concepts
Demonstration
Biostats Lab
How did we do it?
Choose one-sample t-test
Key Terms &
Concepts
Demonstration
Biostats Lab
How did we do it?
Sigma (σ) = 155.5
Mean Diff = 145
α =.05
(default)
Doesn’t matter what you set
Power for, as we’ll go into
the “Options” menu
anyway!
Key Terms &
Concepts
Demonstration
Biostats Lab
How did we do it?
Click on the “Options”
Menu, and select “Graph”
Key Terms &
Concepts
Demonstration
Biostats Lab
How did we do it?
Key Terms &
Concepts
Demonstration
Biostats Lab
How did we do it?
Nice graph for
starters… but I
want the data, not
the picture.
Click on “Show
Data” to get it.
Key Terms &
Concepts
Demonstration
Biostats Lab
How did we do it?
Now we have the data that
we can copy into EXCEL (or
other program) for
comparisons with other
power/sample size curves in
this model.
Biostats Lab
Repeat for other effect sizes
…Here, we hope not to observe the effect
with our new intervention that we’ve seen
previously…
 PI Determined clinically relevant differences
that would be meaningful

 5%, 10%, 15% Better

PI also wanted to power the study to detect
a worsening of spaceflight effects, should
that occur
Key Terms &
Concepts
Demonstration
Biostats Lab
End result?
Power as a function of Sample Size (Knee Extension Endurance)
100%
90%
80%
Power
70%
60%
50%
5% increase
Observed Mean Decrease in KEE (ISS n=17)
40%
5% reduction
10% reduction
15% reduction
30%
10
11
12
Key Terms &
Concepts
13
14
15
16
17
18
19
Sample Size per Group
Demonstration
20
21
22
23
24
25
Biostats Lab
Interpretation

With as few as 15 subjects in the novel
intervention, we exceed 80% power to detect
changes in Knee Extensor Endurance ranging
from 5% more that what historical ISS data
projects to as much as 15% less.
Key Terms &
Concepts
Demonstration
Biostats Lab
Too much of a good thing?

Some disciplines have the luxury of very precise
and sensitive outcome measures
 High signal-to-noise ratio
 Low within or between person variability

Can lead to overpowered studies
 Studies where even a small n is sufficient to reveal
statistically significant differences for very small effects

―In order to be a difference, it has to make a
difference‖
 Dr. Bill Paloski, former NASA Researcher/Manager
Key Terms & Concepts
Demonstration
Additional Strategies
Biostats Lab
What if required n were too high?

Underpowered Studies:
 Less likely to detect differences—inconclusive




results & frustration!
Waste of resources, given ―no-answer‖ potential
Less justifiable risk to participants
Displacement of other research that could have
taken place
Limited scientific contributions
 …but… is it useless?
Key Terms & Concepts
Demonstration
Additional Strategies
Biostats Lab

What can I do to increase my
power honestly?
Choose appropriate & powerful statistical
techniques
 Repeated-Measures & Mixed-Modeled designs
 Co-vary nuisance variance contributors

Choose more sensitive outcome measures
 Continuously scaled vs. ordinal/categorical
Choose more reliable outcome measures
 Reduce error variance
 Narrow subject selection (& inference space)

Key Terms & Concepts
Demonstration
Additional Strategies
Biostats Lab

What can I do to increase my
power honestly?
Challenge traditional notions of what
constitutes ―important‖ results
 α =.05 really necessary?
 2-tailed alpha testing appropriate?
 Power = 80% a reasonable cut-off?

Should you be operating under traditional
inferential boundaries at all?
 Characterization, descriptive, feasibility studies are
part of science too.
Key Terms & Concepts
Demonstration
Additional Strategies
Biostats Lab
What’s my alternative?
Instead of ―powering to detect a significant
difference,‖ some research is more preliminary
or exploratory in nature.
 If so, project the range of effects you will be
able to detect, given n.

 In other words, turn the equation around. ―Given
n=??, the effects that I will be able to detect are of
### magnitude.‖
Key Terms & Concepts
Demonstration
Additional Strategies
Biostats Lab
Back to our example:

If I know that I can only recruit 10 astronauts to
participate…
Solve for effect size,
given n, σ, α
Key Terms & Concepts
Demonstration
Additional Strategies
Biostats Lab
Back to our example:
Graph Power (Y) vs.
Effect Size (X) to
understand what
you’ll be able to
detect…
Is it worth it to
pursue?
Key Terms & Concepts
Demonstration
Additional Strategies
Biostats Lab
Another Approach

Report on the information the study will
provide, given n1
 When only able to collect data of n size, why bother
estimating a required sample size?
 Instead: ―For a sample Y size, I get information Z‖
 ―From sample data of n size, I can characterize the
effects that I’m interested in with what level of
precision?‖
1Parker,
R, & German, N. (2003). Sample Size: More than Calculations. The American Statistician,
August 2003, Vol 57(3).
Key Terms & Concepts
Demonstration
Additional Strategies
Biostats Lab
Precision


Precision here refers to
our accuracy in
reporting the effect that
we will observe in a
future study, given n
Commonly measured
as the width of the 95%
Confidence Interval
 Or ―Half-width‖
 The narrower the better…
Key Terms & Concepts
Demonstration
Additional Strategies
Biostats Lab

Returning to our example…
Given the σ of the differences observed on ISS
so far, if I collect data from a future study that
assumes similar variability (but hopefully less
decline in KEE), I will be able to characterize
my observed effects with what level of
precision?
Key Terms & Concepts
Demonstration
Additional Strategies
Biostats Lab
How would you do this?
Use historical data for σ (recall = 155)
 Apply the usual calculations for CI’s
 Except here, we estimate many CI’s for
different sample sizes, and we do so for our
metric of precision based on distance (D)
between the edges of the CI and the middle
(i.e. the ―half-width‖ ):

Key Terms & Concepts
Demonstration
Additional Strategies
Biostats Lab
Precision

Common to plot the curves
representing relationship of
half-width and sample size.
 You could do this manually (or
Excel) by applying the
formula for CI’s
 You could ask for assistance
from the Biostatistics lab
 You could purchase
specialized software (ex.
PASS) to assist you
http://www.ncss.com/pass.html
Key Terms & Concepts
Demonstration
Additional Strategies
Biostats Lab
Interpretation?

GIVEN n= whatever it is…
report on the precision.
 Is that level of precision
meaningful enough to
pursue?

Also consider where the
slope begins to change,
here around n~10
 Is your possible n close to
that?
 What would it take to get you
there??
Key Terms & Concepts
Demonstration
Additional Strategies
Biostats Lab
Recap

Sample size & power calculations:
 Should happen BEFORE you plan your research
 Requires knowledge about anticipated effect sizes &
σ, and are best appreciated as a continuum, rather
than a cut-off
 Are affected by experimental design, outcome
variables & statistical can plan
Key Terms & Concepts
Demonstration
Additional Strategies
Biostats Lab
Recap

Not all studies have the luxury of recruiting the
right number of subjects
 Overpowered studies can send false alarms
 Underpowered studies can lead to false conclusions

Not all studies should be designed to ―detect
statistical differences,‖ and thus benefit by a
different type of sample size analysis that
focuses on Precision, rather than Power
Key Terms & Concepts
Demonstration
Additional Strategies
Biostats Lab
What if I need help?

Consult the Biostatistics lab at JSC in
the early phases of planning your study
 Dr. Al Feiveson
○ Director of Biostatistics Lab
○ alan.h.feiveson@nasa.gov
 Dr. Rob Ploutz-Snyder
○ robert.ploutz-snyder-1@nasa.gov

But please allow us adequate time to
devote to your study…
Key Terms & Concepts
Demonstration
Additional Strategies
Robert Ploutz-Snyder, Ph.D.
Biostatistician NASA JSC
USRA / Division of Space Life Sciences
Robert.Ploutz-Snyder-1@nasa.gov
Download