sample size estimation

advertisement
SAMPLE SIZE ESTIMATION
DR. SHRIRAM V. GOSAVI
MODERATED BY: BHARAMBE SIR
FRAMEWORK
• What is sample size? &
Why it required?
• Practical issues in determining sample sizes
• Determining sample size
• Sample size calculation by different ways
• Sample size estimation for descriptive studies
• Sample size estimation for hypothesis testing
• Summary
• References
WHAT IS SAMPLE SIZE? & WHY IT
REQUIRED?
• Sample size means “n”
• After planning for any research it is important to
know that how many subjects should be included
in their study i.e. sample size & how these
subjects should be selected (sampling methods).
• If a study does not have an optimum sample size,
the significance of the results in reality (true
differences) may not be detected.
• This implies that the study would lack power to
detect the significance of differences because of
inadequate sample size.
How Big a Sample
Do You Need?
• Small sample size (less than the optimum sample size)
– May fail to detect a clinically important difference,
– or may estimate those effects or associations too
imprecisely,
– Even the most rigorously executed study may fail to
answer its research question
• Very large sample size (more than the optimum size):
– Involve extra patients
– Costs more
– Difficult to maintain high data quality
NOT VERY SMALL AND NOT VERY LARGE
Practical issues in Determining Sample
Sizes
• Importance of the Research Issue: If the
results of the survey research are very critical,
then the sample size should be increased. As
sample size increases, the width of the
confidence interval decreases.
• Heterogeneity of the population: If there is
likely to be wide variations in the results
obtained from various respondents, the
sample size should be increased
Practical issues in Determining Sample
Sizes
• Funding: quite often, budgetary constraints
limit the sample size for the study
• Number of sub-groups to analyze: If multiple
sub-groups in a population are going to be
analyzed, the sample size should be increased
to ensure that adequate numbers are
obtained for each sub-group
Determining sample size
The things you need to know:
•
•
•
•
•
•
•
•
•
•
Random Error:
Systematic Error:
Validity & Precision:
Null Hypothesis:
Alternate Hypothesis:
Hypothesis Testing:
Type I & II Error:
Power:
Effect Size:
Design Effect:
Random error
• It describes the role of chance,
• Sources of random error include:
- sampling variability,
- subject to subject differences &
- measurement errors.
• It can be controlled and reduced to acceptably
low levels by:
- Averaging,
- Increasing the sample size &
- Repeating the experiment
Systematic error (Bias)
• It describes deviations that are not a consequence of
chance alone.
• Several factors, including:
- Patient selection criteria, might contribute to it.
• These factors may not be amenable to measurement,
• Removed or reduced by good design and conduct of
the experiment.
• A strong bias can yield an estimate very far from the
true value.
Validity and Precision (1)
Fundamental concern: avoidance and/or control of error
Error = difference between true values and study results
Accuracy
= lack of error
Validity
Precision
= lack or
control of
systematic
error
= lack of
random
error
Validity and Precision (2)
results
validity
actual
estimator
target
estimator
Precision
Precision
Any possibility of errors?
• Since our decision is based on the sample we
chose from the population, there is a
possibility that we make a wrong decision
• A type I error occurs when Null hypothesis is
rejected when it is in fact true
• A type II error occurs when Null hypothesis is
not rejected when it is false
Summary of possible results of any
hypothesis test
Researcher’s Decision
Accepted
Rejected
True
Correct
Power = (1-Beta)
Type I error (or)
alpha error
False
Type II error (or) beta
error
Correct
Reality of
hypothesis
Type I error / α error
• The probability of making a error is called as
level of significance i.e. consider as 0.05 (5%).
• For computing the sample size its specification
in terms of Zα is required.
• The quantity Zα is a value from the standard
normal distribution corresponding to α
• Sample size is inversely proportional to type
I error.
Type II error / β error
• For computing the sample size its specification
in terms of Zb is required.
• The quantity Zb is a value from the standard
normal distribution corresponding to β
• A type II error is frequently due to small
sample sizes
• The exact probability of a type II error is
generally unknown
Power of the study:
• Probability that the test will correctly identify a
significant difference or effect or association in
sample should one exist in the population
• 1- β corresponds to sensitivity of a diagnostic test,
i.e. probability of making a positive diagnosis when
disease is present
• Thus, sample size directly related to the power of
study.
• A well designed trial should have a power of at least
0.8
Effect size
– It should represent smallest difference that would
be of clinical or biological significance.
– If the effect size is increased, the type II error
decreases
– A large sample size is needed for detection of a
minute difference.
– Thus, the sample size is inversely related to the
effect size.
Variability of the measurement:
– The variability of measurements is reflected by the
standard deviation or the variance.
– The higher the standard deviation, larger sample
size is required.
– Thus, sample size is directly related to the SD
Types of Problems in Medical Research
Estimation: (Prevalence/Descriptive Study)
- Given proportion of prevalence
- Given mean & standard deviation
Testing hypothesis: (Cohort/Case Control/Clinical
Trial)
- Given two proportion or incidence rates
- Given two group means and standard
deviations
SAMPLE SIZE CALCULATION BY
DIFFERENT WAYS
•
•
•
•
By use of Formulae
Computer Soft wares
Readymade tables,
Nomograms
Formulae
&
Problems
Sample size
Quantitative
Z 2σ 2
n
d2
Qualitative
Z p(1 p)
n
d2
2
Descriptive study
When proportion is the parameter of our study
n = Z2α * p * q/d2
where
z = standardized normal deviate (Z value)
p = Proportion or prevalence of interest (from pilot study or
literature survey) expressed in percentage form
q = 100-p
d = clinically expected variation (precision)
Example
From a pilot study it was reported that among
headache patients 28% had vascular
headache. It was decided to have 95% CI and
10% variability in the estimated 28%. How
many patients are necessary to conduct the
study.
ANSWER
p = 28%, q = 72%
Z α = 1.96 for α at 0.05
d = 10% of 28% = 2.8
n = (1.96)2 * 28* 72 /(2.8)2 = 987.8
B. When mean is the parameter of our
study
n = Z2α* S2/d2
Where
Z = Standardized Normal Deviate (Z value)
S = Sample standard deviation
d = Clinically expected variation
Example
In a Health survey of school children it is found
that the mean hemoglobin level of 55 boys is
10.2/100 ml with a standard deviation of 2.1 &
Clinically meaningful difference is 0.8
Mean = 10.2
Standard Deviation = 2.1
Z α = 1.96 for α at 0.05
d = 0.8
n = (1.96)2 * 2.12/(0.8)2 = 26
Testing Hypothesis
Formulae
&
Problems
When mean is the parameter of our
study
n = (Zα + Zβ)2 *S 2 * 2/d2
Where
Zα = Z value for α error
Zβ = Z value for β error
S = Common standard deviation between two
groups
d = Clinically meaningful difference
Example: Quantitative
• An investigator compares the change in blood
pressure due to placebo with that due to a
drug. If the investigator is looking for a
difference between groups of 5 mmHg, then
with a between – subject, SD as 10 mmHg,
how many patients should he recruit?
ANSWER
n = (Zα + Zβ)2 *S 2 * 2/d2
Zα = 1.96 at α = 5%
Zβ = 1.28 at β = 10%
S = 10
d=5
Hence, n = 85
When Proportion is the parameter of our study
• Formula:
n = Z2α[P1(1-P1) + P2(1-P2)]/d2,
where,
n
= sample size
Z2α
= confidence interval
P1
= estimated proportion (larger)
P2
= estimated proportion (smaller)
d
= Clinically meaningful difference
EXAMPLE
• What sample size to be selected from each of
two groups of people to estimate a risk
difference to be within 3 percentage points of
true difference at 95% confidence when
anticipated P1 & P2 are 40% & 32%
respectively.
ANSWER
Available information:
zα = 1.96
P1 = .40
P2 = .32
d = 0.03
n = (1.96)2[ .40(1-.40) + .32(1-.32)] / (.03)2
n = 1953
SUMMARY:
Steps in Estimating Sample Size
• 1. Identify the major study variables.
• 2. Determine the types of estimates of study
variables, such as means or proportions.
• 3. Select the population or subgroups of interest
(based on study objectives and design).
• 4a. Indicate what you expect the population
value to be.
• 4b. Estimate the standard deviation of the
estimate.
SUMMARY:
Steps in Estimating Sample Size
• 5. Decide on a desired level of confidence
in the estimate (confidence interval).
• 6. Decide on a tolerable range of error in
the estimate (desired precision).
• 7. Compute sample size, based on study
assumptions.
COMPUTER SOFTWARE USED IN
ESTIMATION OF SAMPLE SIZE
REFERNCES
• Lwanga SK, Lemeshow S. Sample size determination in health
studies - A practical manual. 1st ed. Geneva: World Health
Organization; 1991.
• Zodpey SP, Ughade SN. Workshop manual: Workshop on Sample
Size Considerations in Medical Research. Nagpur: MCIAPSM; 1999
• Rao Vishweswara K. Biostatistics A manual of statistical methods
for use in health , nutrition and anthropology. 2nd edition. New
Delhi: Jaypee brothers;2007
• VK Chadha . Sample size determination in health studies. NTI
Bulletin 2006,42/3&4, 55 - 62
Download