Uploaded by ootbox.produxtion

Samples and measurement for students

advertisement
American University of
Antigua College of
Medicine
Epidemiology & Biostatistics 1
Preclinical Science
October 2023
Juan Manuel Lozano, MD, MSc
Division of Medical and Population Health Sciences Educations and Research
Herbert Wertheim College of Medicine, Florida International University
Populations and Samples
Juan Manuel Lozano, MD, MSc
Division of Medical and Population Health Sciences Research
Department of Medical Education
Herbert Wertheim College of Medicine
Florida International University
Objectives
Goal RU1.EB1.5
Understand advanced concepts in populations and samples
in health research: inference, generalizability, and other errors in research. 1Q
Given a clinical or research scenario, graph or table, students should be able to:
RU1.EB1.5.1 Describe the main differences between probabilistic and nonprobabilistic sampling methods in descriptive and analytical studies.
RU1.EB1.5.2 Contrast the most common sampling methods used in
epidemiological health research in terms of inference (generalizability)
RU1.EB1.5.3 Describe the main sources of error in the selection of samples:
random error (chance) and systematic error (bias)
RU1.EB1.5.4 Correlate validity with eligibility criteria, sampling methods, and the
generation of systematic and random errors.
 All subjects in a
population:
o can rarely be studied.
o do not need to be
studied.
o should not be studied.
 We can work on a
sample of those subjects
(from the population) by
design and then
generalize to the
population (inference).
Samples in health research
Samples in health research
Target or reference population: group to
which the researcher intends to generalize
findings.
e.g., all children with learning disabilities in
the US.
Accessible population: portion of the target
population that has a chance of being
selected.
e.g., children with learning disabilities in
Miami's school system.
Sample: subset of the accessible population
that is selected.
e.g., children with learning disabilities who
attended school X and Y at a given date.
Characteristics of a good sample
1. Representative of the target population:
 More
easily achieved using probabilistic sampling methods.
 Particularly critical for descriptive studies like surveys.
 Main sources of bias:


Voluntary bias.
Nonresponse.
2. Adequate size:
 Large enough to control random sources of error.
 Not too large (avoid wasting resources).
Steps for acquiring study subjects
1. Selection criteria: characteristics of the target
population that serve the research question well.
 Inclusion:
requirements to get subjects in.
 Exclusion: reasons to leave subjects out.
2. Sampling: selecting a subset of individuals from the
accessible population.
Selection criteria

Inclusion: requirements to get in the study (relevant for
research question, efficient for the study):
 Demographic
(age, gender, etc.).
 Clinical (specific condition, severity, comorbidities, etc.).
 Geographic: place and setting.
 Temporal: time frame of the study.

Exclusion criteria: subset of those who fulfill the inclusion
criteria that need to be excluded because:
 Need
to control some confounders.
 High likelihood of being lost to follow-up.
 Inability to provide good data.
 High risk of potential side effects.
Table 3.1 Designing Selection Criteria for a Clinical Trial of Low Dose Testosterone to
Enhance Libido in Menopause
Design Feature
Example
Inclusion
Specifying populations relevant to the
research question and efficient for study:
criteria (be
specific)
Demographic characteristics
White women 50 to 60 years old
Clinical characteristics
Good general health
Has a sexual partner
Geographic characteristics
Patients attending clinic at the
investigator's hospital
Temporal characteristics
Between January 1 and December 31
of specified year
Exclusion
Specifying subsets of the population that
will not be studied because of:
criteria (be
parsimonious)
High likelihood of being lost to follow-up Alcoholic or plan to move out of state
Inability to provide good data
Disoriented or have a language
barrier*
High risk of possible adverse effects
History of myocardial infarction or
stroke
*Alternatives to exclusion (when these subgroups are important to the research question) would be collecting nonverbal data or
using bilingual staff and questionnaires.

Probability (random):
 Driven
by probabilistic
methods (like a raffle).
 Participants have a
known (but not
necessarily equal)
probability of being
selected.

Non-Probability:
 Driven
by nonprobabilistic methods
(“cherry picking”).
Types of samples (some)

Probability (random):
 Simple
 Systematic
 Stratified


Proportional
Disproportional
 Cluster
 Multistage

Non-Probability:
 Convenience
 Purposive
 Quota
 Snowball
Types of samples (some)
Simple random sampling



Each member of the population
has an equal probability of
being chosen.
Requires having a list of all
members of the population.
Random selections can be made:
Drawing slips of paper from a bag.
 Using a table of random numbers.
 Using a computer.

Systematic random sampling


Requires having a list of all
members of the population.
Steps:
1. Obtain the sampling interval
dividing the number in the population
by the number of individuals to be
selected (figure: 12 / 4 = every 3rd
individual).
2. The starting point is selected at
random (figure: among individuals 1
to 3; No. 2).
3. Use the sampling interval to
choose consecutively from the
population.
Stratified random sampling
Population
Strata
Proportional
Disproportional
1. Divide the population into subgroups (strata) according
to some characteristics (sex, race).
2. Take a random sample from each strata, which can be:
Proportional (strata distributed as in the populations), or
 Disproportional (overrepresenting some strata in the sample)

Cluster random sampling


The population is divided in
natural groups (clusters) of
individuals (states, cities, hospitals,
blocks, etc.).
Steps:
1. Obtain the list of clusters.
2. Select some clusters at random.
Convenience / nonprobability sampling



Subjects are chosen because they
meet the selection criteria and
are easily accessible to the
investigator.
The most common form of
nonprobability sample.
Some variations:
1. Consecutive sample: recruits all
participants who meet the selection
criteria as they become available.
2. Volunteer / self-selection.
 Surveys that track behaviors that can lead to poor health in students grades 9 to 12.
 Administered every other year.
 Some of the health-related behaviors and experiences monitored are:
 Student demographics: sex, sexual identity, race and ethnicity, and grade.
 Youth health behaviors and conditions: sexual, injury and violence, bullying, diet and
physical activity, obesity, and mental health, including suicide.
 Substance use behaviors: electronic vapor product and tobacco product use, alcohol
use, and other drug use.
 Student experiences: parental monitoring, school connectedness, unstable housing, and
exposure to community violence.
YRBSS Sampling, Weighting, and Response Rates
National Survey
 Target population: all public and private school students in grades 9–12 in
the 50 states and the District of Columbia.
 Three-stage, cluster sample design to obtain a nationally representative
sample of U.S. students in grades 9–12:
1. Primary sampling units (PSUs) consisting of large sized counties or
groups of smaller, adjacent counties.
2. Schools (both public and private) are selected from PSUs with
probability proportional to school enrollment size.
3. Classes (one or two) selected randomly in each chosen school and in
each of grades 9–12.
 Average sample size of 14,517 and average school, student, and overall
response rates of 78%, 86%, and 71%, respectively.
Activity 1.
Selection and
sampling in an
RCT
In the report of an RCT,
identifying:
1. The selection
(inclusion, exclusion)
criteria.
2. The sampling method.
Participants
Eligible patients were English speaking adults 18
years or older who were awaiting elective upper
abdominal surgery that required general
anaesthesia, a minimum overnight hospital stay,
and a 5 cm or longer incision above, or extending
above, the umbilicus, and who attended an
outpatient preadmission assessment clinic.
Participants were recruited at three tertiary public
hospitals in Australia and New Zealand from June
2013 to August 2015. We excluded patients if
they were current hospital inpatients, required
organ transplants, required abdominal hernia
repairs, were unable to ambulate for more than
one minute, and were unable to participate in a
single physiotherapy preoperative session within
six weeks of surgery. Site investigators screened
preadmission clinics daily and invited eligible
patients to participate in the trial. Written
informed consent was gained before
randomisation.
Results
From June 2013 to August 2015, we assessed
504 patients listed for elective upper abdominal
surgery for eligibility. Of these, 441 met the
inclusion criteria and were randomly assigned to
receive either an information booklet (n=219;
control) or preoperative physiotherapy (n=222;
intervention). Nine (2%) patients were
withdrawn from the trial, leaving 432 (98%)
included for primary analysis (fig 1).
Fig 1. Flow of patients through trial.
Inclusion criteria In this trial
a) Demographic • Adults aged ≥ 18 y.
characteristics • English speaking.
• Within 6 w of elective major upper abdominal surg.
b) Clinical
characteristics • Minimum overnight hospital stay.
• Incision ≥ 5 cm above / extending above umbilicus.
• Attended outpatient preadmission assessment
c) Geographical 3 tertiary public hospitals in Australia / New Zealand.
characteristics
d) Temporal
June 2013 to August 2015.
characteristics
Exclusion criteria In this trial
•
•
•
Current hospital inpatients.
Required organ transplant or abdominal hernia repair.
Unable to ambulate > 1.
f) High risk of
•
lost to followup
Unable to participate in a single physiotherapy session
within 6 w of surgery.
g) Inability to
•
provide good •
data
Unable to understand English
Cognitive impairment
e) Control for
confounders
h) High risk of
side effects
NA
The investigators used the inclusion and exclusion
criteria to select the sample of patients for the study.
Which of the following sampling methods was used in
this trial?
A. Simple random.
B. Systematic random.
C. Stratified random.
D. Cluster random.
E. Non-random (convenience).
You have up
to 12
minutes
Inclusion criteria In this trial
a) Demographic • Adults aged ≥ 18 y.
characteristics • English speaking.
• Within 6 w of elective major upper abdominal surg.
b) Clinical
characteristics • Minimum overnight hospital stay.
• Incision ≥ 5 cm above / extending above umbilicus.
• Attended outpatient preadmission assessment
c) Geographical 3 tertiary public hospitals in Australia / New Zealand.
characteristics
d) Temporal
June 2013 to August 2015.
characteristics
Exclusion criteria In this trial
•
•
•
Current hospital inpatients.
Required organ transplant or abdominal hernia repair.
Unable to ambulate > 1.
f) High risk of
•
lost to followup
Unable to participate in a single physiotherapy session
within 6 w of surgery.
g) Inability to
•
provide good •
data
Unable to understand English
Cognitive impairment
e) Control for
confounders
h) High risk of
side effects
NA
The investigators used the inclusion and exclusion
criteria to select the sample of patients for the study.
Which of the following sampling methods was used in
this trial?
A. Simple random.
B. Systematic random.
C. Stratified random.
D. Cluster random.
E. Non-random (convenience).
Patients aged ≥ 18 y within 6 w of elective major upper
abdominal surgery
Target
population
Eligible patients in three hospitals in Australia / New Zealand
from June 2013 to August 2015
Accessible
population
Intended
Sample
Actual
sample
Fig 1. Flow of patients through trial
Major problems with samples
1. Systematic error (bias): the sample does not
represent the target population:
 Inadequate
selection criteria.
 Low response rate.
2. Random error: the sample size does not provide
good precision in the estimate obtained in the study:
 Insufficient number of participants.
 Random error is unavoidable, but can be reduced to
adequate levels.
 Large samples do not solve issues with bias.
Common objectives in health research
Objective
Test of
Sampling?
hypotheses?
Most often
1. To describe (descriptive studies): No
• The frequency / distribution of a
probabilistic
problem or finding
2. To compare (analytical studies):
•
•
•
•
Causality: exposed vs. non exposed
Interventions: treated vs. untreated
Diagnosis: a test vs. a “gold standard”
Prognosis: exposed vs. non exposed
Yes (always)
Most often
convenience
Take home messages



Most health research is based on the study of samples.
Advantage: efficiency: examining a subset of subjects reduces
cost, time and effort.
Disadvantage: error may be introduced if the sample:
not represent the target population (findings may not generalize).
 Is not large enough (poor precision on estimates).
 Does

The appropriate sample depends on the research question:
 Descriptive:
Random sample is usually more appropriate.
 Analytical: Convenience sample is more often used.
Take home messages


Nonprobability (convenience) samples are used more often
in clinical research because of the difficulties in obtaining true
access to populations; patients are recruited as they become
available, making true random selection impossible.
Researchers must be aware of the limitations inherent to any
sampling method and should try to incorporate elements of
random sampling whenever possible.
“Sampling, like life, is a compromise.”
Measurement
Juan Manuel Lozano, MD, MSc
Division of Medical and Population Health Sciences Research
Department of Medical Education
Herbert Wertheim College of Medicine
Florida International University
Objectives
Goal RU1.EB1.6 Identify the main conceptual elements of measurement:
variables and the relationship with sources of error in health research
Given a clinical or research scenario, graph or table, students should be able to:
RU1.EB1.6.1 Describe the roles of descriptive, independent, dependent, and
confounding variables in health research.
RU1.EB1.6.2 Identify the type of variables in a given study.
RU1.EB1.6.3 Correlate the four different measurement scales (NOIR: nominal,
ordinal, interval, ratio) with health research variables.
RU1.EB1.6.4 Determine the measurement scales used for the variables in a
study after reading a methods section in published research.
RU1.EB1.6.5 Define internal validity, external validity, and reliability as they relate
to measurement of variables in research.
Variable

“Anything that is measured or manipulated in a
study”
(Norman GR. Statistics. BC Decker Inc., 2003)

“Any attribute, phenomenon or event that can
have different values. Any amount that varies”
(Last JM. A Dictionary of Epidemiology. Oxford University Press, 2001)

“A concept, or factor, that can have more than
one value”
(Portney LG. Foundations of Clinical Research. Prentice-Hall, Inc., 2000)
How can we classify
research variables?


By the role that they play
in the research question
By the measurement scale
Analytical research = assessing associations
Association:
the quantity of a
variable varies
with the quantity
of one or more
other variables.
Positive or Direct
Negative or Inverse
Role of variables in analytical research
 Independent
 Dependent
 Confounding
“The CRASH trial (corticosteroid randomisation after
significant head injury) is a large, international,
randomised placebo-controlled trial of the effect of
early administration of 48 h infusion of
methylprednisolone on risk of death and disability
after head injury.”
Independent variable
Also called:
 Explanatory
 Exposure
 Experimental
 Manipulated
 Predictor
 Causal
“The CRASH trial (corticosteroid randomisation after
significant head injury) is a large, international,
randomised placebo-controlled trial of the effect of
early administration of 48 h infusion of
methylprednisolone on risk of death and disability
after head injury.”
Dependent variable
Also called:
 Outcome
 Endpoint
 Response
 Explained
“The CRASH trial (corticosteroid randomisation after
significant head injury) is a large, international,
randomised placebo-controlled trial of the effect of
early administration of 48 h infusion of
methylprednisolone on risk of death and disability
after head injury.”
Confounding variable
Also called:
 Confounder
 Extraneous
 Intervening
 Controlled
Role of variables in analytical research
 Independent
 Dependent
 Confounding
“The CRASH trial (corticosteroid randomisation after
significant head injury) is a large, international,
randomised placebo-controlled trial of the effect of
early administration of 48 h infusion of
methylprednisolone on risk of death and disability
after head injury.”
Activity 2. Variables in an RCT
In the report of an RCT, identifying the following variables:
1. Independent or exposure.
2. Main dependent or main outcome(s).
3. Some confounders.
OBJECTIVE
To assess the efficacy of a single preoperative
physiotherapy session to reduce
postoperative pulmonary complications
(PPCs) after upper abdominal surgery.
INTERVENTIONS
Preoperatively, participants received an
information booklet (control) or an additional
30-minute physiotherapy education and
breathing exercise training session
(intervention). Education focused on PPCs
and their prevention through early
ambulation and self-directed breathing
exercises to be initiated immediately on
regaining consciousness after surgery
Postoperatively, all participants received
standardised early ambulation, and no
additional respiratory physiotherapy was
provided.
MAIN OUTCOME MEASURES
The primary outcome was a PPC within 14
postoperative hospital days assessed daily
using the Melbourne group score. Secondary
outcomes were hospital acquired
pneumonia, length of hospital stay,
utilisation of intensive care unit services,
and hospital costs. Patient reported health
related quality of life, physical function, and
post-discharge complications were
measured at six weeks, and all cause
mortality was measured to 12 months.
Role of variables in the trial
Independent
Dependent
Preoperative management
Booklet + Education Session (E)
vs Booklet only (C)
Having a PPCs
within 14 days
Yes or No








Confounders
Age
Gender
BMI
ASA physical health status
Comorbidities
Smoking
Type of surgery
Etc….
Up to 6 minutes (≈ 2 min / question)
Role of variables in the trial
Independent
Dependent
Preoperative management
Booklet + Education Session (E)
vs Booklet only (C)
Having a PPCs
within 14 days
Yes or No








Confounders
Age
Gender
BMI
ASA physical health status
Comorbidities
Smoking
Type of surgery
Etc….
How can we classify
research variables?


By the role that they play
in the research question
By the measurement scale
One mnemonic
N
O
I
R
Nominal
Ordinal
Interval
Ratio
Scale
Qualitative
(categorical)
Quantitative
Sub scale
Nominal: values fit into
categories with no hierarchy
Ordinal: values fit into
categories with implicit
hierarchy
Interval: Zero is arbitrary (i.e.,
it is not “0”), can have
fractions
Ratio: Zero is not arbitrary (it
is “0”), can have fractions
Examples
• Sex
• Blood type
• Type of cancer
• Cancer staging
• Apgar or Glasgow Coma scores
• Temperature (oC, oF)
• Time
• Age
• Weight
• Blood pressure
Scale
Qualitative
(categorical)
Quantitative
Sub scale
Nominal: values fit into
categories with no hierarchy
Ordinal: values fit into
categories with implicit
hierarchy
Interval: Zero is arbitrary (i.e.,
it is not “0”), can have
fractions
Ratio: Zero is not arbitrary (it
is “0”), can have fractions
Examples
• Sex
• Blood type
• Type of cancer
• Cancer staging
• Apgar or Glasgow Coma scores
• Temperature (oC, oF)
• Time
• Age
• Weight
• Blood pressure
Scale
Qualitative
(categorical)
Quantitative
Sub scale
Nominal: values fit into
categories with no hierarchy
Ordinal: values fit into
categories with implicit
hierarchy
Interval: Zero is arbitrary (i.e.,
it is not “0”), can have
fractions
Ratio: Zero is not arbitrary (it
is “0”), can have fractions
Examples
• Sex
• Blood type
• Type of cancer
• Cancer staging
• Apgar or Glasgow Coma scores
• Temperature (oC, oF)
• Time
• Age
• Weight
• Blood pressure
Scale
Qualitative
(categorical)
Quantitative
Sub scale
Nominal: values fit into
categories with no hierarchy
Ordinal: values fit into
categories with implicit
hierarchy
Interval: Zero is arbitrary (i.e.,
it is not “0”), can have
fractions
Ratio: Zero is not arbitrary (it
is “0”), can have fractions
Examples
• Sex
• Blood type
• Type of cancer
• Cancer staging
• Apgar or Glasgow Coma scores
• Temperature (oC, oF)
• Time
• Age
• Weight
• Blood pressure
Scale
Qualitative
(categorical)
Quantitative
Sub scale
Nominal: values fit into
categories with no hierarchy
Ordinal: values fit into
categories with implicit
hierarchy
Interval: Zero is arbitrary (i.e.,
it is not “0”), can have
fractions
Ratio: Zero is not arbitrary (it
is “0”), can have fractions
Examples
• Sex
• Blood type
• Type of cancer
• Cancer staging
• Apgar or Glasgow Coma scores
• Temperature (oC, oF)
• Time
• Age
• Weight
• Blood pressure
Scale
Qualitative
(categorical)
Quantitative
Sub scale
Nominal: values fit into
categories with no hierarchy
Ordinal: values fit into
categories with implicit
hierarchy
Discrete: measure in integers,
limited number
Interval: Zero is arbitrary (i.e.,
it is not “0”), can have
fractions
Ratio: Zero is not arbitrary (it
is “0”), can have fractions
Examples
• Sex
• Blood type
• Type of cancer
• Cancer staging
• Apgar or Glasgow Coma scores
•
•
•
•
Number of pregnancies
Number of treatments
Temperature (oC, oF)
Time
• Age
• Weight
• Blood pressure
And why is this important?
Because it allows:
1. To decide how to
collect the data
2. To decide how to
analyze the data
Birth weight in 100 neonates admitted to NICU in five hospitals
Case
Hospital
Gender
Birth weight (g)
LBW (<2500 g)
1
A
Fem
2800
No
2
B
Mal
3200
No
3
C
Mal
2300
Yes
4
D
Fem
2950
No
5
E
Fem
1800
Yes
6
E
Mal
2700
No
7
D
Mal
1500
Yes
100
B
Fem
3800
No
Statistical tests (just a few)
Measurement
scale
Statistical measures or tests
Descriptive
Comparative
Categorical
(nominal, ordinal)
Proportions, percentages
Continuous
(interval, ratio)
t-test, analysis of variance
Mean / standard
deviation, median / range, (ANOVA), linear regression
mode
Chi square, Fisher, MannWhitney U, MantelHaenszel, survival analysis,
RR, OR, etc.
Activity 3. Variables in an RCT
In the report of an RCT, identifying the measurement scale
(nominal, ordinal, interval, ratio) for variables:
Table 1 | Baseline demographic and clinical characteristics of the study population. Values are numbers
(percentages) unless stated otherwise
Characteristics
Preoperative physiotherapy (n=218)
Information booklet (n=214)
Median (IQR) age (years)
63.4 (51.5-71.9)
67.5 (56.3-75.3)
Mean (SD) body mass index
28.5 (5.9)
(6.2)
Body mass index >35
25 (12)
30 (14)
ASA physical health status:
1-2
150 (69)
(58)
3-4
67 (31)
90 (42)
Recent chest infection
12 (6)
5 (2)
Smoking status:
Never smoked
76 (35)
71 (33)
Former smoker
93 (43)
86 (40)
Current smoker
49 (21)
57 (27)
Mean (SD) average pack year
18.1 (23.7)
20.6 (24.7)
Comorbidities:
Respiratory disease
42 (19)
55 (26)
Diabetes mellitus
33 (15)
41 (19)
Cancer
148 (68)
148 (69)
Cardiac disease
26 (12)
34 (16)
Surgical category and procedure:
Colorectal
108 (50)
101 (47)
Hepatobiliary / upper GI
49 (22)
59 (28)
Renal / urology / other
61 (28)
54 (24)
Scale
Sub scale
Qualitative Nominal: values fit into categories
(categorical) with no natural order
Examples
• BMI > 35 (Y, N)
• Recent chest infection (Y, N)
• Comorbidities (respiratory, DM, cancer, heart)
• Surgical category & procedure (colorectal,
hepatobiliary, renal)
Ordinal: values fit into categories with • ASA physical health status (1-2, 3-4)
natural order
• Smoking status (never, former, current)
Quantitative Interval: zero is arbitrary (it is not “0”),
can have decimals and negative values
Ratio: zero is not arbitrary (it is “0”),
can have decimals but not negative
values
• None that we can identify in Table 1
• Age
• BMI
• (Tobacco) Pack years
Scale
Sub scale
Qualitative Nominal: values fit into categories
(categorical) with no natural order
Examples
• BMI > 35 (Y, N)
• Recent chest infection (Y, N)
• Comorbidities (respiratory, DM, cancer, heart)
• Surgical category & procedure (colorectal,
hepatobiliary, renal)
Ordinal: values fit into categories with • ASA physical health status (1-2, 3-4)
natural order
• Smoking status (never, former, current)
Quantitative Interval: zero is arbitrary (it is not “0”),
can have decimals and negative values
Ratio: zero is not arbitrary (it is “0”),
can have decimals but not negative
values
• None that we can identify in Table 1
• Age
• BMI
• (Tobacco) Pack years
Take home messages


Variable: anything that is measured or controlled for in
clinical measurements or in health research.
Role of variables in comparative research:
 Independent
(exposure or intervention)
 Dependent (outcome, endpoint)
 Confounders

Measurement scales - NOIR:
 Determine
how to collect and analyze data
Validity

“How well the measurement represents the
phenomenon of interest” (Hulley SB. Designing Clinical
Research. Lippincott Williams & Williams, 2001)

“The results of a measurement correspond to the
true state of the phenomenon being measured”
(Fletcher RH. Clinical Epidemiology. Lippincott Williams & Williams, 2005)

“The value of the parameter that is the object of
measurement is estimated with little error” (Rothman
KJ. Modern Epidemiology. Wolters Kluwer, 2008)
What is validity?
Validity:


A reflection of the truth
When the measurement tool or the research report (or study)
reflects the truth
Reliability (a related but different concept):


Consistency in reporting in repeated measurements (several
observations, different observers)
Reproducibility, although not necessarily the truth
Examples of health measurements

Clinical:



Body weight, measured with a scale
Severity of head trauma, measured with the Glasgow Coma
Score
Health research:


Proportion of FIU community members that endorse the ban
for tobacco in the university (measured in a survey)
Risk of death after head injury when steroids are used
(measured in a randomized controlled trial)
Types of validity in studies



Internal: the study measured correctly what it was
supposed to measure in the study sample (the study
subjects)
External: the findings of the study can be generalized
to subjects who did not participate in the study
Consequence: internal validity is required before
considering external validity
Internal validity

“Degree to which the results of the
study are correct for the sample of
participants being studied”.
Fletcher et al, Clinical Epidemiology


Depends largely on the procedures
of the study and how rigorously it is
performed.
It is not a "yes or no" concept.
Instead, we consider how confident
we can be with the findings of a study,
based on whether it avoids errors that
may make the findings questionable.
External validity

“Degree to which the results of the
study hold true in other settings.
Another term is generalizability”.
Fletcher et al, Clinical Epidemiology


Depends largely on the how similar
individuals in other settings are as
compared to those included in the
study.
Again, it is not a "yes or no"
concept. Instead, we consider how
confident we can be that the findings
of a study apply to others.
All target participants
of interest
Activity 4. Describing a health situation



Goal: to measure the frequency
(%) of Covid-19 vaccination at FIU.
Target population: FIU community
of ≈ 55,000 people comprising
students, employees and faculty
members.
Study: observational - descriptive
(survey).
Activity 4. Describing a health situation



Goal: to measure the frequency (%) of Covid-19 vaccination at FIU.
Participants: stratified random sample of 100 HWCOM members
(students, employees, faculty).
Methods:
Observational descriptive (survey).
 Self-administered questionnaire completed anonymously online.
 The questionnaire has been validated and used in many similar studies.
 100% response rate.



Main finding: vaccination frequency 85% (95% precision: 77% to
91%).
Conclusion: 85% of FIU community members are vaccinated.
4A. How do you feel about the internal validity of
the study?
1.
2.
3.
4.
5.
I have very serious concerns about it.
I have some concerns about it.
I don’t know how we feel about it.
I feel somewhat confident about it.
I am very confident about it.
4B. How do you team feel about the external
validity of the study (i.e., do the findings apply to
FIU overall)?
1.
2.
3.
4.
5.
I have very serious concerns about it.
I have some concerns about it.
I don’t know how we feel about it.
I feel somewhat confident about it.
I am very confident about it.
4C. Studies that have serious methodological
issues and therefore low internal validity can,
however, have good external validity.
1. True.
2. False.
3. I don’t know.
Up to 9 minutes (≈ 3 min / question)
4A. How do you feel about the internal validity of
the study?
1.
2.
3.
4.
5.
I have very serious concerns about it.
I have some concerns about it.
I don’t know how we feel about it.
I feel somewhat confident about it.
I am very confident about it.
All target participants
of interest
4B. How do you team feel about the external
validity of the study (i.e., do the findings apply to
FIU overall)?
1.
2.
3.
4.
5.
I have very serious concerns about it.
I have some concerns about it.
I don’t know how we feel about it.
I feel somewhat confident about it.
I am very confident about it.
All target participants
of interest
4C. Studies that have serious methodological
issues and therefore low internal validity can,
however, have good external validity.
1. True.
2. False.
3. I don’t know.
Describing a situation



Goal: to measure the
frequency of opinions
(favorable or oppositional)
towards the FIU “Free
Tobacco Smoke” policy
Population: FIU community
(≈45,000 people)
Study: observational
descriptive
Take home messages

Validity in research:
 Internal:
are the findings correct for the sample of participants
studied?
 External: do the results of the study hold true in other settings?

Sources of error:
 Bias:
findings deviate systematically from the truth.
 Random: findings are distributed above and below the truth.
 Not mutually exclusive; rather, their effects are cumulative.
 More about both in future sessions.
Random error (chance)


Health research is almost always conducted
studying samples
Even appropriate sampling techniques entail
the chance of random error
Describing a situation



Goal: to measure the
frequency of opinions
(favorable or oppositional)
towards the FIU “Free
Tobacco Smoke” policy
Population: FIU
community (≈55,000
people)
Study: observational
descriptive
Describing a situation
Survey in a random sample (n = 500) from the
whole FIU community
Study sample
FIU
community
0.65 (≈65%)
endorsement
Population and multiple samples
Population % = ?
Sample 1 = 65%
Sample 2 = 60%
Sample 3 = 70%
Sample 4 = 72%
Sample 5 = 61%
Sample n = n%
Why????
Random (albeit representative) samples of equal size
produce different estimates of the population parameter
What is the cause of this variation?
 Each sample is different (i.e., composed by different subjects) =
each gives different results
 Chance determines the composition of each sample
 Therefore, chance is the cause of the variability in the estimates
from different samples
There is (at least in theory) one way to
eliminate random error; what is it?
Population % = ?
Sample 1 = 65%
Sample 2 = 60%
Sample 3 = 70%
Sample 4 = 72%
Sample 5 = 61%
Sample n = n%
Is it possible to eliminate random error?


Very rarely: including all subjects from the population in the sample
Thus, random error is almost always present in clinical and health
research
If so, what can be done?


Can be minimized by design (sample size calculation)
Can be measured by statistical techniques (by calculating p values
and confidence intervals)
Take home message: potential
explanations for research findings
1. The study findings do not represent the truth:
 Bias: systematic errors lead to study findings that do not represent the
truth
 Control measures: good design and conduct
 Chance: random error leads to study findings that do not represent
the truth
 Control measures: random error cannot be eliminated; it can be
minimized and quantified
2. The study findings represent the truth
Download