Introduction

advertisement
Introduction
• Populations and Samples
– Population - Set of all individuals or units of
interest to investigators. Sometimes we may
refer to a population of measurements as
opposed to individuals or units.
– Sample - Subset of a population that is
observed and measured by investigators.
Quantitative and Qualitative Variables
• Quantitaive variables take on numeric values. They can
be further classified as:
– Continuous variables can take on values along an interval
(e.g. blood pressure, temperature)
– Discrete variables can take on distinct values with “breaks”
(e.g. Woman’s parity, Number of prior cardiac events)
• Qualitative variables take on various categories. They
can be classified as:
– Nominal variables take on values with no inherent ordering
(e.g. Presence/Absence of parasite, gender, race)
– Ordinal variables take on categories that can be ordered (e.g.
Prognosis, Attitude toward a proposal)
Dependent and Independent Variables
• Dependent variables are outcomes of interest to
investigators. Also referred to as Responses or
Endpoints
• Independent variables are Factors that are often
hypothesized to effect the outcomes (levels of dependent
variables). Also referred to as Predictor or Explanatory
Variables
• Research ??? Does I.V.  D.V.
Example - Clinical Trials of Cialis
• Clinical trials conducted worldwide to study efficacy
and safety of Cialis (Tadalafil) for ED
• Patients randomized to Placebo, 10mg, and 20mg
• Co-Primary outcomes:
– Change from baseline in erectile dysfunction domain if the
International Index of Erectile Dysfunction (Numeric)
– Response to: “Were you able to insert your P… into your
partner’s V…?” (Nominal: Yes/No)
– Response to: “Did your erection last long enough for you to
have succesful intercourse?” (Nominal: Yes/No)
Source: Carson, et al. (2004).
Example - Clinical Trials of Cialis
• Population: All adult males suffering from erectile
dysfunction
• Sample: 2102 men with mild-to-severe ED in 11
randomized clinical trials
• Dependent Variable(s): Co-primary outcomes
listed on previous slide
• Independent Variable: Cialis Dose: (0, 10, 20 mg)
• Research Questions: Does use of Cialis improve
erectile function?
Parameters and Statistics
• Parameters: Numerical descriptive measures for
Populations:





m - Mean (average) of a numeric variable
s2 - Variance
s - Standard deviation of a numeric variable
CV - Coefficient of variation of a numeric variable
p - Proportion of population with a nominal characteristic
Parameters and Statistics
• Statistics: Numerical descriptive measures for Samples
– Sample Mean (of a sample of size n):
y

my
^
n
– Sample Variance (s2) and standard deviation (s):
s2 
2
(
y

y
)

n 1
s   s2
s
– Sample coefficient of variation (cv): cv   100%
 y
– Sample Proportion with a characteristic:
^
p
Example - Carbonate of Bismuth
• Samples of Carbonate of Bismuth from a sample of 6
London manufacturing chemists
• Measurements: Quantity of Teroxide (Theoretically should
be 88.30 per 100 parts)
• Measured levels: 89, 88.5, 86.16, 87.66, 87.66, 86
y
89  88.5  86.16  87.66  87.66  86
 87.50
6
(89  87.5) 2  (88.5  87.5) 2  (86.16  87.5) 2  (87.66  87.5) 2  (87.66  87.5) 2  (86  87.5) 2
s 
 1.47
6 1
 1.21 
s  1.47  1.21
cv  
100%  1.39%
87
.
5


2
Source: Umney (1864)
Example - Clinical Trials of Cialis
• Among the 638 patients receiving placebo
(dose=0), 198 responded “Yes” to “Did your
erection last long enough for you to have succesful
intercourse?”
• Of 321 receiving 10mg dose, 186 replied “Yes”
• Of 1143 receiving 20mg dose, 777 replied “Yes”
198
p0 
 0.31
638
^
^
p 10
186

 0.58
321
^
p 20 
777
 .68
1143
Note that proportions are often reported as percentages (number with
characteristic per 100 exposed) or as rates per 10,000 such as mortality
rates for rare causes
Graphical Techniques
• Pictures are worth a bunch of words and
computer packages make graphing easy!
– Histograms show the number or percent by
category or within ranges of values
– Pie charts show proportionally the number or
percent by category or within ranges of values
– Scatterplots plot a dependent variable on the
vertical axis versus an independent variable
with each subject being a point on the chart
Histogram of ED Severity Level
• In the Cialis trial, the baseline severity level was
reported for 2099 patients on an ordinal scale:
1=Normal, 2=Mild, 3=Moderate, 4=Severe
800
600
400
200
Std. Dev = .92
Mean = 2.9
N = 2099.00
0
1.0
2.0
SEVERITY
Cases w eighted by PATIENTS
3.0
4.0
Pie Chart of ED Severity Level
Normal
Severe
Mild
Moderate
Cases weighted by PATIENTS
Histogram of Disposition by Dose (Count=%)
dose
Placebo
10mg
75
20mg
Count
Bars show counts
Disposition:
1=Completed
2=Adverse event
3=Lack of Efficacy
50
4=Lost to follow-up
25
5=Patient Decision
6=Protocol Violation
0
2
4
dispose
6
7=Others
Scatterplot of Math Score vs LSD Level
• Response - Mean Math score for 7 subjects
• Predictor - Mean LSD Concentration
Bivariate Scattergram
80
mathscore
70
60
50
40
30
20
1
2
3
Source: Wagner and Bing (1968)
4
lsdconc
5
6
7
Conc
1.17
2.97
3.26
4.69
5.83
6.00
6.41
Score
78.93
58.20
67.47
37.47
45.65
32.92
29.97
Basic Probability
• Probability measures the likelihood or chances of particular
outcomes (or events) of random experiment or observation
• Let A and B be two events, with probabilities P(A) & P(B):
– Intersection - Event that both A and B occur (Notation: AB)
– Union - Event that either A and/or B occur (Notation: AB)
– Complement - Event that the event does not occur (Notation: Ā)
• Probability Rules:
P( A  B)  P( A)  P( B)  P( AB)
P( AB)
P( A | B) 
= P(A occurs Given B has occurred)
P( B)
P( AB)  P( A) P( B | A)  P( B) P( B | A)
P( A)  1  P( A)
Example - High Cholesterol By Age and Sex
• WHO MONICA Survey of 50000 Adults
• Proportions by Age, Gender, and Cholesterol:
Male
25-29
30-34
35-39
40-44
45-49
50-54
55-59
60-64
Total
Source: Gostynski, et al (2004)
Female
High Chol Low Chol High Chol Low Chol Total
0.0066
0.0486
0.0046
0.0527
0.1125
0.0111
0.0542
0.0056
0.0563
0.1272
0.0167
0.0476
0.0079
0.0582
0.1304
0.0196
0.0457
0.0100
0.0526
0.1279
0.0207
0.0440
0.0172
0.0490
0.1309
0.0222
0.0430
0.0256
0.0400
0.1308
0.0229
0.0425
0.0303
0.0342
0.1299
0.0185
0.0344
0.0304
0.0280
0.1113
0.1383
0.3600
0.1316
0.3710
1
Example - High Cholesterol By Age and Sex
• Probability a Randomly Selected Subject is Male:
P( M )  P( M & HC )  P( M & LC )  .1383  .3600  .4983
• Probability a Randomly Selected Subject is over 40 years:
P( 40)  P(40  44)  P(45  49)  P(50  54)  P(55  59)  P(60  64)
 .1279  .1309  .1308  .1299  .1113  .6308
• Probability Female given subject has High Cholesterol:
P( F & HC )  .1316
P( HC )  P( M & HC )  P( F & HC )  .1383  .1316  .2699
P( F & HC ) .1316
P( F | HC ) 

 .4876
P( HC )
.2699
Independence
• Two events A and B are independent if:
P(A|B) = P(A) or, equivalently P(B|A) = P(B)
• Cholesterol Example:
P( F )  1  P( M )  1  .4983  .5017
P( F | M )  .4876
The occurrence of high cholesterol is not independent of gender
Diagnostic Tests
• True state: Disease Present (D+) or Absent (D-) based on
a gold standard
• Diagnostic test result: Positive (T+) or Negative (T-)
• Subjects can be classified in following table (where
a,b,c, and d are the number of subjects in the 4 cells:
Test Result\True State
Positive (T +)
Negative (T -)
Total
Positive (D +)
a
c
a+c
Negative (D -)
b
d
b+d
Total
a+b
c+d
a+b+c+d
Diagnostic Tests
• Sensitivity - The ability for the test to detect that
the disease is present: P(T+ | D+)
• Specificity - The ability for the test to detect that
the disease is absent: P(T- | D-)
• Positive Predictive Value (PPV) - Proportion of
positive test results that actually have the disease
• Negative Predictive Value (NPV)- Proportion of
negative test results that do not have the disease
• Overall Accuracy - Proportion of subjects who
are correctly diagnosed
Diagnostic Tests
Test Result\True State
Positive (T +)
Negative (T -)
Total
Positive (D +)
a
c
a+c
a
Sensitivity : P (T | D ) 
ac
b


Specificit y : P (T | D ) 
bd
a
*


PPV : P ( D | T ) 
ab
c
*


NPV : P ( D | T ) 
cd
ad
*
Accuracy :
abcd

Negative (D -)
b
d
b+d
Total
a+b
c+d
a+b+c+d

* Assuming
prevalence rates
in test subjects
is same as in
population
Example - Paracheck Test for
Plasmodium Falciparum (Pf)
• Goal: Develop an inexpensive test for Pf in
asymptomatic children in remote parts of India
• Gold Standard: Microscopy
• Diagnostic Test: Paracheck ($0.65/test)
Test Result\True State
Positive (T +)
Negative (T -)
Total
Source: Singh, et al (2002)
Positive (D +)
119
7
126
Negative (D -)
49
398
447
Total
168
405
573
Example - Paracheck Test for
Plasmodium Falciparum (Pf)
Test Result\True State
Positive (T +)
Negative (T -)
Total
Positive (D +)
119
7
126
Negative (D -)
49
398
447
Total
168
405
573
119
 .9444
(94.44%)
126
398
Specificit y 
 .8904
(89.04%)
447
119
PPV 
 .7803
(78.03%)
168
398
NPV 
 .9827
(98.27%)
405
119  398
Accuracy 
 .9023 (90.23%)
573
Sensitivity 
Basic Study Designs
• Studies can generally be classified as
observational or experimental
– Observational - Subjects (or nature) select their
groups (levels of the independent variable)
• Studies comparing ethnicities or sexes wrt drug disposition
• Studies of effects of smoking or other behaviors
• Studies comparing effects of patients on different therapies
– Experimental - Researchers assign subjects to
treatment groups
• Clinical trials with patients being randomized to active drug
or placebo. Typically double-blind (patient/assessor)
Observational Studies
• Case-Control -- Subjects are identified based on
presence/absence of the outcome of interest (D.V.). It is
then determined whether the subject had been exposed
to risk factor (I.V.). Retrospective Studies.
• Cohort -- Subjects are identified by risk factor or
treatment (I.V.) and followed over time to observe
outcome (D.V.). Prospective Studies.
• Cross-sectional -- Subjects sampled at random from
population and levels of both I.V. and D.V. are
simultaneously observed. Many studies based on large
medical databases are cross-sectional
Example - Case-Control Study
• Purpose: Study Risk Factors of Hepatitis-A in
Hispanic Children living in U.S. on Mexican
border (San Diego, CA)
• Cases: 132 Children with Hepatitis-A
• Controls: 354 Children without Hepatitis-A
• Risk Factors:
– Travel outside U.S. (67% of cases, 25% of cases)
– Eating food at taco stand/street vendor on travel
– Eating salad/lettuce on travel
Source: Weinberg, et al (2004)
Example - Cohort Study
• Purpose: Determine whether male adolescents
who develop schizophrenia were more likely to
smoke prior to onset
• Subjects: Israeli male military recruits, not
suffering major psychopathology who complete
smoking questionnaire
• Cohorts: 4052 smokers, 10196 non-smokers
• Follow-up/outcome: 4-16 year follow-up for onset
of schizophrenia (20 smokers, 24 nonsmokers)
Source: Weiser, et al (2004)
Example - Cross-Sectional Study
• Purpose - Investigate effect of high altitude on
maternal hemorrheology
• Subjects - Pregnant and non-pregnant women at
high altitude and at sea level
• Measurements - Blood/Plasma viscosities,
Hematocrit, total protein, Fibrinogen, Albumin
• Selected Findings - Blood and Plasma viscosities
are higher in pregnant and non-pregnant women at
higher altitudes
Source: Kametas, et al (2004)
Experimental Studies
• Randomized Clinical Trials - Studies where
investigators assign subjects at random to treatments
• Special Cases (more than one may apply):
–
–
–
–
–
Parallel Groups - Each subject receives only one treatment
Crossover - Each subject receives each trt (in random order)
Placebo Controlled - One group receives only a placebo
Double Blind - Subject nor assessor are aware of which trt
Double Dummy - Subjects receive similar regimens wrt
appearance, when different drugs look different
– Intention-to-Treat - Analysis is based on all subjects
randomized, including those lost to follow-up
– Completed Protocol - Analysis based on only subjects who
completed study
Example - Randomized Clinical Trial
• Purpose - Three treatments for primary dysmenorrhea in women
• Subjects - 337 women (18-40) suffering dysmenorrhea during past
3 consecutive menstrual cycles
• Treatments (Parallel Groups, double-blind, double-dummy)
– Group 1: 1 tablet meloxicam 7.5mg o.a.d.
1 tablet placebo matching meloxicam 15mg o.a.d.
1 tablet placebo matching mefenamic acid 500mg t.i.d.
– Group 2: 1 tablet meloxicam 15mg o.a.d.
1 tablet placebo matching meloxicam 7.5mg o.a.d.
1 tablet placebo matching mefenamic acid 500mg t.i.d.
– Group 3: 1 tablet mefenamic acid 500mg t.i.d.
1 tablet placebo matching meloxicam 7.5mg o.a.d.
1 tablet placebo matching meloxicam 15.0mg o.a.d.
• Outcomes: Ordinal global assessment of safety/tolerability by
patients and investigators (Good, Satisfactory, Not satisfactory, Bad)
Source: de Mello, et al (2004)
Download