The Importance of Statistical Design and Analysis in the Laboratory

advertisement
Center for Biofilm Engineering
The Importance of
Statistical
Design and Analysis
in the Laboratory
Al Parker, Biostatistician
Standardized Biofilm Methods Research Team
Montana State University
Feb, 2011
Standardized Biofilm Methods Laboratory
Darla Goeres
Al Parker
Marty
Hamilton
Lindsey Lorenz
Paul Sturman
Diane Walker
Kelli BuckinghamMeyer
What is statistical thinking?
 Data
 Experimental Design
 Uncertainty and variability assessment
What is statistical thinking?
 Data
(pixel intensity in an image?
log(cfu) from viable plate counts?)
 Experimental Design
- controls
- randomization
- replication (How many coupons?
experiments?
technicians? labs?)
 Uncertainty and variability assessment
Why statistical thinking?
 Anticipate criticism
(design method and experiments accordingly)
 Provide convincing results
 Increase efficiency
(establish statistical properties)
(conduct the least number of experiments)
 Improve communication
Why statistical thinking?
Standardized Methods
Attributes of a standard method: Seven R’s
 Relevance
 Reasonableness
 Resemblance
 Repeatability (intra-laboratory)
 Ruggedness
 Responsiveness
 Reproducibility (inter-laboratory)
Attributes of a standard method: Seven R’s
 Relevance
 Reasonableness
 Resemblance
 Repeatability (intra-laboratory)
 Ruggedness
 Responsiveness
 Reproducibility (inter-laboratory)
Resemblance of Controls
Independent repeats of the same experiment in
the same laboratory produce nearly the same
control data, as indicated by a small
repeatability standard deviation.
Statistical tool:
nested analysis of variance (ANOVA)
Resemblance Example: MBEC
• 86 mm x 128 mm plastic plate with 96 wells
• Lid has 96 pegs
MBEC Challenge Plate
1
2
3
4
5
6
7
8
A
100
100
100
100
100
50:N
N
GC
SC
B
50
50
50
50
50
50:N
N
GC
SC
C
25
25
25
25
25
50:N
N
GC
SC
D
12.5
12.5
12.5
12.5
12.5
50:N
N
GC
E
6.25
6.25
6.25
6.25
6.25
50:N
N
GC
F
3.125
3.125
3.125
3.125
3.125
50:N
N
GC
G
1.563
1.563
1.563
1.563
1.563
50:N
N
GC
H
0.781
0.781
0.781
0.781
0.781
50:N
N
GC
disinfectant
neutralizer test
9
10
11
control
12
Resemblance Example: MBEC
Control Data: log10(cfu/mm2) from viable plate counts
row
A
B
C
D
E
F
G
H
cfu/mm2
log(cfu/mm2)
5.15 x 105
9.01 x 105
6.00 x 105
3.00 x 105
3.86 x 105
2.14 x 105
8.58 x 104
4.29 x 105
5.71
5.95
5.78
5.48
5.59
5.33
4.93
5.63
Mean LD= 5.55
Resemblance Example: MBEC
Control Mean
Exp Row LD
LD
SD
1
1
1
1
1
1
1
1
A
B
C
D
E
F
G
H
5.71
5.95
5.78
5.48
5.59
5.33
4.93
5.63
5.55 0.31
2
2
2
2
2
2
2
2
A
B
C
D
E
F
G
H
5.41
5.71
5.54
5.33
5.11
5.48
5.33
5.41
5.41 0.17
Resemblance from experiment to experiment
Mean LD = 5.48
Sr = 0.26
the typical
distance between
a control well LD
from an
experiment and
the true mean LD
Resemblance from experiment to experiment
The variance Sr2
can be partitioned:
2% due to between
experiment sources
98% due to within
experiment sources
Formula for the SE of the mean control LD,
averaged over experiments
2
Sc = within-experiment variance of control LDs
SE2 = among-experiment variance of control LDs
nc = number of control replicates per experiment
m = number of experiments
SE of mean control LD =
2
Sc
nc • m
+
2
SE
m
CI for the true mean control LD = mean LD ± tm-1 x SE
Formula for the SE of the mean control LD,
averaged over experiments
2
Sc = 0.98 x (0.26)2 = 0.00124
SE2 = 0.02 x (0.26)2 = 0.06408
nc = 8
m=2
SE of mean control LD =
0.00124
8•2
+
0.06408
2
= 0.1792
95% CI for the true mean control LD = 5.48 ± 12.7 x 0.1792
= (3.20, 7.76)
Resemblance from technician to technician
Mean LD = 5.44
Sr = 0.36
the typical
distance between
a control well LD
and the
true mean LD
Resemblance from technician to technician
The variance Sr2
can be partitioned:
0% due to
technician sources
24% due to
between
experiment sources
76% due to within
experiment sources
Repeatability
Independent repeats of the same
experiment in the same laboratory produce
nearly the same data, as indicated by a
small repeatability standard deviation.
Statistical tool: nested ANOVA
Repeatability Example
Data: log reduction (LR)
LR = mean(control LDs) – mean(disinfected LDs)
Repeatability Example: MBEC
Control Mean
Exp Row LD
LD
SD
1
1
1
1
1
1
1
1
2
2
2
2
2
2
2
2
A
B
C
D
E
F
G
H
A
B
C
D
E
F
G
H
5.71
5.95
5.78
5.48
5.59
5.33
4.93
5.63
5.41
5.71
5.54
5.33
5.11
5.48
5.33
5.41
5.55 0.31
5.41 0.17
1
2
3
4
5
6
7
8
A
100
100
100
100
100
50:N
N
GC
9
10
11
12
SC
B
50
50
50
50
50
50:N
N
GC
SC
C
25
25
25
25
25
50:N
N
GC
SC
D
12.5
12.5
12.5
12.5
12.5
50:N
N
GC
E
6.25
6.25
6.25
6.25
6.25
50:N
N
GC
F
3.125
3.125
3.125
3.125
3.125
50:N
N
GC
G
1.563
1.563
1.563
1.563
1.563
50:N
N
GC
H
0.781
0.781
0.781
0.781
0.781
50:N
N
GC
Repeatability Example: MBEC
Control Control
Disinfected Disinfected
Exp Row
LD
Mean LD Col 6.25% LD Mean LD LR
1
1
1
1
1
1
1
1
A
B
C
D
E
F
G
H
5.71
5.95
5.78
5.48
5.59
5.33
4.93
5.63
2
2
2
2
2
2
2
2
A
B
C
D
E
F
G
H
5.41
5.71
5.54
5.33
5.11
5.48
5.33
5.41
5.55
1
2
3
4
5
4.67
4.41
4.33
4.59
4.54
4.51
1.04
5.41
1
2
3
4
5
4.78
2.71
3.48
3.23
1.82
3.20
2.21
Mean LR = 1.63
Repeatability Example
Mean LR = 1.63
Sr = 0.83
the typical
distance between
a LR for an
experiment and
the true mean LR
Formula for the SE of the mean LR,
averaged over experiments
2
Sc = within-experiment variance of control LDs
Sd2 = within-experiment variance of disinfected LDs
SE2 = among-experiment variance of LRs
nc = number of control replicates per experiment
nd = number of disinfected replicates per experiment
m = number of experiments
SE of mean LR =
2
Sc
nc • m
+
2
Sd
nd • m
+
2
SE
m
Formula for the SE of the mean LR,
averaged over experiments
2
Sc = within-experiment variance of control LDs
Sd2 = within-experiment variance of disinfected LDs
SE2 = among-experiment variance of LRs
nc = number of control replicates per experiment
nd = number of disinfected replicates per experiment
m = number of experiments
CI for the true mean LR = mean LR ± tm-1 x SE
Formula for the SE of the mean LR,
averaged over experiments
Sc2 = 0.00124
Sd2 = 0.47950
SE2 = 0.59285
nc = 8,
nd = 5, m = 2
SE of mean LR =
0.00124
8•2
+
0.47950
5•2
+
0.59285
= 0.5868
2
95% CI for the true mean LR = 1.63 ± 12.7 x 0.5868
= 1.63 ± 7.46
= (0.00, 9.09)
How many coupons? experiments?
margin of error= tm-1 x
no. control coupons (nc):
no. disinfected coupons (nd):
no. experiments (m)
2
3
4
6
10
100
0.00124
+
nc • m
0.47950
nd • m
0.59285
+
m
2
2
3
3
5
5
8
5
12
12
8.20
2.27
1.45
0.96
0.65
0.18
7.80
2.15
1.38
0.91
0.62
0.17
7.46
2.06
1.32
0.87
0.59
0.16
7.46
2.06
1.32
0.87
0.59
0.16
7.16
1.97
1.27
0.84
0.57
0.16
Responsiveness
A method should be sensitive enough that
it can detect important changes in
parameters of interest.
Statistical tool: regression and t-tests
Responsiveness Example: MBEC
A: High
Efficacy
H: Low
Efficacy
1
2
3
4
5
6
7
8
A
100
100
100
100
100
50:N
N
GC
SC
B
50
50
50
50
50
50:N
N
GC
SC
C
25
25
25
25
25
50:N
N
GC
SC
D
12.5
12.5
12.5
12.5
12.5
50:N
N
GC
E
6.25
6.25
6.25
6.25
6.25
50:N
N
GC
F
3.125
3.125
3.125
3.125
3.125
50:N
N
GC
G
1.563
1.563
1.563
1.563
1.563
50:N
N
GC
H
0.781
0.781
0.781
0.781
0.781
50:N
N
GC
disinfectant
9
neutralizer test
10
11
12
control
Responsiveness Example: MBEC
This
response curve
indicates
responsiveness to
decreasing efficacy
between rows
C, D, E and F
Responsiveness Example: MBEC
Responsiveness can
be quantified with a
regression line:
LR = 6.08 - 0.97row
For each step in the
decrease of
disinfectant efficacy,
the LR decreases on
average by 0.97.
Summary
 Even though biofilms are complicated, it is
feasible to develop biofilm methods that meet
the “Seven R” criteria.
 Good experiments use control data!
 Assess uncertainty by SEs and CIs.
 When designing experiments, invest effort in
more experiments versus more replicates
(coupons or wells) within an experiment.
Any questions?
Download