Using R For Flexible Modelling Of Pre

advertisement
Comet analysis – Issues,
Limitations and
Recommendations
Jonathan Bright
Discovery Statistics
AstraZeneca
Nested Design
Test substance
Vehicle
Low
Medium
High
Positive
Control
Culture 1 2 3 4 5 6 7 8 9 10 … 14 15 … 19 20 … 24 25
Gel
(Slide)
ab ab …
Measurements on 50 cells per gel
2
Jonathan Bright, Comet Analysis – Issues, Limitations and Recommendations, NCSC 2010
… ab ab
AstraZeneca Discovery Statistics
Group
Vehicle-treated cells
showing no genetic
damage
Positive controltreated cells
showing severe
genetic damage
3
Jonathan Bright, Comet Analysis – Issues, Limitations and Recommendations, NCSC 2010
AstraZeneca Discovery Statistics
Undamaged and Damaged Cells
Label
4
Jonathan Bright, Comet Analysis – Issues, Limitations and Recommendations, NCSC 2010
AstraZeneca Discovery Statistics
Tail Intensities (linear scale)
Label
5
Jonathan Bright, Comet Analysis – Issues, Limitations and Recommendations, NCSC 2010
AstraZeneca Discovery Statistics
Tail Intensities (linear, jittered)
Label
6
Jonathan Bright, Comet Analysis – Issues, Limitations and Recommendations, NCSC 2010
AstraZeneca Discovery Statistics
Tail Intensities (log scale, jittered)
Summary and Analysis
• Excluding positive control data
• Using PROC MIXED
• Pairwise contrasts
• Main issue is the choice of gel summary, in particular
when:
• Large number of zeros
• Small number of unusually large values
7
Jonathan Bright, Comet Analysis – Issues, Limitations and Recommendations, NCSC 2010
AstraZeneca Discovery Statistics
• Summarise each gel of 50 numbers with a statistic, S
• Analyse S
• Non-zero part of distribution approx lognormal
• Mean is powerful
• Fold-change treatment effects
BUT
• Zeros!
• Add delta (e.g. 0.001) to all tail intensities before
logging and averaging
8
Jonathan Bright, Comet Analysis – Issues, Limitations and Recommendations, NCSC 2010
AstraZeneca Discovery Statistics
Summary Statistic 1. Mean Log
Summary Statistic 1. Mean Log (contd)
• S = mean(log(tail intensity + delta)) = log(delta)
• Delta = 0.001 then S = -3
• Delta = 0.0001 then S = -4 etc
• i.e. S depends critically on delta
• 50 tail intensities all > 0.1, say
• S approx = mean(log(tail intensity))
• i.e. S is approx independent of delta
• Treatment effect depends on delta!
9
Jonathan Bright, Comet Analysis – Issues, Limitations and Recommendations, NCSC 2010
AstraZeneca Discovery Statistics
• 50 zeros on a gel
0.7
i.e. 5-fold
1.0
i.e. 10-fold
Group
10
Jonathan Bright, Comet Analysis – Issues, Limitations and Recommendations, NCSC 2010
AstraZeneca Discovery Statistics
Summary Statistic 1. Example
Summary Statistic 2. Percentiles
• Robust measure of location
• Will fail to detect changes of interest in the upper tail
• 90th percentile
• Better chance of detecting changes in the upper tail
• Not very robust with only 50 values
• 75th percentile
• May offer a better balance than either of the above
• (In the presence of many zeros, even the chosen
percentile may equal zero.)
11
Jonathan Bright, Comet Analysis – Issues, Limitations and Recommendations, NCSC 2010
AstraZeneca Discovery Statistics
• Median
Group
12
Jonathan Bright, Comet Analysis – Issues, Limitations and Recommendations, NCSC 2010
AstraZeneca Discovery Statistics
Summary Statistic 2. Example (linear scale)
Summary Statistic 2. Example (linear, jittered)
Increase from
0.16 to 0.83
“5-fold”
Increase from
1.3 to 3.2
“2.5-fold”
Group
13
Jonathan Bright, Comet Analysis – Issues, Limitations and Recommendations, NCSC 2010
AstraZeneca Discovery Statistics
Increase from
0.004 to 0.175
“44-fold”
Group
14
Jonathan Bright, Comet Analysis – Issues, Limitations and Recommendations, NCSC 2010
AstraZeneca Discovery Statistics
Summary Statistic 2. Example (log, jittered)
Two Summary Statistics?
• Awkward distribution of many zeros and small values
• Too much to ask of a single summary statistic?
• Two summary statistics:
• Proportion of zeros (or proportion of tail intensities < low
threshold)
• Mean(log(all other tail intensities)
15
Jonathan Bright, Comet Analysis – Issues, Limitations and Recommendations, NCSC 2010
AstraZeneca Discovery Statistics
plus some extreme values
Two Summary Statistics. Example (linear, jittered)
0.36
Group
16
Jonathan Bright, Comet Analysis – Issues, Limitations and Recommendations, NCSC 2010
0.13
i.e. 1.4-fold
Group
AstraZeneca Discovery Statistics
0.64
Recommendations
Animal_Gel
17
Jonathan Bright, Comet Analysis – Issues, Limitations and Recommendations, NCSC 2010
AstraZeneca Discovery Statistics
• Picture the raw data
22
20
18
14
%TI
12
10
8
6
4
2
0
-2
1a_a
1a_b
1b_a
1b_b
1c_a
1c_b
1d_a
1d_b
1e_a
1e_b
1f_a
1f_b
2a_a
Animal_Gel
2a_b
2b_a
2b_b
Vehicle
18
Jonathan Bright, Comet Analysis – Issues, Limitations and Recommendations, NCSC 2010
2c_a
2c_b
2d_a
2d_b
Treated
2e_a
2e_b
2f_a
2f_b
AstraZeneca Discovery Statistics
%TI
16
Recommendations
of awkward distributions
• Present results as confidence intervals
19
Jonathan Bright, Comet Analysis – Issues, Limitations and Recommendations, NCSC 2010
AstraZeneca Discovery Statistics
• Picture the raw data
• Consider using 2 summary statistics in the presence
Two Summary Statistics. Example (linear, jittered)
0.36
20
0.13
i.e. 1.4-fold
Group
Group
95% 1-sided confidence interval
extends up to 0.4
95% 1-sided confidence interval
extends up to 2.3-fold
95% 2-sided confidence interval
extends from 0.1 up to 0.45
95% 2-sided confidence interval
extends from 0.7-fold up to 2.6-fold
Jonathan Bright, Comet Analysis – Issues, Limitations and Recommendations, NCSC 2010
AstraZeneca Discovery Statistics
0.64
Download