Practical Aspects of Alerting Algorithms in Biosurveillance

advertisement
Practical Aspects of Alerting
Algorithms in Biosurveillance
Howard S. Burkom
The Johns Hopkins University Applied Physics Laboratory
National Security Technology Department
Biosurveillance Information Exchange Working Group
DIMACS Program/Rutgers University
Piscataway, NJ
February 22, 2006
Outline
What information do temporal alerting algorithms
give the health monitor?
How can typical data issues introduce bias or
other misinformation?
How do spatial scan statistics and other
spatiotemporal methods give the monitor a
different look at the data?
What data issues are important for the quality of
this information?
Conceptual approaches to
Aberration Detection
What does ‘aberration’ mean? Different approaches
for a single data source:
• Process control-based: “The underlying data distribution
has changed” – many measures
• Model-based: “The data do not fit an analytical model
based on a historical baseline” – many models
• Can combine these approaches
• Spatiotemporal Approach: “The relationship of local
data to neighboring data differs from expectations
based on model or recent history”
Comparing Alerting Algorithms
Criteria:
• Sensitivity
– Probability of detecting an outbreak signal
– Depends on effect of outbreak in data
• Specificity ( 1 – false alert rate )
– Probability(no alert | no outbreak )
– May be difficult to prove no outbreak exists
• Timeliness
– Once the effects of an outbreak appear in the
data, how soon is an alert expected?
Aggregating Data in Time
Data stream(s) to monitor in time:
baseline interval
Used to get some estimate
of normal data behavior
• Mean, variance
• Regression coefficients
• Expected covariate distrib.
-- spatial
-- age category
-- % of claims/syndrome
guardband
test interval
Avoids
• Counts to be
contamination
tested for
of baseline
anomaly
with outbreak • Nominally 1 day
signal
• Longer to reduce
noise, test for
epicurve shape
• Will shorten as
data acquisition
improves
Elements of an Alerting Algorithm
– Values to be tested: raw data, or residuals from a model?
– Baseline period
•
•
•
•
•
Historical data used to determine expected data behavior
Fixed or a sliding window?
Outlier removal: to avoid training on unrepresentative data
What does algorithm do when there is all zero/no baseline data?
Is a warmup period of data history required?
– Buffer period (or guardband)
• Separation between the baseline period and interval to be tested
– Test period
• Interval of current data to be tested
– Reset criterion
• to prevent flooding by persistent alerts caused by extreme values
– Test statistic: value computed to make alerting decisions
– Threshold: alert issued if test statistic exceeds this value
Rash Syndrome Grouping
of Diagnosis Codes
www.bt.cdc.gov/surveillance/syndromedef/word/syndromedefinitions.doc
Rash ICD-9-CM Code List
ICD9CM
050.0
050.1
050.2
050.9
051.0
051.1
052.7
052.8
052.9
057.8
057.9
695.0
695.1
695.2
695.89
695.9
ICD9DESCR
SMALL POX, VARIOLA MAJOR
SMALL POX, ALASTRIM
SMALL POX, MODIFIED
SMALLPOX NOS
COWPOX
PSEUDOCOWPOX
VARICELLA COMPLICAT NEC
VARICELLA W/UNSPECIFIED C
VARICELLA NOS
EXANTHEMATA VIRAL OTHER S
EXANTHEM VIRAL, UNSPECIFI
ERYTHEMA TOXIC
ERYTHEMA MULTIFORME
ERYTHEMA NODOSUM
ERYTHEMATOUS CONDITIONS O
ERYTHEMATOUS CONDITION N
692.9
782.1
DERMATITIS UNSPECIFIED CA
RASH/OTHER NONSPEC SKIN E
2
2
026.0
026.1
026.9
051.2
051.9
053.20
SPIRILLARY FEVER
STREPTOBACILLARY FEVER
RAT-BITE FEVER UNSPECIFIED
DERMATITIS PUSTULAR, CONT
PARAVACCINIA NOS
HERPES ZOSTER DERMATITIS E
HERPES ZOSTER WITH OTHER SPECIF
COMPLIC
H.Z. W/ UNSPEC. COMPLICATION
HERPES ZOSTER NOS W/O COM
ECZEMA HERPETICUM
HERPES SIMPLEX W/OTH.SPEC
3
3
3
3
3
3
053.79
053.8
053.9
054.0
054.79
Consensus
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
3
3
3
3
3
Example: Daily Counts with
Injected Cases
Injected Cases
Presumed
Attributable to
Outbreak Event
14
Syndrome Count
12
10
Rash_1
expected
event-attributable
8
6
4
2
0
9/22/96
10/2/96
10/12/96
10/22/96
11/1/96
11/11/96
Encounter Date
11/21/96
12/1/96
12/11/96
12/21/96
Example: Algorithm Alerts Indicated
Test Statistic
Exceeds Chosen
Threshold
14
Syndrome Count
12
Rash_1
expected
alert
event-attributable
10
8
6
4
2
0
9/22/96
10/2/96
10/12/96
10/22/96
11/1/96
11/11/96
Encounter Date
11/21/96
12/1/96
12/11/96
12/21/96
EWMA Monitoring
• Exponential Weighted
Moving Average
• Average with most weight
on recent Xk:
Sk = wS k-1 + (1-w)Xk,
where 0 < w < 1
• Test statistic:
Sk compared to
expectation from sliding
baseline
Basic idea: monitor
(Sk – mk) / sk
Exponential Weighted Moving Average
60
Daily Count
Smoothed
50
40
30
20
10
0
02/25/94
•
•
03/02/94
03/07/94
03/12/94
03/17/94
03/22/94
03/27/94
Added sensitivity for gradual events
Larger w means less smoothing
04/01/94
Example with Detection Statistic Plot
Statistic Exceeds Threshold
Threshold
Example: EWMA applied to Rash Data
Effects of Data Problems
Additional
flags
missed
event
Importance of spatial data
for biosurveillance
– Purely temporal methods can find anomalies,
IF you know which case counts to monitor
• Location of outbreak?
• Extent?
– Advantages of spatial clustering
• Tracking progression of outbreak
• Identifying population at risk
Evaluating Candidate Clusters
Surveillance Region
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
Candidate cluster:
The scan statistic
gives a measure of:
“how unlikely is the
number of cases
inside relative to
the number outside,
given the expected
spatial distribution
of cases”
(Thus, a populous
region won’t
necessarily flag.)
Selecting Candidate Clusters
x
x
x
x
x
x
x
x
x
x
x
x
x
Searching for Spatial Clustering
centroids of data
collection regions
• form cylinders: bases are circles
about each centroid in region A,
height is time
x
x
x
x
x
x
• most significant clusters: regions
whose centroids form base of
cylinder with maximum statistic
x
x
x
x
x
x
• calculate statistic for event count
in each cylinder relative to entire
region, within space & time limits
region A
• but how unusual is it? Repeat
procedure with Monte Carlo runs,
compare max statistic to maxima of
each of these
Scan Statistic Demo
Scan Statistics: Advantages
• Gives monitor guidance for cluster size, location,
significance
• Avoids preselection bias regarding cluster size or
location
• Significance testing has control for multiple
testing
• Can tailor problem design by data, objective:
– Location (zipcode, hospital/provider site,
patient/customer residence, school/store address)
– Time windows used (cases, history, guardband)
– Background estimation method: model, history,
population, eligible customers
Surveillance Application
OTC Anti-flu Sales, Dates: 15-24Apr2002
Total sales as of 25Apr: 1804
potential cluster:
center at 22311
63 sales, 39 exp.
from recent data
rel. risk = 1.6
p = 0.041
Distribution of Nonsyndromic Visits
4 San Diego Hospitals
Days
Effect of Data Discontinuities
on OTC Cough/Cold Clusters
Zip (S to N)
• Before removing problem zips, cluster groups are dominated by zips
that “turn on” after sustained periods of zero or abnormally low counts.
• After editing, more interesting cluster groups emerge.
School Nurse Data: All Visits
unreported
Cluster Investigation by Record Inspection
Records Corresponding to a Respiratory Cluster
Backups
Cumulative Summation Approach (CUSUM)
ER Respiratory Claim Data
70
Number of Cases
• Widely adapted to disease
surveillance
• Devised for prompt detection of
small shifts
• Look for changes of 2k standard
deviations from the mean m
(often k = 0.5)
• Take normalized deviation: often
Zt = (xt –m) / s
• Compare lower, upper sums to
threshold h:
SH,j = max ( 0, (Zt - k) + SH,j-1 )
SL,j = max ( 0, (-Zt - k) + SL,j-1 )
Data
60
Smoothed
50
SH > 1
40
SL > 1
30
20
10
0
12/30
1/9
1/19
1/29
2/8
2/18
2/28
Date (2000-2001)
• Phase I sets m, s, h, k
Upper Sum: Keep adding differences between
today’s count and k std deviations above mean.
Alert when the sum exceeds threshold h.
CuSum Example: CDC EARS Methods C1-C3
Three adaptive methods chosen by National Center for Infectious
Diseases after 9/1/2001 as most consistent
• Look for aberrations representing increases, not decreases
• Fixed mean, variance replaced by values from sliding baseline
(usually 7 days)
Day-9
Day-8
Day-7
Day-6
Day-5
Day-4
Day-3
Day-2
Day-1
Day 0
Current
Count
Baseline for C1-MILD (-1 to -7 day)
Baseline C2-MEDIUM (-3 to -9days)
Baseline for C3-ULTRA (-3 to -9 days)
Calculation for C1-C3:
Individual day statistic for day j with lag n:
Sj,n = Max {0, ( Countj – [μn + σn] ) / σn}, where
μn is 7-day average with n-day lag
( so μ3 is mean of counts in [j-3, j-9] ), and
σn = standard deviation of same 7-day window
C1 statistic for day k is Sk,1 (no lag)
C2 statistic for day k is Sk,3 (2-day lag)
C3 statistic for day k is Sk,3 + Sk-1,3 + Sk-2,3
,where Sk-1,3 , Sk-2,3 are added if they do not exceed the
threshold
Upper bound threshold of 2:
equivalent to 3 standard deviations above mean
Detailed Example, I
Fewer alerts
AND more
sensitive:
why?
Detailed Example, II
Signal Detected only
with 28-day baseline
Detailed Example, III
“the rest of the story”
Download