Practical Aspects of Alerting Algorithms in Biosurveillance Howard S. Burkom The Johns Hopkins University Applied Physics Laboratory National Security Technology Department Biosurveillance Information Exchange Working Group DIMACS Program/Rutgers University Piscataway, NJ February 22, 2006 Outline What information do temporal alerting algorithms give the health monitor? How can typical data issues introduce bias or other misinformation? How do spatial scan statistics and other spatiotemporal methods give the monitor a different look at the data? What data issues are important for the quality of this information? Conceptual approaches to Aberration Detection What does ‘aberration’ mean? Different approaches for a single data source: • Process control-based: “The underlying data distribution has changed” – many measures • Model-based: “The data do not fit an analytical model based on a historical baseline” – many models • Can combine these approaches • Spatiotemporal Approach: “The relationship of local data to neighboring data differs from expectations based on model or recent history” Comparing Alerting Algorithms Criteria: • Sensitivity – Probability of detecting an outbreak signal – Depends on effect of outbreak in data • Specificity ( 1 – false alert rate ) – Probability(no alert | no outbreak ) – May be difficult to prove no outbreak exists • Timeliness – Once the effects of an outbreak appear in the data, how soon is an alert expected? Aggregating Data in Time Data stream(s) to monitor in time: baseline interval Used to get some estimate of normal data behavior • Mean, variance • Regression coefficients • Expected covariate distrib. -- spatial -- age category -- % of claims/syndrome guardband test interval Avoids • Counts to be contamination tested for of baseline anomaly with outbreak • Nominally 1 day signal • Longer to reduce noise, test for epicurve shape • Will shorten as data acquisition improves Elements of an Alerting Algorithm – Values to be tested: raw data, or residuals from a model? – Baseline period • • • • • Historical data used to determine expected data behavior Fixed or a sliding window? Outlier removal: to avoid training on unrepresentative data What does algorithm do when there is all zero/no baseline data? Is a warmup period of data history required? – Buffer period (or guardband) • Separation between the baseline period and interval to be tested – Test period • Interval of current data to be tested – Reset criterion • to prevent flooding by persistent alerts caused by extreme values – Test statistic: value computed to make alerting decisions – Threshold: alert issued if test statistic exceeds this value Rash Syndrome Grouping of Diagnosis Codes www.bt.cdc.gov/surveillance/syndromedef/word/syndromedefinitions.doc Rash ICD-9-CM Code List ICD9CM 050.0 050.1 050.2 050.9 051.0 051.1 052.7 052.8 052.9 057.8 057.9 695.0 695.1 695.2 695.89 695.9 ICD9DESCR SMALL POX, VARIOLA MAJOR SMALL POX, ALASTRIM SMALL POX, MODIFIED SMALLPOX NOS COWPOX PSEUDOCOWPOX VARICELLA COMPLICAT NEC VARICELLA W/UNSPECIFIED C VARICELLA NOS EXANTHEMATA VIRAL OTHER S EXANTHEM VIRAL, UNSPECIFI ERYTHEMA TOXIC ERYTHEMA MULTIFORME ERYTHEMA NODOSUM ERYTHEMATOUS CONDITIONS O ERYTHEMATOUS CONDITION N 692.9 782.1 DERMATITIS UNSPECIFIED CA RASH/OTHER NONSPEC SKIN E 2 2 026.0 026.1 026.9 051.2 051.9 053.20 SPIRILLARY FEVER STREPTOBACILLARY FEVER RAT-BITE FEVER UNSPECIFIED DERMATITIS PUSTULAR, CONT PARAVACCINIA NOS HERPES ZOSTER DERMATITIS E HERPES ZOSTER WITH OTHER SPECIF COMPLIC H.Z. W/ UNSPEC. COMPLICATION HERPES ZOSTER NOS W/O COM ECZEMA HERPETICUM HERPES SIMPLEX W/OTH.SPEC 3 3 3 3 3 3 053.79 053.8 053.9 054.0 054.79 Consensus 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 3 3 3 3 3 Example: Daily Counts with Injected Cases Injected Cases Presumed Attributable to Outbreak Event 14 Syndrome Count 12 10 Rash_1 expected event-attributable 8 6 4 2 0 9/22/96 10/2/96 10/12/96 10/22/96 11/1/96 11/11/96 Encounter Date 11/21/96 12/1/96 12/11/96 12/21/96 Example: Algorithm Alerts Indicated Test Statistic Exceeds Chosen Threshold 14 Syndrome Count 12 Rash_1 expected alert event-attributable 10 8 6 4 2 0 9/22/96 10/2/96 10/12/96 10/22/96 11/1/96 11/11/96 Encounter Date 11/21/96 12/1/96 12/11/96 12/21/96 EWMA Monitoring • Exponential Weighted Moving Average • Average with most weight on recent Xk: Sk = wS k-1 + (1-w)Xk, where 0 < w < 1 • Test statistic: Sk compared to expectation from sliding baseline Basic idea: monitor (Sk – mk) / sk Exponential Weighted Moving Average 60 Daily Count Smoothed 50 40 30 20 10 0 02/25/94 • • 03/02/94 03/07/94 03/12/94 03/17/94 03/22/94 03/27/94 Added sensitivity for gradual events Larger w means less smoothing 04/01/94 Example with Detection Statistic Plot Statistic Exceeds Threshold Threshold Example: EWMA applied to Rash Data Effects of Data Problems Additional flags missed event Importance of spatial data for biosurveillance – Purely temporal methods can find anomalies, IF you know which case counts to monitor • Location of outbreak? • Extent? – Advantages of spatial clustering • Tracking progression of outbreak • Identifying population at risk Evaluating Candidate Clusters Surveillance Region x x x x x x x x x x x x x x x x x Candidate cluster: The scan statistic gives a measure of: “how unlikely is the number of cases inside relative to the number outside, given the expected spatial distribution of cases” (Thus, a populous region won’t necessarily flag.) Selecting Candidate Clusters x x x x x x x x x x x x x Searching for Spatial Clustering centroids of data collection regions • form cylinders: bases are circles about each centroid in region A, height is time x x x x x x • most significant clusters: regions whose centroids form base of cylinder with maximum statistic x x x x x x • calculate statistic for event count in each cylinder relative to entire region, within space & time limits region A • but how unusual is it? Repeat procedure with Monte Carlo runs, compare max statistic to maxima of each of these Scan Statistic Demo Scan Statistics: Advantages • Gives monitor guidance for cluster size, location, significance • Avoids preselection bias regarding cluster size or location • Significance testing has control for multiple testing • Can tailor problem design by data, objective: – Location (zipcode, hospital/provider site, patient/customer residence, school/store address) – Time windows used (cases, history, guardband) – Background estimation method: model, history, population, eligible customers Surveillance Application OTC Anti-flu Sales, Dates: 15-24Apr2002 Total sales as of 25Apr: 1804 potential cluster: center at 22311 63 sales, 39 exp. from recent data rel. risk = 1.6 p = 0.041 Distribution of Nonsyndromic Visits 4 San Diego Hospitals Days Effect of Data Discontinuities on OTC Cough/Cold Clusters Zip (S to N) • Before removing problem zips, cluster groups are dominated by zips that “turn on” after sustained periods of zero or abnormally low counts. • After editing, more interesting cluster groups emerge. School Nurse Data: All Visits unreported Cluster Investigation by Record Inspection Records Corresponding to a Respiratory Cluster Backups Cumulative Summation Approach (CUSUM) ER Respiratory Claim Data 70 Number of Cases • Widely adapted to disease surveillance • Devised for prompt detection of small shifts • Look for changes of 2k standard deviations from the mean m (often k = 0.5) • Take normalized deviation: often Zt = (xt –m) / s • Compare lower, upper sums to threshold h: SH,j = max ( 0, (Zt - k) + SH,j-1 ) SL,j = max ( 0, (-Zt - k) + SL,j-1 ) Data 60 Smoothed 50 SH > 1 40 SL > 1 30 20 10 0 12/30 1/9 1/19 1/29 2/8 2/18 2/28 Date (2000-2001) • Phase I sets m, s, h, k Upper Sum: Keep adding differences between today’s count and k std deviations above mean. Alert when the sum exceeds threshold h. CuSum Example: CDC EARS Methods C1-C3 Three adaptive methods chosen by National Center for Infectious Diseases after 9/1/2001 as most consistent • Look for aberrations representing increases, not decreases • Fixed mean, variance replaced by values from sliding baseline (usually 7 days) Day-9 Day-8 Day-7 Day-6 Day-5 Day-4 Day-3 Day-2 Day-1 Day 0 Current Count Baseline for C1-MILD (-1 to -7 day) Baseline C2-MEDIUM (-3 to -9days) Baseline for C3-ULTRA (-3 to -9 days) Calculation for C1-C3: Individual day statistic for day j with lag n: Sj,n = Max {0, ( Countj – [μn + σn] ) / σn}, where μn is 7-day average with n-day lag ( so μ3 is mean of counts in [j-3, j-9] ), and σn = standard deviation of same 7-day window C1 statistic for day k is Sk,1 (no lag) C2 statistic for day k is Sk,3 (2-day lag) C3 statistic for day k is Sk,3 + Sk-1,3 + Sk-2,3 ,where Sk-1,3 , Sk-2,3 are added if they do not exceed the threshold Upper bound threshold of 2: equivalent to 3 standard deviations above mean Detailed Example, I Fewer alerts AND more sensitive: why? Detailed Example, II Signal Detected only with 28-day baseline Detailed Example, III “the rest of the story”