A pilot study - Liza Heslop, Victoria University (PowerPoint

Developing an adverse event
prediction system : A neural
network and Bayesian pilot
study
Associate Professor Liza Heslop &
Mahdi Bazargani
Acknowledgements: Dean Athan and Gitesh
Raikundalia
May 14th 2013
vu.edu.au
CRICOS Provider No:
Three year study with three stages
• Stage One (Pilot study): develop a structured neural network based
on first day admission case mix indicators to discover the most
sensitive indicators that impact on AEs and to refine the neural
network threshold values– Neural Networks and Bayesian approach
• Stage Two: Daily aggregate adverse events based on daily hospital
workload indicators (DHWI) – a Bayesian approach
• Stage Three: Discovering the relationship of common comorbidity
indices with patients different main CHADx adverse event categories
- A Bayesian approach
Surgeons blame pressure from
management for poor safety at
Lincolnshire trust
BMJ 2013;346:f1094
Fourteen hospital trusts are to be
investigated for higher than expected
mortality rates
BMJ 2013;346:f960
“It has been estimated that across the 14 hospitals around
6000 more patients died than expected, with mortality rates
20% higher” BMJ 2013;346:f960
How has current research developed
understandings of hospital-based
workload intensity?
Nurse workforce (measured as nurse overtime working hours) and
nurse-sensitive patient outcome indicators are positively correlated (Liu et
al. 2012)
Nurse staffing (fewer RNs), increased workload, and unstable nursing
unit environments were linked to negative patient outcomes including
falls and medication errors on medical/surgical units in a mixed method
study combining longitudinal data (5 years) and primary data collection
(Duffield et al. 2012)
Workload levels and sources of stressors can vary across different
professional groups (Mazur et al. 2012)
Current measures/variables of workload
intensity
Measure
Source
Patient transfers
Composition based on the Clinical Classification System, complications identified
by patient safety indicators, and in-hospital mortality
Workload of inpatient doctors measured as the “difficulty of the tasks they
perform while admitting patients”
Nurse staffing levels, hours of nursing care per patient day (HPPD)
Volume measures: Total census (the midnight census); number of surgeries (the
total number of scheduled and unscheduled surgeries per- formed that day); Addons (the number of unscheduled surgeries); percentage add-ons (the number of
add-ons as a percentage of the total surgeries performed); and behavioural health
admissions
Job satisfaction of doctors measured in survey as ‘workload’
Subjective measure of doctors’ workflow interruptions
Nurses perceived last shift patient workload
Generic data but need for additional data:
time that nurses are off the unit (for code blue response, patient transfers and
accompanying patients for tests, internal transfers/bed moves to accommodate
patient-specific issues and particularly to address infection control issues; and
deaths
Blay et al. 2012
Studnicki et al. 2011
Lamba et al. 2012
Numerous
Pedroja 2008
Khuwaja et al. 2004
Weigl et al. 2011
Kalisch et al. 2011
Fram et al. 2012
Workforce intensity measures– no
common standard
A range of internal and external research instruments such as audit,
subjective responses to surveys and administrative and clinical data
records
Very few sourced coded episode-based hospital administrative data
(HAD)
It is necessary to accurately measure workload
A factor that impacts upon the safety and quality of health care
Useful measure to validate in the prediction model
Objectives of the first stage pilot
study
•
•
•
•
•
Develop a structured neural network based on first day
admission case mix indicators
Discover the sensitivity of each input and controlling
indicator toward occurrences of an AEs
Know the most sensitive indicators that impact on AEs
Establish neural network threshold values
Compare two main machine learning algorithms - Neural
Networks (NN) and Bayesian Networks Classifier (NBC)
Methodological objective: Develop a
complex computational relational model
Machine learning methodologies - Neural Network (NN) and Naive Bayes Classifiers
(NBC). Both contribute in different ways. While NN has a complex structure, it is
suitable for establishing the relational model. The relational model is based on
dependent and inter-correlated indicators.
NBC was employed with a pre-optimization algorithm that was trained with independent
indicators.
The accuracy of these two methodologies (NN and NBC) are compared with each
other based on ‘confusion matrices’ and the rate of true positive and true negative
AEs.
Sensitivity analyses are reported based on the NN model which is finally established
based on all incorporated indicators.
Generalized Feed Forward Multilayer Perceptron
Neural Network (input, hidden and output layers)
The hidden layer consists of four processing elements (PEs) with using TanhAxon function
as the transfer function. Weights are updated using back propagation by using Momentum
rule (Momentum=0.7, Step Size=0.1) and batch learning. Batch learning improves the
speed of training/learning
Neural Network is employed for
developing a prediction model
based on dependent and intercorrelated input indicators
There are many structures for a Neural Network and many methods for
training them. For neural networks method, this study employed a
Generalized Feed Forward Multilayer Perceptron Neural Network with
three layers - input, hidden and output. This simple structure is
suitable for the current coded episode static dataset in absence of
any time series objective for prediction of AEs.
The input layers are composed of independent variables based on first
day of admission information : Table 1 DHVI; Table 2 Patient
demographic information and patients’ diagnosis and episode
characteristics (used as controlling input indicators); and a numeric
score derived from comorbidity indexes.
Conceptual design for building
the relational model
Daily Hospital
Volume Indicators
(DHVIs)
Patient demographic
information;
Patient’s diagnosis
and episode
characteristics;
Comorbidity indices
Likelihood of Patient
Adverse Events
(CHADx)
Identify daily hospital volume
indicators
Daily hospital volume indicators (DHVI)measures of work intensity DHVI
Capability for extraction from a coded Australian
episode data set
Table 1. Daily Hospital Volume Indicators (DHVI) employed as independent or
input variables
No.
Volume Indicator Name
Description
1
Number of admissions
Daily number of admissions on the patient’s admitted day
2
Number of discharges
Daily number of admissions on the patient’s admitted day
3
Number of emergency admissions
Daily number of admissions where the admission type is Casualty (A & E)
4
The percentage of all daily emergency admission calculated from all admissions
5
Percentage of emergency
admissions
Number of surgeries
6
Number of mid-point surgeries
Surgical type calculated from the mid-point of admission and discharge date
7
Number of patients each day
The number of patients in the hospital each day
8
Number of deaths
Extracted from discharge information within the episode dataset
9
Number of Adverse Events
Extracted using CHADx business rules on all episodes of care and assigns the time of
possible AEs at mid-point date of hospitalization.
Calculated from DRG code type ‘surgical’ assigned
A defined set of controlling
indicators controlling
indicators
Patient demographic information and patients’
diagnosis and episode characteristics
These indicators are itemised within Table 2
Table 2. Patient demographic information and patients’ diagnosis and episode characteristics (
used as controlling input indicators)
No. Indicator Name Category
Description
1
Patient
demographic
information
Age of patient extracted from the episode dataset
3
Age
Sex
Admission type
4
Primary procedure
5
Secondary
procedure
Primary diagnosis
Patient’s
diagnosis and
episode
characteristics
Corresponding
LOS scores for
each category were
obtained from the
coded episode
dataset
Corresponding patient primary procedure LOS score extracted from
National Hospital Cost Data Collection (NHCDC, 2001)
Corresponding patient secondary procedure LOS score extracted from
National Hospital Cost Data Collection (NHCDC, 2001)
2
6
7
Secondary
diagnosis
Sex of patient extracted from the episode dataset
The type of admission defined as:
1-Casualty (A&E)
2-Waiting List
3-Qualified/Unqualified New-born
4-Transfers(Other Acute Hospital, External Care, Rehabilitation),
5-Change from Psychiatric Unit or Psychogeriatric
6-Other - Includes Referrals from Local Medical Officer (LMO) etc.
Corresponding patient primary diagnosis LOS score extracted from National
Hospital Cost Data Collection (NHCDC, 2001)
Corresponding patient secondary diagnosis LOS score extracted from
National Hospital Cost Data Collection (NHCDC, 2001)
Determine each patient’s
corresponding LOS scores related
to patient specific characteristics
Primary procedure
Secondary procedure
Primary diagnosis
Secondary diagnosis
Assignment of LOS scores from the NHCDC (2001) for each primary
and secondary diagnosis and procedures for each patient in the coded
episode data set.
Comorbidity classification indices used
for obtaining comorbidity LOS scores
Charlson Comorbidity Index (CCI) (Deyo et al. 1992)
Elixhauser Index (Elixhauser et al. 1998)
Disease count
(Stineman et al.1998)
Shwartz Index (Shwartz et al. 1996)
Dealing with a
coded data set
without ‘onset flag’
Identifying possible
comorbidity and
complication diseases
with absence of onset
flag
A step was necessary to
include and identify the
possible comorbidities
from the coded episode
dataset.
This study used a coded data set that
did not have an onset flag. Onset
flags were introduced in 2008 where
hospital acquired conditions (HAC)
were flagged in the codes. Hence a
difficulty of this dataset was the nonexistence of an indicator (represented
by an onset flag) on each secondary
diagnosis to identify its type as a
comorbidity or complication.
Identify an operation for AEs –
output from the Neural Network
Each patient episode of care is identified as containing an AE if it
satisfies any CHADx major categories’ business rules.
According to Utz et al. (2012): “The CHADx offers a comprehensive
classification of hospital-acquired conditions available for use with ICD10-AM.The CHADx was developed as a tool for use within hospitals,
allowing hospitals to monitor (assuming constant casemix) and reduce
hospital-acquired illness and injury.
Within Queensland in 2010/2011, 9.0% of all admissions included at
least one hospital-acquired condition (as defined by the CHADx)”.
Results: building of the relational
model (training and validation
components)
The relational model will ascertain relationships between all intercorrelated (dependent) input and controlling indicators toward the output
variable AEs.
To some extent these variables are inter-correlated, for example
emergency admissions are correlated with the number of admissions.
Table 3 Ordered Sensitivity of input indicators (Standard Deviation) with
coefficient=1SD and step Size=1000
Indicator Name
Ordered
Sensitivity
Values (SD)
0.09454
0.06732
0.05526
Indicator Type
Number of Emergency
Admissions
Elixhauser Index
0.04406
DHVI
0.04397
Comorbidity Index
Sex
0.04148
Patient Demographic Characteristics
Charlson Index
0.03726
Comorbidity Index
Mid Point Number Of Surgeries
0.03074
DHVI
Age
0.02915
Patient Demographic Characteristics
Number of Surgeries
0.02790
DHVI
Secondary Procedure LOS
0.02713
Patient Diagnoses Types
Primary Diagnosis
0.02649
Patient Diagnoses Types
Number of Discharges
Percentage of Emergency
Admissions
Shwartz Index
Number of Admissions
Admission Source
Number of Patients Each Day
Number of Deaths
Disease Count Index
0.02557
0.02382
DHVI
DHVI
0.01897
0.01866
0.01784
0.01456
0.01312
0.00706
Comorbidity Index
DHVI
Patient Demographic Characteristics
DHVI
DHVI
Comorbidity Index
Secondary Diagnosis LOS
Prime procedure LOS
Number of Adverse Event
Patient Diagnoses Types
Patient Diagnoses Types
DHVI
Discussion on Table 3
Most of the DHVIs have small sensitivity toward the output
The ‘number of adverse events’; and ‘emergency admissions’ on the date of admission
have the most sensitivity toward AE occurrences
Among patient diagnoses indicators, all show strong sensitivity toward the likelihood of
adverse events with Secondary Diagnosis LOS having the most effect among all
employed input indicators in this pilot study
Among comorbidity indices, Elixhauser and Charlson show rather strong sensitivity
values
Sex and Age have the highest sensitivities among demographic characteristic
indicators
Table 4 Accuracy of Patient’s AE classification system using Neural Network and naïve Bayes with different threshold values
Classifier
Accuracy (%)
Sensitivity (%)
Specificity (%)
NN (Threshold=0.55)
65.83
38.09
93.57
NN (Threshold=0.50)
67.75
42.85
92.66
NN (Threshold=0.45)
71.33
50
92.66
NN (Threshold=0.40)
72.79
54.76
90.82
NN (Threshold=0.35)
74.25
59.52
88.99
NN (Threshold=0.30)
75.71
64.28
87.15
NN (Threshold=0.25)
73.23
66.66
79.81
NN (Threshold=0.20)
74.60
78.57
70.64
NN (Threshold=0.15)
70
85.71
52.29
Enhanced NBC
64.1
33.6
94.6
Discussion on Table 4
The Neural Network with different thresholds achieves higher overall
accuracy than an optimized NBC. As the goal of this prediction is to
obtain higher accuracy of true positive rates of AE (sensitivity), the
thresholds 0.15 (sensitivity 85%) and 0.20 (sensitivity 78%) were
selected while the last one achieved overall higher accuracy (74%
versus 70%). Selection of these thresholds could be also dependent on
the problem specification and application of the prediction model. On the
other hand, NBC overall accuracy was lower than those values (64%)
and a low rate of sensitivity was obtained (33%).
Summary of key findings
A trained Neural Network and NBC on the least indicators which achieve the highest
accuracy
Ordering of the sensitivity values
Number of adverse events and Number of Emergency admissions on the date of
admission showed most sensitivity within DHVIs
Elixhauser and Shwartz indices showed most sensitivity within comorbidity indices
Sex and Age showed most sensitivity within patient characteristics information toward
occurrences of an AE.
Results show the supremacy of the Neural Network with an overall accuracy of 74%
(Threshold =0.2) versus 64% for Naive Bayes Classifier
Lessons for the three stage study
Indicators are very sensitive to the current state of the trained neural
network and may be different if the network is trained with a different
structure and if new indicators are employed
Outcomes
A simply-structured relational model and neural network that can
generate complex computational calculations based on several weights
for each node as well as several input and hidden nodes – a first step to
develop a relational model to predict AEs
Various training iterations have been conducted to generate the highest
accuracy based on the validation dataset. This has resulted in
avoidance of the overtraining and over-fitting of the network which the
sensitivity analyses are based on
Sensitivity values for the independent indicators have been obtained
Study limitations
Use of a coded episode data set without an onset flag
Inclusion of complicated steps to distinguished complications
arising after admission
Results are not conclusive without further machine
computational processing
Implications of this pilot study for the
next stages of this research
The procedures to overcome the lack of an onset flag have been complex. The accuracy of knowing the
hospital acquired conditions in the overall relational model will be improved in the main study.
The sensitivity results will help with refinements to this pilot study when a larger data set will be used
(including onset flag)
The DHVIs on the date of admission may be eliminated as they don’t show sufficient strength for AE
prediction.
Comorbidity diseases and demographic characteristics along with diagnosis types are involved.
1-age
2-sex
3-primary procedure
4-seconary diagnosis
5-Elixhasuer
There did not seem to be workload indicators involved in the highest accuracy of this relational model, but this
finding will require validation in the refinements to the pilot study.
May not support many research findings which suggests that workload indicators are heavily associated with
adverse events.
Next stage of research (continued)
To develop a case mix of input indicators (CMI) between all employed
indicators to reach the highest possible accuracy of classification based
on employed machine learning algorithm
This CMI will hold the least number of indicators which achieve the
highest accuracy of classification
To firmly establish which indicators to eliminate as their inclusion will not
improve the overall accuracy of the model
Direction of change as a result of
the pilot study
Further testing different machines other than Neural Network and Bayesian Network
Consider an ensemble of RepTRee which may result further accuracy).
There are different machines (e.g. Bayes, Neural Networks, Decision Trees, Logistic
Regression) involved with different optimization algorithms(Greedy Search, Genetic
Algorithm, Ensembles). The next stage will be to obtain the episode data indicator
which will result in the highest possible accuracy for each machine and for each
corresponding optimization algorithms
Correlation (tipping point or non-linear relationships) may be examined in stage three
based on the average rate of DHWIs during all days of the patient hospitalization,
rather than the first day of admission. Correlation types based on Neural Networks is
very complex and suitable for just classification and prediction results – hence Bayes is
recommended for this study instead
Development of a composite measure
of hospital workload intensity
A composite measure of hospital workload intensity may be valuable to policy and
health service officials at many levels:
The future outcome of a valid and reliable workload intensity composite measure
will
•
Help clinicians define suitable workload standards for hospital organisations
•
Help hospital organisational officials to monitor their hospitals’ workload intensity and
even possibly capacity
•
Support health services researchers to standardize measures of workload intensity for
benchmarking
•
Help examine relationships between practice environment features (for example, as
rated on measures of job satisfaction, turnover intentions and assessments of quality
of care) and workload intensity in a systematic and standardized way
Development of a composite measure
of hospital workload intensity (cont’d)
Make better use of coded activity-based data to improve the effectiveness of
operational decision-making
For example, Pedroja (2008:36), who used composite indexes to measure
hospital workload intensity suggested:
“Through the identification of a set of indicators that predict stresses on the system,
leaders would have the ability to provide additional resources or system fixes that
would make the operation less vulnerable to health care error and patient harm”
Support national studies that may like to develop a systemic picture of workload intensity.
Most current studies on workload intensity use a range of proxy measures in small scale
or localised studies to measure the effort needed for inpatient medical and nursing work
or workload intensity
References
Australian Commission on Safety and Quality in Health Care.
Classification of Hospital Acquired Diagnosis (CHADx), 2011
Thomas JW, Guire KE, Horvat GG Is patient length of stay related to
quality of care? Hospital & Health Services Administration (1997)
42(4):489-507
CONTACT DETAILS
NAME Liza Heslop
DEPARTMENT Western Centre for Health Research and Education, Sunshine
Hospital.
PHONE +0407886201
EMAIL liza.heslop@vu.edu.au
www.vu.edu.au