Knowledge Discovery in Microbiology Data: Analysis of Antibiotic

advertisement
KMM’2005, Kaiserslautern, Germany April 10-13, 2005
Knowledge Discovery in Microbiology Data:
Analysis of Antibiotic Resistance in
Nosocomial Infections
Mykola Pechenizkiy, Seppo Puuronen
Department of Computer Science
University of Jyväskylä
Finland
Alexey Tsymbal
Department of Computer Science
Trinity College Dublin
Ireland
Michael Shifrin, Irina Alexandrova
N.N.Burdenko Institute of Neurosurgery
Russian Academy of Medical Sciences,
Moscow, Russia
Contents
•
Introduction:
–
–
•
•
Data Collection and Organization, Dataset’s characteristics
Experimental results in this paper (pilot studies)
–
–
•
Antibiotic Resistance in Nosocomial Infections
Knowledge Discovery in Databases
Association and classification rules,
Classifiers
Experimental results of our further studies (up-to-date)
–
Many-sided analysis
• Basic classifiers
• Feature selection
• Clustering
–
Local Dimensionality reduction within natural clusters:
feature selection (FS) and feature extraction (FE)
• Conventional PCA and class-conditional FE
• Sequential FS
–
Tracking Concept Drift
• 3 evaluation strategy
• Dynamic integration of classifiers
•
Conclusions and future work
KMM’2005 Kaiserslautern, Germany, April 10-13, 2005
Knowledge Discovery in Microbiology Data: Analysis of Antibiotic Resistance in Nosocomial Infections
2
Antibiotic Resistance in Nosocomial Infections
•
•
•
•
•
3 - 40% of patients admitted to hospital acquire an infection during their
stay, and that the risk for hospital-acquired infection, or nosocomial
infection, has risen steadily in recent decades.
The frequency depends mostly on the type of conducted operation being
greater for “dirty” operations (10-40%), and smaller for “pure” operations
(3-7%). E.g. such serious infectious complication as postoperative
meningitis is often the result of nosocomial infection.
Antibiotics are the drugs that are commonly used to fight against
infections caused by bacteria.
According to the Center for Disease Control and Prevention (CDC)
statistics, more than 70% of the bacteria that cause hospital-acquired
infections are resistant to at least one of the antibiotics most commonly
used to treat infections.
Analysis of the microbiological data included in antibiograms collected
in different institutions over different periods of time is considered as one
of the most important activities to restrain the spreading of antibiotic
resistance and to avoid the negative consequences of this phenomenon.
KMM’2005 Kaiserslautern, Germany, April 10-13, 2005
Knowledge Discovery in Microbiology Data: Analysis of Antibiotic Resistance in Nosocomial Infections
3
How antibiotics work
• Inhibition of nucleic acid synthesis
–
Rifampicin; Chloroquine
• Inhibition of protein synthesis
–
Tetracyclines; Chloramphenicol
• Action on cell membrane
–
Polyenes; Polymyxin
• Interference with enzyme system
–
Sulphamethoxazole
• Action on cell wall
–
Penicillin; Vancomycin
• penicillin works by blocking the formation of peptide
bonds in the bacterial cell wall and thereby weakens it,
leaving the bacterium susceptible to osmotic lysis
KMM’2005 Kaiserslautern, Germany, April 10-13, 2005
Knowledge Discovery in Microbiology Data: Analysis of Antibiotic Resistance in Nosocomial Infections
4
Antibiotic sensitivity of different bacteria
•
Comparing the antibiotic sensitivity of different bacteria
© Jim Deacon, Institute of Cell and Molecular Biology, The University of Edinburgh
KMM’2005 Kaiserslautern, Germany, April 10-13, 2005
Knowledge Discovery in Microbiology Data: Analysis of Antibiotic Resistance in Nosocomial Infections
5
The emergence of antibiotic resistance
Effects of different antibiotics on growth of a Bacillus strain. The right-hand image
shows a close-up of the novobiocin disk (marked by an arrow on the whole plate). In
this case some individual mutant cells in the bacterial population were resistant to the
antibiotic and have given rise to small colonies in the zone of inhibition.
© Jim Deacon, Institute of Cell and Molecular Biology, The University of Edinburgh
KMM’2005 Kaiserslautern, Germany, April 10-13, 2005
Knowledge Discovery in Microbiology Data: Analysis of Antibiotic Resistance in Nosocomial Infections
6
How Antibiotic Resistance Happens
KMM’2005 Kaiserslautern, Germany, April 10-13, 2005
Knowledge Discovery in Microbiology Data: Analysis of Antibiotic Resistance in Nosocomial Infections
7
How Antibiotic Resistance Happens
•
In spontaneous DNA mutation, bacterial
DNA may mutate spontaneously. Drugresistant tuberculosis arises this way.
•
In a form of microbial sex called
transformation, one bacterium may take up
DNA from another bacterium. Pencillinresistant gonorrhea results from
transformation.
•
Resistance acquired from a small circle of
DNA called a plasmid, that can flit from
one type of bacterium to another.
–
A single plasmid can provide a slew of different
resistances.
–
In 1968, 12,500 people in Guatemala died in an
epidemic of Shigella diarrhea. The microbe
harbored a plasmid carrying resistances to four
antibiotics!
KMM’2005 Kaiserslautern, Germany, April 10-13, 2005
Knowledge Discovery in Microbiology Data: Analysis of Antibiotic Resistance in Nosocomial Infections
8
How Antibiotic Resistance Happens
• Horizontal Gene Transfer (© Grace Yim and Fan Sozzi)
KMM’2005 Kaiserslautern, Germany, April 10-13, 2005
Knowledge Discovery in Microbiology Data: Analysis of Antibiotic Resistance in Nosocomial Infections
9
Mechanisms of Antibiotic Resistance
© Grace Yim and Fan Sozzi
KMM’2005 Kaiserslautern, Germany, April 10-13, 2005
Knowledge Discovery in Microbiology Data: Analysis of Antibiotic Resistance in Nosocomial Infections
10
Mechanisms of Antibiotic Resistance
Antibiotic
Method of resistance
Chloramphenicol
reduced uptake into cell
Tetracycline
active efflux from the cell
β-lactams, Erythromycin, Lincomycin
eliminates or reduces binding of
antibiotic to target
β-lactams, Erythromycin
hydrolysis
Aminoglycosides, Chloramphenicol,
Fosfomycin, Lincomycin
inactivation of antibiotic by
enzymatic modification
β-lactams, Fusidic Acid
sequestering of the antibiotic by
protein binding
Sulfonamides, Trimethoprim
metabolic bypass of inhibited
reaction
Sulfonamides, Trimethoprim
overproduction of antibiotic target
(titration)
Bleomycin
binding of specific immunity
protein to antibiotic
KMM’2005 Kaiserslautern, Germany, April 10-13, 2005
Knowledge Discovery in Microbiology Data: Analysis of Antibiotic Resistance in Nosocomial Infections
11
Problem Formulation
• More global, e.g. for pharmaceutical companies
– Maintain a pool of effective drugs on the market
• Research, develop and test new antimicrobials
• Widespread misuse of antibiotics
• More local, e.g. for a hospital
– Maintain a pool of effective drugs in the hospital
• Monitoring and researching … (concept drift, seasons)
– Predicts the sensitivity of certain antibiotic for a certain
patient with a certain disease
• Various intelligent techniques including KM, KDD and
DM, ML, DSS etc.
KMM’2005 Kaiserslautern, Germany, April 10-13, 2005
Knowledge Discovery in Microbiology Data: Analysis of Antibiotic Resistance in Nosocomial Infections
12
Knowledge Discovery in Databases
•
Knowledge discovery in databases (KDD) is a combination of data
warehousing, decision support, and data mining that indicates an
innovative approach to information and knowledge management.
•
KDD is an emerging area that considers the process of finding
previously unknown and potentially interesting patterns and
relations in large databases.
•
We apply KDD techniques to the selected part of real clinical
database trying to evaluate possibilities to reveal some interesting
patterns
patterns of
of antibiotic
antibiotic resistance.
resistance.
KMM’2005 Kaiserslautern, Germany, April 10-13, 2005
Knowledge Discovery in Microbiology Data: Analysis of Antibiotic Resistance in Nosocomial Infections
13
The Knowledge Management Process
Knowledge
Creation &
Acquisition
Knowledge
Organization &
Storage
Knowledge
Distribution &
Integration
Knowledge
Adaptation &
Application
Knowledge Evaluation, Validation and Refinement
KDD
KMM’2005 Kaiserslautern, Germany, April 10-13, 2005
Knowledge Discovery in Microbiology Data: Analysis of Antibiotic Resistance in Nosocomial Infections
14
The Task of Classification
J classes, n training observations, p features
Given n training instances
Training
New instance
(xi, yi) where xi are values of
Set
to be classified
attributes and y is class
CLASSIFICATION
Goal: given new x0,
predict class y0
Examples:
Class Membership of
the new instance
- prognostics of recurrence of breast cancer;
- diagnosis of thyroid diseases;
- Antibiotic Resistance prediction
KMM’2005 Kaiserslautern, Germany, April 10-13, 2005
Knowledge Discovery in Microbiology Data: Analysis of Antibiotic Resistance in Nosocomial Infections
15
The Task of Classification
•
Predicting Antibiotic Resistance
– predict the sensitivity of a pathogen to an antibiotic based on
data about the antibiotic, the isolated pathogen, and the
demographic and clinical features of the patient.
KMM’2005 Kaiserslautern, Germany, April 10-13, 2005
Knowledge Discovery in Microbiology Data: Analysis of Antibiotic Resistance in Nosocomial Infections
16
Data Collection
• N.N. Burdenko Institute of Neurosurgery
• Bacterial analyzer “Vitek-60” (developed by “bioMérieux”)
• Information Systems
– "Microbiologist" (developed by the Medical Informatics Lab of the
institute)
– "Microbe" (developed by Russian company "MedProject-3").
• Each instance of the data used in analysis represents one
sensitivity test and contains the following features:
– pathogen that is isolated during the bacterial identification
analysis,
– antibiotic that is used in the sensitivity test
– the result of the sensitivity test itself (sensitive, resistant or
intermediate), obtained from “Vitek” according to the guidelines of
(NCCLS).
KMM’2005 Kaiserslautern, Germany, April 10-13, 2005
Knowledge Discovery in Microbiology Data: Analysis of Antibiotic Resistance in Nosocomial Infections
17
Data Organization
• The information about sensitivity analysis is
connected with patient, his or her demographical
data (sex, age) and hospitalization in the Institute
(main department, days spent in ICU, days spent in
the hospital before test, etc.).
• Each instance of microbiological test in the database
corresponds to a single specimen (liquor).
• Piloting exploratory analysis – 1423 sensitivity tests
including the meningitis cases of the year 2002.
KMM’2005 Kaiserslautern, Germany, April 10-13, 2005
Knowledge Discovery in Microbiology Data: Analysis of Antibiotic Resistance in Nosocomial Infections
18
Dataset’s characteristics
Patient and hospitalization related
Sex
{Male, Female}
Age
Integer
Recurring stay
{True,False}
Days of stay in NSI
Integer
Days of stay in ICU
Integer
Days of stay in NSI before specimen was received
Integer
Bacterium is isolated when patient is in ICU
{True,False}
Main department
{1,…,10}
Department of stay (departments + ICU)
{1,…,11}
Pathogen and pathogen groups
Pathogen name
{Pat_name1, …, Pat_name17}
Gram(+/- )
{True,False}
Staphylococcus
{True,False}
Enterococcus
{True,False}
Enterobacteria
{True,False}
Nonfermenters
{True,False}
Antibiotic and antibiotic groups
Antibiotic name
{Ant_name1, …, Ant_name39}
Group1
{True,False}
…
…
Group15
sensitivity
{True,False}
{Sensitive, Intermediate, Resistant}
KMM’2005 Kaiserslautern, Germany, April 10-13, 2005
Knowledge Discovery in Microbiology Data: Analysis of Antibiotic Resistance in Nosocomial Infections
19
Grouping of Pathogens
Patogen group
all
S1
S2
S3
total
total%
1967
217
2244
4430
100%
gram +
1 089
43
1 002
2 134
48,2%
gram -
880
174
1 242
2 296
51,8%
1 028
43
942
2 013
45,4%
enterococ_g+
61
0
60
121
2,7%
enterobac_g-
237
60
486
783
17,7%
nonferm_g-
643
114
756
1 513
34,2%
staf_g+
KMM’2005 Kaiserslautern, Germany, April 10-13, 2005
Knowledge Discovery in Microbiology Data: Analysis of Antibiotic Resistance in Nosocomial Infections
20
Grouping of Antibiotics
Ant_group
S1
S2
S3
total
total%
pen_ingib
144
29
148
321
7,2%
monobact
28
7
56
91
2,1%
glyco
179
0
2
181
4,1%
amino
309
34
346
689
15,6%
f_hinolon
212
39
261
512
11,6%
sulph
122
0
186
308
7,0%
macrolydes
38
3
124
165
3,7%
tetra
43
2
136
181
4,1%
132
6
26
164
3,7%
71
5
136
212
4,8%
167
1
14
182
4,1%
89
2
20
111
2,5%
b_lactam
435
89
789
1 313
29,6%
ceph
183
78
292
553
12,5%
c_penem
116
2
19
137
3,1%
pen
136
9
478
623
14,1%
rif
nitrofuran
lact_ses
fuzid
KMM’2005 Kaiserslautern, Germany, April 10-13, 2005
Knowledge Discovery in Microbiology Data: Analysis of Antibiotic Resistance in Nosocomial Infections
21
Results of Pilot Studies
•
•
•
•
On the whole set of features nonparametic approaches like 3-Nearest
Neighbor (3NN) classifier resulted in better accuracy in comparison with
parametric approaches like Naïve Bayes.
Classes with instances related to sensitive and resistant cases of
pathogens are balanced (47% and 48% correspondingly) and easier to
predict. On the contrary, there were very few instances of sensitivity
tests where the pathogens sensitivity was intermediate (5%), and it was
difficult for classifiers to make good predictions for this group of
instances. Some algorithms treated instances related to I sensitivity as
noise
Naïve Bayes could achieve much higher accuracy when FS is
undertaken and the classification model is build on the selected subset
of features.
Feature ranking according to the relief measure shows that most of
information is concentrated in the features related to antibiotics, much
less information in the features that describe pathogen and even less
information is in the features that describe demographics of the patients
and the hospitalization context.
KMM’2005 Kaiserslautern, Germany, April 10-13, 2005
Knowledge Discovery in Microbiology Data: Analysis of Antibiotic Resistance in Nosocomial Infections
22
Examples of Classification Rules
1: (7.2 < years_old <= 14.4) & (main_dept = 1) => pat_ab_sens = S (81/24)
2: (days_fefore_test < 16) & (main_dept = 2) => pat_ab_sens = S (47/7)
3: (pathogen_name = p_aeruginosa) & (recurring = FALSE) & (sex = M) &
(days_in_ICU < 21) => pat_ab_sens = S (82/14)
4: (antibiotic_name = vancomycin) => pat_ab_sens = S (44/1)
5: (antibiotic_name = tic_clav) & (pathogen_name = a_calc_baumannii) =>
pat_ab_sens = S (6/0)
The numbers in brackets denote the number of instances satisfying to the left part of the rule
(support) and the number of exceptions found for this rule
KMM’2005 Kaiserslautern, Germany, April 10-13, 2005
Knowledge Discovery in Microbiology Data: Analysis of Antibiotic Resistance in Nosocomial Infections
23
Experimental Results of Further Studies
•
Many-sided data analyses
– Basic Classifiers
• Naïve Bayes (NB), Bayesian Network (BN),
• Three nearest neighbor classifiers (1NN, 3NN, and 15NN),
• Decision tree classifier (C4.5).
• Rule-based classifier Jrip
– Dimensionality reduction (local and global)
• Feature Selection
– Sequential search strategies
» FFS, BFE. BiS
• Feature Extraction
– PCA, class-conditional parametric and nonparametric
– Clustering
• Natural clustering
• Classical techniques like kMeans, EM
•
Tracking Concept Drift
KMM’2005 Kaiserslautern, Germany, April 10-13, 2005
Knowledge Discovery in Microbiology Data: Analysis of Antibiotic Resistance in Nosocomial Infections
24
Classification Accuracies 1 of (2)
0.850
0.830
0.810
0.790
0.770
gram gram+
avg
global
0.750
0.730
0.710
0.690
0.670
0.650
NB
BN
- 4430 sensitivity tests
1NN
3NN
15NN
- meningitis cases
C45
Jan 2002 - Jul 2004
KMM’2005 Kaiserslautern, Germany, April 10-13, 2005
Knowledge Discovery in Microbiology Data: Analysis of Antibiotic Resistance in Nosocomial Infections
25
Classification Accuracies 2 of (2)
0.910
0.860
0.810
ceph
c_penem
pen
avg
b_lactam(global)
0.760
0.710
0.660
0.610
NB
BN
- 4430 sensitivity tests
1NN
3NN
15NN
- meningitis cases
C45
Jan 2002 - Jul 2004
KMM’2005 Kaiserslautern, Germany, April 10-13, 2005
Knowledge Discovery in Microbiology Data: Analysis of Antibiotic Resistance in Nosocomial Infections
26
Problem Representation
High vs. low quality representation spaces (RS) for concept
learning (Michalski, 1995)
KMM’2005 Kaiserslautern, Germany, April 10-13, 2005
Knowledge Discovery in Microbiology Data: Analysis of Antibiotic Resistance in Nosocomial Infections
27
Feature selection or transformation
• Features are often correlated (not independent from each
other)
– Feature selection techniques that just assign weights to
individual features are insensitive to interacted or
correlated features.
• That is why the transformation of the given representation
before weighting the features is often preferable.
• Data is often not homogenous
– For some problems a feature subset may be useful in one
part of the instance space, and at the same time it may be
useless or even misleading in another part of it.
– Therefore, it may be difficult or even impossible to remove
irrelevant and/or redundant features from a data set and
leave only useful ones by means of feature selection.
KMM’2005 Kaiserslautern, Germany, April 10-13, 2005
Knowledge Discovery in Microbiology Data: Analysis of Antibiotic Resistance in Nosocomial Infections
28
Feature extraction process and eigenvaluebased approaches
•Feature extraction (FE) is a dimensionality reduction technique that
extracts a subset of new features from the original set by means of
some functional mapping keeping as much information in the data as
possible (Fukunaga 1990).
•Conventional Principal Component Analysis (PCA) is one of the most
commonly used feature extraction techniques, that is based on
extracting the axes on which the data shows the highest variability
(Jolliffe 1986).
PCA has the following properties:
(1) it maximizes the variance of the extracted features;
(2) the extracted features are uncorrelated;
(3) it finds the best linear approximation in the mean-squares sense;
(4) it maximizes the information contained in the extracted features.
KMM’2005 Kaiserslautern, Germany, April 10-13, 2005
Knowledge Discovery in Microbiology Data: Analysis of Antibiotic Resistance in Nosocomial Infections
29
The main problem of
PCA in classification
PCA gives high weights to features with higher variabilities
disregarding whether they are useful for classification or not.
x2
PC(2)
a)
PC(1)
x1
x2
PC(2)
b)
PC(1)
x1
PCA for classification: a) effective work of PCA, b) the case where an
irrelevant principal component was chosen from the classification point of
view.
KMM’2005 Kaiserslautern, Germany, April 10-13, 2005
Knowledge Discovery in Microbiology Data: Analysis of Antibiotic Resistance in Nosocomial Infections
30
Data Heterogeneity
• A feature subset may be useful in one part of
the instance space, and at the same time it may
be useless or even misleading in another part
– Search for homogenious regions
• Different clustering/partitioning techniques
– kMeans, EM
• Natural clusters
– use of contextual features for splitting
» features that are not useful for classification by themselves
but are useful in combination with other (context-sensitive)
features
KMM’2005 Kaiserslautern, Germany, April 10-13, 2005
Knowledge Discovery in Microbiology Data: Analysis of Antibiotic Resistance in Nosocomial Infections
31
Classification within natuaral clusters
0.850
- 4430 sensitivity tests
- meningitis cases
Jan 2002 – Jul 2004
0.830
0.810
0.790
0.770
gram gram+
avg
global
0.750
0.730
0.710
0.690
0.670
0.650
NB
•
BN
1NN
3NN
15NN
C45
Classification accuracies for two main pathogen clusters
KMM’2005 Kaiserslautern, Germany, April 10-13, 2005
Knowledge Discovery in Microbiology Data: Analysis of Antibiotic Resistance in Nosocomial Infections
32
Local DR within Natural Clusters
Comparison of local vs. global 7-NN accuracy results for the
a) whole data, b) ‘gram+‘ cluster and c) ‘gram–’ cluster
with and without applying FE (top part) and FS (bottom part).
(accepted to IEEE CMBS 2005)
KMM’2005 Kaiserslautern, Germany, April 10-13, 2005
Knowledge Discovery in Microbiology Data: Analysis of Antibiotic Resistance in Nosocomial Infections
33
Antibiotic Resistance as Concept Drift
• A difficult problem with learning in many real-world domains
is that the concept of interest may depend on some hidden
context, not given explicitly in the form of predictive
features.
• Changes in the hidden context can induce more or less
radical changes in the target concept, which is generally
known as concept drift
• Even in most strictly controlled environments some
unexpected changes may happen
– due to fail and replacement of some medical equipment, or
– due to changes in personnel, causing the necessity to change
the model
• An effective learner should be able to track such changes
and to quickly adapt to them
KMM’2005 Kaiserslautern, Germany, April 10-13, 2005
Knowledge Discovery in Microbiology Data: Analysis of Antibiotic Resistance in Nosocomial Infections
34
Types of Concept Drift (CD)
• Changes in hidden context may be a
cause of
– a change of target concept
– a change of the underlying data distribution.
• The necessity in the change of current model due to
the change of data distribution is called virtual
concept drift.
• Virtual concept drift and real concept drift
often occur together.
• Real or virtual, or both – model needs to
be changed.
KMM’2005 Kaiserslautern, Germany, April 10-13, 2005
Knowledge Discovery in Microbiology Data: Analysis of Antibiotic Resistance in Nosocomial Infections
35
Tracking Concept Drift
1.12.01
0.800
0.750
0.700
0.650
I
0.600
II
III
0.550
0.500
0.450
0.400
Mar
Jun
Sep
Dec
Mar
Jun
Sep
Dec
Mar
Jun
3 strategies were used
(1) Train on every but the 1st chunk and test on the last chunk
(2) Train on i-th chunk and test on the i+1 chunk
(3) Train on the 1st chunk and test on every but the 1st chunk
KMM’2005 Kaiserslautern, Germany, April 10-13, 2005
Knowledge Discovery in Microbiology Data: Analysis of Antibiotic Resistance in Nosocomial Infections
36
Approaches to Handling CD
• (1) instance selection;
– select instances relevant to the current concept
• generalizing from a window that moves over recently arrived
instances and uses the learnt concepts for prediction only in
the immediate future
• (2) instance weighting;
– according to their “age”, and their competence with regard
to the current concept
• instance weighting techniques handle CD worse than
analogous instance selection techniques due to overfitting
the data
• (3) ensemble learning
– next slide
KMM’2005 Kaiserslautern, Germany, April 10-13, 2005
Knowledge Discovery in Microbiology Data: Analysis of Antibiotic Resistance in Nosocomial Infections
37
Ensembles & Dyn. Integr. of Classifiers
•
Ensemble learning is among the most popular and effective
approaches to handle concept drift, in which a set of concept
descriptions built over different time intervals is maintained,
predictions of which are combined using a form of voting, or the
most relevant description is selected.
•
The problem with current ensemble approaches in that they are
not able to deal with local concept drift, which is a common
case.
– only particular bacteria may develop their resistance to
certain antibiotics, while resistance to the others can remain
the same;
– or data distribution can change for particular bacteria only.
KMM’2005 Kaiserslautern, Germany, April 10-13, 2005
Knowledge Discovery in Microbiology Data: Analysis of Antibiotic Resistance in Nosocomial Infections
38
Dynamic Intergation of Classifiers (DIC)
•
Basic idea of DIC techniques:
– two main phases.
• the learning phase
– local classification errors of each base classifier for each instance of the
training set are estimated according to the 1/0 loss function using CV.
– the learning phase finishes with training the base classifiers on the
whole training set.
• the application phase
– begins with determining k-nearest neighbours for a new instance.
– WNN regression is used to predict the local classification errors of each
base classifier for the new instance.
•
Dynamic Selection (DS),
– a classifier with the least predicted local classification error is selected
•
Dynamic Voting (DV),
– each base classifier receives a weight proportional to its estimated local
accuracy, and the final classification is produced as in WV
•
Dynamic Voting with Selection (DVS)
– the base classifiers with the highest local classification errors are
discarded (errors that fall into the upper half of the error interval) and DV
is applied to the remaining classifiers
KMM’2005 Kaiserslautern, Germany, April 10-13, 2005
Knowledge Discovery in Microbiology Data: Analysis of Antibiotic Resistance in Nosocomial Infections
39
DIC for Tracking CD
Ju
l
Au
g
Se
p
O
ct
No
v
De
c
Ja
n
Fe
b
M
ar
Ap
r
M
ay
max
wv
v
ds
dv
dvs
Ju
l
Au
g
Se
p
O
ct
No
v
De
c
Ja
n
Fe
b
M
ar
Ap
r
M
ay
Ju
n
Ja
n
Fe
b
M
ar
Ap
r
M
ay
Ju
n
0.95
0.90
0.85
0.80
0.75
0.70
0.65
0.60
0.55
0.50
Classification accuracy over sequential data blocks
(ensembles of C4.5 decision trees)
Dynamic integration techniques improve ensemble
accuracy by more than 10% on average.
KMM’2005 Kaiserslautern, Germany, April 10-13, 2005
Knowledge Discovery in Microbiology Data: Analysis of Antibiotic Resistance in Nosocomial Infections
40
Conclusions and Future Work
• Contribution
– Many-sided analyses of microbiological data
• A number of KD techniques are applied
– Locally and globally
– Data analysis as static DB content and as a stream
• Further work
– Communicating the results to the experts
– Identifying other potential cases for application
(hopefully one from CBMS’05) and applying manysided analyses
• comparing found dependecies for other contexts
(hospitals, countries, sources of pathogens etc)
KMM’2005 Kaiserslautern, Germany, April 10-13, 2005
Knowledge Discovery in Microbiology Data: Analysis of Antibiotic Resistance in Nosocomial Infections
41
IEEE CBMS 2005
Trinity College Dublin
June 23-24
The 18th IEEE Symposium on Computer-Based Medical Systems
http://conferences.computer.org/CBMS2005/index.html
KMM’2005 Kaiserslautern, Germany, April 10-13, 2005
Knowledge Discovery in Microbiology Data: Analysis of Antibiotic Resistance in Nosocomial Infections
42
Contact info
Mykola Pechenizkiy, Seppo Puuronen
Department of Computer Science
University of Jyväskylä
Finland
mpechen@cs.jyu.fi & sepi@cs.jyu.fi
Alexey Tsymbal
Dept of Computer Science
Trinity College Dublin
Ireland
Alexey.Tsymbal@cs.tcd.ie
Michael Shifrin, Irina Alexandrova
N.N.Burdenko Institute of Neurosurgery
Russian Academy of Medical Sciences, Moscow, Russia
Shifrin@nsi.ru
KMM’2005 Kaiserslautern, Germany, April 10-13, 2005
Knowledge Discovery in Microbiology Data: Analysis of Antibiotic Resistance in Nosocomial Infections
43
Download