Healthcare Informatics: From Data Mining to Business Intelligence

advertisement
Healthcare Informatics:
From Data Mining to
Business Intelligence
Hsinchun Chen, Ph.D.
Director, Artificial Intelligence Lab
BioPortal Program; Caduceus Intelligence Program
University of Arizona
Acknowledgements: NSF, NIH, NLM, CDC, NCI, DOD,
LOC, DHS, DOJ, FBI, CIA
© 2005
1
Business Intelligence
and Data Mining
© 2005
2
Business Intelligence and Analytics
• $3B BI revenue in 2009 (Gartner, 2006)
• The Data Deluge (The Economists, March 2010);
internet traffic 667 Exabytes by 2013, Cisco; Total amount
of information in 2010, 1.2 Zettabyte (KB-MB-GB-TB-PBEB-ZB-YB)
• $9.4B BI software spending in 2010 and $14.1B by
2014 (Forrester)
• IBM spent $14B in BI in five years; $9B BI revenue in
2010 (USA Today, November 2010); 24 acquisitions,
10,000 BI software developers, 8,000 BI consultants, 200
BI mathematicians
© 2005
3
Business Intelligence and Analytics
• BI: “skills, technologies, applications, and practices used
to help an enterprise better understand its business and
market.”
• Technologies: data warehousing; Extraction,
Transformation, and Load(ETL); Business Performance
Management (BPM); visual dashboards; and advanced
knowledge discovery using data and text mining
• BI 2.0: web intelligence, web analytics, web 2.0, social
media analytics, opinion mining; cloud computing and
web services; real-time monitoring and mining; enterprise
performances (marketing/accounting/finance/healthcare)
© 2005
4
BIG and FAST
• Business data from TBs to PBs
• The Big Data Era; “speed to insight”
• Barnes & Noble + Aster Data + MapReduce; McAfee +
Datameer + Hadoop; AdKnowledge + Greenplum +
Hadoop + Amazon EC2
• MapReduce/Hadoop vs. Parallel DMBSs: ETL and
“read once” data sets; complex and powerful analytics;
semi-structured data; quick-and-dirty analyses; limitedbudget operations; fault tolerance; performances
© 2005
5
CS Ecosystem and Impacts
© 2005
6
Data, Text, and Web Mining
• Data Mining: ID3, neural networks, genetic
algorithms, SVM; Weka, SPSS, SAS, Microsoft
SQL server data mining, IBM Intelligent Miner,
IBM Cognos
• World Wide Web: ftp, http/html, browser, digital
library, search engines; Mosaic, Alta Vista, Lycos,
Yahoo, Google
• Social Media: collaboration, participation, filtering,
multimedia, social networks; Facebook, Youtube,
Twitter, Second Life
© 2005
7
Data Mining Models and Methods
Predictive
Modeling
Database
Segmentation
 Classification
 Demographic clustering
 Value prediction
 Neural clustering
Link
Analysis
Deviation
Detection
 Associations discovery
 Visualization
 Sequential pattern discovery
 Statistics
 Similar time sequence discovery
© 2005
8
8
Data Mining: A KDD Process
© 2005
9
9
© 2005
10
Grand Challenges for Engineering (NAE), 2010
• Advance health informatics
– “The acquisition, management, and use of information in health”
– Sharing information over regional, national, or global networks
– Offering relevant decision support to clinicians and patients
(Caduceus Intelligence)
– Alleviating doctors’ information overload
– “Just in time, just for me” medical advice at the point of care;
Support personalized medicine
– Accessing medical research information (HelpfulMed &
GeneScene)
– Improving response in public health emergencies (BioPortal)
• Secure cyberspace
• Prevent nuclear terror
• Engineer the tools for scientific discovery
….
©•2005
11
Healthcare Informatics Projects at AI Lab:
(1) HelpfulMed and GeneScene: medical
literature, ontology, concept maps, text
mining, visualization (NSF, NIH, NCI)
(2) BioPortal: infectious disease information
sharing, knowledge portal, spatio-temporal
analysis and visualization, sequence
visualization (NSF, CIA, CDC)
(3) Caduceus Intelligence: HIS, LIS, PACS,
clinical decision support, health data
mining, personalized medicine (NSF, Tawain
NSC)
© 2005
12
Hsinchun Chen et al.,
2005
© 2005
Hsinchun Chen, et al.,
2010
13
Biomedical Informatics:
Biomedical literature, biomedical
ontologies, linguistic phrasing,
categorization, text mining
© 2005
14
© 2005
15
© 2005
16
HelpfulMED Search of Medical Websites
© 2005
17
HelpfulMED search of Evidence-based Databases
What does database cover?
Search which databases?
How many documents?
Enter search term
© 2005
18
Consulting HelpfulMED Cancer Space (Thesaurus)
Enter search term
Select relevant search terms
New terms are posted
Search again...
Or find relevant webpages
© 2005
19
Browsing HelpfulMED Cancer Map
1
Visual Site Browser
Top level map
2
3
Diagnosis, Differential
4
Brain Neoplasms
5
© 2005
Brain Tumors
20
© 2005
21
Genescene Overview
Knowledge Base
Integrate gene relations from
literature and outside databases
and provide knowledge for
learning and evaluation in data
mining
Text Mining
Process Medline abstracts
and extract gene relations
automatically from the text
Data Mining
Process gene expression data
(and existing knowledge) and
use different algorithms to
extract regulatory networks
Interface & Visualization
Allow searching for keywords, display a map of the
relations extracted from the text and/or from the
microarray
© 2005
22
Genescene Overview
JIF
Ontologies
External
Databases
HUGO
Publications
Medline
XML Parser
Publications &
GO
Meta Information
UMLS
Knowledge
Base
Titles & Abstracts
GeneScene
Text Mart
Relation Parsers
Lexical
lookup
AZ Noun
Phraser
POS
Tagging
Adjuster &
Tagger
Full
Parser
FSA
Relation
Grammar
UMLS
© 2005
Relations in
flat files
Concept
Space
Relations in
flat files
Co-occurrence
relations
Feature Structures
GeneScene
Data Mart
Text Mining
GeneScene
Information
Retrieval
Visualization
Data Mining
Spring
Algorithm
Micro
Array
Data
Bayesian
Networks
Association
Rule Mining
23
Problem: Gene Pathway
•Title Key roles for E2F1 in signaling p53-
dependent apoptosis and in cell division within
developing tumors.
•Abstract: Apoptosis induced by the p53 tumor
suppressor can attenuate cancer growth in
preclinical animal models. Inactivation of the
pRb proteins in mouse brain epithelium by the
T121 oncogene induces aberrant proliferation
and p53-dependent apoptosis. p53 inactivation
causes aggressive tumor growth due to an
85% reduction in apoptosis. Here, we show
that E2F1 signals p53-dependent apoptosis
since E2F1 deficiency causes an 80% apoptosis
reduction. E2F1 acts upstream of p53 since
transcriptional activation of p53 target genes is
also impaired. Yet, E2F1 deficiency does not
accelerate tumor growth. Unlike normal cells,
tumor cell proliferation is impaired without
E2F1, counterbalancing the effect of apoptosis
reduction. These studies may explain the
apparent paradox that E2F1 can act as both an
oncogene and a tumor suppressor in
experimental systems
© 2005
Action
Protocols
Graphic
Representation
p53
reads
"E2F1 signals p53-dependent
apoptosis"
E2F1
apoptosis
p53
infers
So, I'm assuming... a straight
line pathway...
E2F1
apoptosis
Expert
errs and
corrects
E2F1
reads
"E2F1 acts upstream of p53"
p53
apoptosis
E2F1
p53
reads
"E2F1 deficiency does not
accelerate tumor growth"
apoptosis
Final
graph
tumor growth
24
Prepositions: OF/BY/IN
OF
BY
IN
q0
Nominalization
(-ion)
q5
Adjective,
noun,
verb (-ed)
Adjective,
Noun,
verb (-ed)
Nominalization
(-ion)
Nominalization
(-ion)
Negation
q4
NP, 5: str1
NP
q1
Aux, 1: tr13
OF
q6
OF
Nominalization
(-ion)
q7
Aux
mod
Negation
q2
Adjective,
noun,
verb (-ed)
OF
q15
q18
q13
NP
verb
aux
verb
verb
q14
verb
Nominalization
(-ion)
q3
mod
OF
q8
BY
q9
NP
© 2005
mod
q11
BY
q10
q12
NP
IN
IN
NP
NP
BY
IN
q16
IN
NP
q17
25
Example Map (one abstract)
© 2005
26
Select interesting
relations to
visualize
© 2005
27
Overview
© 2005
Double click to
expand
28
Expanded node
© 2005
29
Finding the truth: p38
acts as a negative
feedback for Ras
signaling
© 2005
30
BioPortal: infectious disease information
sharing, knowledge portal, spatio-temporal
analysis and visualization, sequence
visualization
© 2005
31
Syndromic Surveillance
• A syndrome is a set of symptoms or conditions that
occur together and suggest the presence of a certain
disease or an increased chance of developing the
disease (from NIH/NLM)
• Syndromic surveillance is based on health-related
data that precede diagnosis and signals a sufficient
probability of a case or an outbreak to warrant further
public health response (from CDC)
– Targeting investigation of potential cases
– Detecting outbreaks associated with bioterrorism
2016/7/12
© 2005
32
32
Syndromic Surveillance Data Sources in Different Stages of
Developing a Disease
Reproduced from Mandl et. al. (2004)
2016/7/12
© 2005
33
33
Syndromic Surveillance System Survey
Projects
User population
Stakeholders
RODS
-Pennsylvania, Utah, Ohio, New Jersey, Michigan etc
RODS laboratory,
-418 facilities connected to RODS
U of Pittsburgh
STEM
N/A
IBM
ESSENCE II
300 world wide DOD medical facilities
DoD
EARS
-Various city, county, and state public health officials
in the United States and abroad of US
CDC
BioSense
Various city, county, and state public health officials in
the United States and abroad of US
CDC
RSVP
Rapid Syndrome Validation Project; Kansas, NM
Sandia NL, NM
BioPortal
NY, CA, Kansas, AZ, Taiwan
U of Arizona
2016/7/12
© 2005
34
34
Sample Systems and Data Sources Utilized
Projects
Data sources/Techniques
RODS
- Chief complaints (CC); OTC medication sales
- Free-text Bayesian disease classification
STEM
- Simulated disease data
- Disease modeling and visualization, SIR
ESSENCE II
- Military ambulatory visits; CC; Absenteeism data
EARS
- 911 calls; CC; Absenteeism; OTC drug sales
- Human-developed CC classification rules
BioSense
- City/state generated geocoded clinical data
- Graphing/mapping displays
RSVP
- Clinical and demographic data
- PDA entry and access
BioPortal
- Geo-coded clinical data; Gemonic sequences; Multilingual CC
- Real-time access and visualization; Web based hotspot
analysis; Sequence visualization; Multilingual ontologybased CC classification
© 2005
35
Information Sharing Infrastructure Design
© 2005
Data Ingest Control
Module
Cleansing / Normalization
Adaptor
Adaptor
SSL/RSA
Adaptor
SSL/RSA
Info-Sharing Infrastructure
Portal Data Store
(MS SQL 2000)
PHINMS
Network
XML/HL7
Network
NYSDOH
CADHS
New
36
Data Access Infrastructure Design
Public health
professionals,
researchers, policy
makers, law enforcement
agencies & other users
WNV-BOT Portal
Browser (IE/Mozilla/…)
SpatialTemporal
Data
Search
Visualand
Query
ization
SSL connection
Analysis /
Prediction
HAN or
Personal
Alert
Management
Web Server (Tomcat 4.21 / Struts 1.2)
Data Store
User Access Control API (Java)
2016/7/12
© 2005
Dataset
Privileges
Management
Data Store
(MS SQL 2000)
Access
Privilege
Def.
37
37
BioPortal
© 2005
38
38
Dataset name
Advanced
Spatial / Temporal
Search criteria
Select background maps
Results
listedlist
in table
Available
dataset
User main page
Positive cases
Time range
Select NY / CA
population, river and
lakes
County / State
Choose WNV disease
data
Select CA dead bird,
chicken and NY dead
bird data
Positive cases
User Login
Positive cases
Start STV
2016/7/12
© 2005
Specify bird
species
39
39
NYSpatial
dead bird
distribution
temporal
distribution
pattern
pattern
GIS
Timeline
Close
Zoom in
NY
Zoom in
Periodic
Pattern
Year 2001
data
Control
panel
2016/7/12
© 2005
Move time
slider, year 3
2
2 weeks
View1 all
year
3 year
window
window indata
3 year span
Concentrated
Similar time
Overall pattern
in May
pattern
/ Jun
40
40
Spatial distribution
Overlay population map
pattern
Dead bird cases
Dead
birdlong
cases
migrate
from
island
distribute
along
Into upstate
NY
populated areas near
Hudson river
Enable
population
map
Season end
Move time
slider
2016/7/12
© 2005
41
41
BioPortal HotSpot Analysis: RSVC,
SaTScan, and CrimeStat Integrated (first visual, real-time
hotspot analysis system for disease surveillance)
• West Nile virus in California
2016/7/12
© 2005
42
42
Hotspot Analysis-Enabled STV
Select hotspot to
Regular STV
highlight case
points
Select algorithms
Hotspots found!
Select baseline and
case periods
Select
Select
baseline
targetand
geographic
case periods
area
2016/7/12
© 2005
43
43
Taiwan Hospital Surveillance: Chief Complaints
© 2005
44
44
Grouped by Hospital
© 2005
45
45
Taiwan SARS Network Visualization (Cont.)
Social network visualization with
patients and geographical locations
Scroll bar on time dimension to
see the evolution of a network
2016/7/12
© 2005
46
46
Taiwan SARS Network Evolution – Hospital Outbreak
The index patient of Heping
Hospital began to have symptoms.
2016/7/12
© 2005
47
47
Security Informatics:
law enforcement information sharing,
crime data mining, counter-terrorism
surveillance, multilingual text mining,
intelligence web mining
© 2005
48
• Intelligence and Security
Informatics (ISI): Development of
advanced information
technologies, systems,
algorithms, and databases for
national security related
applications, through an
integrated technological,
organizational, and policy-based
approach” (Chen et al., 2003a)
• Data, text, and web mining
• From COPLINK to Dark Web
H. Chen, computer scientist,
artificial intelligence, U. of
Arizona (2006)
© 2005
49
COPLINK
•
•
•
•
•
•
•
1996-, DOJ, NIJ, NSF, ITIC, DHS
Connect
Detect
Agent
STV (Spatio-Temporal Visualization)
CAN (Criminal Activity Network)
BorderSafe (Mutual Information)
• AI Lab  Knowledge Computing Corporation (KCC)
• Tucson, Phoenix  AZ  3500 agencies, 20 states
• The largest public safety information sharing and data
mining system!
© 2005
50
© 2005
51
51
•The New York Times November 2, 2002
•ABC News April 15, 2003
•Newsweek Magazine March3, 2003
© 2005
52
Private Equity Firm Buys
MIS Spinoff Company for
$40M…July 2009
© 2005
53
Dark Web
• 2002-, ITIC, NSF, LOC
• Discussions: FBI, DOD/Dept of Army, NSA, DHS
• Collection:
– Web site spidering
– Forum spidering
– Video spidering
• Analysis and Visualization:
– Link and content analysis (web sites)
– Web metrics analysis (web sites sophistication)
– Authorship analysis (forums; CyberGate)
– Sentiment analysis (forums; CyberGate)
– Video coding and analysis (videos; MCT)
• 50,000 terrorist web sites, 1B documents/files about terrorists
• The largest collection of terrorist-generated contents: 10 TBs (~LOC)
• The most advanced multilingual Web 2.0 intelligence analysis system
© 2005
54
The Dark Web project in the Press
Project Seeks to Track Terror Web
Posts, 11/11/2007
Researchers say tool could trace online posts
to terrorists, 11/11/2007
Mathematicians Work to Help Track Terrorist
Activity, 9/14/2007
Team from the University of Arizona
identifies and tracks terrorists on
the Web, 9/10/2007
© 2005
55
Web Site Example: Links to Multimedia and Manuals
Link to “The General of Islam” Radio Station
Azzam
Speeches
Berg
beheading
others
videos of
Zarqawi
Source: http://www.al-ghazawat.110mb.com/,
© 2005
French and Arabic Web Site
Complete
65 pages
manual of
a 50
caliber rifle
in pdf
56
Dark Web Forum Tools
© 2005
57
Caduceus Intelligence:
HIS, LIS, PACS, clinical decision
support, health data mining, virtual
patients, personalized medicine,
healthcare 2.0 (physicians and patients
like me)
© 2005
58
Outline
• Background & Related Literature
– Healthcare issues, healthcare systems, and healthcare IT
• Research Testbed: Taiwan MS Hospital Data Overview
– Tablespace, statistics, and examples
• Techniques for Symptom-Disease-Treatment (SDT) Associations
– Rationale, related work, method, and results
– Dashboards for Physician and Manager
© 2005
59
The Healthcare
Systems
•
Healthcare continues to be one of
the largest and fast-growing
industry in the United Stated in the
past decades.
Annual Healthcare Expenditure in
the United States
Per Capita US$
% GDP
8,000
18.0
7,000
16.0
14.0
6,000
12.0
5,000
4,000
8.0
3,000
6.0
2,000
4.0
2008
2004
2000
1996
1992
1988
1984
1980
0.0
1976
0
1972
2.0
1968
1,000
1964
Behind this multi-billion industry are
practices that need to be
transformed to be more effective.
– It was estimated that in the U.S.
about 100,000 people die each
year due to preventable medical
errors (Institute of Medicine,
2001).
1960
•
10.0
Data Source: OECD Health Data 2010
© 2005
60
Effective Healthcare
•
Healthcare in the 21st Century
– Quality metrics: safety, effective, patient-centered, timely, efficient, and
equitable (Institute of Medicine, 2001).
•
Along with medical professions, computer scientist and information systems
scholars have also been engaged in bringing effective healthcare, especially
from an information technology (IT) angle (Stead & Lin, 2009).
•
Studies show that health IT (HIT) plays a significant role in transforming
healthcare practice (Agarwal et al., 2010).
– President Obama’s Health Information Technology for Economic and
Clinical Health (HITECH) initiatives encourage healthcare providers to
adopt digitalized solutions to enhance quality of care.
• Financial incentives for “meaningful” use of HIT
• Interoperable data standard for electronic health records (EHR)
© 2005
61
Four Domains of HIT
• In their study, Stead and Lin (2009: 29) categorize HIT into four
domains
– automation,
– connectivity,
– decision support, and
– data-mining capabilities
• It is indicated that despite their direct and significant impact in
delivering effective healthcare, decision support and data-mining
capabilities still receive little support in today’s HIT design and
investment.
© 2005
62
• Although often been used
interchangeablely with
medical informatics and ehealth, health informatics
possesses simultaneous
emphasis on decision
support, data mining, as
well as visualization in a
broad healthcare provision
process (Nykänen, 2000;
Norris, 2002; Brender et al.,
2000).
© 2005
Health Informatics
Decision
Support
Data Mining
Health
Informatics
Visualization
63
Healthcare Decision Support
•
Advanced data analysis and decision support has been featured as a key
success factor of HIT (Lau et al., 2010) as well as a promising IS research
area (Sarnikar et al., 2010; Agarwal et al., 2010).
• “Decision technology does not merely facilitate or augment decisionmaking rather it reorganizes decision-making practices” (Patel et al.,
2002).
• Decision support in healthcare process may reside in various
managerial, administrative, and clinical scenarios (Norris, 2002).
Key Pillars for Clinical Decision Support Systems
• Challenges (Sittig et al., 2008)
– Data integration and management
– Complicated workflows
– Reasoning in a high-dimensional
space
Clinical Perspectives
Best Knowledge Available
When Needed
High Adoption & Effective
Use of Clinical Decision
Support Systems
Continuous Improvement
of Clinical Knowledge &
Supporting Tools
Knowledge Availability
Effective IS Use
Extending Knowledge &
Methods
Knowledge & Semantic
Data Management
Technology Acceptance &
Adoption
Medical Informatics &
Healthcare Business
Intelligence
IS Perspectives
© 2005
64
64
(Adapted from Osheroff et al., 2007)
Healthcare Data Mining
• Research indicates the need for domain-driven data mining to enable
actionable knowledge discover and delivery (Cao, 2010). Health
informatics is distinct from other data mining tasks for its voluminous
and heterogeneous data as well as the inherent privacy and ethic
issues (Cios & William Moore, 2002).
• Previous studies have been applying data mining in disease pattern
recognition (Rao et al., 2002; Wroblewski et al., 2009), risk
assessment (Austin et al., 2010; Hou et al., 2010), and treatment
support (Dahlström et al., 2006; Toussi et al., 2009) in various
medical and healthcare contexts.
© 2005
65
Healthcare Visualization
• Users’ preferences and cognitive processes should be considered in
the complex clinical decision process (Fieschi et al., 2003; Kushniruk
& Patel, 2004; Horsky et al., 2003).
• Research shows evidence that information representation
significantly influence human’s sensemaking and analytical reasoning
process (Gotz & Zhou, 2009; Qu & Furnas, 2005).
• However, human-computer interface has also been ranked as the top
challenges in clinical decision support (Sittig et al., 2008).
© 2005
66
Research Testbed: Taiwan MS Hospital
ATTRIBUTE
Data Storage Size
VALUE
Current Database Status
39.6 GB
Number of Tables
297*
Number of Registered Patients
894,061
Number of Unique Outpatients
387,014
Time Span of Outpatient Records
(from outpatient master table PTOPD)
2002/01/03-2010/07/12**
Average Number of Daily Outpatients
2409.95***
Number of Unique Inpatients
91,717
Time Span of Inpatient Records
(from inpatient master table PTIPD)
2003/11/17-2010/07/13
Average Number of Daily Inpatients
60.93
The tables encompass data from Min-Sheng’s Hospital Information System (HIS), Laboratory
Information System (LIS), and Picture Archiving and Communication Systems (PACS).
** Records before Dec. 2004 are incomplete. Average daily outpatients before Dec. 2004 is
21.49.
*** The calculation of average daily outpatients excludes Sundays and records before Dec.
2004.
*
© 2005
67
Main Tables in the Tablespace
Outpatient Modules
PTER (急診掛號檔)
PTOPD (門急診病患檔)
CODINGOPDA (門急診疾病分類診斷檔)
CODINGOPD (門急診疾病分類資料檔)
CODINGOPDP (門急診疾病分類處置檔)
ACNTOPD (門診病患醫令明細檔)
PRICE (收費標準檔)
PTCOURSE (病患同療程記錄檔)
ORDAOPD (門診病患診斷檔)
MCHRONIC (慢性病連續處方箋檔)
HRECOPD1 (歷史門診收據檔(表頭))
HRECOPD2C (歷史門診收據檔(貸方))
HRECOPD2D (歷史門診收據檔(借方))
HORDERA (門診病患診斷檔2)
FREQUENCY (頻次代碼檔)
ORDSOOPD (門診S.O.檔)
Patient background:
CHART (病歷基本資料檔)
AGEGROUP1 (年齡分層主檔)
AGEGROUP2 (年齡分層表身檔)
PTTYPE (身份代碼檔)
Diagnosis
(Symptoms and
Diseases)
Registration
PTIPD (住院病患基本資料檔)
IPDINDEX (住院病患索引檔)
IPDTRANS (住院病患履歷檔)
Treatment
(Procedures and
Orders)
Transaction /
Receipt
CODING (疾病分類表頭檔)
CODINGA (疾病分類診斷檔)
CODINGP (疾病分類處置檔)
ACNTIPD (住院病患醫令明細檔)
PRICE (收費標準檔)
ORDFB (住院健保醫療費用醫令清單檔)
DTLFB (住院健保醫療費用清單檔)
DIAGDOCA (入院病摘主檔)
DIAGDOCAX (入院病摘內容檔)
DIAGDOCI (出院病歷摘要主檔)
DIAGDOCIX (出院病摘內容檔)
FREQUENCY (頻次代碼檔)
RECIPD1 (住院收據檔(表頭))
RECIPD2C (住院收據檔(貸方))
RECIPD2D (住院收據檔(借方))
RECIPDNH (住院收據檔(健保))
ORDAIPD1 (住院診斷履歷檔(表頭))
ORDAIPD2 (住院診斷履歷檔(表身))
Inpatient Modules
Hospital:
Disease:
Operation:
LIS:
PACS:
BED (床位主檔)
BEDGRADE1 (床位等級代碼檔(表頭))
BEDGRADE2 (床位等級代碼檔(表身))
BEDSTATUS (床位狀態檔)
APDRGD (DRG疾病代碼對照檔)
DRGICD9 (DRG疾病代碼對照檔)
PTOR (手術病人主檔)
PTORDRPT (手術病人報告記錄明細檔)
ORSAMPLE (手術內容模組主檔)
PTORALLLOG (手術異動記錄LOG檔)
PTORDIAG (手術病人術後診斷)
PTORDRPT (手術病人報告記錄明細檔)
PTORLOG (手術病人主檔LOG)
PTORSTAFF (手術參與人員檔)
LABITEM1 (檢驗項目主檔(表頭))
LABITEM2 (檢驗項目主檔(表身))
LABGROUP (檢驗組別代碼檔)
LABP1 (檢驗病理表頭檔)
LABP2 (檢驗病理表身檔)
LABS1 (檢驗病理組織學1檔)
LABS2 (檢驗病理組織學2檔)
LABX1 (檢驗異動表頭檔)
LABX2 (檢驗異動表身檔)
EXAMITEM (檢查項目檔)
PTEXAM (申請單主檔)
PTEXAMINDEX (申請單索引檔)
PTEXAMITEM (申請單檢查項目檔)
PTEXAMRPT (申請單報告檔)
DEPT (部門代碼檔)
DIV (科別代碼檔)
DOCTOR (醫師代碼檔)
HOSPITAL (醫院代碼檔)
© 2005
ICDGROUP1 (ICD分類主檔)
ICDGROUP2 (ICD分類表身檔)
ICD (國際疾病分類代碼檔)
Note: Tables with underlines contain free-text data.
68
TABLE
Top 10 Tables
NUMBER
SPAN
(Ranked
by
Number
ofTIME
Records)
TABLE CHINESE NAME
OF
*
CATEGORY
RECORDS
ACNTOPD
門診病患醫令明細檔
23,404,233
2004/07/22-2010/07/12
Outpatient - Order
ACNTIPD
住院病患醫令明細檔
19,822,382
2004/05/10-2010/07/13
Inpatient - Order
HRECOPD2C
歷史門診收據檔(貸方)
9,306,611
2004/06/30-2010/07/12
Outpatient Transaction
CODINGOPDA
門急診疾病分類診斷檔
6,983,919
2004/07/01-2010/04/30
Outpatient - Diagnosis
6,908,559
2004/06/25-2010/07/13
Inpatient - Diagnosis
DIAGDOCXLOG 病摘內容異動LOG檔
ORDFB
住院健保醫療費用醫令清單
檔
6,118,311
2006/06-2010/07
Inpatient - Order
HRECOPD1
歷史門診收據檔(表頭)
5,549,295
2004/06/30-2010/07/12
Outpatient Transaction
LABX2
檢驗異動表身檔
5,267,898
2008/07/09-2010/12/29
Lab
ORDAOPD
門診病患診斷檔
5,117,329
2000/01/032010/07/12**
Outpatient - Diagnosis
HRECOPD2D
歷史門診收據檔(借方)
4,046,316
2004/06/30-2010/07/12
Outpatient Transaction
© 2005
69
Important MS Hospital Tables and
Their Potential Applications
HIS
CONTENT
TABLE
ORDAOPD &
Signs, symptoms, and diseases
ORDAIPD2
ACNTOPD & Orders for examination, procedure,
ACNTIPD
and medication, payment information
LIS
Chief complaints, personal history of
DIAGDOCAX illness and drug, primary diagnosis,
&
plan of examination and treatment,
DIAGDOCIX complication, and discharge
instructions
TABLE CONTENT
LABX1 Lab examination results (with safe upper &
&
lower limits) and disease code for the lab
LABX2 exam
PACS
© 2005
LABS2 Physician’s comments for the lab results
CONTENT
TABLE
EXAMITE Medical image name and position of
M
body
PTEXAMR Physician’s comments for the PACS
PT
results
APPLICATION
• Disease prediction
• Disease clustering
• Order generation
• Treatment recommendation
• Physician’s clinical
preferences
• Clinical workflow
• Information extraction
• Problem verification
• Treatment recommendation
APPLICATION
• Patient safety alert
• Lab suggestion &
interpretation
• Disease severity assessment
• Treatment outcome
assessment
APPLICATION
• PACS exam suggestion
• Disease severity assessment
• Treatment outcome
assessment
70
LIS HIS
PACS
Sample Data
Background Information
Name: Jack
Gender: Male
Age: 55
Division: Pulmonary Medicine
Doctor Name: Dr. John
Admission Date: 2009 Aug 14th
……
Admission Comment
Chief Complaints: cough , white
sputum, poor intake & loss of BW
(about 9 kg ) for 3 months hemoptysis
for 2 weeks mild SOB
History of present illness: ……
Family History: ……
© 2005
71
LIS HIS
Hospital Information System (HIS)
PACS
Sample Data (cont.)
DIAGNOSIS &
ICD/PRICE
INTERVENTIO
CODE
N
Symptom
786
Disease
Procedure
Order
© 2005
DESCRIPTION
Symptoms involving respiratory system and other chest
symptoms
786.3
Hemoptysis
162.9
Malignant neoplasm of bronchus and lung, unspecified
197.2
Secondary malignant neoplasm of pleura
33.24
Closed (endoscopic) biopsy of bronchus
87.41
Computerized axial tomography of thorax
OGEFI2
Gefitinib F.C. (Iressa)
67051
Thoraciscopic wedge or Partial resection of the Lung
IETOP1
Etoposide injection (Etoposide-Teva)
72
LIS HIS
Laboratory Information System
PACS
Sample Data (cont.)
LAB NAME
Surgical
pathology Level
IV
© 2005
REPORT
TYPE
INTERPRETATION
Diagnosis
Bronchus, left, transbronchial biopsy --- Squamous cell
carcinoma. Results of immunohistochemical stains: P63: tumor
cell positive. TTF-1: tumor cell negative.
Gross
The specimen consists of 3 pieces of gray tan and soft tissue, up to 0.2
cm. All for section.
Microscopic
Section of the bronchial biopsy shows clusters of tumor cells with
large pleomorphic nuclei and abundant cytoplasm.
Immunohistochemical stains show that the tumor cells are postive for
P63 but negative for TTF-1. Squamous cell carcinoma is considered.
73
LIS HIS
PACS
Picture Archiving and Communication Systems (PACS)
Sample Data (cont.)
EXAM NAME
INTERPRETATION
CT with/without contrast PROCEDURE: Clinical information: left hilar mass Technique: Chest CT scan
with and without contrast enhancement performed by 64MDCT (LightSpeed
VCT, GE) Limitations: None Comparison: No comparison study available
Interpretation: Any further question, please feel free to contact with the
radiologist, Dr. Mark, who interpreted this study.
FINDINGS: The result shows focal areas of decreased opacity (lung destruction)
with or without visible walls in … -There are multiple, small, centrilobular
lucencies and patchy appearance. They are predominantly upper lobe
distribution. It is suggestive of centrilobular emphysema. A large oval shaped
mass like lesion, measuring over 5 cm in size, is localized at the left perihilar
region with direct invasion to the mediasitnum. Wedge shape consolidation of
the left lower lobe is disclosed down to the lung base. Significant adenopathy is
positive at the AP window, pericarinal and subcarinal spaces.
====== [Conclusion] ====== 1. Lung cancer (left hilar) with partial collapse of
LLL, stage IIIA 2. Mediastinal LN metastases 3. Emphysema is suggested.
Chest PA view
© 2005
CXR showed normal heart size with slightly increased lung markings at hila and
lower lung. Prominent left hilar shadow is found. Suggest F/U.
74
A Clinical Decision Process
DIAGNOSIS &
ICD/PRICE
DESCRIPTION
INTERVENTION CODE
Symptom
786
Symptoms involving respiratory system and other chest
symptoms
Disease
Procedure
Order
786.3
Hemoptysis
162.9
Malignant neoplasm of bronchus and lung, unspecified
197.2
Secondary malignant neoplasm of pleura
33.24
Closed (endoscopic) biopsy of bronchus
87.41
Computerized axial tomography of thorax
OGEFI2
Gefitinib F.C. (Iressa)
67051
Thoraciscopic wedge or Partial resection of the Lung
IETOP1
Etoposide injection (Etoposide-Teva)
• Generally, physicians employ their medical knowledge to perform
health care tasks in the iterative sequence of: symptoms, diseases,
and treatments (and outcomes).
© 2005
75
A Clinical Decision Process (cont.)
• One of the difficulties in the provision of effective
healthcare is that symptom, disease, and treatment
constitute a high-dimensional, complicated, yet
interrelated space (Sittig et al., 2008).
• Diagnosis errors are common and cannot be mitigated by
EHRs without integrating the digital evidence with
physician’s diagnostic thinking (Schiff & Bates, 2010).
• Techniques to facilitate physician’s diagnostic reasoning
can be very useful.
© 2005
76
Association Rule Mining (ARM) in Medicine
• Diagnostic reasoning, planning, and patient management are three
generic medical reasoning tasks (Long, 2001), in which diagnostic
reasoning is a disease prediction process based on symptoms, signs,
lab results, and/or medical images whereas planning is a set of
interventions taken to verify and resolve patient’s illness.
• The past decade has shown increasing interest in applying
association rule mining (ARM) to support the diagnostic reasoning
and planning process. Specifically, ARM was applied to disease
prediction, problem verification, comorbidity analysis, disease
clustering, automatic order generation, treatment suggestion, and
many other medical scenarios.
© 2005
77
ARM in Medicine: Symptoms, Diseases, and
Treatments
0.05 < Confidence <=0.2
0.2 < Confidence <=0.5
0.5 < Confidence
Symptoms
Hemoptysis
(786.3)
Other dyspnea and
respiratory abnormalities
(786.09)
0.0640
0.0689
Unspecified pulmonary
tuberculosis
confirmation unspecified
(011.90)
Diseases
0.4525
Pneumonia
(486)
0.2502
Malignant neoplasm of bronchus
and lung, unspecified
(162.9)
0.1456
0.2097
0.0640
Treatments
0.5562
Terbutaline sulphate
5mg/2ml/vial
(ETERBUS)
0.7615
0.5496
Thoracentesis Chest PA view
Pyridoxine Hcl (34.91)
(320011)
Tablets 50mg
(OVTB6)
© 2005
0.1158
0.4882
0.6777
0.4194
0.4646
Computerized
axial tomography
0.2707 of thorax
(87.41)
Injection or infusion of
Direct smear by
cancer
Gram Stain
chemotherapeutic
Aerobic Culture
(130062)
substance
(13007)
(99.25)
78
ARM Research in Medicine
Disease prediction
Problem verification
Symptom, sign,
or lab result
Automatic order generation
Treatment suggestion
Disease
Treatment
Problem verification
Comorbidity analysis
Disease clustering
STUDY
DIRECTION OF ASSOCIATIONS
APPLICATION
Wright et al. (2010)
Lab result to Disease
Treatment to Disease
Problem verification
Hanauer et al. (2009)
Disease to Disease
Disease clustering
Klann et al. (2009)
Disease to Treatment
Automatic order generation
Tai & Chiu (2009)
Disease to Disease
Comorbidity analysis
Ordonez (2006)
Symptom to Disease
Disease prediction
Wright & Sittig (2006)
Disease to Treatment
Outpatient - Transaction
Cao et al. (2005)
Disease to Disease
Comorbidity analysis
Imberman (2002)
Symptom to Disease
Disease prediction
Treatment to Disease
Problem verification
Doddi et al. (2001)
© 2005
79
Hanauer, D. A., Rhodes, D. R., & Chinnaiyan, A. M. (2009). Exploring Clinical
Associations Using ‘-Omics’ Based Enrichment Analyses. PLoS ONE, 4(4),
e5203.
© 2005
80
Wright, A., Chen, E. S., & Maloney, F. L. (2010). An automated
technique for identifying associations between medications, laboratory
results and problems. JBI, 43(6), 891-901.
Medication
Cyclosporine micro (Neoral)
Ritonavir
Tenofovir/emtricitabinea
Multivitamin (vitamins A, D, E, K)
Atazanavir
Efavirenz/emtricitabine/tenofovira
Efavirenz/emtricitabine/tenofovira
Ritonavir
Cyclosporine micro (Neoral)
Tenofovir/emtricitabinea
Problem
Cardiac transplant
HIV/AIDSb
HIV/AIDSb
Cystic fibrosis
HIV/AIDSb
HIV/AIDSb
HIV positive
HIV positive
Stress test
HIV positive
Support Confidence Chi square Interest Conviction
72
47.37% 15974.05
222.76
1.9
108
87.10% 13584.49
126.62
7.7
117
74.05% 12484.95
107.66
3.83
13
76.47% 12206.84
939.93
4.25
91
87.50% 11495.76
127.21
7.94
77
95.06% 10576.62
138.2
20.11
73
90.12% 10525.03
145.06
10.06
90
72.58% 10423.49
116.82
3.62
63
41.45% 10390.04
166.04
1.7
101
63.92% 10284.74
102.89
2.75
Laboratory Result
Bethesda inhibitor assay
vWF multimers
Fetal hemoglobin
Cotinine
Cotinine
Cotinine
Vitamin K
Cyclosporine level
Tobramycin level
Cotinine
Problem
Hemophilia
von Willebrand’s disease
Sickle cell anemia
Lung Transplant
Cystic Fibrosis
Pulmonary Fibrosis
Cystic Fibrosis
Cardiac Transplant
Cystic Fibrosis
Pulmonary Fibrosis
Support Confidence Chi Square Interest Conviction
7
25.00%
5906.03
845.03
1.33
8
53.33%
3711.07
465.22
2.14
18
25.35%
6647.5
370.57
1.34
9
18.75%
2452.46
274.06
1.23
10
20.83%
2545.07
256.07
1.26
9
27.27%
1997.01
223.48
1.37
8
17.39%
1696.96
213.76
1.21
101
42.98% 20344.05
202.12
1.75
10
16.13%
1966.38
198.25
1.19
11
22.92%
2048.01
187.78
1.3
© 2005
81
Imberman, S. P., Domanski, B., & Thompson, H. W. (2002). Using
dependency/association rules to find indications for computed tomography
in a head trauma dataset. AI in Medicine, 26(1-2), 55-68.
© 2005
82
SDT ARM Analysis
• Goal: To find diagnosing and prescribing associations
among symptoms, diseases, and treatments (medical
procedures and medicines).
• Process:
1.
2.
3.
4.
Input a target code (c) that represents either a
symptom, disease, or treatment
Identify the set of clinical visits (V) from
inpatient/outpatient records which have c
Identify the set of codes (O) are also assigned in V
For each code o in O, calculate the probability of o,
given the occurrence of c. That is,
P (o | c ) 
5.
© 2005
Entire Patient Visits
Patient Visits with Code c
P (o  c )
P (c )
Rank and return the codes in O that meet
thresholds, such as minimum support, confidence,
and interest.
Patient Visits
with both Codes
c and o
83
Study Case: Lung Cancer
• To exemplify the SDT association, we proceed with lung cancer
subjects. The reason for such selection is two-fold:
– Cancer continues to be the leading cause of death around the
world, accounting for one fourth of death in the United States
(Jemal et al., 2010) and 27.3% death in Taiwan (Department of
Health, Taiwan, 2010).
– Among all kinds of cancers, lung cancer occupies the highest
population, new cases, and death (Jemal et al., 2010). Lung
cancer (ICD-9 code: 162) is also one of the most common types of
cancer in our dataset (534 unique patients and 1131 distinct
inpatient visits).
• With various lung cancer subtypes, subjects are selected when their
diseases are coded as 162.9 (ICD-9 code for malignant neoplasm of
bronchus and lung unspecified), resulting a total of 484 subjects and
975 inpatient visits.
© 2005
84
Descriptive Statistics of the Data
Departments
0
100 200 300 400 500 600 700 800
Pulmonary medicine
622
Pulmonary medicine and critical illness
122
Cardiology
48
Thoracic surgery
43
Gastroenterology
30
General surgery
21
Other 11 Dept. (each < 20)
89
Physicians
500
400
300
200
100
0
386
175
M1158
© 2005
312
M1584
102
M0031
Other 64
physicians
(each < 50)
85
Descriptive Statistics of the Data (cont.)
Patient Age Groups
Patient Genders
800
600
400
200
0
617
358
M
800
600
400
200
0
F
Frequent Cooccurred Diagnosis
291
33
25 to 44 45 to 64
0
Pneumonia organism unspecified
Secondary malignant neoplasm of bone and…
Unspecified essential hypertension
Acute respiratory failure
Unspecified pleural effusion
Secondary malignant neoplasm of brain and…
Secondary malignant neoplasm of pleura
Diabetes mellitus without complication type ii or…
Secondary malignant neoplasm of lung
Secondary malignant neoplasm of liver
© 2005
651
100
> 65
200
300
244
155
135
131
130
110
100
91
83
77
86
Top 10 Identified Symptom& Disease Associations for
Lung Cancer (162.9)
Symptom
Rank
1
2
3
4
5
6
7
8
9
10
ICD Code
792.1
799.4
780.71
786.3
785.9
786
786.5
786.2
786.9
799.0
Name
Nonspecific abnormal findings in stool contents
Cachexia
Chronic fatigue syndrome
Hemoptysis
Other symptoms involving cardiovascular system
Symptoms involving respiratory system and other chest symptoms
Chest pain
Cough
Other symptoms involving respiratory system and chest
Asphyxia
ICD Code
162.9
162
733.11
197.1
162.5
163.9
162.0
197.2
198.3
162.8
Name
Malignant neoplasm of bronchus and lung, unspecified
Malignant neoplasm of trachea, bronchus and lung
Pathologic fracture of humerus
Secondary malignant neoplasm of mediastinum
Malignant neoplasm of lower lobe, bronchus or lung
Malignant neoplasm of pleura, unspecified
Malignant neoplasm of trachea
Secondary malignant neoplasm of pleura
Secondary malignant neoplasm of brain and spinal cord
Malignant neoplasm of other parts of bronchus or lung
Verification
WebMD Other Sources*
V
V
V
V
V
V
V
V
Disease
Rank
1
2
3
4
5
6
7
8
9
10
© 2005
*
87
Note: other sources include lungcancerbookandnewsletter.com and the Japanese Journal of Clinical Oncology
Top 10 Identified Procedure & Order Associations for
Lung Cancer (162.9)
Treatment -- Procedure
Rank
1
2
3
4
5
6
7
8
9
10
ICD Code
33.26
33.27
32.4
33.24
33.93
34.24
92.18
92.24
40.11
40.24
Name
Closed (percutaneous) (needle) biopsy of lung
Closed endoscopic biopsy of lung
Lobectomy of lung
Closed (endoscopic) biopsy of bronchus
Puncture of lung
Pleural biopsy
Total body radioisotope scan
Teleradiotherapy using photons
Biopsy of lymphatic structure
Excision of inguinal lymph mode
Verification
WebMD Other Sources*
V
V
V
V
V
V
V
V
V
Treatment -- Orders
Rank
1
2
3
4
5
6
7
8
9
10
© 2005
*
Order ID
OIRES2
IPEME5
OERLOT
OGEFI2
IETOP1
IDOCE8
70213
SMEGEOS
37038
33103
Name
Gefitinib F.C. (Iressa)
Pemetrexed Disodium Heptahydrate (Alimta)
Erlotinib (Tarceva)
Gefitinib F.C. (Iressa)
Etoposide injection (Etoposide-Teva)
Docetaxel (Taxotere)
Radical lymphadenectomy
Megestrol acetate suspension (Megest)
Intravenous chemothrapy <=1 hours
CT Guide biopsy
Verification
WebMD Other Sources*
V
V
V
V
V
V
V
V
V
V
Note: other sources include mdguidelines.com, Wikipedia, PubMed, and Journal of Clinical Oncology
88
Scenario-based SDT Association (Personalized Medicine)
•
SDT associations can be further extended to illuminate associations in a specific
(personalized) scenario, such as:
– Physician-centered SDT association
• Physician’s treatment propensity  education and dissemination of best
practice
– Patient-centered SDT association
• Influences of patient’s demographic background and medical status  patientcentered care
– SDT association for multiple target diseases (complex disease scenarios)
Entire Patient Visits
Patient Visits with Code c
Patient Visits
with both Codes
c and o
© 2005
Patient Visits in the Scenario
Patient Visits with Code c
Patient Visits
with both Codes
c and o
89
Consistency of Top Treatment Orders
Top 10 Treatment Orders from the Aggregated Group
Iressa Alimta Tarceva
Iressa
Etoposide-Teva
Taxotere
Radical
lymphadenectomy
Megest
Intravenous
chemothrapy CT Guide
<=1 hours
biopsy
M0031
Physician
M1158
V
V
V
V
M1584
Pulmonary medicine (PM)
Department
PM and critical illness
V
Cardiology
Gender
F
V
M
V
V
V
V
V
V
V
V
25 to 44
Age group
45 to 64
> 65
Cooccurred
disease
198.5
486
518.81
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
• When ranked by their confidence values, the top 10 treatment
orders in each subgroups show low consistency with the aggregated
group.
– The result indicates that context has a high impact on SDT associations, which
motivates our further development on scenario-based SDT techniques.
© 2005
90
Correlation Analysis
Correlation Matrix for All Treatment Orders
AGGREGATED
M0031
M1158
M1584
AGGREGATED
1
0.852
0.951
0.963
M0031
0.852
1
0.686
0.812
M1158
0.951
0.686
1
0.904
M1584
0.963
0.812
0.904
1
AGGREGATED
198.5
486
518.81
AGGREGATED
1
0.94
0.922
0.812
198.5
0.94
1
0.822
0.711
486
0.922
0.822
1
0.943
518.81
0.812
0.711
0.943
1
•
•
AGGREGATED
Pulmonary medicine (PM)
PM and critical illness
Cardiology
AGGREGATED
25 to 44
45 to 64
> 65
AGGREGATED
1
0.878
0.944
0.987
AGGREGATED
1
0.752
0.985
0.904
25 to 44
0.878
1
0.862
0.84
45 to 64
0.944
0.862
1
0.879
Pulmonary
medicine (PM)
0.752
1
0.669
0.771
> 65
0.987
0.84
0.879
1
PM and critical illness
0.985
0.669
1
0.843
AGGREGATED
F
M
AGGREGATED
1
0.983
0.995
Cardiology
0.904
0.771
0.843
1
F
0.983
1
0.959
M
0.995
0.959
1
As treatment orders in each subgroup are weighted by their [support ×
confidence], the correlation matrices are calculated to determine the
correlations of treatment orders among each subgroups.
While the correlations are generally high (> 0.6), several subgroups show
relatively low correlations with others, such as physician M0031, age group
25 to 44, and the pulmonary medicine department.
© 2005
91
Manager Dashboard
(Hospital Performance Indicators)
• Goal: To understand various hospital performance indicators based
on four categories: income, physician, and patient.
© 2005
92
Physician Dashboard
(Summarized Patient Profile)
© 2005
93
Physician Dashboard
(Scenario-based SDT Associations Visualization)
Target
Symptoms
Diseases
Treatment (Procedures)
Treatment (Orders)
© 2005
94
Preliminary Findings
• Drawn from data integration and data mining methods,
the proposed SDT association techniques aim to facilitate
diagnostic reasoning and planning tasks.
• Qualitative verification of top SDT associations shows a
high level of consistency with expert knowledge.
• Scenario-based SDT associations can provide finegrained associations for specific subgroups (patients or
physicians).
• Physician dashboards can potential help alleviate clinical
information and cognitive overload.
© 2005
95
Healthcare Data Mining and Business Intelligence
Research Road Maps
•
•
•
Clinical data mining and decision support
– SDT association rule mining; patient cross-disease analysis; patient clustering and
visualization; patient and treatment anomaly detection; patient disease progression
analysis
– Physician dashboard visualization; patient information aggregation and
progression visualization
– Research support and care information from literature, medical ontology and web
– Physician support social media and network (PhysicansLikeMe)
Patient care and support
– Patient information portal; patient centered management
– Capture of regular physical exams and lab reports; progression analysis
– Wireless sensors; home care monitoring and reporting
– Information capture from patients and family members
– Patient support social media and network; PatientsLikeMe, Caner Survivor
Network
Executive information systems
– Resource and facility utilization; department and disease analysis
– Cost analysis and assessment
© 2005
96
Reference
•
•
•
•
•
•
•
•
•
•
•
•
Agarwal, R., Gao, G. (., DesRoches, C., & Jha, A. K. (2010). Research Commentary--The Digital Transformation of Healthcare:
Current Status and the Road Ahead. Information Systems Research, 21(4), 796-809.
Austin, R. M., Onisko, A., & Druzdzel, M. J. (2010). The Pittsburgh Cervical Cancer Screening Model: a risk assessment tool.
Archives of Pathology & Laboratory Medicine, 134(5), 744-750.
Brender, J., Nøhr, C., & McNair, P. (2000). Research needs and priorities in health informatics. International Journal of Medical
Informatics, 58-59, 257-289.
Cao, L. (2010). Domain-Driven Data Mining: Challenges and Prospects. Knowledge and Data Engineering, IEEE Transactions
on, 22(6), 755-769.
Cios, K. J., & William Moore, G. (2002). Uniqueness of medical data mining. Artificial Intelligence in Medicine, 26(1-2), 1-24.
Cao, H., Markatou, M., Melton, G. B., Chiang, M. F., & Hripcsak, G. (2005). Mining a clinical data warehouse to discover
disease-finding associations using co-occurrence statistics, 2005, 106-110.
Dahlström, O., Thyberg, I., Hass, U., Skogh, T., & Timpka, T. (2006). Designing a decision support system for existing clinical
organizational structures: considerations from a rheumatology clinic. Journal of Medical Systems, 30(5), 325-331.
Department of Health, Taiwan. (2010, December 13). Twenty leading causes of death. Retrieved February 8, 2011, from
http://www.doh.gov.tw/CHT2006/DisplayStatisticFile.aspx?d=78276 (in Chinese)
Doddi, S., Marathe, A., Ravi, S. S., & Torney, D. C. (2001). Discovery of association rules in medical data. Medical Informatics
& the Internet in Medicine, 26(1), 25-33.
Fieschi, M., Dufour, J. C., Staccini, P., Gouvernet, J., & Bouhaddou, O. (2003). Medical decision support systems: old
dilemmas and new paradigms. Methods of Information in Medicine, 42(3), 190–198.
Gotz, D., & Zhou, M. X. (2009). Characterizing users’ visual analytic activity for insight provenance. Information Visualization,
8(1), 42-55.
Hanauer, D. A., Rhodes, D. R., & Chinnaiyan, A. M. (2009). Exploring Clinical Associations Using ‘-Omics’ Based Enrichment
Analyses. PLoS ONE, 4(4), e5203.
© 2005
97
Reference (cont.)
•
•
•
•
•
•
•
•
•
•
•
•
•
Horsky, J., Kaufman, D. R., Oppenheim, M. I., & Patel, V. L. (2003). A framework for analyzing the cognitive complexity of
computer-assisted clinical ordering. Journal of Biomedical Informatics, 36(1-2), 4-22.
Hou, Q., Lin, Z., Dusing, R. W., Gajewski, B. J., & McCallum, R. W. (2010). A Bayesian hierarchical assessment of gastric
emptying with the linear, power exponential and modified power exponential models. Neurogastroenterology and Motility: The
Official Journal of the European Gastrointestinal Motility Society.
Imberman, S. P., Domanski, B., & Thompson, H. W. (2002). Using dependency/association rules to find indications for
computed tomography in a head trauma dataset. Artificial Intelligence in Medicine, 26(1-2), 55-68.
Jemal, A., Siegel, R., Xu, J., & Ward, E. (2010). Cancer Statistics, 2010. CA Cancer J Clin, 60(5), 277-300.
Institute of Medicine. (2001). Crossing the Quality Chasm: A New Health System for the 21st Century. Washington, D.C.: The
National Academies Press.
Klann, J., Schadow, G., & McCoy, J. M. (2009). A recommendation algorithm for automating corollary order generation. AMIA
Annual Symposium Proceedings / AMIA Symposium. AMIA Symposium, 2009, 333-337.
Kushniruk, A. W., & Patel, V. L. (2004). Cognitive and usability engineering methods for the evaluation of clinical information
systems. Journal of Biomedical Informatics, 37(1), 56-76.
Lau, F., Kuziemsky, C., Price, M., & Gardner, J. (2010). A review on systematic reviews of health information system studies.
Journal of the American Medical Informatics Association, 17(6), 637 -645.
Long, W. J. (2001). Medical informatics: reasoning methods. Artificial Intelligence in Medicine, 23(1), 71-87.
Norris, A. C. (2002). Current trends and challenges in health informatics. Health Informatics Journal, 8(4), 205 -213.
Nykänen, P. (2000). Decision Support Systems from a Health Informatics Perspective.
Ordonez, C. (2006). Association rule discovery with the train and test approach for heart disease prediction. IEEE Transactions
on Information Technology in Biomedicine, 10(2), 334-343.
Patel, V. L., Shortliffe, E. H., Stefanelli, M., Szolovits, P., Berthold, M. R., Bellazzi, R., & Abu-Hanna, A. (2009). The coming of
age of artificial intelligence in medicine. Artificial Intelligence in Medicine, 46(1), 5-17.
© 2005
98
Reference (cont.)
•
•
•
•
•
•
•
•
•
•
Qu, Y., & Furnas, G. W. (2005). Sources of structure in sensemaking. In CHI '05 extended abstracts on Human factors in
computing systems, CHI '05 (pp. 1989–1992). New York, NY, USA: ACM.
Ramakrishnan, N., Hanauer, D., & Keller, B. (2010). Mining Electronic Health Records. Computer, 43(10), 77-81.
Rao, B. R., Sandilya, S., Niculescu, R., Germond, C., & Goel, A. (2002). Mining time-dependent patient outcomes from hospital
patient records. Proceedings / AMIA ... Annual Symposium. AMIA Symposium, 632-636.
Stead, W. W., & Lin, H. (Eds.). (2009). Computational technology for effective health care: immediate steps and strategic
directions. National Academies Press.
Sittig, D. F., Wright, A., Osheroff, J. A., Middleton, B., Teich, J. M., Ash, J. S., Campbell, E., & Bates, D. W. (2008). Grand
challenges in clinical decision support. Journal of Biomedical Informatics, 41(2), 387-392.
Tai, Y., & Chiu, H. (2009). Comorbidity study of ADHD: Applying association rule mining (ARM) to National Health Insurance
Database of Taiwan. International Journal of Medical Informatics, 78(12), e75-e83.
Toussi, M., Lamy, J., Le Toumelin, P., & Venot, A. (2009). Using data mining techniques to explore physicians' therapeutic
decisions when clinical guidelines do not provide recommendations: methods and example for type 2 diabetes. BMC Medical
Informatics and Decision Making, 9, 28.
Wright, A., Chen, E. S., & Maloney, F. L. (2010). An automated technique for identifying associations between medications,
laboratory results and problems. Journal of Biomedical Informatics, 43(6), 891-901.
Wright, A., & Sittig, D. F. (2006). Automated development of order sets and corollary orders by data mining in an ambulatory
computerized physician order entry system. AMIA ... Annual Symposium Proceedings / AMIA Symposium. AMIA Symposium,
819-823.
Wroblewski, D., Francis, B. A., Chopra, V., Kawji, A. S., Quiros, P., Dustin, L., & Massengill, R. K. (2009). Glaucoma detection
and evaluation through pattern recognition in standard automated perimetry data. Graefe's Archive for Clinical and
Experimental Ophthalmology = Albrecht Von Graefes Archiv Für Klinische Und Experimentelle Ophthalmologie, 247(11), 15171530.
© 2005
99
Download