Database System (DBS) Information Technology (IT) Medical Record (MR) Electronic Medical Record (EMR) Computer Technology (CT) (Web; Digital information) Medical Informatics (MI) Data Mining (DM) Evidence Based Medicine (EBM) Bio-informatics Knowledge Discovery Medical Informatics related disciplines Medical Information System Time 2000s Automated Medical Record Computerized Medical Record Electronic Medical Record Electronic Patient Record Electronic Health Record Paper base Scanning the paper document Computer format Wider scope of personal information Wellness information Knowledge System Admission Discharge Clinic Department Reporting Financial mgmt. Decision making Data harmonization Data warehousing Data mining Netwoking World wide data Public health Research system User Driver Technology Driver ............... Knowledge Discovery & Data Mining Four basic elements of a database system User: end user; casual user; practitioners Data: text; graph; image; sound; vedio Software: general; application Hardware: CPU; I/O; network Types of current database system Hierarchical database Network database Object-oriented database Relational database (DBASE III plus; Clipper; Foxpro; Access) Advantages of integrated database Data sharing Minimal data redundancy Data consistency Data standard improvement Data integrity improvement Data security Faster development of new application Normalization (best table design) Separate the table until every fields depend on the primary key only Primary key (PK) The group of attributes that uniquely (no null, ) determines every other attribute in the relation Candidate key All candidate PK’s are candidate keys Foreign key A attribute value of reference table is the PK of the home table First normal form (1NF) If and only if 1)all domains contain single values only Second normal form (2NF) If and only if 1) it is in 1NF, and 2) every nonkey attributes in fully dependent on the PK Third normal form (3NF) If and only if 1)it is in 2NF, and 2) every nonkey attributes is non-transitively dependent on the PK Further normalization Boyce-Codd normal form (BCNF) Forth normal form (4NF) Fifth normal form (5NF) 掛號 健診 藥局 門診 急診 次專科 檢驗 X光 病理 心導管 出院 內視鏡 細胞 血庫 物理治療 營養 病歷 掛號 健診 藥局 門診 急診 次專科 檢驗 X光 病理 出院 心導管 內視鏡 細胞 血庫 病歷 物理治療 營養 供繕 麻醉 住院 手術 護理照顧 掛號 醫師績效 健診 薪資 藥局 門診 急診 次專科 檢驗 人事管理 X光 病理 藥品庫存採購 出院 心導管 內視鏡 財物料採購 細胞 血庫 出納 病歷 物理治療 營養 財務會計 成本會計 供繕 麻醉 住院 手術 護理照顧 醫院資訊系統 1 門急診資訊系統 掛號系統; 醫令作業系統; 收費管理系統; 帳務管理系統; 健保申報系統 2 住院資訊系統 住院管理系統; 批價收費系統; 健保申報申復爭議系統; 單一劑量作業 護理站管理系統; 手術室管理系統; 住院醫囑作業系統; 帳務管理系統; 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 病歷管理系統 藥局連線系統 醫師績效作業系統 供繕系統 血庫作業系統 檢驗檢查報告系統 X 光片管理系統 急診病患動態管理系統 疾病分類統計作業系統 藥品庫存採購作業作業系統 材物料庫存採購作業作業系統 人事管理系統 薪資管理系統 財務會計作業系統 成本會計作業系統 出納作業系統 財產作業系統 健檢系統 掛號 醫師績效 健診 薪資 藥局 門診 急診 次專科 檢驗 人事管理 X光 病理 藥品庫存採購 出院 心導管 內視鏡 財物料採購 細胞 血庫 出納 病歷 物理治療 營養 財務會計 成本會計 供繕 麻醉 住院 手術 護理照顧 Knowledge discovery in databases (KDD) An explosive growth in capabilities to both generate and collect data scientific data (e.g. remote sensors, space satellites) business data (e.g. bar codes for commercial products, credit card) human genome database (human genetic code) government transactions (e.g. tax returns) health care transactions (control costs, improve quality) advanced data storage technology (faster, higher capacity, cheaper) better database management systems and data warehousing technology Allow us to transform this data deluge into “ mountains “ of stored data Such volumes of data overwhelm traditional manual methods of data analysis such as spreadsheets, ad-hoc queries can create informative reports from data can not analyze the contents of reports to focus on important knowledge A significant need exists for a new generation of techniques and tools with the ability to intelligently and automatically assist humans in analyzing the mountains of data for nuggets of useful knowledge These techniques and tools are the subject of the emerging field of KDD Knowledge Discovery and Data Mining Components Model representation Model evaluation Parameter search & model search Methods Decision trees and rules Neural networks Regression (linear and non-linear) Classification and Clustering Probabilistic graphical dependency models (Baysian networks) Relational Learning models (autoregressive models) etc. Applications Business management Health care fraud detection & prevention Astronomy Molecular biology Global climate change modeling Medicine etc. Knowledge Discovery Data preparation Search for pattern Data Mining Knowledge evaluation Knowledge interpretation Definitions of KDD terms Data is a set of facts (F; e.g. cases in a database) example: F is the collection of 23 cases with three fields each containing the values of debt, income, loan. Pattern is an expression E in a language describing facts in subset FE of F. E is called a pattern. example: “ If income <$t, then person has defaulted on the loan “. Knowledge A pattern is called knowledge if for some user specified threshold. It is purely user-oriented, and determined by whatever functions and thresholds the user chooses. By appropriate settings of thresholds, one can emphasize accurate predictors or useful patterns over others. Data mining is a step in the KDD process consisting of particular data mining algorithms that, under some acceptable computational efficiency limitations, produces a particular enumeration of patterns over F. KDD Process involves data preparation, search for patterns, knowledge evaluation, and refinement involving iteration after modification. is the process of using data mining methods ( algorithms) to extract (identify) what is deemed knowledge according to the specifications of measures and thresholds, using the database F. is interactive and iterative, involving numerous steps with many decisions being made by the user. Knowledge discovery in databases vs. data mining (KDD vs. DM) KDD has been mostly used by artificial intelligence, machine learning researchers the overall process of discovering useful knowledge from data DM has been commonly used by statisticians, data analysts, MIS community application of algorithms for extracting patterns from data without the additional steps of the KDD process (such as incorporating appropriate prior knowledge and proper interpretation of the results) blind application of DM methods can be a dangerous activity Links between KDD (and DM) and related field machine learning pattern recognition databases statistics artificial intelligence knowledge acquisition for expert systems data visualization The National Health Insurance Information System Nine major systems: Underwriting of insurance Payments for medical fees Management of medical affairs Financial management Administrative support Decision support system Exchange of information Management of community-based insuring agencies Safety control 健保IC卡存放的資料 健保IC卡資料存放共分四區段,未來視實際需要增加存放內容。 一、「基本資料段」: 主要功能係辨識身分用,存放內容包括:卡片號碼、姓名、身分證字號或 身分證明文件號碼、出生日期、性別、發卡日期、照片、卡片註銷註記。 二、「健保資料段」: 主要功能係紀錄就醫相關資料,包括:保險對象身分註記、卡片有效期限 、重大傷病註記、就醫可用次數、最近一次就醫序號、新生兒依附註記、 就醫資料登錄、就醫累計資料、醫療費用總累計、個人保險費、保健服務 、最後月經開始日期、預產期、孕婦產前檢查。本段資料分階段上線。 三、「醫療專區」: 主要存放門診處方箋、長期處方箋、重要醫令項目、過敏藥物等。本段資 料將考慮院所適應情形,分階段上線。 四、「衛生行政專區」: 主要存放預防接種資料、器官捐贈資料等。 Medical Informatics as a Discipline (http://www.cpmc.columbia.edu/edu/textbook) medical informatics = study and use of computers and information in health care purpose of this lecture is to further define the field definition by Asso. of American Medical Colleges (AAMC) 1986 "Medical informatics is a developing body of knowledge and a set of techniques concerning the organizational management of information in support of medical research, education, and patient care.... Medical informatics combines medical science with several technologies and disciplines in the information and computer sciences and provides methodologies by which these can contribute to better use of the medical knowledge base and ultimately to better medical care." history of computers 1800s - Charles Babbage's logic engine 1890 - Herman Holleriths's punch cards for census 1940s - early programmable digital computers (Eniac) 1950s - commercially available (Univac) 1960s - faster, more memory 1970s - minicomputers 1980s - microcomputers, networks 1990s - RISC, workstations, growth of networks appearance of computers in medicine 1960s - practical = early departmental and monolithic research = early ECG and diagnosis 1970s - practical = monolithic administrative & departmental, imaging (CT), early bibliographic retrieval research = alerts, Mycin (early successes) 1980s - practical = results reporting, outpatient, growth of clinical systems and databases research = AI, IR 1990s - practical = integration, communication research = vocab, interfaces, coding, evaluation factors in lack of use of computers in clinical care involves complex organisms (unlike physical processes) if over-simplify, not useful (vs bank transaction) therefore need sophisticated abstraction + detail technology for gathering complex info. just emerging eg low use of QMR or DXplain therefore providers have not entered info. reimbursement has not been linked to clinical info. therefore many admin. systems but few clinical health care administered by individuals, small groups less need for coordination inertia fear ignorance money security, integrity lack of standards language previous failures rapid turnover of technology factors in recent increase of medical informatics increase in use of technology - more data generated mobility of population - need to communicate specialization - need to communicate managed care systems - need to communicate rise in health care costs - attempt to control care improved hardware - faster and more memory improved methods - acquisition, transfer, retrieval reduced computer costs increased awareness related fields: makeup of med- info groups biomedical engineering - ECG, devices MI higher level of abstraction electrical engineering - hardware computer science - algorithms, closer to mathematics MI specific to health domain medical computer science – subdivision of comp sci cognitive science - AI and psychology not concerned with studying human brain information theory - physics of communication information (library) science – manage aper/elec info MI is close to this but MI not limited to info. storage and retrieval software industry - producing products MI stresses evaulation MI not dependent on selling every roduct MDs, RNs, dentists, other health care workers PhDs, esp computer science (also physics, ...) administrators, policy planners masters, PhD programs in medical informatics industry current issues in clinical care cost accessibility of health care coordinating care and setting policy acquisition and retrieval of data (eg across inst.) acquisition and sharing of knowledge (eg specialist) medical informatics research mirrors clinical issues data acquisition - GUI, nlp data storage - databases, modeling vocabularies - format, content organization of data - Larry Weed POMR 1969 machine interfaces - standards like HL7, security data retrieval - query languages knowledge acquisition knowledge representation - Arden application of knowledge when needed decision analysis, alerts, diagnosis education care plans and practice guidelines