Progress_20130409

advertisement
Classifying Neurosurgery
Operation Notes
using Text Mining Techniques
Introduction: Inspiration and Goal
• 台灣神經外科醫學會101年新訂手術病例統
計表
• How can we build a ‘reasonable’
classification system for neurosurgery
from the informatics point of view
Brain tumor
Vascular
(1) Glioma
Tumor
High grade
Malignant
Low grade
Benign
(2) Meningioma
HIVD
(3) Pituitary tumor
Cervical
(4) Acoustic neuroma
Lumber
(5) Others
Stenosis
Aneurysm (Microsurgery)
Cervical
AVM (Excision)
Lumber
EC-IC bypass
Other Instrumentation
Endarterectomy
Cervical
Cavernoma (Excision)
Lumber
Dural AVF (Microsurgery)
Spontaneous ICH
Spine
Decompressiom for Infarction
Carpal tunnel syndrome
EDH
PNS
Acute SDH
Chronic SDH
Lesioning
Traumatic ICH
DBS
CVA
Cranioplasty
Peripheral nerves
Head
injury3
VP Shunt
MVD
Hyperhidrosis
Functional
Epilepsy surgery
Aneurysym (Coiling)
Endovascular surgery
Others
AVM
Dural AVF
Carotid angioplasty / stent
Aspiration
Infection (abscess,empyema)
Drainage
Excision
Trauma
Brain Tumor
Spinal
Spinal Metastasis
Tu
mo Neurilemmoma
r Meningioma
Tethered cord
syndrome
Acute EDH
Craniotomy + evacuation
Acute SDH
Chronic SDH
Craniotomy + evacuation
Burr hole drainage
Brain swelling
Decompressive craniectomy
Skull bone defect
Cranioplasty
(Para)spinal lesion
Glioma
Excision (total, subtotal, partial)
Intramedullary tumor Laminectomy + Excision/Biopsy
Meningioma
Excision (Simpson grade
Ependymoma
Lymphoma
Excision / Biopsy
Epidermoid tumor
Excision
)
Anterior / Posterior decompression (biopsy)
Excision
Excision
Detethering
Nil (CT-guided Bx)
Excision
Trauma C-spine
TL-spine
fracture/disloca Laminectomy with TPS
tion
Spine
C-stenosis
C-HIVD
Posterior fusion (Sonntag, Gallie, TAS,
Occipitocervical)
Laminectomy
ACDF (no instrument)
L-stenosis
Laminectomy (without/with discectomy)
Cavernous sinus tumor Excision
L-HIVD
Discectomy (microsurgical)
Medulloblastoma
Excision
Spondylolisthesis
Laminectomy + transpedicle screw (name
)
Chordoma
Excision
Transcranial/trenssphenoidal
excision
Epidural abscess
Laminectomy + drainage
C-OPLL
Multilevel corpectomy + plating
Pituitary
Brain Metastasis
Cavernoma
CPA tumor
Craniopharyngioma
Transcranial/transsphenoidal
adenomectomy
Excision
Excision
Excision (retrosigmoid /
presigmoid)
fracture/disloca ACDF w traction
tion
C1-2 subluxation
MATERIALS (TNSS2012 poster)
• Between Apr, 2009 and Mar, 2012, 4639 operations were
performed on 2852 patients admitted to the
neurosurgery service of a medical center in northern
Taipei.
• We downloaded these operation notes from the hospital
database using the patient list obtained from our
proprietary software for scheduling admissions and
operations.
• A simple parser was applied to separate the operation
notes into four segments: header, timeline, billing
information and free text note. Free text operation notes
recording procedures performed by neurosurgeons were
extracted.
Materials and Methods
• 4639 semi-structured operation notes
stored in HIS downloaded into PC
•
•
•
•
Preprocessing
Keyword selection and identification
Agglomerative clustering
Evaluation for appropriateness
4639 Semi-Structured Texts
• Format 1/2/3: 753/2552/2087 notes
– Format 2 = 3 (only different billing data order)
• Header ~ basic data
– who/what/when/where/how
• Timeline: timing of each operation and
anesthesia stages
• Billing information: NHI codes and counts
• Free text note: recording procedures
performed by neurosurgeons
Format3 (I)
•
•
•
•
•
•
•
•
•
•
•
•
NAME (gender;DOB;age)
手術日期 yyyy/mm/dd
手術主治醫師 xxx
手術區域 rr xxx房 yy號
診斷 Brain tumor
器械術式 Brain tumor
Crainotomy(Others)
手術類別 預定手術
手術部位 頭、頸
傷口分類 清潔
麻醉方式 全身麻醉 麻醉
主治醫師 yyy
ASA 2
紀錄醫師 yyy
時間資訊
• 00:00 臨時手術NPO
• 13:18 進入手術室
• 13:20 麻醉開始
• 14:00 誘導結束
• 14:10 抗生素給藥
• 14:10 手術開始
• 18:05 手術結束
• 18:05 麻醉結束
• 18:15 送出病患
Format3 (II)
醫令資訊 類別 名稱 量 刀 側
• 手術 腦瘤切除-手術時間
在4小時以內 1 1 L
• 麻醉 PRE-ANESTHESIA
EVALUATION 1 0
• 麻醉 SEMI-CLOSED
INTRATRACHEAL
INTUBA 1 0
• 麻醉 G-anesthesia (2-4
hours,each 30 4 0
• 麻醉 SEMI-CLOSED
INTRATRACHEAL
INTUBA 2 0
• 麻醉 Peripheral arterial line
inserti 1 0
• 麻醉 C.V.P. catheter in
ubation 1 0
• 麻醉 Lactic Acid (lactate) 1 0
• 麻醉 動脈血液檢查全套 1 0
• 麻醉 Hemoglobin (Hb) 1 0
• 麻醉 測血糖 1 0
• 麻醉 Ca (Calcium) 1 0
• 麻醉 Na (Sodium) 1 0
• 麻醉 K (Potassium) 1 0
• 麻醉 Blood gas analysis 1 0
Format3 (III)
摘要__ 手術科部: 外科部 套用罐頭: Craniotomy for ICT 開立醫師
: YYY 開立時間: 2011/03/31 18:25 Pre-operative Diagnosis Left
convexity meningioma Post-operative Diagnosis Left convexity
meningioma Operative Method Left frontoparietal craniotomy for
tumor exicision, Simpson grade I Specimen Count And Types
Several fragments of one tumor was sent for pathology. Pathology
Pending Operative Findings One extraaxial dura-based, firm to
elastic, well-capsulated, about 6-7cm, tumor located at left
frontoparietal region. Operative Procedures With endotracheal
general anaesthesia, the patient was put in supine position with
head fixed in Mayfield head clamp. After scalp shaved, scrubbed,
disinfected, and then draped, we made one U-shape skin incision at
left frontopareital area. We drilled for burr holes, and then created
craniotomy. We made dura incision around the tumor base, and
dissected the arachnoid membrane plane around the tumor. The
dura was closed in water-tight fahsion, and bone graft was fixed
back with wires. The wound was closed in layers after CWV
insertion. Operators VS XXX Assistants R4 YYY
Free Text Preprocessing
• Convert into lower case
– Chinese character already dropped
• Replace special characters with “ ”
– All punctuations removed
– Still keeping spaces between words
• Abbreviations: kept in original form
Punctuations/Stop words
• Puntuations for perl: tr/.,:;!?"(){}//d;
– Only alphabets retained  9906 words
• Stop word list used to eliminate
meaningless bigrams
– Salton G. The SMART Retrieval System.
Englewood Cliffs, NJ, Prentice Hall; 1971
– ‘Right’ retained  right/left IS important
• Section headings removed
Types of Keywords
42 manually selected, may be combined
or isolated (e.g. fronto-temporo-parietal)
– Wildcards and spaces used
• Anatomy (16): location, structure
• Pathology (11)
• Procedure (15): name, steps
– Including instruments (intra-op and implants)
– May be identical/similar to procedure name
Anatomy Keywords (16)
• Brain (1+6)
– front*, pariet*, tempor*, occipit*, cerebell*,
ventricl*
• Spin* (1+4)
– root, thecal, sac, disc
• Others/common to brain and spine (4)
– pituitary, carotid, trache*
– nerve
Pathology Keywords (11)
• General (3)
– tumor, injury, abscess
• Specific to brain/more common in brain (5)
– parkinso*, aneurysm
– hematoma, hemorrhag*, swelling
• Specific to spine (1)
– spondylo*
• Others (2)
– hyperhidrosis, csf lea* (pituitary)
Procedure/Instrument Kwds (15)
• Brain (7)
– burr, hole, craniotomy, craniectomy, cranioplasty
– ventriculoperitoneal, shunt
• Spine (2)
– fusion, cage
• Common to brain and spine (2)
– decompressi*, scre*
• Others/nonspecific (4)
– radiofrequency (PNS?), debrid*, port a, emergen*
– trache*: already used as anatomy keyword
Issues in Choosing Keywords
• Specificity (anat/path/proc)
– Brain (19): 7/5/7 words
– Spine (8): 5/1/2 words
• C- and L- spine: (single alphabet, too nonspecific)
– Brain and spine: decompressi*, screw
• Common bigrams (various specificity)
– Ventriculoperitoneal shunt
– Thecal sac
– Burr hole
“ Trache*”, “Thecal Sac”
• Initially, “trach*” was used to find
“tracheostomy”, but “endotracheal” or
“endo-tracheal” still valid
• “trach*” changed into “ trach*” to eliminate
the FPs
• Dura, dural sac  nonspecific to spine
• Thecal sac  specific to spine
Horizontal Dendrograms
Cluster Dissimilarity (cases 
clusters) and Linkage Criterion
• In order to decide which clusters should be
combined (for agglomerative), or where a cluster
should be split (for divisive), a measure of
dissimilarity between sets of observations is
required.
• Achieved by use of an appropriate metric (a
measure of distance between pairs of
observations), and a linkage criterion which
specifies the dissimilarity of sets as a function of
the pairwise distances of observations in the
sets.
Linkage criteria
• The linkage criterion determines the distance
between sets of observations as a function of
the pairwise distances between observations.
• Some commonly used linkage criteria between
two sets of observations A and B are
– Maximum or complete linkage clustering:
worst
– Minimum or single-linkage clustering: best
– Mean or average linkage clustering, or
UPGMA: average of all pairs
Manning,C.D. (1999) Foundations of Statistical Natural Language Processing MIT Press, Cambridge, Mass.
Examples of Different Strategies
Six Variants of Agglomerative
Clustering
• For 45 vectors in 3657 dimensions
• Vector-based variants: easy to compute,
but create new vectors, z = x or y, z = (x +
y)/2 and z = x + y
– Recalculate at each stage of linking
• Set-based variants: applying maximum,
minimum and mean distance between
observations in set pairs
– Lookup table possible
Hierarchical Cluster
• Set-based
– Single linkage
– Complete linkage
– UPGMA
• Vector-based
– UPGMC
– WPGMC
– OR
Results
Similarity matrix
Similarity matrix derived from the 45 binary TF
Single linkage
Complete linkage
UPGMA
WPGMC
WPGMC
WPGMC
OR
Download