Author(s): Charles P. Friedman, October 29, 2013 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License: http://creativecommons.org/licenses/by-nc-sa/3.0/ We have reviewed this material in accordance with U.S. Copyright Law and have tried to maximize your ability to use, share, and adapt it. The citation key on the following slide provides information about how you may share and adapt this material. Copyright holders of content included in this material should contact open.michigan@umich.edu with any questions, corrections, or clarification regarding the use of content. For more information about how to cite these materials visit http://open.umich.edu/education/about/terms-of-use. Any medical information in this material is intended to inform and educate and is not a tool for self-diagnosis or a replacement for medical evaluation, advice, diagnosis or treatment by a healthcare professional. Please speak to your physician if you have questions about your medical condition. Viewer discretion is advised: Some medical content is graphic and may not be suitable for all viewers. Citation Key for more information see: http://open.umich.edu/wiki/CitationPolicy Use + Share + Adapt { Content the copyright holder, author, or law permits you to use, share and adapt. } Public Domain – Government: Works that are produced by the U.S. Government. (17 USC § 105) Public Domain – Expired: Works that are no longer protected due to an expired copyright term. Public Domain – Self Dedicated: Works that a copyright holder has dedicated to the public domain. Creative Commons – Zero Waiver Creative Commons – Attribution License Creative Commons – Attribution Share Alike License Creative Commons – Attribution Noncommercial License Creative Commons – Attribution Noncommercial Share Alike License GNU – Free Documentation License Make Your Own Assessment { Content Open.Michigan believes can be used, shared, and adapted because it is ineligible for copyright. } Public Domain – Ineligible: Works that are ineligible for copyright protection in the U.S. (17 USC § 102(b)) *laws in your jurisdiction may differ { Content Open.Michigan has used under a Fair Use determination. } Fair Use: Use of works that is determined to be Fair consistent with the U.S. Copyright Act. (17 USC § 107) *laws in your jurisdiction may differ Our determination DOES NOT mean that all uses of this 3rd-party content are Fair Uses and we DO NOT guarantee that your use of the content is Fair. To use this content you should do your own independent analysis to determine whether or not your use will be Fair. Data, Computation, Images and WaveForms Prof. Charles P. Friedman Introduction to Health Informatics University of Michigan October 29, 2013 Where are We? • Channel 1 • Method Lectures 1. Health information exchange 2. Knowledge representation 3. Information retrieval 4. Imaging and image analysis (today) 5. Policy development and analysis 6. Organization/management 7. Human-computer interaction And more to follow… 4 Key Questions 1. What are the primary data types we deal with in health informatics? 2. How are non-alphanumeric data types represented and made “computable”? 3. What kinds of computations are performed on images and how? 4. How are images managed, curated, and communicated? We’re going to use simplified examples to emphasize the methods. 5 Data Types • Alphanumeric – Examples: free text, coded text, numerical results of tests and observations, others A patient presents to emergency department complaining of flu-like symptoms. Her fever is 40 C and pulse is 87. • Images – Examples: photographs, radiographs (x-rays), CT scans, ultrasound, others • Waveforms – Examples: ECG results, sounds, others 6 Non-Alphanumeric Data Capture Modalities • Xrays and Fluoroscopy • Ultrasound • Computerized tomography (CT scans and related) • Magnetic Resonance • Electrocardiograms (ECGs) • Microphones • Many others… 7 EHRs and Data Types • Today’s EHR is an alphanumeric EHR • Images typically are managed in separate PACS (Picture Archive and Communication Systems) – Don’t say: “PACS systems” • The EHR of the future is a multimedia EHR that seamlessly integrates data types Seto B, Friedman C. Moving toward multimedia electronic health records: how do we get there? J Am Med Inform Assoc 2012;19:503-505. 8 Key Questions 1. 2. 3. 4. What are the primary data types we deal with in health informatics? How are non-alphanumeric data types represented and made “computable”? What kinds of computations are performed on images and how? How are images managed, curated, and communicated? 9 Computability • In digital computers, ultimately this requires reduction to binary “bits” (1 or 0) • Any number can be represented (in Base 2) as a string of 1’s and 0’s (1011) Base 2 = (11) Base 10 • Using ASCII codes (a standard), any text character can be represented as a number “C” = (67)ASCII = (1000011)Base 2 • “Chuck” = (67|104|117| 95|107) ASCII 10 So How Do We Make Images and Waveforms Computable? 11 Computable Representation of Digital Images (2D for now) • Goal: Make a picture into an array of numbers • Method: – Represent an image as a matrix of dots (pixels) – Each pixel has a location in the matrix, corresponding to a location in the image – Each pixel can be characterized by intensity and color (if color image) • Computing on images = mathematical calculations on pixels 12 2D Image as a Matrix of Pixels The location, intensity, and color of each pixel completely represents the image in computable form. Pixel (10, 10) = 128 13 Image Quality Indices • Spatial resolution: Number of pixels per unit area of actual image (pixel density) – Diagnostic quality digital xray is 2048 x 2048 pixels to cover ~ 200 square inches • Contrast resolution: Number of bits used to represent the intensity of a pixel – “12 bit” monochrome image ~ 4000 shades of grey • Temporal resolution: Time required to generate an image (important for animation) 14 Representing Waveforms to Make Them Computable • Sample the height of the waveform at discrete times. • As the time interval (sampling interval) diminishes, the sampled waveform approaches the exact one. • Analogous to a one-dimensional “image”. 15 Key Questions 1. 2. 3. 4. What are the primary data types we deal with in health informatics? How are non-alphanumeric data types represented and made “computable”? What kinds of computations are performed on images and how? How are images managed, curated, and communicated? 16 Computing on Images and Waveforms Once images are in“computable” (numerical) form, what kinds of computations are done on them? • Display manipulation • Image compression • Computing size and distance • Computing difference (example in depth) • Computing structure and automated inferencing 17 And How Is This Done? • Straightforward manipulations of individual pixels • Creating a mathematical model of the information in the image or waveform – Capturing the relationships between pixels 18 Display Manipulation To help the viewer inspect the image and detect features of interest. • Select area of interest • Zoom in on area of interest • Enhance brightness • Enhance contrast 19 Image Compression To reduce the number of bytes required to store the image. • Lossless vs. lossy compression. • Lossless compression reduces size of image file without loss of fidelity • Lossy compression comes with loss of fidelity but effect may be imperceptible • Most compression algorithms require a mathematical model of the image (more later) 20 Computing Size and Distance To assist in diagnosis and treatment planning. • User points to or outlines what is to be measured • Can be computed from pixel density and physical scale of the image. • How big is the lesion? • What is the distance between two structures? 21 Computing Difference • Clinically, a difference between two images is often hard to detect “by eye” • By computing on pixels, differences can be detected 22 How the Difference Algorithm Works New Image Pixel (1 10) = 255 Old Image Pixel (10, 10) = 23 Pixel (10, 10) = 160 Difference Image Pixel (1 10) = 255 Pixel (1 10) = 0 Pixel (10, 10) = 137 23 Computing Difference A patient presents with this X-ray on a follow-up visit 24 Computing Difference Here was his X-ray three months earlier 25 The “Difference” Image 26 I “Cheated” the Problem of Image Registration 27 Computing Structure Advanced computational methods that enable: • Automated interpretations of ECGs • 3 dimensional rendering from 2 dimensional slices (Visible Human Project) http://www.youtube.com/watch?v=ojCNUoVfzh4 • Feature detection: automated “grading” of tumors • These methods require creation of a mathematical model of the image 28 Representing Images and Waveforms Mathematically • The secret of advanced image and waveform processing is to create a mathematical model of the information in the image • Effectively, this results in a set of equations that: – Given a location in an image or waveform – Will return a close approximation to the pixel value (intensity, color) or the waveform height at that location 29 Any Musical Tone is a Combination a Fundamental and Its Harmonics 30 Key Questions 1. 2. 3. 4. What are the primary data types we deal with in health informatics? How are non-alphanumeric data types represented and made “computable”? What kinds of computations are performed on images and how? How are images managed, curated, and communicated? 31 How Does One “Retrieve” This Image? 32 Managing Images Requires a Standardized Way of Characterizing Them • Images bring out the difference between data and metadata • The data…………………. • The metadata: Mrs. Jane Smith, Chest Xray, Acquired May 11, 2012 33 The DICOM Header DICOM = Digital information and Communications in Medicine 1 SOP Instance UID: Unique identifier for the Study 2 Study Date: Date the Study started, if any previous procedure steps within the same study have already been performed. 3 Acquisition Date: The date the acquisition of data that resulted in sources started. 4 Study Time: The time the acquisition of data that resulted in sources started. 5 Modality: Type of equipment that originally acquired the data used to create the images in this Series. 6 Manufacturer: Manufacturer of the equipment that produced the sources. 7 Institution Name: Institution or organization to which the identified individual is responsible or accountable. And six more… 34 Curating and Exchanging Images • Curating images requires their preservation – Digital images a big advantage over film – Storage costs no longer an issue • Images are a big challenge for information exchange – Even compressed images are large files (16 Mb for a chest xray) – Images are “bandwidth hogs” – Need for fluidity of images made the compelling case for the Next Generation Internet 35 Summary: Key Questions 1. 2. 3. 4. What are the primary data types we deal with in health informatics? How are non-alphanumeric data types represented and made “computable”? What kinds of computations are performed on images and how? How are images managed, curated, and communicated? 36 Image Attributions • • • “Thymic large cell neuroendocrine carcinoma: report of a resected case - a case report” by PubMed Central is under a Creative Commons license CC BY 2.0. “VistA Img” by an employee of the United States Department of Veterans Affairs, taken or made as part of that person's official duties is in the Public Domain. “1st thru 5th harmonics of vibrating string” by http://bbasound.wikispaces.com/ is under a Creative Commons license CC BYSA 3.0. 37