Introduction Materials and methods Big data analytics and automated decision making in medicine and healthcare Computer Aided Diagnosis Dr. Oliver Faust1 1 School of Science & Engineerring Habib University 2014 Dr. Faust Computer Aided Diagnosis Introduction Materials and methods Outline 1 Introduction Problem: Information overload Solution: Big data analytics 2 Materials and methods A methodology for data analytics Conclusions Dr. Faust Computer Aided Diagnosis Introduction Materials and methods Problem: Information overload Solution: Big data analytics The case for big data Science and observation Analytical science observes nature and analyzes the data from these observations. Medicine and biomedical engineering Far from being an exception, biomedical data acquisition is on the forefront in the quest for more data. It strives towards extracting relevant information that could accurately diagnose a disease or predict its prognosis. Dr. Faust Computer Aided Diagnosis Introduction Materials and methods Problem: Information overload Solution: Big data analytics Big, as in more and better, data More data sources Medical facilities are equipped with new scanners while older once are still used in parallel. Mobile physiological monitoring equipment. For example a mobile phone with a health application. Better data quality Higher resolution. Specialized targeted applications. Combination of different signal acquisition techniques. Dr. Faust Computer Aided Diagnosis Introduction Materials and methods Problem: Information overload Solution: Big data analytics Example: X-ray evolution Data size: A simple Digital Radiography (DR) such as a chest X-ray takes up about 30 MB (Megabyte). In Computed Tomography (CT), a full study takes more than one GB (Gigabyte). PET-CT combines CT with about 5 MB of Positron Emission Tomography (PET) data. Dr. Faust Computer Aided Diagnosis Introduction Materials and methods Problem: Information overload Solution: Big data analytics Result: Big data Medical data is Big In 2008 it has been estimated that all kinds of medical images occupied 1250 Petabytes. Truly very big In comparison, the giant Internet search company Google used an estimated disk space of 20 to 200 Petabytes in August 2009. Dr. Faust Computer Aided Diagnosis Introduction Materials and methods Problem: Information overload Solution: Big data analytics Problem: Information overload Medical practitioners face: Fatigue; Lag of attention; Optical illusions; Stress: Time pressure; Decision pressure; Responsibility. Dr. Faust Computer Aided Diagnosis Introduction Materials and methods Problem: Information overload Solution: Big data analytics Solution: Big data analytics Analytics for practitioner support Highlight important parts of the data to support the practitioner. + Low system complexity; + Large margin of error – the brain is fault tolerant; Human decision making; – Lag of traceability. Computer aided diagnosis + Traceability; + Automation – riding Moors law; – Small margin of error; – High system complexity. Dr. Faust Computer Aided Diagnosis Introduction Materials and methods A methodology for data analytics Conclusions Diseases Need definition Detect symptoms of a disease in data. These diseases can be: Cardiovascular Disease – The killer: Heart attack; Stroke. Diabetes – Paving the way for the killer diseases: Diabetic retinopathy – A leading cause of blindness; Diabetic foot – Walking becomes painful. Epilepsy – Thunderstorm in the brain. Dr. Faust Computer Aided Diagnosis Introduction Materials and methods A methodology for data analytics Conclusions What is the nature of the problem? Unsupervised There is no previous knowledge on the data. A possible approach is: Find structure in the data; Characterize the structure. Supervised Medical data, such as body temperature, is used for disease diagnosis since Milena. Feature extraction; Automated decision making. Dr. Faust Computer Aided Diagnosis Introduction Materials and methods A methodology for data analytics Conclusions Blockdiagram Data Offline Online Pre-processing Pre-processing Feature extraction Feature assessment Best Features Calculation of selected features Training Online classification Offline classification Classification assessment Information Dr. Faust Computer Aided Diagnosis Introduction Materials and methods A methodology for data analytics Conclusions Data One dimensional signals: Electrocardiogram [9]; Heart Rate [3, 17, 15, 18, 9, 8, 14]; Electroencephalogram [7, 10, 12, 10, 13, 6]. Two dimensional signals: Ultrasound [2, 4, 1, 5]; X-ray and mammography; Thermography; Fundus images of the retina [11, 16]. Dr. Faust Computer Aided Diagnosis Introduction Materials and methods A methodology for data analytics Conclusions Fundus image with signs of Diabetes retinopathy Dr. Faust Computer Aided Diagnosis Introduction Materials and methods A methodology for data analytics Conclusions Ultrasound images of Carotid plaque Dr. Faust Computer Aided Diagnosis Introduction Materials and methods A methodology for data analytics Conclusions Electroencephalogram Dr. Faust Computer Aided Diagnosis Introduction Materials and methods A methodology for data analytics Conclusions Blockdiagram Data Offline Online Pre-processing Pre-processing Feature extraction Feature assessment Best Features Calculation of selected features Training Online classification Offline classification Classification assessment Information Dr. Faust Computer Aided Diagnosis Introduction Materials and methods A methodology for data analytics Conclusions Preprocessing Dr. Faust Computer Aided Diagnosis Introduction Materials and methods A methodology for data analytics Conclusions Blockdiagram Data Offline Online Pre-processing Pre-processing Feature extraction Feature assessment Best Features Calculation of selected features Training Online classification Offline classification Classification assessment Information Dr. Faust Computer Aided Diagnosis Introduction Materials and methods A methodology for data analytics Conclusions Feature extraction The aim of feature extraction is to extract significant measures from the preprocessed physiological measurements. In general, there are two different types of feature extraction: Linear Features; Nonlinear Features. Dr. Faust Computer Aided Diagnosis Introduction Materials and methods A methodology for data analytics Conclusions Linear Features Algorithms used for linear feature extraction: Statistical – Mean Variance; Fourier transform coefficients; Discrete wavelet coefficients; Entropy measures; Principal Component Analysis; Recurrence plots. Dr. Faust Computer Aided Diagnosis Introduction Materials and methods A methodology for data analytics Conclusions Linear Feature example: Fourier transform Dr. Faust Computer Aided Diagnosis Introduction Materials and methods A methodology for data analytics Conclusions Nonlinear Features Algorithms used for nonlinear feature extraction: Largest Lupanov Exponent (LLE); Correlation Dimension (CD); Hurst exponent (H); Approximate entropy (ApEn); Fractal dimension (FD); Phase space plot; Recurrence plots (RP). Dr. Faust Computer Aided Diagnosis Introduction Materials and methods A methodology for data analytics Conclusions Electroencephalogram of different sleep stages Dr. Faust Computer Aided Diagnosis Introduction Materials and methods A methodology for data analytics Conclusions Electroencephalogram of different sleep stages Dr. Faust Computer Aided Diagnosis Introduction Materials and methods A methodology for data analytics Conclusions Electroencephalogram of different sleep stages Dr. Faust Computer Aided Diagnosis Introduction Materials and methods A methodology for data analytics Conclusions Electroencephalogram of different sleep stages Dr. Faust Computer Aided Diagnosis Introduction Materials and methods A methodology for data analytics Conclusions Electroencephalogram of different sleep stages Dr. Faust Computer Aided Diagnosis Introduction Materials and methods A methodology for data analytics Conclusions Electroencephalogram of different sleep stages Dr. Faust Computer Aided Diagnosis Introduction Materials and methods A methodology for data analytics Conclusions Blockdiagram Data Offline Online Pre-processing Pre-processing Feature extraction Feature assessment Best Features Calculation of selected features Training Online classification Offline classification Classification assessment Information Dr. Faust Computer Aided Diagnosis Introduction Materials and methods A methodology for data analytics Conclusions Feature assessment and feature selection Analysis Of Variance (ANOVA) ANOVA test yields the so called p-value, which provide a statistical measure of the feature quality. Dr. Faust Computer Aided Diagnosis Introduction Materials and methods A methodology for data analytics Conclusions Blockdiagram Data Offline Online Pre-processing Pre-processing Feature extraction Feature assessment Best Features Calculation of selected features Training Online classification Offline classification Classification assessment Information Dr. Faust Computer Aided Diagnosis Introduction Materials and methods A methodology for data analytics Conclusions Classification The aim of classification is to obtain an objective decision on the underlying data. For computer aided diagnosis systems, this objective decision is concerned with whether or not the underlying data comes from a diseased person, and if a disease was detected a corollary decision is needed on the nature of the disease. Suppoert Vector Machine (SVM); ADAboost (Adaptive Boosting); Decision Tree (DT); K-Nearest Neighbout (KNN); Artificial Neural Network (ANN); Probabilistic Neural Network (PNN). Dr. Faust Computer Aided Diagnosis Introduction Materials and methods A methodology for data analytics Conclusions Blockdiagram Data Offline Online Pre-processing Pre-processing Feature extraction Feature assessment Best Features Calculation of selected features Training Online classification Offline classification Classification assessment Information Dr. Faust Computer Aided Diagnosis Introduction Materials and methods A methodology for data analytics Conclusions Classification assessment This is the most important classification criteria, because it determines the quality of the analysis system. Accuracy; Sensitivity; Specificity; Positive Predictive Value; Confusion Matrix; Receiver operating characteristic. Dr. Faust Computer Aided Diagnosis Introduction Materials and methods A methodology for data analytics Conclusions Cross validation electrocardiogram example Dr. Faust Computer Aided Diagnosis Introduction Materials and methods A methodology for data analytics Conclusions Conclusion, off-line system Find a suitable algorithm structure through off-line processing and modeling. Know what you are looking for – Define the need; Make the data accessible through preprocessing; Extract features, many more than needed; Test the features and select the best features; Test automated decision algorithms, select the best algorithm. Dr. Faust Computer Aided Diagnosis Introduction Materials and methods A methodology for data analytics Conclusions Conclusion, on-line system Implement the on-line system to benefit the target audience. Clean the data through pre-processing; Extract the best features; Employ the most accurate classification algorithm to get the best decision support; Visualize the decision support result; Distribute the decision support result. Dr. Faust Computer Aided Diagnosis Introduction Materials and methods A methodology for data analytics Conclusions References I [1] Rajendra U Acharya, Oliver Faust, APC Alvin, S Vinitha Sree, Filippo Molinari, Luca Saba, Andrew Nicolaides, and Jasjit S Suri. “Symptomatic vs. asymptomatic plaque classification in carotid ultrasound”. In: Journal of medical systems 36.3 (2012), pp. 1861–1871. [2] U Rajendra Acharya, Oliver Faust, Subbhuraam Vinitha Sree, Filippo Molinari, Luca Saba, Andrew Nicolaides, and Jasjit S Suri. “An accurate and generalized approach to plaque characterization in 346 carotid ultrasound scans”. In: Instrumentation and Measurement, IEEE Transactions on 61.4 (2012), pp. 1045–1053. Dr. Faust Computer Aided Diagnosis Introduction Materials and methods A methodology for data analytics Conclusions References II [3] U Rajendra Acharya, Oliver Faust, S Vinitha Sree, Dhanjoo N Ghista, Sumeet Dua, Paul Joseph, VI Thajudin Ahamed, Nittiagandhi Janarthanan, and Toshiyo Tamura. “An integrated diabetic index using heart rate variability signal features for diagnosis of diabetes”. In: Computer methods in biomechanics and biomedical engineering 16.2 (2013), pp. 222–234. Dr. Faust Computer Aided Diagnosis Introduction Materials and methods A methodology for data analytics Conclusions References III [4] U Rajendra Acharya, Oliver Faust, S V Sree, Ang Peng Chuan Alvin, Ganapathy Krishnamurthi, Jose CR Seabra, João Sanches, and JS Suri. “AtheromaticTM : Symptomatic vs. asymptomatic classification of carotid ultrasound plaque using a combination of HOS, DWT & texture”. In: Engineering in Medicine and Biology Society, EMBC, 2011 Annual International Conference of the IEEE. IEEE. 2011, pp. 4489–4492. Dr. Faust Computer Aided Diagnosis Introduction Materials and methods A methodology for data analytics Conclusions References IV [5] U Rajendra Acharya, Oliver Faust, APC Alvin, Ganapathy Krishnamurthi, José CR Seabra, João Sanches, Jasjit S Suri, et al. “Understanding symptomatology of atherosclerotic plaque by image-based tissue characterization”. In: Computer methods and programs in biomedicine 110.1 (2013), pp. 66–75. [6] Rajendra Acharya U, Oliver Faust, N Kannathal, TjiLeng Chua, and Swamy Laxminarayan. “Non-linear analysis of EEG signals at various sleep stages”. In: Computer methods and programs in biomedicine 80.1 (2005), pp. 37–45. Dr. Faust Computer Aided Diagnosis Introduction Materials and methods A methodology for data analytics Conclusions References V [7] O Faust, RU Acharya, AR Allen, and CM Lin. “Analysis of EEG signals during epileptic and alcoholic states using AR modeling techniques”. In: IRBM 29.1 (2008), pp. 44–52. [8] Oliver Faust, Leong Mei Yi, and Lee Mei Hua. “Heart Rate Variability Analysis for Different Age and Gender”. In: Journal of Medical Imaging and Health Informatics 3.3 (2013), pp. 395–400. [9] Oliver Faust and Wenwei Yu. “Cardiac Health Visualization and Diagnosis Using Entropies”. In: Journal of Medical Imaging and Health Informatics 3.3 (2013), pp. 409–416. Dr. Faust Computer Aided Diagnosis Introduction Materials and methods A methodology for data analytics Conclusions References VI [10] Oliver Faust, Wenwei Yu, and Nahrizul Adib Kadri. “Computer-based identification of normal and alcoholic eeg signals using wavelet packets and energy measures”. In: Journal of Mechanics in Medicine and Biology 13.03 (2013). [11] Oliver Faust, Rajendra Acharya, E Y K Ng, Kwan-Hoong Ng, and Jasjit S Suri. “Algorithms for the automated detection of diabetic retinopathy using digital fundus images: a review”. In: Journal of medical systems 36.1 (2012), pp. 145–157. Dr. Faust Computer Aided Diagnosis Introduction Materials and methods A methodology for data analytics Conclusions References VII [12] Oliver Faust, U Rajendra Acharya, Lim Choo Min, and Bernhard HC Sputh. “Automatic identification of epileptic and background EEG signals using frequency domain parameters”. In: International journal of neural systems 20.02 (2010), pp. 159–176. [13] OLIVER FAUST, ANG PENG CHUAN ALVIN, SUBHA D PUTHANKATTIL, and PAUL K JOSPEH. “DEPRESSION DIAGNOSIS SUPPORT SYSTEM BASED ON EEG SIGNAL ENTROPIES”. In: Journal of Mechanics in Medicine and Biology (2013). Dr. Faust Computer Aided Diagnosis Introduction Materials and methods A methodology for data analytics Conclusions References VIII [14] Oliver Faust, U Rajendra Acharya, Filippo Molinari, Subhagata Chattopadhyay, and Toshiyo Tamura. “Linear and non-linear analysis of cardiac health in diabetic subjects”. In: Biomedical Signal Processing and Control 7.3 (2012), pp. 295–302. [15] Tay Swee Hock, Oliver Faust, Teik-Cheng Lim, and Wenwei Yu. “Automated Detection of Premature Ventricular Contraction Using Recurrence Quantification Analysis on Heart Rate Signals”. In: Journal of Medical Imaging and Health Informatics 3.3 (2013), pp. 462–469. [16] M MUTHU RAMA KRISHNAN and Oliver Faust. “Automated glaucoma detection using hybrid feature extraction in retinal fundus images”. In: Journal of Mechanics in Medicine and Biology 13.01 (2013). Dr. Faust Computer Aided Diagnosis Introduction Materials and methods A methodology for data analytics Conclusions References IX [17] Faust Oliver, Acharya U Rajendra, SM Krishnan, and Min Lim. “Analysis of cardiac signals using spatial filling index and time-frequency domain”. In: BioMedical Engineering OnLine 3.30 (2004), pp. 1–11. [18] U Rajendra Acharya, Oliver Faust, Nahrizul Adib Kadri, Jasjit S Suri, and Wenwei Yu. “Automated identification of normal and diabetes heart rate signals using nonlinear measures”. In: Computers in biology and medicine 43.10 (2013), pp. 1523–1529. Dr. Faust Computer Aided Diagnosis Introduction Materials and methods A methodology for data analytics Conclusions Questions and Answers Thesis Data volume is nothing without analytics. Questions? Comments? Dr. Faust Computer Aided Diagnosis