The Cooper Union for the Advancement of Science and Art Albert Nerken School of Engineering Diagnosing Alzheimer’s Disease Using Machine Learning Techniques on Neuroimaging Data David Nummey Advised by Prof. Fred L. Fontaine May 10, 2011 2 S*ProCom Center for Signal Processing, Communications, and Computer Engineering Research Overview • Introduction • Alzheimer’s Disease and Neuroimaging • Machine Learning Techniques • Implementation • Results • Conclusions Overview • Introduction • Alzheimer’s Disease and Neuroimaging • Machine Learning Techniques • Implementation • Results • Conclusions Introduction • AD a neurodegenerative ‘dementia’ • 4.5M Americans in 2000; 5.1M in 2007 • 13.2M projected by 2050! • Annual health care costs $148B & rising • Research progressing: 1. Early, accurate diagnosis 2. Identify predictive biomarkers 3. Treatment and prevention measures Our Approach • Example-based brain models of AD and healthy elderly patients • Functional and structural brain images • Machine learning approach Overview • Introduction • Alzheimer’s Disease and Neuroimaging • Machine Learning Techniques • Implementation • Results • Conclusions Alzheimer’s Disease • Disease course known; cause is not • Typical pattern of degeneration • Amyloid protein ‘plaque’ depositions • Tau protein ‘tangles’ strangle neurons • Genetics reveal risk factors • Current diagnostic methods • Clinical evaluations • Specific cognitive deficits must be present • Diagnosis confirmed post-mortem Neuroimaging for AD • Structural imaging • MRI • PIB-PET • CT • Functional imaging • FDG-PET • fMRI • DTI Magnetic Resonance Imaging • Create structural image from body’s magnetic properties • • • • Water highly polar Apply static B field Perpendicular pulses Measure EM decay FDG-PET • Fluorodeoxyglucose positron emission AD tomography • Radiopaque by anaerobic activity • Corresponds to brain functionality by region MCI HC ADNI • Alzheimer’s Disease Neuroimaging Initiative (ADNI) • International collaboration • Longitudinal studies of 800+ patients • Third long-term study in progress • NIH and NIA funding since 2004 ADNI Data Overview • Introduction • Alzheimer’s Disease and Neuroimaging • Machine Learning Techniques • Implementation • Results • Conclusions Machine Learning Overview • Pattern recognition • Automatic classification • ‘Train’ models based on examples • Successful in many fields, and medical diagnoses • Supervised learning problem • Inputs xi with known labels yi • Images belong to AD, MCI, or HC classes Machine Learning Overview • Feature extraction • PCA, the kernel trick • Machine learning methods • LDA/QDA, GNB, KNN, SVM • Majority voting classifier • Cross-validation • Prevent overfitting Feature Extraction • Principal Component Analysis (PCA) • Orthogonal dimensions w/ highest variances • Project data onto this smaller space Feature Extraction • The Kernel Trick • Allow non-linear solutions to linear problems/classifiers • Replace inner products with kernel in higher-dim space • No need to determine higher-dim space computational savings Discriminant Analysis • Linear Discriminant Analysis (LDA) • f(x) = wTx + b – Separating hyperplane • Minimize ||w|| ‘maximal margin’ • Quadratic Discriminant Analysis (QDA) • Statistical estimations Gaussian Naïve Bayes • Naïve Bayes • Assume independent processes • Estimate statistics, model classes • GNB – Gaussian PDF’s • ML prediction K-Nearest Neighbors • KNN – estimate class PDFs • Trained models depend on labels of nearest neighbors Support Vector Machines • SVM – Maximum margin problem • Convex optimization • Rep. hyperplane by ‘support vectors’ • Non-linear solutions with kernel trick • Multiclass solutions done heuristically Majority Voting Classifier • Ensemble classifiers • Use results from multiple methods • Extract strengths of each • Simplest: Majority vote • Label is most popular output Cross-Validation • Avoid overfitting! • Want to model general characteristics • Train & test on random subsets of data • K-fold cross-validation • Partition training data into K groups • Iteratively verify features and models • Select best PC’s per model Overview • Introduction • Alzheimer’s Disease and Neuroimaging • Machine Learning Techniques • Implementation • Results • Conclusions Implementation Overview • ADNI data retrieval interface • Data sets constructed • Classification & evaluation procedures ADNI Data Extraction • ADNI data organized by visit IDs • Perl/SQL interface by Aleksey Orekhov • SQL queries to create general list • Specifies file paths to best images • Matlab interface • User selects image types & visit periods • ‘Least common denominator’ 3-class data set constructed Data Sets Constructed 1. Best-quality FDG-PET scans, BL visit 2. Restrict to also include best-quality 1.5T MRI’s from screening period • Grey-matter mask applied to ADNINEW & PETGREY Classification & Evaluation • Subsets of data over many iterations • Feature extraction • Principal component analysis • Cross-validate training data • Machine learning methods • LDA, QDA, GNB, KNN, SVM, Full SVM • One-vs.-one multiclass heuristic • Majority voting classifier • Accuracy, sensitivity, specificity, etc. Overview • Introduction • Alzheimer’s Disease and Neuroimaging • Machine Learning Techniques • Implementation • Results • Conclusions Results • Attempted each classification task • AD vs. MCI vs. HC • AD/MCI vs. HC • AD vs. MCI/HC • AD vs. HC • AD vs. MCI • MCI vs. HC • Results evaluated statistically Results – Masked PET Scans Results – Masked PET Scans Results – Unmasked PET Scans Results – MRI Scans Performance Analysis • Three-class formulations need more/better features • Simpler, shorter simulations are fine • SVM best, GNB worst • Others very similar • Grey-matter mask tradeoff • MRI data most separable Overview • Introduction • Alzheimer’s Disease and Neuroimaging • Machine Learning Techniques • Implementation • Results • Conclusions Summary • Taub interface to ADNI database • FDG-PET and MRI data sets • Machine learning framework • Successful AD vs. HC diagnoses • 98%±6% MRI accuracy; 85±5% FDG-PET • ‘Early detection’ (MCI vs. HC) requires further work • Combining data types • Longitudinal studies Future Work • • • • • • • Further classifiers & cross-validation New feature extraction techniques Outlier detection Expand breadth of studies Longitudinal studies Predict cognitive conversions Clinician-friendly interface Acknowledgements • Prof. Fred Fontaine and S*ProCom2 • Dr. Christian Habeck and The Taub Institute • Kamran Mahbobi and MaXentric Technologies • My mentors, family, and friends here today Thank you. Any questions? Q&A – References (1/2) [1] “Alzheimer’s disease facts and figures,” 2007. [Online]. Available: http://www.alz.org/national/documents/report_alzfactsfigures2009.pdf [2] I. Guyon and A. Elissee, “Alzheimer disease in the us population: Prevalence estimates using the 2000 census,” Archives of Neurology, vol. 60, no. 8, pp. 1119–22, Aug. 2003. [Online]. Available: http://www.ncbi.nlm.nih.gov/pubmed/12925369 [3] G. Waldemar, B. Dubois, M. Emre, J. Georges, I. G. McKeith, M. Rossor, P. Scheltens, P. Tariska, and B. Winblad, “Recommendations for the diagnosis and management of alzheimer’s disease and other disorders associated with dementia: Efns guideline,” European journal of neurology: the ocial journal of the European Federation of Neurological Societies, vol. 14, no. 1, pp. e1–26, Jan. 2007. [Online]. Available: http://www.ncbi.nlm.nih.gov/pubmed/17222085 [4] “2011 alzheimer’s disease facts and figures,” pp. 1–2, 2011. [Online].Available: http://www.alz.org/documents_custom/2011_Facts_Figures_Fact_Sheet.pdf [5] R. C. Petersen, “Diagnosis of alzheimer’s disease and mild cognitive impairment,” pp. 7–10, 2005. [Online]. Available: http://www.touchneurology.com/articles/diagnosis-alzheimer039s-disease-andmild-cognitive-impairment [6] S. Molchan, “The alzheimer’s disease neuroimaging initiative,” pp. 30–32, 2005. [Online]. Available: http://www.adni-info.org/Pdfs/adnimolchan.pdf [7] DSM-IV-TR: Diagnostic and Statistical Manual of Mental Disorders, 4th ed. Washington DC: American Psychiatric Press, Inc., 2000. [Online]. Available: www.psychiatryonline.com/resourceTOC.aspx?resourceID=1 [8] A. D. Waxman, “Functional brain imaging in dementia & the transition from spect to pet,” pp. 1–6, 2005. [Online]. Available: http://www.touchneurology.com/articles/functional-brain-imagingdementia-transition-spect-pet [9] H. Blumenfeld, Neuroanatomy through Clinical Cases, 2nd ed. Sunderland, MA: Sinauer Associates, Inc., 2010. [Online]. Available: www.sinauer.com/detail.php?id=0586 [10] R. Seeley, C. VanPutte, J. Regan, and A. Russo, Seeley’s Anatomy & Physiology, 9th ed. New York, NY: McGraw-Hill, 2011. [Online]. Available: highered.mcgraw-hill.com/sites/0073525618 [11] A. Bechara, H. Damasio, and A. R. Damasio, “Emotion, decision making and the orbitofrontal cortex,” Cerebral cortex (New York, N.Y.: 1991), vol. 10, no. 3, pp. 295–307, Mar. 2000. [Online]. Available: http://www.ncbi.nlm.nih.gov/pubmed/10731224 [12] B. Dubois, H. H. Feldman, C. Jacova, S. T. DeKosky, P. Barberger-Gateau, J. Cummings, A. Delacourte, D. Galasko, S. Gauthier, and G. Jicha, “Research criteria for the diagnosis of alzheimer’s disease: revising the NINCDS-ADRDA criteria,” The Lancet Neurology, vol. 6, no. 8, pp. 734–746, Aug. 2007. [Online]. Available: http://linkinghub.elsevier.com/retrieve/pii/S1474442207701783 [13] J. Kang., H.-G. Lemaire, A. Unterbeck, J. M. Salbaum, C. L. Masters, K.-H. Grzeschik, G. Multhaup, K. Beyreuther, and B. Müller-Hill, “The precursor of alzheimer’s disease amyloid a4 protein resembles a cell-surface receptor,” Nature, vol. 325, no. 19, pp. 733–736, 1987. [Online]. Available: http://www.nature.com/nature/journal/v325/n6106/abs/325733a0.html [14] P. J. Whitehouse., D. L. Price., R. G. Struble, A. W. Clark, J. T. Coyle, and M. R. DeLong, “Alzheimer’s disease and senile dementia: Loss of neurons in the basal forebrain,” Science, vol. 215, no. 5, pp. 1237–1239, 1982. [Online]. Available: http://www.ncbi.nlm.nih.gov/pubmed/7058341 [15] K. Blennow, M. J. de Leon, and H. Zetterberg, “Alzheimer’s disease,” Lancet, vol. 368, no. 9533, pp. 387–403, Jul. 2006. [Online]. Available: http://www.ncbi.nlm.nih.gov/pubmed/16876668 [16] G. Kolata, “In spinal-fluid test, an early warning on alzheimer’s,” New York, NY, pp. 1–4, August 2010. [Online]. Available: http: //www.nytimes.com/2010/08/10/health/research/10spinal.html?_r=1 [17] ——, “Vast gene study yields insights on alzheimer’s,” New York, NY, pp. 1–4, April 2011. [Online]. Available: http://www.nytimes.com/2011/04/04/ health/04alzheimer.html [18] A. M. Brickman, C. Habeck, E. Zarahn, J. Flynn, and Y. Stern, “Structural mri covariance patterns associated with normal aging and neuropsychological functioning,” Neurobiology of aging, vol. 28, no. 2, pp. 284–95, Feb. 2007. [Online]. Available: http://www.ncbi.nlm.nih.gov/pubmed/16469419 [19] I. Asllani, C. Habeck, N. Scarmeas, A. Borogovac, T. R. Brown, and Y. Stern, “Multivariate and univariate analysis of continuous arterial spin labeling perfusion mri in alzheimer’s disease,” Journal of cerebral blood flow and metabolism: ocial journal of the International Society of Cerebral Blood Flow and Metabolism, vol. 28, no. 4, pp. 725–36, Apr. 2008. [Online]. Available: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2711077&tool=pmcentrez&rendertype=abstract [20] C. Habeck, N. L. Foster, R. Perneczky, A. Kurz, P. Alexopoulos, R. a. Koeppe, A. Drzezga, and Y. Stern, “Multivariate and univariate neuroimaging biomarkers of alzheimer’s disease,” NeuroImage, vol. 40, no. 4, pp. 1503–15, May 2008. [Online]. Available: http://www.ncbi.nlm.nih.gov/pubmed/18343688 [21] M. D. Devous, “Functional brain imaging in the dementias: role in early detection, dierential diagnosis, and longitudinal studies,” European journal of nuclear medicine and molecular imaging, vol. 29, no. 12, pp. 1685–96, Dec. 2002. [Online]. Available: http://www.ncbi.nlm.nih.gov/pubmed/12458405 [22] J. T. O’Brien, “Role of imaging techniques in the diagnosis of dementia,” The British journal of radiology, vol. 80 Spec No 2, pp. S71–7, Dec. 2007. [Online]. Available: http://www.ncbi.nlm.nih.gov/pubmed/18445747 [23] S. A. Greenfield, The Human Brain: A Guided Tour. New York, NY: Science Masters, 1997. [24] M. Weiner, P. Aisen, R. Peterson, C. Jack, W. Jagust, J. Trojanowski, L. Shaw, A. Toga, L. Beckett, D. Harvey, C. Mathis, A. G. R, G. A. Saykin, S. Potkin, J. Morris, L. T. D, N. Buckholz, D. Lee, and H. Soares, “Alzheimer’s disease neuroimaging initiative update,” Paris, France, 2010. [Online]. Available: http://adni.loni.ucla.edu/wp-content/uploads/2008/07/ADNI_update_Jan%_2010-ADNIUpdate-FIle.pdf [25] J. C. Gore, “Principles and practice of functional mri of the human brain,” in Journal of Clinical Investigation, vol. 112, no. 1, 2003. [Online]. Available: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC162295/ [26] D. Noll and W. Schneider, “Theory, simulation, and compensation of physiological motion artifacts in functional mri,” in Proceedings of 1st International Conference on Image Processing. IEEE Comput. Soc. Press, 1994, pp. 40–44. [Online]. Available: http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=413892 [27] P. M. Matthews, An Introduction to Functional Magnetic Resonance Imaging of the Brain. New York, NY: Oxford University Press Inc., 2004, ch. 1. [Online]. Available: www.fmrib.ox.ac.uk/book/ [28] D. Noll, “Technical challenges in functional neuroimaging,” in 2004 2nd IEEE International Symposium on Biomedical Imaging: Macro to Nano (IEEE Cat No. 04EX821). Ieee, 2004, pp. 1208–1211. [Online]. Available: http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=1398761 [29] H. Damasio, Human Brain Anatomy in Computerized Images, 2nd ed. Oxford University Press, USA, Jan. 2005. [Online]. Available: http://www.worldcat.org/isbn/0195082044 [30] C. M. Bishop, Pattern Recognition and Machine Learning, 2nd ed. New York, NY: Springer Science + Business Media, LLC, 2007. [Online]. Available: http://research.microsoft.com/enus/um/people/cmbishop/prml/ Q&A – References (2/2) [31] S. Theodoridis and K. Koutroumbas, Pattern Recognition, 4th ed. San Diego, California: Academic Press, 2009. [Online]. Available: cgi.di.uoa.gr/~{}stpatrec [32] T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd ed. New York, NY: Springer Science + Business Media, LLC, 2009. [Online]. Available: http://www-stat.stanford.edu/~{}tibs/ElemStatLearn/ [33] L. Wasserman, All of Statistics: A Concise Course in Statistical Inference, 2nd ed. New York, NY: Springer Science + Business Media, Inc., 2004. [Online]. Available: www.stat.cmu.edu/~{}larry/all-ofstatistics/index.html [34] L. Breiman, “Statistical modeling: The two cultures,” Statistical Science, vol. 16, no. 3, pp. 199–231, 2001. [Online]. Available: http://projecteuclid.org/DPubS/Repository/1.0/Disseminate?view= body&id=pdf_1&handle=euclid.ss/1009213726 [35] N. Ramakrishnan, “Data mining: from serendipity to science,” pp. 77–37, 1999. [Online]. Available: https://people.cs.vt.edu/~{}ramakris/papers/gei.pdf [36] C. Burges, “A tutorial on support vector machines for pattern recognition,” Data Mining and Knowledge Discovery, vol. 2, pp. 121–167, 1998. [Online]. Available: http://www.google.com/url?sa=t&source=web&cd=2&ved=0CCMQFjAB&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10. 1.1.122.3829%26rep%3Drep1%26type%3Dpdf&ei=-hGVTY2tNPO-0QHs_9HyCw&usg=AFQjCNHRhD2EaOzu2ZrCbW_Z5YGNytw8bw&sig2=4plfNMIGyFn0w8dnxphQkg [37] I. Guyon and A. Elissee, “An introduction to variable and feature selection,” Journal of Machine Learning Research, vol. 3, pp. 1157–1182, 2003. [Online]. Available: http://portal.acm.org/citation.cfm?id=944968[38] S. Theodoridis, A. Pikrakis, K. Koutroumbas, and D. Cavouras, Introduction to Pattern Recognition: A MATLAB Approach, 1st ed. Burlington, MA: Academic Press, 2010. [Online]. Available: http://www.elsevierdirect.com/companion.jsp?ISBN=9780123744869 [39] S. Haykin, Adaptive Filter Theory, 4th ed. Upper Saddle River, NJ: Prentice-Hall, Inc., 2002. [Online]. Available: http://cwx.prenhall.com/bookbind/pubbooks/haykin/ [40] C. Cortes and V. Vapnik, “Support-vector networks,” Machine Learning, vol. 20, no. 3, pp. 273–297, Sep. 1995. [Online]. Available: http://www.springerlink.com/index/10.1007/BF00994018 [41] L. Breiman, “Bagging predictors,” Machine Learning, vol. 24, no. 2, pp. 123–140, Aug. 1996. [Online]. Available: http://www.springerlink.com/index/10.1007/BF00058655 [42] R. Polikar, “Ensemble-based systems in decision making,” pp. 21–45, 2006. [Online]. Available: users.rowan.edu/~{}polikar/RESEARCH/PUBLICATIONS/csm06.pdf [43] J. Shen, “Tools for NIfTI and ANALYZE Images in MATLAB,” 2011. [Online]. Available: http://www.rotman-baycrest.on.ca/~{}jimmy/NIfTI/ [44] J. Ashburner, G. Barnes, C.-c. Chen, J. Daunizeau, G. Flandin, K. Friston, D. Gitelman, S. Kiebel, J. Kilner, V. Litvak, R. Moran, W. Penny, K. Stephan, R. Henson, C. Hutton, V. Glauche, J. Mattout, and C. Phillips, SPM8 Manual, London, UK, 2011. [Online]. Available: http://www.fil.ion.ucl.ac.uk/spm/doc/manual.pdf [45] D. Simon and J. R. Boring, Sensitivity, Specificity, and Predictive Value, 3rd ed. Boston, MA: Butterworth Publishers, Jul. 1990, vol. 21, no. 7, ch. 6, pp. 49–54. [Online]. Available: http://www.ncbi.nlm.nih.gov/books/NBK383/ [46] B. Efron and R. Tibshirani, “Improvements on cross-validation: The .632+ bootstrap method,” Journal of the American Statistical Association, vol. 92, no. 438, pp. 548– 560, 1997. [Online]. Available: http://pcbfaculty.ou.edu/classfiles/MGT6983-Methods&Design/Efron&TibshiraniJASA’97.pdf [47] M. Elad and M. Aharon, “Image denoising via learned dictionaries and sparse representation,” 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 1 (CVPR’06), pp. 895–900. [Online]. Available: http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=1640847 [48] M. Aharon, M. Elad, and A. Bruckstein, “K-svd: An algorithm for designing overcomplete dictionaries for sparse representation,” Structure, vol. 54, no. 11, pp. 4311–4322, 2006. [Online]. Available: http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1710377 [49] A. Charles, B. Olshausen, and C. J. Rozell, “Sparse coding for spectral signatures in hyperspectral images,” in Asilomar Conference on Signals, Systems and Computers, 2010. [Online]. Available: http://users.ece.gatech.edu/~{}acharles6/documents/CharlesOlshausenRozell_2010.pdf [50] A. S. Charles, B. A. Olshausen, and C. J. Rozell, “Learning sparse codes for hyperspectral imagery,” pp. 1–27, 2011. [Online]. Available: http://users.ece.gatech.edu/~{}crozell/pubs/charles_HSIdict_feb2011.pdf [51] S. Mallat, A Wavelet Tour of Signal Processing: The Sparse Way, 3rd ed. Burlington, MA: Academic Press, 2009. [Online]. Available: www.wavelet-tour.com [52] M. Hanke, Y. O. Halchenko, P. B. Sederberg, E. Olivetti, I. Fründ, J. W. Rieger, C. S. Herrmann, J. V. Haxby, S. J. Hanson, and S. Pollmann, “Pymvpa: A unifying approach to the analysis of neuroscientific data.” Frontiers in Neuroinformatics, vol. 3, no. February, p. 3, Jan. 2009. [Online]. Available: http://www.ncbi.nlm.nih.gov/pubmed/19212459 [53] L. Zhang, D. Samaras, D. Tomasi, N. Volkow, and R. Goldstein, “Machine learning for clinical diagnosis from functional magnetic resonance imaging,” in 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05). Ieee, 2005, pp. 1211–1217. [Online]. Available: http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=1467404 [54] Y. Stern, “Cognitive reserve and alzheimer disease,” Alzheimer disease and associated disorders, vol. 20, no. 3 Suppl 2, pp. S69–74, 2006. [Online]. Available: http://www.ncbi.nlm.nih.gov/pubmed/16917199 [55] T. Kim, L. Al-Dayeh, and M. Singh, “fmri artifacts reduction using bayesian image processing,” IEEE Transactions on Nuclear Science, vol. 46, no. 6, pp. 2134–2140, 1999. [Online]. Available: http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=819295 [56] N. Vartello, V. Positano, E. Ricciardi, M. Santarelli, a. Guazzelli, P. Pietrini, and L. Landini, “Independent component analysis of fmri data: a model based approach for artifacts separation,” First International IEEE EMBS Conference on Neural Engineering, 2003. Conference Proceedings., pp. 529–532, 2003. [Online]. Available: http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=1196880 [57] Y. M. Kadah, “Adaptive denoising of event-related functional magnetic resonance imaging using spectral subtraction,” IEEE Transactions on Biomedical Engineering, vol. 51, no. 11, pp. 1944–1953, 2004. [Online]. Available: http://www.k-space.org/ymk/papers/tbme-04-1.pdf Q&A – AD Biomarkers Q&A – Regions of Interest • Temporal lobe • Language • Hippocampus (memory) • Amygdala (emotion) • Parietal lobe • Sensory integration • Frontal lobe in later stages • Personality Q&A – Benchmarking • Unsimplified MRI data take ~20-25min/iteration (12 patients/data set to train/test) • Unsimplified FDG-PET data take ~5-10 min/iteration (Same patient selection method) • FDG-PET data with grey matter mask take ~1min/iteration, or ~5-10min/it with univariate ROIs (~33 pats/each) Q&A – Clinical Impact • Research progressing: 1. Early, accurate diagnosis 2. Identify predictive biomarkers 3. Treatment and prevention measures • Regulatory approval needed for clinical implementation • Combined prediction & prevention trials may see some success