Uncertainty and Information Integration in Biomedical Applications Claudia Plant Research Group for Bioimaging TU München Outline 1) Motivation: massive increase of data 2) Integration and Uncertainty • Neurosciences: fMRI and EEG data. • Proteomics: Peptide Profiling. 3) Conclusion Uncertainty and Information Integration in Biomedical Applications Motivation: Data Explosion in Medicine and Life Sciences 100 intensity 80 60 40 20 0 -20 -7.86026E-05 2176.5303 8707.7758 19593.737 m/z value The amount of scientific data doubles each year. Szalay et Grey, Nature 2006 Uncertainty and Information Integration in Biomedical Applications BMBF Project: Understanding Resting-state Brain Aktivity • Metabolism of the brain is not significantly reduced in comparison to task. • Other regions become active during rest, so-called resting state networks. Goal of this project: • Understand function of Resting state networks, • compare healthy persons And subjects with functional brain disorders. Methods: fMRI, EEG Challenge for data mining: Massive data sets, uncertainty, information integration Uncertainty and Information Integration in Biomedical Applications fMRI Imaging: Principle and Setup Uncertainty and Information Integration in Biomedical Applications fMRI Imaging: Spatial Aspect VOXEL (Volumetric Pixel) Slice Thickness e.g., 6 mm In-plane resolution e.g., 192 mm / 64 = 3 mm 3 mm 6 mm SAGITTAL SLICE IN-PLANE SLICE Number of Slices e.g., 10 Matrix Size e.g., 64 x 64 Field of View (FOV) e.g., 19.2 cm Uncertainty and Information Integration in Biomedical Applications 3 mm fMRI Imaging: Temporal Aspect With spatial resolution 3x3x6 mm approximately 80,000 voxels the brain. 3 mm Temporal resolution: up to some hundreds of timepoints. 6 mm 3 mm Uncertainty and Information Integration in Biomedical Applications EEG/MEG Low spatial but high temporal resolution (milliseconds). Can we combine the benefits of the two modalites? fMRI: high spatial, low temporal resolution EEG/MEG: high temporal, low spatial resolution Uncertainty and Information Integration in Biomedical Applications The Cocktail Party Problem electrode/ voxel brain process Space: (x +/- e1, y +/- e1, z +/- e1) Time: t +/- e2 Space: (x +/- e3, y +/- e3, z +/- e3) Time: t +/- e4 With e1 >>> e2 And e3 << e4 Uncertainty and Information Integration in Biomedical Applications For Single Type of Microphone: ICA brain process Successfully applied for spatial and temporal de-mixing of fMRI and EEG data. V. D. Calhoun, T. Adali, M. Stevens, K. A. Kiehl, and J. J. Pekar, Semi-Blind ICA of FMRI: A Method for Utilizing Hypothesis-Derived Time Courses in a Spatial ICA Analysis, NeuroImage, vol. 25, pp. 527-538, 2005. V. D. Calhoun, J. J. Pekar, and G. D. Pearlson, Alcohol Intoxication Effects on Simulated Driving: Exploring Alcohol-Dose Effects on Brain Activation Using Functional MRI, Neuropsychopharmacology, vol. 29, pp. 2097-2107, 2004. Uncertainty and Information Integration in Biomedical Applications Temporal ICA with FastICA Example temporal ICA u = u1, …, un v = v1, …, vn 2) Fix Point Iteration: wi = E{uw (g(wiT-uw)} – E{uw(g‘(wiT-uw)} 1) Centering and Whitening De-correlate and standardizise uw = L-1/2 * VT * (u-m) 3) Konvergence M = V * L-1/2 * W, S = X * M-1 Uncertainty and Information Integration in Biomedical Applications Results of Spatial ICA on Task-fMRI Experiment: Subject hits buttom as soon she sees a red light. Spatial ICA X = M Time series S IC IC1: visual cortex IC2: basal regions The red time series of IC1 preceeds the green of IC2. Uncertainty and Information Integration in Biomedical Applications Existing Approches to Joint ICA EEG 1) Scale to common resolution and perform usual ICA Problem: Information Loss! fMRI V. D. Calhoun., T. Adali, N. R. Giuliani, J. J. Pekar, K. A. Kiehl and G. D. Pearlson, Method for multimodal analysis of independent source differences in schizophrenia: combining gray matter structural and auditory oddball functional data, HBM, vol. 27, pp. 47-62, 2006 Uncertainty and Information Integration in Biomedical Applications Existing Approaches to Joint ICA EEG 2) Perform ICA on each modality separately Problem: How to interpret the result? 3) Parallel ICA: Change the objective function of ICA to find similar components in both modalities Problem: Objective function has now two different goals. How to weight them? Parametrization difficult. Perhaps use concepts of Information Theory for this? -> Later fMRI Uncertainty and Information Integration in Biomedical Applications Our Idea: Probabilistic ICA Represent each object (x,y,z,t) as PDF and perform Joint ICA. How to represent? As PDF Uncertainty and Information Integration in Biomedical Applications Probabilistic ICA combined with Information-theoretic Clustering Classical ICA model assumes a global mixing matrix A. This is not always the case, especially for data from different modalites. Do not force integration by parameters, let the data decide. Combine ICA with Clustering! Uncertainty and Information Integration in Biomedical Applications OCI: Outlier-robust Clustering using Independent Components (Sigmod 2008) …so far only for certain data. Parameterfree clustering Non-Gaussian Clusters noise Uncertainty and Information Integration in Biomedical Applications Relationship between PDFs and Data Compression Suppose we know the mixing Matrix and have two candidate PDFs for coordinate zi too many bits good fit too few bits Information Theory: We want to transmit the data and sender and receiver know the correct PDF. The minimum description length is: ? ? We do not know the correct PDF. Try both! Uncertainty and Information Integration in Biomedical Applications ICA and Data Compression ICA yields mixing matrix with directions of minimal entropy -> Efficient coding. Apply FastICA algorithm at a cluster level. x before Centering Whitening x after After 1 iteration After 4 iterations ICA minimizes Entropy -> reduces uncertainty -> reduces compression cost Uncertainty and Information Integration in Biomedical Applications Data Integration and Information Theory Concepts of Information Theory provide means to measure how different Information of different sources is. If information is similar, it can be compressed effectively together. Therefore, information-theoretic clustering is a parameter-free approch to data Integration. Uncertainty and Information Integration in Biomedical Applications Conclusion • Integrative mining of uncertain data is a challenging task of emerging importance in many applications, • We discussed an example from Neurosciences and some ideas for possible but there are many, many others.. (applications and ideas) • This is a very interesting problem specification for basic research in data mining. • Have fun! Uncertainty and Information Integration in Biomedical Applications