Problems in Biological Imaging: Opportunities for Signal Processing Jelena Kovačević bimagicLab Center for Bioimage Informatics Department of Biomedical Engineering Department of Electrical and Computer Engineering Carnegie Mellon University Cast of Characters The Roadmap Issues Revolution in biology Tools Framework What can we do? Tasks Revolution in Biology Focus in biology Vertical to horizontal approach “Omics”: genomics, proteomics, … Fluorescence microscopy Hugely successful Allows for live-cell imaging Fluorescent markers, starting with GFP Allows for collection of high-dimensional data sets 2D images and 3D volumes At multiple time instants Multiple channels Analysis and interpretation Cumbersome, nonreproducible, error prone Goal PSF h Imaging in systems biology Use informatics to Leads to acquire, store, manipulate and share large bioimaging databases automated, efficient and robust processing Need Host of sophisticated tools from many areas A/D Restoration Denoising + Deconvolution Denoising Deconvolution Registration Mosaicing Segmentation Tracking Analysis Modeling The Roadmap Issues Revolution in biology Noise levels and types Lack of ground truth Large deviations Low definition and contrast Wide range of time-frequency features Noise Levels and Types Shift towards noninvasive Data collected farther from the source Signals typically corrupted by high levels of noise Weak biosignals Standard SP techniques not used but even those will not work well with such signals Types of noise Electrical, neuronal, … Modeling of noise a problem Lack of Ground Truth Shift towards noninvasive No access to ground truth Large Deviations Humans and/or animals as ``customers'‘ Wide range of states considered ``normal'‘ Looking for is a range rather than a single state Large deviations from the range of normal states may characterize what we are looking for normal delayed abnormal Low Definition and Contrast Images typically have low contrast and are poorly defined Lack of consistent edges Wide Range of Time- and FrequencyLocalized Features Bioimages Global behaviors together with spikes and transients Puts time-frequency tools to the test “Speckled” nature---stochastic representation The Roadmap Issues Revolution in biology Framework Continuous-domain image processing From continuous to discrete domain Discrete-domain image processing Continuous-Domain Image Processing PSF h Specimen (object) vs image of it (projection) A/D Restoration Denoising + Deconvolution Denoising Deconvolution Registration Mosaicing LSI systems Impulse response of the microscope: PSF Fourier view FT or FS Segmentation Tracking Analysis Modeling From Continuous to Discrete PSF h Resolution in microscopy A/D Restoration Denoising + Deconvolution Filtering before sampling Sources of uncertainty Denoising Deconvolution Registration Mosaicing Segmentation Tracking Analysis Modeling Discrete-Domain Image Processing PSF h LSI system, digital filtering Consider the signal as Infinite signal with finite number of nonzero coefficients Finite signal Fourier view DTFT DFT A/D Restoration Denoising + Deconvolution Denoising Deconvolution Registration Mosaicing Segmentation Tracking Analysis Modeling The Roadmap Issues Revolution in biology Tools Framework Signal and image representations Fourier analysis Gabor analysis Multiresolution analysis Data-driven representation and analysis Signal Representations ER FT WP WT Dirac STFT basis f “Holy Grail” of signal analysis/processing Actin t Understand the “blob”-like structure of the energy distribution in the timefrequency space Design a representation reflecting that Data Driven Representation & Analysis Use representations based on training data and automated learning approaches Wavelet packets PCA & variations ICA … Estimation Framework Random variations introduced by system noise, artifacts, uncertainty originating from the biological phenomena lead to statistical methods Seek the solution optimal in some probabilistic sense Optimality criterion MSE, often depends on unknown parameters Bayesian framework, MAP estimators The Roadmap Issues Revolution in biology Tools Framework Tasks Acquisition Deblurring, denoising, restoration Registration and mosaicing Segmentation, tracing and tracking Classification and clustering Modeling Acquisition Issues in acquisition of fluorescence microscope images Increase resolution Acquire for longer periods Total data acquisition is reduced, speeding up image acquisition Allows a higher frame rate (increased temporal resolution) Allows us to spend more time acquiring the regions of interest (which gives increased spatial resolution) Acquisition process damages both the signal (photobleaching) and the cell (phototoxicity) Efficient acquisition reduces the total amount of data acquired, thus reducing damage to the cell This allows us to observe cellular processes for longer periods Intelligent acquisition Acquire only where and when needed adaptivity Model driven (microscope model & data model) Model-Driven Acquisition Acquisition Grid acquisition MR adaptive acquisition Markov Random Fields Example-based enhancement Efficient Acquisition Reconstruction Reconstruction Simple interpolation methods Wavelet reconstruction Model-based reconstruction Knowledge Extraction Modeling MR Acquisition [Merryman & Kovačević, 2005] Problem Measure of success Why acquire in areas of low fluorescence? Acquire only when and where needed Accuracy Problem dependent Here: Strive to maintain the achieved classification accuracy Approach Mimic “Battleship” Compression Ratio Efficient Acquisition and Learning of Fluorescence Microscope Data Models 2. Intelligent Acquisition No 1. Model Building Model satisfactory? Yes Model Develop a mathematical framework and algorithms to build accurate models of fluorescence microscope data sets as well as design intelligent acquisition systems based on those models 1. Use all the input from the microscope to model the data set 2. Choose acquisition regions that allow us to construct an accurate model in the shortest amount of time Efficient Acquisition and Learning of Fluorescence Microscope Data Models [Jackson, Murphy & Kovačević, 2007] Predict the distribution of fluorescence in the subsequent frame and acquire accordingly Predict likelihood of object moving to any given position Acquire those positions with the highest likelihood Too small an acquisition region may not find the object Too large an acquisition region is inefficient Motion models Three motion models commonly observed in practice Random walk Constant velocity Constant acceleration Efficient Acquisition and Learning of Fluorescence Microscope Data Models Learning the motion model Prediction: Based on current beliefs about motion model, find likelihood of each object appearing at any given pixel in the subsequent frame Acquisition: Acquire the pixels that have the highest overall likelihood of containing an object Observation: Observe the actual location of each object, if found Update: Use this information to update our beliefs about the motion models for each object Efficient Acquisition and Learning of Fluorescence Microscope Data Models Known motion model Single object, random walk of known variance Probability distribution of it appearing in any given location in the subsequent frame Acquisition regions capture the locations where the object is expected with the highest probabilities Efficient Acquisition and Learning of Fluorescence Microscope Data Models Known motion model If the object is detected, repeat, centering the new acquisition region at the object’s most recent location If the object is not detected, estimate where it is Probability distribution given that the object was not in the acquisition region Efficient Acquisition and Learning of Fluorescence Microscope Data Models Known motion model Predict this object’s location in the next frame Probability distribution 1D case: choose two disconnected acquisition regions 2D case: choose to acquire between the two black circles Deblurring, Denoising & Restoration Microscope images contain artifacts Blurring caused by a PSF Noise from the electronics of digitization Deblurring/deconvolution Widefield microscopy Effect of depth Denoising Deconvolution + Denoising = Restoration Registration & Mosaicing Registration Find spatial relationship and alignment between images Mosaicing Used when fine resolution is needed within a global view Stitching together pieces of an image Usually requires registration, given overlapping pieces Segmentation, Tracing & Tracking Segmentation Methods used: thresholding and watershed Edge-based, region-based, combination Active contours Tracing Mostly tracing of axons Typical, path following approaches Fail in the presence of noise Tracking Molecular dynamics and cell migration Tracking of objects over time Segmentation Separate objects of interest from each other and the background Fundamental step in microscopy Hand segmentation Not reproducible Not tight Piecewise linear Cannot compute statistics Time-consuming Current standard Watershed segmentation Active Contour Segmentation Active contour algorithms Contour comparable to an elastic string Moved under external and internal forces External: derived from the image (edges) Internal: geometric properties of the contour (curvature) Level Set method: A way to track the contour as it evolves Positive inside the contour (mountain) Negative outside the contour (valley) Zero on the contour, C embedded at its zero (sea) level Fc < 0 <0 >0 =0 Fc > 0 n STACS Combines energy minimization approach with statistical modeling Model matching Pixels inside and outside the contour follow different statistical models Modified STACs for fluorescence microscopy images No edge information No obvious shape information Segmentation driven by statistics of the image and contour smoothness MSTACS: Our level-set evolution equation Topology needs to be preserved TPSTACS TPSTACS: Results [Coulot, Kirschner, Chebira, Moura, Kovačević, Osuna & Murphy, 2006] Successful Problem Hand-segmented Solution TPTACS Extremely slow MRSTACS MRSTACS Decompose image to L levels Smoothing renders cell easier to discern Detect cells using morphological operations Get coarse version of contour (TPSTACS) Refine contour iteratively faster segmentation Coarse result < 3 sec Fine result < 30 min h ↓2 g ↓2 h ↓2 ↓2 g ↓2 horizontal 2D Filter bank Level 1 decomposition h g ↓2 vertical 37 A Critical Review of Active Contours Flexible Can be tuned to be accurate Adapt to topological changes in the image But… Tuning of parameters is involved Updating the level set function – inefficient What is the ‘contour’ in a digital image? Discrete topological rules – external constraints can cause abruptness Multiresolution – how do we reconstruct the level set function? New math needed Active Mask Framework: No Contours Fluorescence microscope images speckled in nature Estimate densities of bright pixels in local neighborhood at different scales Recast computation of force as a transformation 20 20 40 40 60 No need for the time consuming extension function 60 80 80 100 100 120 120 For image f, transform T is 140 160 140 160 20 40 60 80 100 120 140 160 180 200 20 40 60 80 100 120 140 160 180 200 A slight blur Original Image 20 20 40 40 60 60 Windowing function and scale factor a 80 100 120 140 80 Different conditions (cell lines, resolution, etc.) Different and a 100 120 TPSTACS: Rectangular , a = 1 and suitable operands 140 160 160 20 40 60 80 100 120 140 160 180 200 Enough to discern the cell boundary 20 40 60 80 100 120 140 160 180 200 Too much blur – Edges rounded Active Masks: Results HeLa cells – Total protein image HeLa cells – Membrane protein image Success Initialization: Level set function is identically zero Iterations: 3 Time taken: 6.5 sec per iteration Active Masks Pros Cons Framework suited to digital images Can be made specific with the choice of suitable forces, windows and scale factors Performance not critically dependent on initialization Easy and fast to compute Translation, dilation and rotation invariance can be preserved Topology preservations hard Multiple active mask framework Multiple Active Masks Initialization Random initialization with M»M0 masks, where M0 = expected number of objects in the image Evolution: driven by distributor functions Can incorporate multiresolution/multiscale Convergence Experimentally Working on a proof Results of STACS on Different Modalities Yeast DIC Cardiac MRI: Endocardium and epicardium Brain fMRI Axial Coronal Saggital True Positive False Positive False Negative Classification Problems in Bioimaging Determination of protein subcellular location patterns [Chebira, Barbotin, Jackson, Merryman, Srinivasa, Murphy & Kovačević, 2007] Detection of developmental stages in Drosophila embryos [Kellogg, Chebira, Goyal, Cuadra, Zappe, Minden & Kovačević, 2007] Classification of histological stem-cell teratomas [Ozolek, Castro, Jenkinson, Chebira,, Kovačević, Navara, Sukhwani, Orwig, Ben-Yehudah & Schatten, 2007] Fingerprint recognition [Hennings, Thornton, Kovačević & Kumar , 2005] [Chebira, Coelho, Sandryhalia, Lin, Jenkinson, MacSleyne, Hoffman, Cuadra, Jackson, Püschel & Kovačević , 2007] Develop an automated system capable of fast, robust and accurate classification Multiresolution Classification shorthand System Generic Classification MR Classification C W Weighting Algorithm Hypothesis: Better classification accuracy obtained if we use the spacefrequency information lying in the MR subspaces Feature FE MR Extraction Compute features in the MR-decomposed subspaces (subbands) instead Would like to use wavelet packets Do not have an obvious cost measure Do it implicitly instead MR Block FE MR C W Grow full tree to L levels Use all nodes MR Bases DWT DFT DCT … MR Frames SWT DT-CWT DD-DWT Our design: LTFT Lapped Tight Frame Transforms Build MR transforms for these problems Not many nonredundant ones exist Seed them from higher-dimensional bases Feature Extraction and Classifier Feature Extraction MR New Haralick texture features (T3, 26 features) Morphological features (M, 16 features) Zernike features (Z, 49 features) Classifier Neural networks No hidden layers MR FE C W FE C W Weighting Procedure Local decisions MR Decision vectors for each subband of each training image containing C numbers Goal: combine local decisions into a global one Algorithms Open form (iterative) Closed form (analytical) Per data set Per class Pruning criteria FE C W Determination of PSL Patterns: Results MR significantly outperforms NMR MRF outperform MRB Per-Dataset CF slightly outperforms OF Trend is flat → T3 set enough Why Do MR Frames Work? Looking into classes of signals where bases/frame perform better Simple example Real plane Two classes Decision rule Union of nonoverlaping parallelograms, bases, otherwise, frames Conclusions and Opportunities Issues Revolution in biology Tools Framework What can we do? Tasks Conclusions & Opportunities The “dream”: automated, efficient and reliable processing as well as knowledge extraction from large bioimage databases Dig in! Gaps to fill Need tools adapted to specific bioimaging applications Need to adapt state-of-theart techniques and/or come up with new ones for bioimaging tasks