MIT OpenCourseWare http://ocw.mit.edu HST.582J / 6.555J / 16.456J Biomedical Signal and Image Processing Spring 2007 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms. Harvard-MIT Division of Health Sciences and Technology HST.582J: Biomedical Signal and Image Processing, Spring 2007 Course Director: Dr. Julie Greenberg HST-582J/6.555J/16.456J - Biological Signal and Image Processing - Spring 2007 Problem Set 5 Due April 26, 2007 Problem 1 Decision Boundaries: Two-dimensional Gaussian Case The optimal Bayesian decision rule can be written: ( ) 1 ; pp10 (x > PP01 (x) φ (x) = 0 ; otherwise It is sometimes useful to express the decision in the log domain, or equivalently ( 1 ; ln (p1 (x)) − ln (p0 (x)) > ln PP10 φ (x) = 0 ; otherwise The decision boundary is defined as the locus of points, x, where the ratios are equal, that is P0 ln (p1 (x)) − ln (p0 (x)) = ln P1 If x = [x1 , x2 ] is a two-dimensional Gaussian variable, its PDF is written: 1 1 T −1 exp − (x − mi ) Σi (x − mi ) pi (x) = 2 2π |Σi |1/2 where mi , Σi are the class-conditional means and covariances, respectively. Plugging this into the log form of the decision boundary above yields: 1 1 P0 1 |Σ0 | T T −1 −1 − (x − m1 ) Σ1 (x − m1 ) + (x − m0 ) Σ0 (x − m0 ) + ln = ln 2 2 2 |Σ1 | P1 Suggestion: You may want to do part (d) of this problem first as a way of checking your answers to the first three parts although it is not necessary to do so. a) Suppose 1 x1 −1 1 P1 = P0 = m0 = x= m1 = x0 1 0 2 100 −90 19 1 9 −1 19 19 Σ−1 = Σ = Σ1 = Σ0 = 9 10 |Σ1 | = |Σ0 | = −90 100 1 0 1 100 10 19 19 express the decision boundary in the form x2 = f (x1 ). Cite as: Julie Greenberg, Bertrand Delgutte, William (Sandy) Wells, John Fisher, and Gari Clifford. Course materials for HST.582J / 6.555J / 16.456J, Biomedical Signal and Image Processing, Spring 2007. MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. 1 b) If we keep all values from part (a), but set 1 P0 = exp − P1 2 how does the decision boundary change in terms of its relationship to m1 and m0 ? Express the decision boundary in the form x2 = f (x1 ) using the new value of the ratio of P0 to P1 and the means and covariances from part (a). c) Suppose now that Σ1 = Σ 0 = 1 r r 1 where |r| < 1 (which is simply a constraint to ensure Σi is a valid covariance matrix) keeping all other relevant terms from part (a). How does this change the decision boundary as compared to the result of part (a)? d) Now let Σ0 = 1 −9 10 −9 10 1 Σ−1 0 = 100 19 90 19 90 19 100 19 |Σ0 | = 19 100 setting all other parameters, except P1 and P0 , the same as in part (a). Use matlab contour function to plot the decision boundary as a function of the ratio of prior probabilities of each class for the values P0 /P1 = [1/4, 1/2, 1, 2, 4]. Here is some of the code you will need (where “function” is the left side of the decision boundary equation, ln (p1 (x)) − ln (p0 (x))): [x1,x2] = meshgrid(-4:0.1:4,-4:0.1:4); d = function(x1,x2); [c,h] = contour(x1,x2,d,log([1/4,1/2,1,2,4])); clabel(c,h); Cite as: Julie Greenberg, Bertrand Delgutte, William (Sandy) Wells, John Fisher, and Gari Clifford. Course materials for HST.582J / 6.555J / 16.456J, Biomedical Signal and Image Processing, Spring 2007. MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. 2 Problem 2 Suggestion: read the entire question, the answer can be stated in one sentence with no calculations. Suppose you have a 3-dimensional measurement vector x = [x1 , x2 , x3 ] for a binary classification problem where 0 < P1 < 1 (i.e. it is strictly greater then 0 and less then 1). Recall that the class-conditional marginal distribution of x1 , x2 is Z pi (x1 , x2 ) = pi (x1 , x2 , x3 ) dx3 Z = pi (x1 , x2 |x3 ) pi (x3 ) dx3 and that the unconditioned marginal density of any single measurement is p (xk ) = 1 X Pi pi (xk ) i=0 where k = 1, 2, or 3. Now consider 2 different decision functions. The first φ (x1 , x2 , x3 ) is the optimal classifier using the full measurement vector [x1 , x2 , x3 ], while the second ϕ (x1 , x2 ) is the optimal classifier using only [x1 , x2 ]. In general the probability of error using φ (x1 , x2 , x3 ) will be lower then when using ϕ (x1 , x2 ) (i.e. when we ignore the third measurement). State a condition under which both classifiers will achieve the same probability of error. Cite as: Julie Greenberg, Bertrand Delgutte, William (Sandy) Wells, John Fisher, and Gari Clifford. Course materials for HST.582J / 6.555J / 16.456J, Biomedical Signal and Image Processing, Spring 2007. MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. 3