Lecture 2: Matched filtering P. Perona California Institute of Technology EE/CNS148 - Spring 2005 1 1.1 Detecting a signal (object) in white noise Signal in known position Suppose that you are observing a game of checkers and you are wondering if a white piece is present at a given location on the board, e.g. E3. There are two possibilities: either the object (the white piece) is there at the given position or not. If the object was there and we had clean measurements, then the signal we would observe would be s, if it was not there we would observe 0 (we may translate our origin so that the ‘no object’ case has signal equal to zero - this will simplify our calculations later. Unfortunately, our camera is not perfect (and maybe someone is smoking in the room), so the measured signal is the sum of s, the ‘clean’ object, and n, some noise (or 0 + n if the object is not there). In order to allow for explicit calculations, suppose that the noise is white and Gaussian. Our task is to discriminate between the two situations: is the object (the white piece of checkers) there or not? One possible approach to detect the presence of the object in the signal is to use a linear operator to ‘condition’ the signal, followed by a threshold to perform a yes/no decision. First, the linear operator, indicated by k, is multiplied to the signal giving a response R =< k, s + n >. The response is then compared to a threshold T . If R > T the object is declared to be present, if R <= T the object is declared to be absent. The design problem consists in choosing which k, T combination will give minimum error rate. The ‘response’ of the linear operator Rn to pure Gaussian white noise of mean µ = 0 and std σ is: X Rn = < k, n >= ki n i i (1) 1 This is, unfortunately, a stochastic quantity with mean and variance: X ERn = ki Eni = 0 i ERn2 = X ki kj Eni nj = i,j X ki2 En2i = σ 2 kkk2 (2) i Here we assume that the signal is sampled by an array of pixels, each one of which is denoted by the index i. You may want to perform the corresponding calculations for the case in which the signal and noise are defined on a continuum: e.g. s(x), n(x) with x ∈ (0, 1). The response to a signal containing the object and no noise is the deterministic quantity Rs : Rs = < k, s > (3) By linearity the response to the signal with noise is: R = < k, n > + < k, s > (4) ER = < k, s > E(R − ER)2 = σ 2 kkk2 (5) with mean and variance: Suppose for a moment that we had chosen the operator k, then a reasonable way to choose the threshold is T = (Rs − ERn )/2 = Rs /2. This is called the ‘equal error rate threshold’ since it produces as many ‘false detections’ as ‘false rejections’. You may calculate the false rejection and false detection rates explicitly thanks to the assumption that the noise is Gaussian. Now we proceed to designing k. The strategy we will follow is to minimize the error rate with respect p to the parameters contained in k. The error rate will be minimized when the ratio Rs / ERn2 is minimized. So the best k is the solution to the problem: k ∗ = min k < k, s > σkkk (6) Notice that the ratio is independent of the norm of the kernel k, since it appears linearly both in the numerator and in the denominator. We may as well set kkk = 1 from now on. The question is then how to maximize Rs with respect to k ( ERn depends only on the norm of k which is now set to 1 and on the constant σ which we cannot change). This is quite clearly maximized when s k= ksk This may be verified either using constrained maximization (e.g. using Lagrange multipliers) or simply noting that in order to maximize the product of s and k one needs to align k with s. 2 1.2 Signal in unknown position Now suppose that we have to decide whether a white piece is present on the board, but we do not know its position. Can we generalize the previous approach to this case? The simplest approach is to make a list of all the possible locations where the piece might be, and test each one of them independently. If the piece could move on the board with continuity, then the list of all positions is infinite. The product of k and s + n at all locations is called a correlation integral: Z R(x) = s ∗ k =< s(y − x), k >= s(y − x)k(y)dy In order to detect likely locations of signal now consider the local maxima of R that exceed a given threshold. See Matlab example. What do you do when there are two possible signals (e.g. white pieces and black pieces on the board)? One possibility is to use 2 filters k1 and k2 and pick the local maximum that is largest. 2 Variable signal What when the signal is variable (eg. lighting, deformations, translation)? E.g. consider the problem of detecting a chess piece (a horse) under perspective projection (sometimes bigger, sometimes smaller) and arbitrary orientation of the piece. Suppose that one could only use one kernel, then one would pick the kernel that maximizes the average product with the observed signals si : X k = arg max < k, si >2 = k T SS T k k i where S = [s1 . . . sN ] (here, for simplicity of notation, we think of both k and the si as column vectors). Now notice that A = SS T is a positive semidefinite symmetric matrix. Therefore its eigenvalues are real and semipositive and its eigenvectors are an orthonormal set. Call ui the set of eigenvectors of A (equivalently, they are the left-eigenvectors of S, and call σi the corresponding eigenvalues. Suppose that they are sorted in descending order of magnitude. Then A may be expressed in terms of the basis formed by the ui and the maximization has a simple solution: X A = SS T = σi ui uTi i k = arg max k T Ak = u1 and k T max k Ak = σ1 kkk2 = σ1 k 3 Suppose that we allowed k to belong to a r-dimensional space with r << N . Then the solution would be ki = ui . Use principal component analysis (or, equivalently, singular value decomposition) to select suitable subspace. 4