Stein Unbiased Risk Estimator Michael Elad The Objective We have a denoising algorithm of some sort, and we want to set its parameters so as to extract the best out of it y x + v~ Algorithm 0, I 2 min E yˆ x ŷ h y, 2 2 min E h y, x 2 2 Charles M. Stein Paper Derivation – 1 Lets open the norm into its ingredients: E h y, x 2 2 E h y, 2 2 2E x h y, T E x 2 2 Easy Impossible? Not important Therefore, we will proceed with the second term and show that in fact it can be computed Derivation – 2 Using the fact that x yv we get E v h y, E x h y, E y h y, T T T Easy Impossible? Again, the first term is fine for us to compute, while the second seems hard (we do not know the noise vector!) Derivation – 3 Using the definition of expectation E v h y, E v k h k y, T k v 2k 1 v k h k y, exp 2 dv k 2 k 2 This may look ugly BUT ….. Derivation – 4 We notice that the same integral can be written as v2 v h y, exp 22 dv 2 d v 2 h y, exp 2 dv 2 dv which should remind us of integration by parts: d d f x dx g x dx f x g x dx f x g x dx Derivation – 5 Using this to our expression leads to d v 2 h y, dv exp 22 dv v v2 d h y, exp 2 exp 2 h y, 2 2 dv 2 Assuming that the function h is finite for all y, this term is zero The derivative w.r.t. v can be replaced by a derivative w.r.t. y dh y, dh y, dy dh y, dv dy dv dy d(x v) 1 dv dv dy dv Derivation – 6 One last step – the expression we got is in fact an expectation … d v 2 h y, dv exp 22 dv v2 d exp 2 h y, 2 dy d E h y, dy dv Wrap Up (1) We got the following expression after all the above steps E h y, x 2 2 E h y, 2 2 2E y h y, T The squared norm of the estimated image An inner product between the noisy and the denoised images 2 2 E y h y, Sum over the “sensitivity” of our algorithm to perturbations in the input vector const. Our estimator is true up to an unknown constant Wrap Up (2) Since we cannot compute the expectation, we will simply drop it with the hope that the summation over all the image pixels is sufficient to provide the desired accuracy E h y, x 2 2 h y, 2 2 2y h y, T 2 2 y h y, If you want to set the parameters, , do this while minimizing the above expression This implies that the denoising algorithm should be differentiable w.r.t. the input. Example – Thresholding Lets come back to the global image denoising scheme by thresholding y x + v~ Algorithm 0, I 2 DWS W ŷ h y, T 1 T D y Example – Smoothing Lets make sure that our estimator is differentiable by smoothing it (assume k is even) 2 1.5 1 k z zk T z ST z k z k z Tk z 1 T k=10 k=20 Hard-Thresholding 0.5 0 -0.5 2k -1 -1.5 -2 -2 -1 0 1 2 dST z dz z z (k 1) T T z 1 T k 2 k Example - SURE SURE in our case is therefore … E h y, x 2 2 h y, 2 2 2y h y, T 2 y h y, 2 1 DWST W D y T 2 2 2y DWST W 1DT y T 1 1 2 tr DWS W D y W D 2 ' T T T Example - SURE We can simplify the last term tr DWS'T W 1DT y W 1DT tr WS'T W 1DT y W 1DT D Some Properties: ' 1 T 1 T tr WST W D y W D D A Diagonal Matrix tr AB tr BA tr S'T W 1DT y DT D tr W1W2 W11 tr W2 tr WR tr W diag( R ) tr S W ' T 1 DT y W 2 Example - SURE Bottom line: E yˆ x 2 2 DWS W D y 1 T T 2 2 2y DWST W 1DT y T 1 2 tr S W D y W 2 ' T Does this work? T 2 Example - SURE Run Chapter_14_Global_SURE.m