DSP-CIS Chapter-12: Least Squares & RLS Estimation Marc Moonen Dept. E.E./ESAT-STADIUS, KU Leuven marc.moonen@esat.kuleuven.be www.esat.kuleuven.be/stadius/ Part-III : Optimal & Adaptive Filters Chapter-11 : Optimal & Adaptive Filters - Intro • General Set-Up • Applications • Optimal (Wiener) Filters – : Least Squares & Recursive Least Squares Estimation Chapter-12 • Least Squares Estimation • Recursive Least Squares (RLS) Estimation • Square-Root Algorithms – : Least Means Squares (LMS) Algorithm Chapter-13 Chapter-14 : Fast Recursive Least Squares Algorithms Chapter-15 : Kalman Filtering DSP-CIS / Chapter-12 : Least Squares & RLS Estimation / Version 2014-2015 p. 2 Least Squares & RLS Estimation 3 1. L east Squar es (L S) Est imat ion Pr ot ot ype opt imal/ adapt ive fi lt er r evisit ed filter structure ? filter input → FIR filters (= pragmatic choice) filter filter parameters cost function ? filter output → quadratic cost function (= pragmatic choice) + error DSP-CIS / Chapter-12 : Least Squares & RLS Estimation / Version 2014-2015 desired signal p. 3 Least Squares & RLS Estimation 4 1. L east Squar es (L S) Est imat ion FIR filters (= tapped-delay line filter/ ‘transversal’ filter) filter input N−1 wl · uk− l yk = u[k] u[k-1] uk-1 uk l= 0 u[k-2] w0[k] u[k-3] uk-2 w1[k] uk-3 w2[k] w3[k] 0 yk = w T · u k filter output where wT = + error w0 w1 w2 ··· wN − 1 desired signal b a+bw w a u Tk = uk uk− 1 uk− 2 ··· uk− N + 1 DSP-CIS / Chapter-12 : Least Squares & RLS Estimation / Version 2014-2015 p. 4 Least Squares & RLS Estimation 5 1. L east Squar es (L S) Est imat ion Quadratic cost function M M SE : (see Lecture 8) J M SE (w ) = E{ e2k } = E{ |dk − yk |2} = E{ |dk − w T u k |2} L east -squar es(L S) cr it er ion : if statistical info is not available, may use an alternative ‘data-based’ criterion... L L e2k = J L S(w ) = k= 1 |dk − yk |2 = L T 2 k= 1 |dk − w u k | k= 1 Interpretation? : see below DSP-CIS / Chapter-12 : Least Squares & RLS Estimation / Version 2014-2015 p. 5 Least Squares & RLS Estimation 6 1. L east Squar es (L S) Est imat ion filter input sequence : u 1, u 2, u 3, . . . , u L corresponding desired response sequence is : d1, d2, d3, . . . , dL e1 e2 .. eL error signal e = d1 d2 .. dL d cost function : J L S(w) = u T1 − u T2 .. u TL U L 2 k= 1 ek = w0 w · .1 . wN w e 22 = d − Uw 22 → linear least squares problem : minw d − Uw 22 DSP-CIS / Chapter-12 : Least Squares & RLS Estimation / Version 2014-2015 p. 6 Least Squares & RLS Estimation 7 1. L east Squar es (L S) Est imat ion L e2k = e 22 = eT · e = d − Uw 22 J L S(w) = k= 1 minimum obtained by setting gradient = 0 : ∂J L S(w) ∂ ]w = w L S = [ (d T d + w T U T Uw − 2w T U T d)]w = w L ∂w ∂w = [2U T U w − 2U T d ]w = w L S 0= [ X uu X du X uu · w L S = X du → w L S = X −uu1X du DSP-CIS / Chapter-12 : Least Squares & RLS Estimation / Version 2014-2015 ‘normal equations’. p. 7 Least Squares & RLS Estimation 9 1. L east Squar es (L S) Est imat ion N ot e : correspondences with Wiener filter theory ? ¯ uu and X¯ du by time-averaging (ergodicity!) ♣ estimate X ¯ uu} = 1 estimate{ X L ¯ du } = 1 estimate{ X L L u k · u Tk = 1 T 1 U · U = X uu L L u k · dk = 1 T 1 U · d = X du L L k= 1 L k= 1 leads to same optimal filter : estimate{ w W F } = ( L1 X uu)− 1 · ( L1 X du) = X −uu1 · X du = w L S DSP-CIS / Chapter-12 : Least Squares & RLS Estimation / Version 2014-2015 p. 8 Least Squares & RLS Estimation 10 1. L east Squar es (L S) Est imat ion N ot e : correspondenceswith Wiener filter theory ? (continued) ♣ Furthermore (for ergodic processes!) : 1 X¯ uu = lim L →∞ L 1 X¯ du = lim L →∞ L L k= 1 u k · u Tk = lim 1 X uu L u k · dk = lim 1 X du L L →∞ L k= 1 L →∞ so that limL →∞ w L S = w W F . DSP-CIS / Chapter-12 : Least Squares & RLS Estimation / Version 2014-2015 p. 9 Least Squares & RLS Estimation 11 2. R ecur sive L east Squar es (R L S) For a fixed data segment 1...L , least squares problem is minw d − Uw 22 d= d1 d2 .. dL u T1 U= u T2 .. u TL w LS = [U T U]− 1 · U T d . Wanted : recursive/ adaptive algorithms L → L + 1, sliding windows, etc... DSP-CIS / Chapter-12 : Least Squares & RLS Estimation / Version 2014-2015 p. 10 Least Squares & RLS Estimation 2. R ecur sive L east Squar es (R L S) 12 • given opt imal solut ion at t ime L ... w L S(L ) = [X uu (L )]− 1X du(L ) = [U(L )T U(L )]− 1[U(L )T d(L )] d1 d(L ) = u T1 d2 ... U(L ) = dL u T2 ... u TL • ...comput e opt imal solut ion at t ime L + 1 w L S(L + 1) = ... d(L + 1) = d1 u T1 d2 ... u T2 ... u TL + 1 dL + 1 U(L + 1) = W ant ed : O(N 2) instead of O(N 3) updating scheme DSP-CIS / Chapter-12 : Least Squares & RLS Estimation / Version 2014-2015 p. 11 Least Squares & RLS Estimation 13 2.1 St andar d R L S It is observed that X uu(L + 1) = X uu (L ) + u L + 1u TL + 1 The matrix inversion lemma states that [X uu(L + 1)]− 1 = [X uu(L )]− 1 − [X uu (L )]− 1u L + 1u TL + 1[X uu (L )]− 1 1+ u TL + 1[X uu (L )]− 1u L+ 1 Result : w L S(L + 1) = w L S(L ) + [X uu(L + 1)]− 1u L + 1 · (dL + 1 − u TL + 1w L S(L )) Kalman gain a priori residual = st andar d r ecur sive least squar es (R L S) algor it hm R emar k : O(N 2) operations per time update R emar k : square-root algorithms with better numerical properties see below DSP-CIS / Chapter-12 : Least Squares & RLS Estimation / Version 2014-2015 p. 12 Least Squares & RLS Estimation 14 2.2 Sliding W indow R L S Sliding window R L S J L S(w ) = L 2 e k= L − M + 1 k M = length of the data window w L S(L ) = [U(L )T U(L )]− 1 [U(L )T d(L )] [X uu (L )]− 1 X du (L ) with d(L ) = dL − M + 1 dL − M + 2 .. dL u TL − M + 1 U(L ) = u TL − M + 2 .. u TL . DSP-CIS / Chapter-12 : Least Squares & RLS Estimation / Version 2014-2015 p. 13 Least Squares & RLS Estimation 15 2.2 Sliding W indow R L S It is observed that X uu(L + 1) = X uu (L ) − u L − M + 1u TL − M + 1 + u L + 1u TL + 1 L|L + 1 X uu • leads to : updating (cfr supra) + similar downdating • downdating is not well behaved numerically, hence to be avoided... • simple alternative : exponential weighting DSP-CIS / Chapter-12 : Least Squares & RLS Estimation / Version 2014-2015 p. 14 Least Squares & RLS Estimation 16 2.3 Exponent ially W eight ed R L S Exp onent ially weight ed R L S J L S(w ) = L 2(L − k) 2 ek k= 1 λ 0 < λ < 1 is weighting factor or forget factor 1 1− λ is a ‘measure of the memory of the algorithm’ w L S(L ) = [U(L )T U(L )]− 1 [U(L )T d(L )] [X uu (L )]− 1 X du (L ) with d(L ) = λ L − 1d1 λ L − 2d2 .. λ 0dL λ L − 1u T1 U(L ) = λ L − 2u T2 .. λ 0u TL DSP-CIS / Chapter-12 : Least Squares & RLS Estimation / Version 2014-2015 p. 15 Least Squares & RLS Estimation 17 2.3 Exponent ially W eight ed R L S It is observed that X uu(L + 1) = λ 2X uu (L ) + u L + 1u TL + 1 hence −1 [X uu(L + 1)] = 1 [X (L )]− 1u 1 T −1 uu L+ 1u L+ 1 λ 2 [X uu (L )] 1 −1 λ2 [X (L )] − λ 2 uu 1+ 12 u TL+ 1[X uu(L )]− 1u L + 1 λ w L S(L + 1) = w L S(L ) + [X uu(L + 1)]− 1u L + 1 · (dL + 1 − u TL + 1w L S(L )) . DSP-CIS / Chapter-12 : Least Squares & RLS Estimation / Version 2014-2015 p. 16 Least Squares & RLS Estimation 18 3. Squar e-r oot A lgor it hms • Standard RLSexhibitsunstableroundoff error accumulation, hence not the algorithm of choice in practice • Alternativealgorithms(‘square-root algorithms’), which have been proved to be stable numerically, are based on orthogonal matrix decompositions, namely QR decomposition (+ QR updating, inverse QR updating, see below) DSP-CIS / Chapter-12 : Least Squares & RLS Estimation / Version 2014-2015 p. 17 Least Squares & RLS Estimation 19 3.1 QR D -based R L S A lgor it hms QR D ecom posit ion for L S est im at ion least squares problem minw d − Uw 22 least squares solution w LS = [U T U]− 1 · U T d . avoid matrix squaring for numerical stability ! DSP-CIS / Chapter-12 : Least Squares & RLS Estimation / Version 2014-2015 p. 18 Least Squares & RLS Estimation 20 3.1 QR D -based R L S A lgor it hms QR D ecom posit ion for L S est im at ion least squares problem minw d − Uw 22 ‘square-root algorithms’ based on QR decomposition (QRD) : é R ù UU= = Q . êQ · úR= Q . R LxN L × N LxLLë× N0 Nû× N Q(:,1:N ) NxN LxN QT · Q = I , Q is orthogonal R is upper triangular DSP-CIS / Chapter-12 : Least Squares & RLS Estimation / Version 2014-2015 p. 19 Least Squares & RLS Estimation 21 Everything you need to know about QR decomposition Example : 1 2 3 4 U 6 10 7 − 11 8 12 9 − 13 = Q Q R 0.182 0.816 0.174 5.477 14.605 − 5.112 0.365 0.408 − 0.619 0 4.082 8.981 · 0.547 0 0.716 0 0 20.668 0.730 − 0.408 − 0.270 R emar k : QRD ≈ Gram-Schmidt R emar k : U T · U = R T · R R is Cholesky factor or square-root of U T · U → ‘square-root’ algorithms ! DSP-CIS / Chapter-12 : Least Squares & RLS Estimation / Version 2014-2015 p. 20 Least Squares & RLS Estimation 22 3.1 QR D -based R L S A lgor it hms QR D for L S est imat ion if é R ù U == Q Q·.R ê ú U 0 LxN û LxL ë then LxN (**) T min d = 2 =Rmin z w Q (d -Uw) QT w· dU-Uw 2 2 2 = min w é z ù é R ù ê ú-ê úw ë * û ë 0 û 2 2 with this w LS = R − 1 · z (**) orthogonal transformation preserves norm DSP-CIS / Chapter-12 : Least Squares & RLS Estimation / Version 2014-2015 p. 21 Least Squares & RLS Estimation 23 3.1 QR D -based R L S A lgor it hms QR -up dat ing for R L S est imat ion Assume we have computed the QRD at time k ˜ T · U(k) d(k) = R(k) z(k) Q(k) The corresponding LS solution is w LS(k) = [R(k)]− 1 · z(k) Our aim is to update the QRD into ˜ + 1)T · U(k + 1) d(k + 1) = Q(k R(k + 1) z(k + 1) and then compute w LS(k + 1) = [R(k + 1)]− 1 · z(k + 1) DSP-CIS / Chapter-12 : Least Squares & RLS Estimation / Version 2014-2015 p. 22 Least Squares & RLS Estimation 24 3.1 QR D -based R L S A lgor it hms QR -up dat ing for R L S est imat ion It is proved that the relevant QRD-updating problem is R(k) z(k) R(k + 1) z(k + 1) ← Q(k + 1)T · T 0 u k+ 1 dk+ 1 PS: This is based on a QR-factorization as follows: é R(k) ù é ù ê ú = Q(k +1) .ê R(k +1) ú êë uTk+1 úû ( N+1)´( N+1) êë úû 0 ( N+1)´( N ) ( N+1)´( N ) DSP-CIS / Chapter-12 : Least Squares & RLS Estimation / Version 2014-2015 p. 23 Least Squares & RLS Estimation 25 3.1 QR D -based R L S A lgor it hms QR -up dat ing for R L S est imat ion R(k) z(k) R(k + 1) z(k + 1) T ← Q(k + 1) · T 0 u k+ 1 dk+ 1 R(k + 1) · w LS(k + 1) = z(k + 1) . = square-root (information matrix) RLS R emar k . with exponential weighting λ R(k) λ z(k) R(k + 1) z(k + 1) T ← Q(k + 1) · 0 u Tk+ 1 dk+ 1 DSP-CIS / Chapter-12 : Least Squares & RLS Estimation / Version 2014-2015 p. 24 Least Squares & RLS Estimation 26 3.1 QR D -based R L S A lgor it hms QR D updat ing R(k) z(k) R(k + 1) z(k + 1) ← Q(k + 1)T · T 0 u k+ 1 dk+ 1 basic tool is Givens r ot at ion def Gi ,j ,θ = i j ↓ ↓ I i− 1 0 0 0 0 0 cosθ 0 sin θ 0 0 0 I j − i− 1 0 0 0 − sin θ 0 cosθ 0 0 0 0 0 I m− j DSP-CIS / Chapter-12 : Least Squares & RLS Estimation / Version 2014-2015 ← i ← j p. 25 Least Squares & RLS Estimation 27 3.1 QR D -based R L S A lgor it hms QR D updat ing Givens rotation applied to a vector x˜ = Gi ,j ,θ · x : x˜ i = cosθ · x i + sin θ · x j x˜ j = − sin θ · x i + cosθ · x j x˜ l = x l for l = i , j xj x˜ j = 0 iff tan θ = x ! i DSP-CIS / Chapter-12 : Least Squares & RLS Estimation / Version 2014-2015 p. 26 Least Squares & RLS Estimation 28 3.1 QR D -based R L S A lgor it hms QR D updat ing R(k) z(k) R(k + 1) z(k + 1) T ← Q(k + 1) · T 0 u k+ 1 dk+ 1 sequence of Givens t r ansfor mat ions x x x x x x x x x x x x x → x x x x x x x x x x x x → x x x x x x x x x x x DSP-CIS / Chapter-12 : Least Squares & RLS Estimation / Version 2014-2015 → x x x x x x x x x p. 27 29 Least Squares Estimation R(k + 1)&z(kRLS + 1) R(k) z(k) → Q(k + 1)T · 0 u(1) u(2) u(3) u(4) u Tk+ 1 dk+ 1 d rotation cell R11 R12 R13 R14 z11 R24 z21 R34 z31 0 R22 R23 a a’ b b’ 0 R33 memory cell (delay) 0 R44 z41 0 DSP-CIS / Chapter-12 : Least Squares & RLS Estimation / Version 2014-2015 p. 28 Least Squares & RLS Estimation 30 3.1 QR D -based R L S A lgor it hms R esidual ext r act ion R(k) z(k) R(k + 1) z(k + 1) T = Q(k + 1) · T 0 u k+ 1 dk+ 1 From this it is proved that the ‘a posteriori residual’ is dk+ 1 − u Tk+ 1w LS(k + 1) = · N i = 1 cos(θi ) and the ‘a priori residual’ is dk+ 1 − u Tk+ 1w LS(k) = N i = 1 cos(θi ) DSP-CIS / Chapter-12 : Least Squares & RLS Estimation / Version 2014-2015 p. 29 31 Least R esidual ext rSquares act ion (v1.0)&: u(1) 1 u(2) u(3) RLS Estimation u(4) d rotation cell cos 1 R11 R12 R13 R14 z11 0 R22 cos 2 R23 R24 z21 R33 R34 z31 cos 4 R44 z41 0 a a’ b b’ memory cell cos 3 (delay) 0 0 cos dk+ 1 − u Tk+ 1w LS(k + 1) = · N i = 1 cos(θi ) output DSP-CIS / Chapter-12 : Least Squares & RLS Estimation / Version 2014-2015 p. 30 32 Least R esidual ext rSquares act ion (v1.1)&: u(1) u(2) u(3) u(4) RLS Estimation d 1 0 R11 R12 R13 R14 z11 0 0 R22 R23 R24 0 R34 a a’ b b’ z21 0 R33 rotation cell z31 memory cell (delay) 0 0 R44 z41 0 cos dk+ 1 − u Tk+ 1w LS(k + 1) = · N cos(θi ) i = 1output1 DSP-CIS / Chapter-12 : Least Squares & RLS Estimation / Version 2014-2015 p. 31 Least Squares & RLS Estimation DSP-CIS / Chapter-12 : Least Squares & RLS Estimation / Version 2014-2015 p. 32 Least Squares & RLS Estimation 34 Skip this slide 3.2 I nver se U pdat ing R L S A lgor it hms R L S wit h inver se updat ing bottom line = track inverse of compound triangular matrix R(k + 1) z(k + 1) 0 −1 −1 = = [R(k + 1)]− 1 [R(k + 1)]− 1 · z(k + 1) 0 −1 [R(k + 1)]− 1 w LS(k + 1) . 0 −1 = implicit backsubstitution DSP-CIS / Chapter-12 : Least Squares & RLS Estimation / Version 2014-2015 p. 33 Least Squares & RLS Estimation 35 Skip this slide 3.2 I nver se U pdat ing R L S A lgor it hms First, it is proved that R(k + 1) [R(k + 1)]− T 0 ← Q(k + 1)T · R(k) [R(k)]− T u Tk+ 1 0 i.e. the Q(k + 1)T that is used to update the triangular factor R(k) with a new row u Tk+ 1, may also be used to update [R(k)]− T , provided an all-zero row is substituted for the u Tk+ 1. DSP-CIS / Chapter-12 : Least Squares & RLS Estimation / Version 2014-2015 p. 34 36 Skip this slide Least Squares & RLS Estimation −T −T u(1) R(k + 1) [R(k + 1)] 0 ← Q(k + 1)T · u(2) 0 R11 u(3) R12 u(4) R13 R14 0 R(k) [R(k)] u Tk+ 1 0 0 0 a a’ b b’ S11 0 R22 R23 R24 S21 S22 0 1/ R33 R34 S31 S32 S33 R44 S41 S42 S43 0 S44 0 DSP-CIS / Chapter-12 : Least Squares & RLS Estimation / Version 2014-2015 p. 35 Least Squares & RLS Estimation 37 Skip this slide 3.2 I nver se U pdat ing R L S A lgor it hms Squar e-r oot infor mat ion/ covar iance R L S R(k + 1) [R(k + 1)]− T 0 −T R(k) [R(k)] ← Q(k + 1)T · T u k+ 1 0 N ot e : numerical stability ! if S(k) is stored version of [R(k)]− T , and S(k)T · R(k) = I + δI . then the error ‘survives’ the update, i.e. S(k + 1)T · R(k + 1) = I + δI . → linear round-off error build-up !! DSP-CIS / Chapter-12 : Least Squares & RLS Estimation / Version 2014-2015 p. 36 Least Squares & RLS Estimation 38 Skip this slide 3.2 I nver se U pdat ing R L S A lgor it hms Squar e-r oot covar iance R L S Aim : derive updating scheme that works without R It is proved that 0 [R(k + 1)]− T δ ← Q(k + 1)T · − [R(k)]− T · u k+ 1 [R(k)]− T 1 0 i.e. the Q(k + 1)T that is used to update the triangular factors R(k) and [R(k)]− T , may be computed from the matrix-vector product − [R(k)]− T · u k+ 1. DSP-CIS / Chapter-12 : Least Squares & RLS Estimation / Version 2014-2015 p. 37 Least Squares & RLS Estimation 39 Skip this slide 0 [R(k + 1)]− T δ − [R(k)]− T · u k+ 1 [R(k)]− T ← Q(k + 1) · 1 0 T u1 1 0 0 v1 u2 rotation cell S11 u3 u4 0 multiply-add cell 0 0 v2 S21 a a’ b b’ b c S22 0 a a-b.c c b 1/ 0 0 v3 S31 S32 S33 0 0 0 1/( cos ) = v4 S41 -k1 . S42 -k2 . S43 -k3 . S44 0 -k4 . DSP-CIS / Chapter-12 : Least Squares & RLS Estimation / Version 2014-2015 p. 38 Least Squares & RLS Estimation 40 Skip this slide 3.2 I nver se U pdat ing R L S A lgor it hms Squar e-r oot covar iance R L S 0 [R(k + 1)]− T δ It is proved that ← Q(k + 1)T · − [R(k)]− T · u k+ 1 [R(k)]− T 1 0 ≈ Kalman gain vector 0 [R(k + 1)]− T δ − δ · k(k + 1)T ← Q(k + 1)T · − [R(k)]− T · u k+ 1 [R(k)]− T 1 0 recall w LS(k + 1) = w LS(k) + k(k + 1) · (dk+ 1 − u Tk+ 1 · w LS(k)) ek+ 1 DSP-CIS / Chapter-12 : Least Squares & RLS Estimation / Version 2014-2015 p. 39 Skip this slide Least Squares & RLS Estimation DSP-CIS / Chapter-12 : Least Squares & RLS Estimation / Version 2014-2015 p. 40 Skip this slide Least Squares & RLS Estimation DSP-CIS / Chapter-12 : Least Squares & RLS Estimation / Version 2014-2015 p. 41 Skip this slide Least Squares & RLS Estimation DSP-CIS / Chapter-12 : Least Squares & RLS Estimation / Version 2014-2015 p. 42