Semester 2, 2023 Tutorial 10 QBUS1040 Matrix inverses, least squares and least squares data fitting Lecture recap • Solving linear equations with back substitution: - To solve Rx = y, Rn 0 R= 0 R22 R1n R2n 0 Rnn R12 we can perform back substitution. - For i = n, n - 1, ... , 1: 1 - Assuming A is invertible, let's solve Ax= b, i.e., compute x = A- b. 1 1 - With QR factorisation A= QR, we have: A- 1 = (QR)- = R- QT. - Compute x = R- 1 (QTb) by back substitution. • Least squares: The goal is to minimise II Ax - bll 2 . - Assume that A has linearly independent columns. - The unique solution is x = (AT A)- 1ATb = Atb. • Least squares data fitting: - Aim to model the relationship y f(x) with: * xCl), ... , x<N) E Rn being the feature vectors and y(l), ... , y(N) being the outcomes. * We aim to find a linear in parameters approximation off with f (x) = 01fi (x) + ·· · + 0pfp(x) where fi(x): Rn-+ Rare basis functions. * 0i are parameters we can choose. - Our prediction is given by y(i) = f (x(i)), and ideally we want y(i) to be as close to y(i) as possible. - Define a N x p matrix A such that Aij = i}(x(i)), N-vectors yd = (y< 1 ), ... , y(N)) and yd = (y(l), ... , y<N)). this implies that yd= A0. 2 2 - We want to solve the optimisation problem minel1Yd-ydll = minellA0-ydll , and the solution 1 is 0 = ( AT A )- AT y by least squares (assuming the columns of A are linearly independent). - Simple case: Straight line fit. * Two basis functions, fi(x) = 1 and h(x) = x, where x - [~. A. 1 ER, so x~l)] . .. X(N) Page 1 of 4. Exercises Semester 2, 2023 Tutorial 10 QBUS1040 I] Solving systems of linear equations via QR factorisation Exercise 1: Write a function called sol ve_via_back_sub, which returns a vector x such that it solves Ax = b. I] Moore's law Exercise 2: The figure and table below show the number of transistors Nin 13 microprocessors and the year of their introduction. Year Transistors 1971 1972 1974 1978 1982 1985 1989 1993 1997 1999 2000 2002 2003 2,250 2,500 5,000 29,000 120,000 275,000 1,180,000 3,100,000 7,500,000 24,000,000 42,000,000 220,000,000 410,000,000 0 108 0 107 - 0 0 ....f/l 0 0 tl 106 0 -~ 0 0 105 0 104 0 oO 103 1970 1975 1980 1985 1990 1995 2000 2005 Year The plot gives the number of transistors on a logarithmic scale. Find the least squares straight-line 01 + 02(t - 1970) where tis the year and N is the number fit of the data using the model log10 N prediction of the log of the number of transistors in 1970, model's the is 0 that of transistors. Note 1 and 1082 gives the model's prediction of the fra,ctional increase in number of transistors per year. (a) Find the coefficients 01 and 02 that minimise the RMS error on the data and give the corresponding RMS error. (b) Visualise your model. (c) Use your model to predict the number of transistors in a microprocessor introduced in 2015. Compare the prediction to the IBM Z13 microprocessor, released in 2015, which has around 4 x 109 transistors. Exercise 3: Weighted least squares In least squares, the objective (to be minimised) is m IIAx - bll 2 = L(a;x - bi) 2 i=l a; are the rows of A, and then-vector xis to be chosen. In the weighted least squares problem, where we minimise the objective L wi(a; x m bi) 2 i=l where Wi are given positive weights. The weights allow us to assign different weights to the different components of the residual vector. (The objective of the weighted least squares problem is the square of the weighted norm, IIAx - bll! , as defined in exercise 3.28 of the textbook.) 2 (a) Show that the weighted least squares objective can be expressed as IID(Ax - b)ll for an appropriate diagonal matrix D. This allows us to solve the weighted least squares problem as a standard least squares problem, by minimising IIBx - dll 2 , where B = DA and d = Db. Page 2 of 4. Semester 2, 2023 Tutorial 10 QBUS1040 Solution: First, recall the row expansion of the least squares objective 2 2 IIAx -bll a,Tf X. - bi] = = [-amx- bm m L(iifX - bi)2. i=l In the weighted least squares problem, we have m m Lwi(af x-bi) i=l = L(JWi(af x-bi)) 2 2 • i=l 0 0 0 0 0 yWm, r,;;,- -T X al -T a 2 x- b1 b2 -T bm amX- 2 = IID(Ax - b)ll 2 = IIBx-dll 2 where B = DA, d = Db and D = diag((Jwi", ... , ~)). Note: ../w does not make sense as w is a vector. (b) Show that when A has linearly independent columns, so does the matrix B. Solution: If A has linearly independent columns, then the only solution to Ax X = 0 is = 0. Now, we know that Bx= DAx and Dis invertible as it is a diagonal matrix with non-zero entries, thus to solve DAx = 0, we can just solve Ax = 0 (multiply both sides on the left by D- 1), but we know that as A has linearly independent columns, the only solution to this is x = 0, which implies that B has linearly independent columns. 1 (c) The least squares approximate solution is given by x = (AT A )- ATb. Give a similar formula want to use the matrix might You problem. for the solution of the weighted least squares W = diag(w) in your formula. 1 2 Solution: The least squares solution of a system IIAx - bll is given by x = (AT A)- ATb. 2 Now using our newly defined objective of minimising II Bx - dll , where B = DA, d = Db and D = diag((Jivi, ... , ~)), we aim to find a similar expression for the least squares solution. X = (BTB)- 1 BTd = ((DA)T(DA ))- 1 (DA)T(Db) = (ATDTDA)- 1 ATDTDb = (AT DDA)- 1 ATDDb = (ATD 2 A)- 1 ATD2 b DT =D as D is a diagonal matrix 2)) = diag( (w , ... , wm)) = diag( w) = W, so we can write 1 As D 2 = diag( ( Jiui2, ... , ~ 1 the solution to the weighted least squares problem as x = (ATW A)- ATWb. Page 3 of 4. Tutorial 10 QBUS1040 Semester 2, 2023 Exercise 4: Least squares and QR factorisation (After class exercise) Suppose A is an m x n matrix with linearly independent columns and QR factorization A = QR, and bis an m-vector. The vector Ax is the linear combination of the columns of A that is closest to the vector b, i.e., it is the projection of b onto the set of linear combinations of the columns of A. (a) Show that Ax= QQTb. (The matrix QQT is called the projection matrix.) Solution: By least squares, we have x = (AT A)- 1 ATb, and with A= QR, X = (RTQTQR)-1 RTQTb = (RT R)-1 RT QTb = R-lR-TRTQTb = R-lQTb LHS = Ax = AR- 1QTb = QRR- 1 QTb = QJQTb =QQTb =RHS (b) Show that IIAx - bll 2 = llbll 2 - IIQTbll 2. (This is the square of the distance between band the closest linear combination of the columns of A.) Solution: LHS = IIAx - bll 2 = IIQQTb - bll 2 = (QQTb)T(QQTb) - 2(QQTb)Tb + bTb = llbll2 + bT(QT)TQTQQTb- 2bT(QT)TQTb = llbll 2 + bT QQT QQTb - 2bT QQTb = llbll 2 + bT QIQTb - 2bT QQTb = llbll 2 + bT QQTb - 2bT QQTb = llbll 2 - bT QQTb 11 = llbll 2 - (bT Q)QTb = llbll 2 - (QTb)TQTb 2 2 = llbll - IIQTbll =RHS Page 4 of 4.