Uploaded by Jonas Le

Matrix Inverse, Least Squares and Least Square Data Fitting

advertisement
Semester 2, 2023
Tutorial 10
QBUS1040
Matrix inverses, least squares and least squares data fitting
Lecture recap
• Solving linear equations with back substitution:
- To solve Rx = y,
Rn
0
R=
0
R22
R1n
R2n
0
Rnn
R12
we can perform back substitution.
- For i = n, n - 1, ... , 1:
1
- Assuming A is invertible, let's solve Ax= b, i.e., compute x = A- b.
1
1
- With QR factorisation A= QR, we have: A- 1 = (QR)- = R- QT.
- Compute x = R- 1 (QTb) by back substitution.
• Least squares:
The goal is to minimise II Ax - bll 2 .
- Assume that A has linearly independent columns.
- The unique solution is x = (AT A)- 1ATb = Atb.
• Least squares data fitting:
- Aim to model the relationship y f(x) with:
* xCl), ... , x<N) E Rn being the feature vectors and y(l), ... , y(N) being the outcomes.
* We aim to find a linear in parameters approximation off with f (x) = 01fi (x) + ·· · +
0pfp(x) where fi(x): Rn-+ Rare basis functions.
* 0i are parameters we can choose.
- Our prediction is given by y(i) = f (x(i)), and ideally we want y(i) to be as close to y(i) as
possible.
- Define a N x p matrix A such that Aij = i}(x(i)), N-vectors yd = (y< 1 ), ... , y(N)) and
yd = (y(l), ... , y<N)). this implies that yd= A0.
2
2
- We want to solve the optimisation problem minel1Yd-ydll = minellA0-ydll , and the solution
1
is 0 = ( AT A )- AT y by least squares (assuming the columns of A are linearly independent).
- Simple case: Straight line fit.
* Two basis functions, fi(x)
= 1 and h(x) = x, where x
- [~.
A.
1
ER, so
x~l)]
.
..
X(N)
Page 1 of 4.
Exercises
Semester 2, 2023
Tutorial 10
QBUS1040
I]
Solving systems of linear equations via QR factorisation
Exercise 1:
Write a function called sol ve_via_back_sub, which returns a vector x such that it solves Ax
= b.
I]
Moore's law
Exercise 2:
The figure and table below show the number of transistors Nin 13 microprocessors and the year of
their introduction.
Year
Transistors
1971
1972
1974
1978
1982
1985
1989
1993
1997
1999
2000
2002
2003
2,250
2,500
5,000
29,000
120,000
275,000
1,180,000
3,100,000
7,500,000
24,000,000
42,000,000
220,000,000
410,000,000
0
108
0
107 -
0
0
....f/l
0
0
tl 106
0
-~
0
0
105
0
104
0
oO
103
1970 1975 1980 1985 1990 1995 2000 2005
Year
The plot gives the number of transistors on a logarithmic scale. Find the least squares straight-line
01 + 02(t - 1970) where tis the year and N is the number
fit of the data using the model log10 N
prediction of the log of the number of transistors in 1970,
model's
the
is
0
that
of transistors. Note
1
and 1082 gives the model's prediction of the fra,ctional increase in number of transistors per year.
(a) Find the coefficients 01 and 02 that minimise the RMS error on the data and give the corresponding RMS error.
(b) Visualise your model.
(c) Use your model to predict the number of transistors in a microprocessor introduced in 2015.
Compare the prediction to the IBM Z13 microprocessor, released in 2015, which has around
4 x 109 transistors.
Exercise 3: Weighted least squares
In least squares, the objective (to be minimised) is
m
IIAx -
bll
2
=
L(a;x -
bi)
2
i=l
a;
are the rows of A, and then-vector xis to be chosen. In the weighted least squares problem,
where
we minimise the objective
L wi(a; x m
bi)
2
i=l
where Wi are given positive weights. The weights allow us to assign different weights to the different
components of the residual vector. (The objective of the weighted least squares problem is the square
of the weighted norm, IIAx - bll! , as defined in exercise 3.28 of the textbook.)
2
(a) Show that the weighted least squares objective can be expressed as IID(Ax - b)ll for an appropriate diagonal matrix D. This allows us to solve the weighted least squares problem as a
standard least squares problem, by minimising IIBx - dll 2 , where B = DA and d = Db.
Page 2 of 4.
Semester 2, 2023
Tutorial 10
QBUS1040
Solution: First, recall the row expansion of the least squares objective
2
2
IIAx -bll
a,Tf X. - bi]
=
=
[-amx- bm
m
L(iifX
-
bi)2.
i=l
In the weighted least squares problem, we have
m
m
Lwi(af x-bi)
i=l
= L(JWi(af x-bi)) 2
2
•
i=l
0
0
0
0
0
yWm,
r,;;,-
-T X al
-T
a 2 x-
b1
b2
-T
bm
amX-
2
= IID(Ax - b)ll 2
= IIBx-dll 2
where B = DA, d = Db and D = diag((Jwi", ... , ~)).
Note: ../w does not make sense as w is a vector.
(b) Show that when A has linearly independent columns, so does the matrix B.
Solution: If A has linearly independent columns, then the only solution to Ax
X
= 0 is
= 0.
Now, we know that Bx= DAx and Dis invertible as it is a diagonal matrix with non-zero
entries, thus to solve DAx = 0, we can just solve Ax = 0 (multiply both sides on the left
by D- 1), but we know that as A has linearly independent columns, the only solution to
this is x = 0, which implies that B has linearly independent columns.
1
(c) The least squares approximate solution is given by x = (AT A )- ATb. Give a similar formula
want to use the matrix
might
You
problem.
for the solution of the weighted least squares
W = diag(w) in your formula.
1
2
Solution: The least squares solution of a system IIAx - bll is given by x = (AT A)- ATb.
2
Now using our newly defined objective of minimising II Bx - dll , where B = DA, d = Db
and D = diag((Jivi, ... , ~)), we aim to find a similar expression for the least squares
solution.
X = (BTB)- 1 BTd
= ((DA)T(DA ))- 1 (DA)T(Db)
= (ATDTDA)- 1 ATDTDb
= (AT DDA)- 1 ATDDb
= (ATD 2 A)- 1 ATD2 b
DT
=D
as D is a diagonal matrix
2)) = diag( (w , ... , wm)) = diag( w) = W, so we can write
1
As D 2 = diag( ( Jiui2, ... , ~
1
the solution to the weighted least squares problem as x = (ATW A)- ATWb.
Page 3 of 4.
Tutorial 10
QBUS1040
Semester 2, 2023
Exercise 4: Least squares and QR factorisation (After class exercise)
Suppose A is an m x n matrix with linearly independent columns and QR factorization A = QR,
and bis an m-vector. The vector Ax is the linear combination of the columns of A that is closest to
the vector b, i.e., it is the projection of b onto the set of linear combinations of the columns of A.
(a) Show that Ax= QQTb. (The matrix QQT is called the projection matrix.)
Solution: By least squares, we have x = (AT A)- 1 ATb, and with A= QR,
X = (RTQTQR)-1 RTQTb
= (RT R)-1 RT QTb
= R-lR-TRTQTb
= R-lQTb
LHS
= Ax
= AR- 1QTb
= QRR- 1 QTb
= QJQTb
=QQTb
=RHS
(b) Show that IIAx - bll 2 = llbll 2 - IIQTbll 2. (This is the square of the distance between band the
closest linear combination of the columns of A.)
Solution:
LHS = IIAx - bll 2
= IIQQTb -
bll 2
= (QQTb)T(QQTb)
- 2(QQTb)Tb + bTb
= llbll2 + bT(QT)TQTQQTb- 2bT(QT)TQTb
= llbll 2 + bT QQT QQTb - 2bT QQTb
= llbll 2 + bT QIQTb - 2bT QQTb
= llbll 2 + bT QQTb - 2bT QQTb
= llbll 2 - bT QQTb
11
= llbll 2 - (bT Q)QTb
= llbll 2 - (QTb)TQTb
2
2
= llbll - IIQTbll
=RHS
Page 4 of 4.
Download