Uploaded by sachinbharadwaj98

Numerical Linear Algebra (Instructor Solution Manual, Solutions)-Springer (2007)

advertisement
Numerical Linear Algebra
Solutions Manual for Instructors
Grégoire Allaire
Sidi Mahmoud Kaber
Contents
2
Exercises of chapter 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
3
Exercises of chapter 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4
Exercises of chapter 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
5
Exercises of chapter 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
6
Exercises of chapter 6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
7
Exercises of chapter 7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
8
Exercises of chapter 8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
9
Exercises of chapter 9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
10
Exercises of chapter 10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
Preface
This is the solution manual for the exercises of our book, “Numerical linear
algebra”.
G.A., S.M.K.
Paris, July 13, 2007.
2
Exercises of chapter 2
Solution of Exercise 2.3 The determinant of A is rigorously equal to e.
However, when this number is very small, Matlab may find a wrong value
because of rounding errors. For instance with e = 10−20 we obtain
>> e=1.e-20;n=5;p=NonsingularMat(n);
>> A=p*diag([ones(n-1,1); e])*inv(p);
>> det(A)
ans =
-1.5086e-14
Recall that the constant eps contains the floating point relative accuracy of
Matlab
>> eps
ans =
2.2204e-16
Solution of Exercise 2.4 The rank is equal to 2. We note that the multiplication of A by a nonsingular matrix does not change its rank. Assume that
A has size m × n, and let C be a non singular matrix of size m × m. Since C
is non singular, the vector spaces Im (A) and Im (CA) are in one-to-one correspondence and have thus the same dimension. Indeed, to each x ∈ Im (A),
we map Cx ∈ Im (CA) and to each x ∈ Im (CA), we map C −1 x ∈ Im (A).
Solution of Exercise 2.5 We build a matrix NR of size 2 × 10 containing
different values of n and the rank of the corresponding matrix A.
1. >>
>>
>>
>>
>>
NR
NR=[];
for n=1:10
A=rand(8,n)*rand(n,6);NR =[NR [n;rank(A)]];
end;
NR
=
1
2
3
4
5
6
7
8
9
10
4
CHAPTER 2.
EXERCISES OF CHAPTER 2
1
2
3
4
5
6
6
6
6
6
We remark that the rank of A is equal to n for n ≤ 6, and equal to 6 for
n ≥ 6.
2. With matrix BinChanceMat, we obtain
>> NR
NR =
1
1
2
2
3
3
4
4
5
4
6
5
7
6
8
6
9
6
10
6
This time around, we observe that the rank of A is always less than or
equal to 6.
3. Let us show that rk (AB) ≤ min( rk (A), rk (B)). Assume the dimensions
of A and B are compatible for the product AB to make sense. Since
Im (AB) ⊂ Im (A), we have rk (AB) ≤ rk (A). Besides, according to the
rank theorem
dim Ker B + rk B = dim Ker (AB) + rk (AB).
Since Ker (B) ⊂ Ker (AB), we have the upper bound rk (AB) ≤ rk (B),
thereby proving the result.
Matrices m × n defined by function rand have a great probability of being
of maximal rank, that is, of rank equal to min(m, n). With the random
function BinChanceMat, this probability is smaller, which explains the
observed results.
P
Solution of Exercise 2.6 Finding the rank of ni=1 ui uti .
1. Case when A is defined using the function rand. Using the following instructions
>> n=5;A=zeros(n,n);r=5;
>> for i=1:r
u=rand(n,1);A=A+u*u’;
end;
>> rank(A)
ans =
5
we “almost always” obtain that the rank of A is equal to r.
2. Case when A is defined using the function BinChanceMat. Using the following instructions
>> A=zeros(n,n);
>> for i=1:r
u=BinChanceMat(n,1);A=A+u*u’;
end;
rank(A)
ans =
4
5
we merely obtain that the rank of A is less than or equal to r.
3. Explanation. For any x ∈ Rn , we have
Ax =
r
X
(ui uti )x =
i=1
r
X
i=1
hx, ui iui .
So the vectors ui span the image of A and, consequently, the rank of A is
always less than or equal to r. If these vectors are linearly independent in
Rn (which is very likely with rand and more unlikely with BinChanceMat),
then they form a basis of the image of A. The dimension of this space is
therefore r.
Solution of Exercise 2.8
1. >> A=[1:3;4:6;7:9;10:12]; det(A’*A)
ans =
0
>> B=[-1 2 3;4:6;7:9;10:12]; det(B’*B)
ans =
216
We remark that the matrix At A is singular, while B t B is not.
2. The rank of A is equal to 2 and the rank of B is equal to 3.
3. Let X be a matrix of size m × n, with m ≥ n. The rank theorem implies
that dim Ker X + rk X = n. The following alternative holds true:
• either the rank of X is maximal, that is, equal to n, which is equivalent
to say that X is injective ( Ker X = {0}) and thus the square matrix
X ∗ X is also injective, hence non singular since
X ∗ Xx = 0 =⇒ 0 = hX ∗ Xx, xi = kXxk2 =⇒ Xx = 0 =⇒ x = 0,
• or rk (X) < n and thus X is not injective, and neither is the square
matrix X ∗ X.
4. This time, both matrices AAt and BB t are singular. Indeed, for a matrix
X of size m × n, with m < n, even if its rank is maximal, that is, equal
to m, we have dim Ker X = n − m and X is never injective. Thus the
square matrix X ∗ X is always singular.
Solution of Exercise 2.9 Rank of A + uut .
>> A=MatRank(20,20,17); Q=null(A’);
>> size(Q)
ans =
20
3
>> u=Q(:,2); rank(A+u*u’)
ans =
18
6
CHAPTER 2.
EXERCISES OF CHAPTER 2
We notice that the rank of the matrix A + uut is equal to r + 1. Indeed for
any x ∈ Rn , we have
(A + uut )x = Ax + uut x = Ax + hx, uiu,
and since u ∈ Ker (At ) = ( Im A)⊥ , hx, uiu is orthogonal to Im A and therefore the dimension of Im (A + uut ) is equal to r + 1.
Solution of Exercise 2.13 The function triu returns the upper triangular part of a matrix. Similarly, the lower triangular part is obtained by the
function tril.
>> n=5;A=rand(n,n);A=triu(A)-diag(diag(A))
A =
0
0
0
0
0
0.1942
0
0
0
0
0.0856
0.9793
0
0
0
0.7285
0.2508
0.1685
0
0
0.2183
0.9558
0.4082
0.6033
0
To get the successive powers of A, we use the variable ans as follows.
>> A*A
ans =
0
0
0
0
0
0
0
0
0
0
0.1902
0
0
0
0
0.0631
0.1650
0
0
0
0.6601
0.5511
0.1017
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0.0320
0
0
0
0
0.1157
0.0996
0
0
0
>> ans*A
ans =
We see that An is equal to the zero matrix. Such a matrix is said to be
nilpotent. Explanation: since all the eigenvalues of A are zero, its characteristic
polynomial is pA (x) = (−1)n xn and by the Cayley–Hamilton theorem, we
have indeed pA (A) = 0.
Solution of Exercise 2.14 We give the results for m = n = 6.
1. >> H=hilb(6);eig(H)
ans =
0.0000
0.0000
7
0.0006
0.0163
0.2424
1.6189
2. >> [P,D]=eig(H); P
P =
-0.0012
-0.0111
0.0622
0.2403
0.0356
0.1797
-0.4908
-0.6977
-0.2407
-0.6042
0.5355
-0.2314
0.6255
0.4436
0.4170
0.1329
-0.6898
0.4415
-0.0470
0.3627
0.2716
-0.4591
-0.5407
0.5028
3. >> i=3;u=P(:,i);l=D(i,i);norm(H*u-l*u)
ans =
1.9525e-16
-0.6145
0.2111
0.3659
0.3947
0.3882
0.3707
0.7487
0.4407
0.3207
0.2543
0.2115
0.1814
Solution of Exercise 2.15 The eigenvalues of A are those of D, that is, 2
(with multiplicity two), 3 and 4.
1. >> specA=eig(A)
specA =
4.0000
3.0000
2.0000
2.0000
2. >> n=3;eig(A^n)
ans =
64.0000
27.0000
8.0000 + 0.0000i
8.0000 - 0.0000i
Matlab makes an error by adding a (small) imaginary part to the double
eigenvalue 2
>> imag(ans)
ans =
1.0e-05 *
0
0
0.5384
-0.5384
For n = 10, we get
>> n=10;eig(A^n), imag(ans)
ans =
8
CHAPTER 2.
EXERCISES OF CHAPTER 2
1.0e+06 *
1.0486
0.0590
0.0010 + 0.0000i
0.0010 - 0.0000i
ans =
0
0
0.0185
-0.0185
This time, the error is not negligible anymore.
3. >> [P1,D1]=eig(A);P1’*P1
ans =
1.0000
0.6404
-0.9562
-0.9562
0.6404
1.0000
-0.8319
-0.8319
-0.9562
-0.8319
1.0000
1.0000
-0.9562
-0.8319
1.0000
1.0000
The matrix is not diagonalizable in an orthonormal basis of eigenvectors.
Solution of Exercise 2.16 Spectra of A and At .
>> n=10;A=rand(n,n);[eig(A) eig(A’)]
ans =
4.5713
4.5713
-0.6828
-0.6828
-0.0581 + 0.4900i -0.0581 + 0.4900i
-0.0581 - 0.4900i -0.0581 - 0.4900i
0.6038 + 0.4022i
0.6038 + 0.4022i
0.6038 - 0.4022i
0.6038 - 0.4022i
0.2680 + 0.4556i
0.2680 + 0.4556i
0.2680 - 0.4556i
0.2680 - 0.4556i
0.1435
0.5497
0.5497
0.1435
We notice that for various examples, the matrices A and At have the same
spectrum. This is indeed the case since λ ∈ σ(A) is equivalent to say that
(A − λI) is singular, which is in turn equivalent to (A − λI)t being singular,
wherefrom the result is proved.
Solution of Exercise 2.17 We notice that for several calls
u=rand(n,1);v=rand(n,1);A=eye(n,n)+u*v’;eig(A)
(where n varies) the spectrum of the matrix In + uv t consists of a simple
eigenvalue and the eigenvalue 1 with multiplicity n − 1. Let us show that
the simple eigenvalue is indeed equal to 1 + ut v. If either u or v is zero, the
result is obvious. Assume that neither one is zero. The spectrum of In + uv t
9
consists on the one hand of the eigenvalue 1 corresponding to eigenvectors in
the hyperplane V orthogonal to v since for all w ∈ V , we have
(In + uv t )w = w + uv t w = w + hv, wiu = w,
and, on the other hand, of the eigenvalue 1 + ut v corresponding to the eigenvector u since
(In + uv t )u = u + uv t u = (1 + v t u)u.
Solution of Exercise 2.19 We remark that AAt and At A have the same
nonzero eigenvalues. Explanation: assume that A is of size m × n and let
A = V Σ̃U ∗ be its SVD factorization. Since AA∗ = V Σ̃ Σ̃ ∗ V ∗ , the eigenvalues
of this matrix are those of the diagonal matrix Σ̃ Σ̃ ∗ , of size m × m. Similarly,
the eigenvalues of A∗ A are those of the diagonal matrix Σ̃ ∗ Σ̃, of size n × n.
It is clear that the eigenvalues of the largest matrix are those of the smallest
matrix plus zeros. More generally, the nonzero eigenvalues of the two matrices
AB and BA are the same (see Lemma 2.7.2 in the book).
Solution of Exercise 2.21 We fix n = 100 in the following examples (the
observations are the same whatever the value of n).
1. >> A=SymmetricMat(n);[P,D]=eig(A);
>> X=zeros(n,n);for k=1:n, X=X+D(k,k)*P(:,k)*P(:,k)’; end;
>> norm(A-X)
ans =
1.5874e-15
Pn
We notice that the symmetric matrix A is equal to k=1 λi ui uti .
2. >> D(1,2)=1;B=P*D*inv(P);[P,D]=eig(B);
>> X=zeros(n,n);for k=1:n, X=X+D(k,k)*P(:,k)*P(:,k)’; end;
>> norm(B-X)
ans =
8.0345
Pn
This time the matrices B and k=1 µi vi vit do not coincide.
3. Explanation: the symmetric matrix A is diagonalizable in an orthonormal
basis of eigenvectors. In other words, the matrix P computed by eig is
orthogonal. We therefore have A = P DP t . Calling ui the columns of P ,
we have P D = [λ1 u1 | . . . |λn un ] and A reads
 t
u1
 
n
  X
 
A = [λ1 u1 | . . . |λn un ]  ...  =
λi ui uti .
 
  k=1
utn
Solution of Exercise 2.22 We remark that for various values of n the rank
of A is equal to n.
10
CHAPTER 2.
EXERCISES OF CHAPTER 2
1. For any x ∈ Rn , we have Ax = vut x = hx, uiv. Wherefrom we infer that
if u and v are nonzero, the image of A is the line generated by v and thus
the rank of A is equal to 1.
2. Conversely: let A ∈ Mn (R) be a matrix of rank r = 1. The SVD factorization of A reduces to A = µ1 v1 ut1 , thus the result is proved.
Solution of Exercise 2.24 For the matrix A1
>>
>>
>>
>>
A1=[1,2,3;3,2,1;4,2,1];
A1*A1
A1*ans
A1*ans
%
%
%
%
A
A2
A3
...
The limit is (seems to be!) a matrix with all entries equal to +∞. We may
also use the power function, for instance, compute A1^100.
For the matrix A2 , the limit is
ans =
0.5000
0
0.5000
0
1.0000
0
0.5000
0
0.5000
For the matrix A3 , the limit is the zero matrix. For the matrix A4 , we obtain
two accumulation points, i.e., the ’limit’ oscillates between
0.5000
0
0.5000
0
1.0000
0
0.5000
0
0.5000
-0.5000
0
-0.5000
0
1.0000
0
-0.5000
0
-0.5000
and
Each matrix Ai is diagonalizable and reads Ai = Pi Di Pi−1 . The powers of Ai
are therefore equal to

 k
0
λi,1
0  Pi−1 .
Aki = Pi Dik Pi−1 = Pi  0
λki,2
0
0
λki,3
The behavior, at infinity, of the powers of these four matrices is accounted for
by studying their eigenvalues.
• The spectral radius of A1 is > 1, there exists an eigenvalue λ1,j of modulus
> 1 and such that |λk1,j | tends to +∞.
• The spectral radius of matrix A3 is < 1. For all j, |λk2,j | tends to 0, and so
is the case for Ak3 .
11
• The eigenvalues of A2 are 1/2, 1 and 1, we infer that D2k tends to
diag(0, 1, 1).
• The eigenvalues of A4 are −1, 1/2 and 1. The presence of the eigenvalue
−1 accounts for the “oscillations’ of the sequence Ak .
Solution of Exercise 2.25
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
% Unit circle
t=0:0.01:2*pi;x=cos(t)’;y=sin(t)’;
A =[-1.25
0.75; 0.75
-1.25];
% Image of the unit circle by A
for i=1:length(t)
v=[x(i);y(i)]; w=A*v; Ax(i)=w(1); Ay(i)=w(2);
end
% Plot of the circle and its image; see Figure 2.2
plot(x,y,Ax,Ay,’.’,’MarkerSize’,10,’LineWidth’,3)
axis([-2 2 -2 2]);grid on; axis equal;
set(gca,’XTick’,-2:1:2,’YTick’,-2:1:2,’FontSize’,24);
We compute the singular value decomposition of the matrix
>> [V,S,U] = svd(A)
V =
-0.7071
0.7071
0.7071
0.7071
S =
2.0000
0
0
0.5000
U =
0.7071
-0.7071
-0.7071
-0.7071
% the singular values of
% the matrix are 2 and 1/2
Solution of Exercise 2.26 The matrix B may be defined by B=[zeros(m,m)
A;A’ zeros(n,n)]. Example of calculation:
>> n=5;m=3;
>> A=rand(m,n);B=[zeros(m,m) A;A’ zeros(n,n)];
>> [V S U]=svd(A);
>> diag(S)’, eig(B)’
ans =
1.7251 0.6388 0.4008
ans =
-1.7251 -0.6388 -0.4008 -0.0000 0.0000 0.4008 0.6388 1.7251
It seems that the nonzero eigenvalues of B are equal to plus and minus the singular values of A. Let us prove this is actually the case. Let λ be an eigenvalue
of the symmetric matrix B. There exist two vectors x ∈ Rm and y ∈ Rn such
that (x, y) 6= (0m , 0n ), Ay = λx and At x = λy. We infer that At Ay = λ2 y.
We note that y = 0 =⇒ λ = 0 and thus if the eigenvalue λ is nonzero, y is an
12
CHAPTER 2.
EXERCISES OF CHAPTER 2
eigenvector of At A corresponding to the eigenvalue λ2 . This means that |λ|
is a singular value of A. Reciprocally, if µ is a nonzero singular value of A,
there exists a nonzero vector u ∈ Rn such that At Au = µ2 u. We set y = u
and x = µ1 :
x
Ay
x
B
=
=µ
,
y
At x
y
which proves that µ is an eigenvalue of B.
Solution of Exercise 2.27 Pseudo-inverse matrix.
>> a =[1 -1 4; 2 -2 0;
>> p=pinv(a)
p =
-0.0822
0.1925
0.0822
-0.1925
0.1737
-0.1784
>> p*a
ans =
1.0000
-0.0000
-0.0000
1.0000
0.0000
0.0000
>> a*p
ans =
0.5305
-0.3286
-0.3286
0.7700
0.3756
0.2629
0.0000
0.0000
>> a*p*a
ans =
1.0000
-1.0000
2.0000
-2.0000
3.0000
-3.0000
-1.0000
-1.0000
>> p*a*p
ans =
-0.0822
0.1925
0.0822
-0.1925
0.1737
-0.1784
3 -3
5;-1 -1 0];
0.0657
-0.0657
0.0610
-0.5000
-0.5000
-0.0000
0.0000
0.0000
1.0000
0.3756
0.2629
0.6995
-0.0000
-0.0000
-0.0000
-0.0000
1.0000
4.0000
0.0000
5.0000
-0.0000
0.0657
-0.0657
0.0610
-0.5000
-0.5000
-0.0000
We note that A† A = I3 , which makes sense since rk (A) = 3. We also notice
that A† AA† = A† and AA† A = A (Moore-Penrose conditions).
Solution of Exercise 2.28 We note that both quantities are equal. Let us
show that it is actually the case. Let
V Σ̃U ∗ be the SVD factorization
A=
Σ 0
and Σ = diag (µ1 , . . . , µr ). By
of matrix A ∈ Mm,n (R) with Σ̃ =
0 0
13
†
†
∗
†
definition A = U Σ̃ V with Σ̃ =
we obtain the inequalities
Σ −t
0
0
. Computing the trace of AA† ,
0
tr (AA† ) = tr (V Σ̃ Σ̃ † V ∗ ) = tr (Σ̃ Σ̃ † ) = tr (ΣΣ −t ),
wherefrom we get the result tr (ΣΣ −t ) = r.
3
Exercises of chapter 3
Solution of Exercise 3.1 Numerical experiments show that each one of
these norms is larger than m = maxi,j |Ai,j |. Let us prove this result, which is
obvious for the Frobenius norm. Let i0 and j0 be indices such that m = |Ai0 ,j0 |.
The p norms being subordinate, we have
1/p

n
X
kAxkp
kAei0 kp
≥ |Ai0 ,j0 |
|Ai0 ,j |p 
kAkp = max
≥
= kAei0 kp = 
x6=0 kxkp
kei0 kp
j=1
where ei is the i-th vector of the canonical basis.
Solution of Exercise 3.2 Numerical experiments show that each one of
−1
these norms is larger than m = (mini |Ti,i |) . Let us prove this result: for
each of the considered norms, we have
kT −1k ≥ max |(T −1 )i,j | ≥ max |(T −1 )i,i | = max
i,j
i
i
1
1
=
= m,
|Ti,i |
mini |Ti,i |
because, for a triangular matrix, the diagonal entries of its inverse are the
inverse of its diagonal entries.
Solution of Exercise 3.4
>> n=35;u=rand(n,1);v=rand(n,1);
>> fprintf(’norm of uvt = %f norm of u = %f norm of v = %f \n’,...
norm(u*v’),norm(u),norm(v))
norm of uvt = 10.413866 norm of u = 3.011554 norm of v = 3.457971
We note that kuv t k2 = kuk2 kvk2 . Justification:
kuv t k2 = max kuv t xk2 = max khx, viuk2 = kuk2 max |hx, vi|.
kxk2 =1
kxk2 =1
kxk2 =1
We conclude by remarking that maxkxk2 =1 |hx, vi| = kvk2 , since by the
Cauchy–Schwarz inequality, we have |hx, vi| ≤ kvk2 kxk2 and the supremum
16
CHAPTER 3.
EXERCISES OF CHAPTER 3
is attained for all vectors x = αv, α ∈ Rn . We have the same relation for the
Frobenius norm:
sX X
sX X
t
t
2
|(uv )i,j | =
|ui |2 |vj |2 = kuk2 kvk2 = kukF kvkF .
kuv kF =
i
j
i
j
The same formula does not hold true for the norms k.k1 and k.k∞ because for
each of these norms, we have for u = v = (1, . . . , 1)t ,
• kuv t k∞ = n and kuk∞ = kvk∞ = 1.
• kuv t k1 = n and kuk∞ = kvk∞ = n.
Solution of Exercise 3.5 The vector v is an eigenvector of AAt associated
with the smallest singular value of A. Consider A = V ΣU ∗ the SVD factorization of matrix A. We know, on the one hand (see Remark 2.7.5) that the
columns of V are the eigenvectors of AAt , and on the other hand (see equality
(5.17)) that
kA−1 k2 = kA−1 vn k.
Solution of Exercise 3.6
1. Define ϕA (x) = kAxk for x ∈ Rn . This function obviously satisfies the
triangular inequality and the homogeneity relation. It remains to check
that ϕA (x) = 0, implies x = 0. This is true if and only if A is injective.
function y=normA(A,u)
if size(A,2)~=size(u,1)
error(’normA::incompatibility of dimensions’);
else
y=norm(A*u)
end;
2. function y=normAs(A,x)
if size(A,2)~=size(x,1)
error(’normAs::incompatible dimensions’);
else
if size(x,2)~=1
error(’normAs:: x must be a vector ’);
end;
y=sqrt(x’*A*x)
end;
p
Define ϕA (x) = hAx, xi for x ∈ Rn . This function is real valued if A
is assumed to be positive semi definite, i.e., for all x ∈ Rn , hAx, xi ≥ 0.
Under this minimal assumption let us check the three norm properties.
• The homogeneity is satisfied: ϕA (λx) = |λ|ϕA (x) for all λ ∈ R.
• Only the zero vector has zero norm: ϕA (x) = 0 =⇒ x = 0. This is true
if and only if A is positive definite.
17
• The triangular inequality ϕA (x + y) ≤ ϕA (x) + ϕA (y) is satisfied if
and only if
hAx, yi + hAy, xi ≤ 2ϕA (x)ϕA (y),
or, equivalently, if and only if
h(A + At )x, yi ≤ 2ϕA (x)ϕA (y).
(3.1)
Now, we distinguish two cases:
– A is symmetric and, then, condition (3.1) is equivalent to
hAx, yi ≤ ϕA (x)ϕA (y)
(3.2)
which is nothing but Cauchy–Schwarz inequality for the bilinear
form hAx, yi. To prove (3.2), we decompose
x and y in anP
orthonorP
mal basis ui of eigenvectors of A: x = ni=1 xi ui and y = ni=1 yi ui .
Then (3.2) is simply
v
v
u n
u n
n
X
uX
uX
λ i xi y i ≤ t
λi x2i t
λi yi2 ,
i=1
i=1
i=1
which holds true by virtue of the discrete Cauchy–Schwarz inequality.
– A is nonsymmetric, inequality ( 3.1) is nonetheless true. In fact,
using (3.2) for the symmetric matrix A + At , we have
p
p
h(A+At )x, yi ≤ h(A + At )x, xi h(A + At )y, yi = 2ϕA (x)ϕA (y),
since hAt x, xi = hx, Axi = hAx, xi.
Remark: when A is symmetric and positive definite, (x, y) 7−→ hAx, yi
defines a scalar product on Rn .
Solution of Exercise 3.9 Best k-rank approximation.
1. >> m=10;n=7;r=5;A=MatRank(m,n,r);
>> [V,S,U] = svd(A);S=diag(S);
>> for k=r-1:-1:1
>>
[v,s,u] = svds(A,k);
>>
fprintf(’For k = %i:: error^2 =
>> k,norm(v*s*u’-A,’fro’)^2,k,S(k+1))
>> end
>> For k = 4:: error^2 = 0.006658 and
>> For k = 3:: error^2 = 0.099486 and
>> For k = 2:: error^2 = 0.888564 and
>> For k = 1:: error^2 =P1.890766 and
r
We note that kA − Ak k2F = i=k+1 µ2i .
%f
and S(%i)=%f\n’,...
S(4)=0.081598
S(3)=0.304677
S(2)=0.888300
S(1)=1.001101
18
CHAPTER 3.
EXERCISES OF CHAPTER 3
2. Since in the proof of Proposition 3.2.1, we write A − Ak = V DU ∗ where
D ∈ Mm,n (R) is the diagonal matrix diag (0, . . . , 0, µk+1 , . . . , µr , 0, . . . , 0).
The Frobenius norm being invariant by unitary transformation, we have
v
u X
u r
kA − Ak kF = kDkF = t
µ2i ,
i=k+1
hence the result is proved.
Solution of Exercise 3.10
1. C is the diagonal matrix formed by entries ci .
2. We have
M ẏ(t)
M ẏ(t)
0
M ż(t) =
=
=
M ÿ(t)
−C ẏ(t) − Ky(t)
−K
M
−C
y(t)
ẏ(t)
.
Since matrix M is nonsingular, we deduce (3.14) with
0
I
A=
.
−M −1 K −M −1 C
3. With the data

1 0
M = 0 2
0 0
of the problem, we have





0
2 −1 0
1/2 0
0
0  , K =  −1 2 −1  , C =  0 1/2 0  .
1
0 −1 1
0
0 1/2
The script above produces Figures 3.1-3.2. We observe that the masses go
back to their equilibrium positions and stay at rest (zero velocity).
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
K=laplacian1dD(3)/16; K(3,3)=1;M=diag([1 2 1]);
C=diag([.5 .5 .5]);
A=[zeros(size(K)) eye(size(K));-inv(M)*K -inv(M)*C];
pos0=[-1 0 1];
% Initial positions
Z0=[-.1 0 .1];
% perturbation of the positions
Z0=[Z0 -1 0 1]’;
% perturbation of the velocities
velocities=[];positions=[];time=0:0.1:30;
for t=time;
% Solution (positions and velocities) at instant t
Z=expm(A*t)*Z0;
positions=[positions;Z(1:3)’+pos0];
velocities=[velocities;Z(4:6)’];
end
plot(time, positions,’MarkerSize’,10,’LineWidth’,3)
plot(time, velocities,’MarkerSize’,10,’LineWidth’,3)
19
2
1.5
1
0.5
0
−0.5
−1
−1.5
−2
0
m3
m
2
m1
5
10
15
20
25
30
Fig. 3.1. Positions of the masses as functions of time.
1
0.5
0
−0.5
−1
0
5
10
15
20
25
30
Fig. 3.2. Velocities of the masses as functions of time.
4
Exercises of chapter 4
Solution of Exercise 4.3
1. If A is a band matrix, of half-bandwidth p, there are at most 2p + 1
nonzero elements on a row of A. The scalar product of u and a row of
A is carried out in at most 2p + 1 multiplications (we shall not consider
boundary effects due to the first and last rows that contain less than
2p + 1 each. The cost of a product Au is thus, asymptotically, of (2p + 1)n
multiplications.
2. If A and B are two band matrices, AB is also a band matrix, of halfbandwidth 2p. Indeed, on the one hand we have
(AB)i,j =
n
X
k=1
ai,k bk,j =
X
ai,k bk,j ,
i−p≤k ≤i+p
j−p≤k≤j+p
and on the other hand


 i − j > 2p
i−p > j +p
|i − j| > 2p ⇐⇒
or
⇐⇒
or
,


i − j < −2p
i+p<j −p
so (AB)i,j = 0 if i + p < j − p or i − p > j + p. The matrix AB having a
band structure, we do not have to determine n2 scalars but only (4p + 1)n
scalars, each one of them is equal to the scalar product of two vectors
having at most (2p+1) nonzero entries each, which is carried out in at most
(2p + 1) operations. The total cost is therefore less than (2p + 1)(4p + 1)n
operations. A more accurate computation of the scalar product of row i
of matrix A and column j of matrix B is done in
• 0 operation if j < i − p,
• 1 operation if j = i − p,
.
• ..
22
CHAPTER 4.
EXERCISES OF CHAPTER 4
• 2p + 1 operations if j = i,
.
• ..
• 1 operation if j = i + p,
• 0 operation if j > i + p,
that is, overall (2p+1)+2(1+. . .+2p) = (2p+1)2 operations. The number
of operations required by the calculation of AB is thereby equivalent to
(2p + 1)2 n.
Solution of Exercise 4.4 Matlab allowing for recursiveness (a function can
call itself), the Strassen algorithm is easily programmed.
function c=strassen(a,b)
% warning: matrices are square
% of size n × n with n = 2k
n=size(a,1);
if n==1
c=a*b;
else
m=n/2;
% we decompose the matrix a in 4 blocks
a11=a(1:m,1:m);a12=a(1:m,m+1:n);
a21=a(m+1:n,1:m);a22=a(m+1:n,m+1:n);
% same for b
b11=b(1:m,1:m);b12=b(1:m,m+1:n);
b21=b(m+1:n,1:m);b22=b(m+1:n,m+1:n);
% the Strassen calculation rule
m1=strassen(a12-a22,b21+b22);m2=strassen(a11+a22,b11+b22);
m3=strassen(a11-a21,b11+b12);m4=strassen(a11+a12,b22
);
m5=strassen(a11
,b12-b22);m6=strassen(a22
,b21-b11);
m7=strassen(a21+a22,b11
);
% we define c=a*b by blocks
c=[m1+m2-m4+m6,m4+m5;m6+m7,m2-m3+m5-m7];
end
We check on some examples that the function is indeed working
>> k=5;n=2^k;a=rand(n,n);b=rand(n,n);c=a*b;d=strassen(a,b);
>> norm(c-d)
ans =
1.9112e-13
We compare with the computation done by MatMult.
>> k=5;n=2^k;a=rand(n,n);b=rand(n,n);
tic;c=strassen(a,b);t1=toc
t1 =
0.7815
23
>> tic;d=MatMult(a,b);t2=toc
t2 =
0.0050
We remark that function strassen is much slower than function MatMult.
This is due to the recursiveness using a lot of memory.
Solution of Exercise 4.5 We may compute X −1 defined by (4.4) in six
matrix multiplications and two matrix inversions
>> A=rand(n,n);B=rand(n,n);
% matrices of size n × n
>> C=rand(n,n);D=rand(n,n);
>> tic;
% initialization of a time counter
>> Am1=inv(A);
>> M=Am1*B;
>> Delta=D-C*M;
>> N=C*Am1;
>> Deltam1=inv(Delta);
>> P=Deltam1*N;
>> inverseX=[Am1+M*P, -M*Deltam1; -P, Deltam1];
>> t1=toc;
% time elapsed since the last tic call
>> X=[A B;C D];
>> tic;invX=inv(X);t2=toc;
>> n=100;norm(inverseX-invX, ’inf’) % we check the computations
ans =
2.4219e-10
For small values of n the standard Matlab inversion of X is faster.
>> fprintf(’computation by the Schur complement = %f \n’,t1)
computation by the Schur complement = 0.012005
>> fprintf(’computation by inversion of X = %f \n’,t2)
computation by inversion of X = 0.008171
5
Exercises of chapter 5
Solution of Exercise 5.1
1. eps is the floating point relative accuracy of Matlab in double precision
>> eps
ans =
2.2204e-16
>> a=eps;b=0.5*eps;X=[2, 1;2, 1];
>> A=[2, 1;2, 1+a];norm(A-X)
>> B=[2, 1;2, 1+b];norm(X-B)
>> ans =
2.2204e-16
>> ans =
0
In simple precision the accuracy is
>> eps(’single’)
ans =
1.1921e-07
2. realmax is the largest floating point number, and realmin the smallest
one
>> rM=realmax, 1.0001*rM,
rM =
1.7977e+308
ans =
Inf
This produces an overflow.
>> rm=realmin, .0001*rm
rm =
2.2251e-308
26
CHAPTER 5.
EXERCISES OF CHAPTER 5
ans =
2.2251e-312
There is no underflow; according to the Matlab help, the number .0001*rm
is a “denormal number”. Consult a IEEE floating point documentation.
3. Infinity and “Not a number”
>> A=[1 0 0 3]; B=[5 -1 1 1]./A
Warning: Divide by zero.
B =
5.0000
-Inf
Inf
>> isinf(B)
ans =
0
1
1
0
>> C=A.*B
ans =
5
NaN
NaN
1
>> isnan(C)
ans =
0
1
1
0
0.3333
4. >> A=[1 1; 1 1+eps];inv(A), rank(A)
Warning: Matrix is close to singular or badly scaled.
Results may be inaccurate. RCOND = 5.551115e-17.
ans =
1.0e+15 *
4.5036
-4.5036
-4.5036
4.5036
ans =
1
The matrix A is invertible, but very ill conditioned, hence the response of
Matlab is false (the real rank is 2).
>> B=[1 1; 1 1+.5*eps];inv(B), rank(B)
Warning: Matrix is singular to working precision.
ans =
Inf
Inf
Inf
Inf
ans =
1
The matrix B is singular for Matlab.
Solution of Exercise 5.4 We store the upper triangular matrix A column
by column, starting with the first one, in a vector u. The mapping between
indices is Ai,j = ui+j(j−1)/2 . The program is quite similar to the previous one:
27
function aU=storeU()
fprintf(’Storage of an upper triangular matrix’)
fprintf(’the matrix is stored column by column’)
n=input(’enter the dimension n of the square matrix’)
for j=1:n
fprintf(’column %i \n’,j)
ii=j*(j-1)/2;
for i=1:j
fprintf(’enter element (%i,%i) of the matrix’,i,j)
aU(i+ii)=input(’ ’);
end;
end;
Solution of Exercise 5.5 Starting from the dimension s of the vector storing
a triangular square matrix A, one has to determine the size n of the latter.
Since n ≥ 1, we have
√
−1 + 1 + 8s
2
.
s = n(n + 1)/2 ⇐⇒ n + n − 2s = 0 =⇒ n =
2
function c=storeLpU(a,b)
% product of two triangular matrices a and b
% a: stored by storeL
% b: stored by storeU
% c=ab: full matrix
m=length(a);s=length(b);
if m~=s, then error("incompatible dimensions");end;
n=round((-1+sqrt(1+8*m))/2);
c=zeros(n,n);
for i=1:n
ii=i*(i-1)/2;
for j=1:n
jj=j*(j-1)/2;
s=0;
for k=1:min(i,j)
s=s+a(k+ii)*b(k+jj);
end;
c(i,j)=s;
end;
end;
Solution of Exercise 5.6
1. The matrix Q is obtained by
>> A=[1:5;5:9;10:14];Q=null(A’)
Q =
0.4527
28
CHAPTER 5.
EXERCISES OF CHAPTER 5
-0.8148
0.3621
2. a) > >b=[5; 9; 4];x=A\b
Warning: Rank deficient, rank = 2, tol =
1.9294e-14.
x =
-1.8443
0
0
0
1.6967
Matlab warns us of a problem: it can not compute the rank of A.
>> A*x-b
ans =
1.6393
-2.9508
1.3115
The vector x = A\b returned by Matlab is not a solution of equation
Ax = b.
>> Q’*b
ans =
-3.6214
b) For b=[1; 1; 1], x = A\b is indeed a solution of equation Ax = b
and Qt b = 0.
c) Justification. For any matrix A of size m × n and b ∈ Rm , we have
(denoting by q1 , . . . , qs the columns of Q):
Qt b = 0 ⇐⇒ hqi , bi = 0,
∀i = 1, . . . , s
t ⊥
⇐⇒ b ∈ ( Ker A ) ⇐⇒ b ∈ Im A.
d) function y=InTheImage(A,b)
kerAt=null(A’);
if (size(kerAt,1)==size(b,1))
if norm(kerAt’*b) > 1.e-6
y=’no’;
else
y=’yes’;
end;
else
error(’Dimension problem’)
end;
Some trials:
>> A=[1 2 3; 4 5 6; 7 8 9];b=[1;1;1];;InTheImage(A,b)
ans =
yes
>> b=[1;2;1];InTheImage(A,b)
29
ans =
no
Solution of Exercise 5.7 The computation of x by the Cramer formulas is
performed by the following function
function y=cramer(A,b)
n=size(A,1);d=det(A);
for i=1:n
sol(i)=det([A(:,1:i-1), b, A(:,i+1:n)])
end;
y=sol’/d;
We carry out the calculations for n = 20, 40, 60 and 80.
>> for n=20:20:80;
>> b=ones(n,1);c=1:n;A=c’*ones(size(c));A=A+A’;
>> s=norm(A,’inf’);
>> for i=1:n, A(i,i)=s;end;
>>
x=cramer(A,b);
>>
sol=A\b;
>>
norm(sol-x’);
>>
[n
norm(sol-x’)]
>> end;
ans =
20.0000
0.0000
ans =
40.0000
0.0000
ans =
60.0000
0.0000
ans =
80
NaN
Explanation: for n = 80, Matlab cannot execute the requested calculations
anymore, determinants are too large (Nan means ’not a number’)
>> det(A)
ans =
Inf
Solution of Exercise 5.8 The matrix A is always invertible ( det (A) = 1)
and
In −B
−1
.
A =
0 n In
We infer that
condF (A)2 = kAk2F kA−1 k2F = (2n + kBk2F )2 ,
i.e., condF (A) = 2n + kBk2F . The Frobenius norm and condition number of a
matrix X are given by norm(X,’fro’) and cond(X,’fro’).
30
CHAPTER 5.
EXERCISES OF CHAPTER 5
Solution of Exercise 5.9 We check that cond2 (Hn ) is very large even for
small values of n:
>> cond(hilb(5)), cond(hilb(10))
ans =
4.7661e+05
ans =
In Figure 5.1, we display the graph generated by the following instructions
>>
>>
>>
>>
n=10;x=(1:n)’;for i=1:n,y(i)=cond(hilb(i)); end;
plot(x,log(y),’-+’,x,3.45*x-4.1,’MarkerSize’,10,’LineWidth’,3)
grid on;
set(gca,’XTick’,1:2:10,’YTick’,-5:10:35,’FontSize’,24);
The curve n 7→ ln cond2 (Hn ) is almost a straight line of slope close to
3.45. Wherefrom we deduce the approximation cond2 (Hn ) ≈ e3.45n . Actually
35
25
15
5
−5
1
3
5
7
9
Fig. 5.1. Condition number of the Hilbert matrix, n 7→ ln cond2 (Hn ).
on can prove the equivalence cond2 (Hn ) ≈ e
matrices are very much ill-conditioned.
7n
2
, which implies that Hilbert
Solution of Exercise 5.10
function y=Lnorm(A)
% computes the 2-norm of a lower triangular matrix A
if norm(A-tril(A))>1.e-10
error(’the matrix is not lower triangular.’)
end;
n=size(A,1);y=zeros(n,1);x=ones(n,1);
y(1)=A(1,1)*x(1);
31
for i=2:n
s=A(i,1:i-1)*x(1:i-1);
if abs(A(i,i)+s)<abs(A(i,i)-s)
x(i)=-1;
end;
y(i)=A(i,i)*x(i)+s;
end;
y=y/sqrt(n);
y=norm(y);
Some tests:
>> n=20;a=tril(rand(n,n));[Lnorm(a), norm(a)]
ans =
5.8862
6.6578
>> n=40;a=tril(rand(n,n));[Lnorm(a), norm(a)]
ans =
11.9732
13.1621
>> n=80;a=tril(rand(n,n));[Lnorm(a), norm(a)]
ans =
23.7755
26.2176
We notice a fairly good agreement of the two computations.
Solution of Exercise 5.11
function y=LnormAm1(A)
% computes the 2-norm of A−1
n=size(A,1);y=zeros(n,1);
y(1)=1/A(1,1);
for i=2:n
s=A(i,1:i-1)*y(1:i-1);
y(i)=-(sign(s)+s)/A(i,i);
end;
y=y/sqrt(n);
y=norm(y);
Some tests:
>> n=20;a=LowNonsingularMat(n);[LnormAm1(a), norm(inv(a))]
ans =
0.1056
0.1126
>> n=40;a=LowNonsingularMat(n);[LnormAm1(a), norm(inv(a))]
ans =
0.0525
0.0552
>> n=80;a=LowNonsingularMat(n);[LnormAm1(a), norm(inv(a))]
ans =
0.0252
0.0260
32
CHAPTER 5.
EXERCISES OF CHAPTER 5
Here as well, we observe a fairly good agreement of both computations.
Solution of Exercise 5.12
function y=Lcond(A)
y=Lnorm(A)*LnormAm1(A);
Some tests:
>> n=20;a=LowNonsingularMat(n);[Lcond(a), cond(a)]
ans =
1.5809
1.6801
>> n=40;a=LowNonsingularMat(n);[Lcond(a), cond(a)]
ans =
1.5717
1.6543
>> n=80;a=LowNonsingularMat(n);[Lcond(a), cond(a)]
ans =
1.5438
1.5946
Solution of Exercise 5.14
1. We notice that for various values of n the spectrum of the matrix is made
up (or seems to be made up) of the only values 1.618 and -0.618.
2. Instead of solving system Ax = b, we solve the equivalent system
M −1 Ax = M −1 b: this may be worthwhile at least when the latter system
is better conditioned than the original one.
3. a) We have
u
u
u
u
i.e.,
= λM
or equivalently A
=λ
M −1 A
v
v
v
v
(1 − λ)Cu + Dt v = 0
Cu + Dt v = λCu
⇔
Du
= λDC −1 Dt v
Du
= λDC −1 Dt v
We deduce that Du = λDC −1 (λ − 1)Cu = λ(λ − 1)Du, so that
(λ2 − λ − 1)Du = 0.
b) The last relation shows that:
• either Du = 0 and thus DC −1 Dt v = 0, that is, v = 0. Then we
deduce that Cu = λCu, so λ = 1, since the eigenvector (u, v)t is
nonzero and accordingly u 6= 0.
√
• or λ2 − λ − 1 = 0, that is, λ = (1 ± 5)/2.
√
Thus,
the eigenvalues of M −1 A belong to the set {(1 − 5)/2, 1, (1 +
√
5)/2} independently of the dimension n.
c) If M −1 A is symmetric, we have
√
|λmax |
5+1
=√
cond2 (M −1 A) =
≈ 2.62.
|λmin |
5−1
Solution of Exercise 5.17 We put the data in an array
33
>> X=[1990
1997
1998
939
972
1047
1058
1991
1999
1006
1071
1992
1993
2000; ...
1022
1016
1083];
1994
1995
1996 ...
1011
1038 ...
Reproduction of Figure 1.2.
years=X(1,:);cost=X(2,:);
m=length(years);
t0=years;
A(1,1)=m;A(1,2)=sum(years);A(2,1)=A(1,2);A(2,2)=sum(years.*years);
b(1)=sum(cost);b(2)=years*cost’;
xx=A\b’;
pt=xx(1)+years*xx(2);
plot(years,cost,’-+’,years,pt,’MarkerSize’,10,’LineWidth’,3)
grid on
set(gca,’XTick’,1990:2:2000,’YTick’,900:50:1100,’FontSize’,24);
Reproduction of Figure 1.3. We now use the functions polyfit and polyval.
p = polyfit(years,cost,4);
f = polyval(p,years);
plot(years,cost,’-+’,years,f,’MarkerSize’,10,’LineWidth’,3)
grid on
set(gca,’XTick’,1990:2:2000,’YTick’,900:50:1100,’FontSize’,24);
Solution of Exercise 5.18
1. We define a function f
f=inline(’sin(x)-sin(2*x)’);
Let p(x) = a + bx + cx2 . The vector u = (a, b, c)t is a solution of At Au =
At b, with

 

1 x1 x21
x1
 1 x2 x22 
 x2 

 

A =  . . . , b =  . .
 .. .. .. 
 .. 
1 xn x2n
>>
>>
>>
>>
>>
>>
xn
n=100;x=4.*rand(n,1);x=sort(x);y=f(x);
A1=[ones(n,1) x x.*x];
coef=(A1’*A1)(A’*b)(A1’*y);
sol1=coef(1)+coef(2)*x+coef(3)*x.*x;
norm(sol1-y)
ans =
4.7731
The same results are obtained with the instructions
34
CHAPTER 5.
EXERCISES OF CHAPTER 5
>> n=100;x=4.*rand(n,1);x=sort(x);y=f(x);
>> p = polyfit(x,y,2);
>> fp = polyval(p,x);norm(fp-y)
2. We write p(x) = a + b cos(x) + c sin(x). The vector u = (a, b, c)t is a
solution of system At Au = At b with


1 cos(x1 ) sin(x1 )
 1 cos(x2 ) sin(x2 ) 


A=.
.
..
..

 ..
.
.
1
cos(xn )
sin(xn )
>> cx=cos(x);sx=sin(x);
>> b=y;A=[ones(n,1) cx sx];
>> coef=(A’*A)\(A’*b);
>> sol2=coef(1)+coef(2)*cx+coef(3)*sx;
>> norm(sol2-y)
ans =
4.3472
The error is smaller. This was to be expected, since in view of the representation of function f at points xi , it seemed more suitable to seek a
combination of trigonometric functions than a combination of monomials.
Both approximations p and q as well as the cloud of points X,f(X) are
displayed in Figure 5.2.
2
1.5
1
0.5
0
−0.5
−1
−1.5
−2
0 0.5 1 1.5 2 2.5 3 3.5 4
Fig. 5.2. The two approximations p (+) and q (–) in Exercise 5.18.
6
Exercises of chapter 6
Solution of Exercise 6.1 We obtain
det(A)
ans =
- 9.517E-16
instead of 0, because of rounding errors.
Solution of Exercise 6.2 Comparison of the LU and Cholesky methods.
1. function [L,U]=LUfacto(A)
[m,n]=size(A);
if m~=n, error(’the matrix is not square’), end;
small=1.e-16;
for k=1:n-1
if abs(A(k,k))<small, error(’error: zero pivot’), end;
for i=k+1:n
A(i,k)=A(i,k)/A(k,k);
A(i,k+1:n)=A(i,k+1:n)-A(i,k)*A(k,k+1:n);
end;
end;
U=triu(A);L=A-U+diag(ones(n,1));
2. function A=Cholesky(A)
[m,n]=size(A);
if m~=n, error(’the matrix is not square’), end;
small=1.e-8;
if norm(A-A’,’inf’)>small
error(’nonsymmetric matrix’)
end;
for j=1:n
A(j,j)=A(j,j)-A(j,1:j-1)*A(j,1:j-1)’;
if A(j,j)< small, error(’nonpositive matrix’), end;
if abs(A(j,j))< small, error(’nondefinite matrix’), end;
36
CHAPTER 6.
EXERCISES OF CHAPTER 6
A(j,j)=sqrt(A(j,j));
for i=j+1:n
A(i,j)=A(i,j)-A(j,1:j-1)*A(i,1:j-1)’;
A(i,j)=A(i,j)/A(j,j);
end;
end;
A=tril(A);
3. The instructions below produce Figure 6.1.
>>
>>
>>
>>
>>
>>
>>
>>
>>
for i=1:50
n=10*i;A=pdSMat(n);b=ones(n,1);
tic;[L U]=LUfacto(A);xlu=BackSub(U,ForwSub(L,b));tlu(i)=toc;
tic;B=Cholesky(A);xcho=BackSub(B’,ForwSub(B,b));tcho(i)=toc;
end;
x=10*(1:50)’; % we represent CPU(LU) and 2*CPU(Cholesky)
plot(x,tlu,x,2*tcho,’+’,’MarkerSize’,10,’LineWidth’,3)
grid on;
set(gca,’XTick’,0:100:500,’YTick’,0:1:4,’FontSize’,24);
4
3
2
1
0
0
100 200 300 400 500
Fig. 6.1. Comparison of the LU and Cholesky methods computational time.
We remark that the curves are almost superimposed and therefore the
Cholesky method is twice as fast as the LU method.
Solution of Exercise 6.4 Influence of row permutations.
>> e=1.E-16;A=[e 1 1;1 -1 1; 1 0 1];b=[2 0 1]’;
>> [L U]=LUfacto(A);[w z p]=lu(A);[l u]=LUfacto(p*A);
>> x1=BackSub(U,ForwSub(L,b));
?? Error using ==> BackSub
singular matrix
>> det(U)
ans =
0
37
>> x2=BackSub(u,ForwSub(l,p*b))
x2 =
-0.0000
1.0000
1.0000
This striking difference (the first algorithm fails while the second one is successful) can be explained by the occurrence of too large entries in L and U
>> [norm(L) norm(U) norm(l) norm(u)]
ans =
1.0e+16 *
1.4142
1.4142
0.0000
0.0000
Indeed, the permutation matrix p exchanges the first and second rows of A.
This permutation avoids dividing by too small pivots in the computation of
the LU factorization; see the importance of pivoting in Exercise 6.3.
Solution of Exercise 6.7 2D Finite difference Laplacian.
1. Approximation of the Laplacian
a) Write the Taylor expansions.
b) Idem.
c) It is clear that
−ui−1,j + 2ui,j − ui+1,j −ui,j−1 + 2ui,j − ui,j+1
+
+O(h2 )+O(k 2 ),
h2
k2
which yields the following second-order approximation of the Laplacian
−ui−1,j + 2ui,j − ui+1,j
−ui,j−1 + 2ui,j − ui,j+1
−∆u(xi , yj ) ≈
+
.
h2
k2
2. See next question.
3. We find


4 −1 0 . . . 0
.. 
..
..

.
.
 −1 4
. 


.
.
.

.
.
.
B= 0
.
.
. 0 
.
 .

.
.
.
 .
. −1 4 −1 
−∆u(xi , yj )=
0 . . . 0 −1 4
The matrix of the complete system is


B
−IN
0
...
0
.. 
..

.
 −IN
B
−IN
. 


1
..
..
..
.
A= 2
.
.
.

0

0
h  .

.
. . −I
 ..
B
−IN 
N
0
...
0
−IN
B
38
CHAPTER 6.
EXERCISES OF CHAPTER 6
We notice that this matrix is tridiagonal by blocks: each diagonal block is
tridiagonal and each out-of-diagonal block is diagonal (it is equal to minus
the identity).
a) The following function defines A by blocks.
function A=laplacian2dD(n)
% Computes the 2D Laplacian matrix
% defined on the unit square
% with Dirichlet boundary conditions
% n = number of internal points in x = number of internal points in y
I0=-eye(n,n);
B0=2*eye(n,n)-diag(ones(n-1,1),1);B0=B0+B0’;
O=zeros(n,n);
A=[B0 I0;I0 B0];
X=O;
for j=3:n
A=[A [X I0]’;X I0 B0];
X=[X O];
end;
A=A*(n+1)*(n+1);
The instruction spy(laplacian2dD(5)) gives Figure 6.2. We may also
5
10
15
20
25
5
10
15
20
25
Fig. 6.2. Display of the 2D Laplacian matrix (N = 5).
define A pointwise, without using the block structure
function A=laplacian2dDBis(n)
A=4*eye(n*n,n*n);
for i=1:n*n-1
A(i,i+1)=-1;
A(i+1,i)=-1;
39
end;
for i=1:n-1
A(n*i,n*i+1)=0;
A(n*i+1,n*i)=0;
end;
for i=1:(n-1)*(n-1)
A(i,i+n)=-1;
A(i+n,i)=-1;
end;
A=A*(n+1)*(n+1);
b) function b=laplacian2dDRHS(n,frhs)
% Computes the RHS of the 2D laplacian problem
xgrid=(1:n)/(n+1);ygrid=(1:n)/(n+1);
xx=xgrid’*ones(1,n); % each column of xx contains xgrid
yy=ones(n,1)*ygrid; % each row of yy contains ygrid
frhs=str2func(frhs);
b=frhs(xx,yy);
b=b(:);
4. Validation. Example of a right-hand side f and of a solution u to validate
the scheme indexprocRHS
function fxy=RHS(x,y)
fxy=2*(x.*(1-x) + y.*(1-y));
function uxy=exactu(x,y);
uxy=x.*(1-x).*y.*(1-y);
% f (x, y) = 2x(1 − x) + 2y(1 − y)
% u(x, y) = x(1 − x)y(1 − y)
The following function gives an approximate solution of −∆u = f on the
unit square, with Dirichlet homogeneous boundary conditions. The reader
is invited to make the necessary changes to solve the same problem in a
square ]a, b[×]c, d[ with non-homogeneous Dirichlet boundary conditions.
function sol=Solve2dLaplacian(n,f2d)
% solves the Laplacian in 2D
% in the unit square
% with homogenous Dirichlet boundary conditions
% f is the LHS
A=laplacian2dD(n);
b=laplacian2dDRHS(n,f2d);
sol=A\b;
% sol is a vector
sol=reshape(sol,n,n);
% sol is a matrix
Call example (the result is displayed in Figure 6.3)
>>
>>
>>
>>
>>
n=10;
sol=Solve2dLaplacian(n,’RHS’);
x=(1:n)/(n+1);
surf(x,x,sol,’MarkerSize’,10,’LineWidth’,3)
set(gca,’XTick’,0:.25:1,’YTick’,0:.25:1,’FontSize’,20);
40
CHAPTER 6.
EXERCISES OF CHAPTER 6
>> xx=x’*ones(1,n);yy=ones(n,1)*x;
>> exact=exactu(xx,yy);
>> norm(exact-sol)
ans =
1.0488e-016
0.06
0.04
0.02
0
1
1
0.5
0.5
0 0
Fig. 6.3. Computation of the solution (for N = 10) of problem (6.7)-(6.8) with
f (x, y) = 2x(1 − x) + 2y(1 − y) and g = 0.
5. Convergence.
a) f (x, y) = 2π 2 u(x, y)−2π[(y−1) sin(πy) cos(πx)+(x−1) sin(πx) cos(πy)]
b) The limit value N0 = 80 depends on the computer used.
c) The matrix is formed of N times block B plus 2(N − 1) times matrix
−IN , hence we have Ne = N (3N − 2) + 2(N − 1)N = 5N 2 − 4N . The
following function builds A by exploiting the sparse structure of this
matrix.
function A=laplacian2dDSparse(n)
% Computing the sparse 2D Laplacian matrix
% in the unit square
% Dirichlet boundary conditions
% n = number of internal points i
% x = numbrer of internal points in x
% construction of block 1
i=1;
ii=[1,1,1];jj=[i,i+1,i+n];uu=[4,-1,-1];
for i=2:n-1
ii=[ii,i,i,i,i];jj=[jj,i-1,i,i+1,i+n];
uu=[uu,-1,4,-1,-1];
end;
41
i=n;
ii=[ii,i,i,i];jj=[jj,i-1,i,i+n];uu=[uu,-1,4,-1];
% construction of blocks from 2 to n-1
for k=2:n-1
I=(k-1)*n;
i=I+1;
ii=[ii,i,i,i,i];jj=[jj,i-n,i,i+1,i+n];
uu=[uu, -1,4,-1,-1];
for i=I+2:I+n-1
ii=[ii,i,i,i,i,i];jj=[jj,i-n,i-1,i,i+1,i+n];
uu=[uu,-1,-1,4,-1,-1];
end;
i=I+n;
ii=[ii,i,i,i,i];jj=[jj,i-n,i-1,i,i+n];
uu=[uu,-1,-1,4,-1];
end;
% construction of block n
i=(n-1)*n+1;
ii=[ii,i,i,i];jj=[jj,i-n,i,i+1];
uu=[uu,-1,4,-1];
for i=n*(n-1)+2:n*n-1
ii=[ii,i,i,i,i];jj=[jj,i-n,i-1,i,i+1];
uu=[uu,-1,-1,4,-1];
end;
i=n*n;
ii=[ii,i,i,i];jj=[jj,i-n,i-1,i];uu=[uu,-1,-1,4];
uu=uu*(n+1)*(n+1);
A=sparse(ii,jj,uu);
Analysis of the error: the result of the instructions below is displayed
in Figure 6.4. The curve representing the logarithm of the error in
terms of the logarithm of N turns out to be a straight line whose
slope is approximately equal to −1.86. Theory predicts a slope equal
to 2.
>> for k=1:10
>>
n=5*k;
>>
sol=Solve2dLaplacianSparse(n,’RHS2’);
>>
xgrid=(1:n)/(n+1);ygrid=(1:n)/(n+1);
>>
xx=xgrid’*ones(1,n);yy=ones(n,1)*ygrid;
>>
exact=exactu2(xx,yy);
>>
err(k)=norm(exact(:)-sol(:),’inf’);
>> end;
>> plot(log(5*(1:10)),log(err),’-+’,’MarkerSize’,10,’LineWidth’,3)
6. a) Computing the spectrum of A
>> n=20;A=laplacian2dD(n);[P,D]=eig(A);Spectrum=diag(D);
b) Sorting the spectrum
42
CHAPTER 6.
EXERCISES OF CHAPTER 6
−5
−6
−7
−8
−9
1.5
2
2.5
3
3.5
4
Fig. 6.4. Convergence of the finite difference method: log(error) in terms of log(n).
>> [ArrangedSpec, I]=sort(Spectrum);
>> Nbre=4;
>> % Computing the first eigenvalues
>> vp=ArrangedSpec(1:Nbre);
>> % and corresponding eigenvectors
>> VectorsP(:,1:Nbre)=P(:,I(1:Nbre));
To represent one of the eigenvectors, for instance, the first one,
>> j=1;X1=reshape(VectorsP(:,j),n,n);x=(1:n)/(n+1);
>> surfc(x,x,X1)
The reader can check that the n2 eigenvalues of A are precisely
jπh 4
iπh
) + sin2 (
) , 1 ≤ i, j ≤ N.
λi,j = 2 sin2 (
h
2
2
In Figure 6.5, we display iso-contours of the first four eigenvectors
(we should say eigenfunctions) of the discretized Laplacian in 10 × 10
points. These eigenfunctions oscillate more and more.
c) The function ϕ(x, y) = sin(αx) sin(βy) satisfies the boundary condition if and only if there exist two integers i and i such that α = iπ and
β = jπ. We deduce that the eigenvalues of the continuous problem
are
λi,j = (i2 + j 2 )π 2 , (i, j) ∈ N2
associated with eigenfunctions
ϕi,j (x, y) = sin(iπx) sin(jπy).
Hence, the first four eigenvalues are:
• a simple eigenvalue λ1,1 = π 2 associated with eigenfunction ϕ1,1 (x, y) =
sin(πx) sin(πy). The curve displayed in Figure 6.5 (top left) is proportional to ϕ1,1 ;
43
0
0.2
−0.05
0
−0.1
1
−0.2
1 1
0.5
0 0
0.5
0.2
0.1
0
0
−0.2
1
−0.1
1 1
0.5
0 0
0.5
0.5
0.5
0 0
0 0
0.5
0.5
1
1
Fig. 6.5. The first four (from left to right and from top to bottom) eigenfunctions
of the discretized Laplacian on a 20 × 20 regular grid.
• a double eigenvalue λ2,1 = λ1,2 = 5π 2 . The corresponding eigen
subspace is spanned by ϕ1,2 (x, y) = sin(πx) sin(2πy) and ϕ2,1 (x, y) =
sin(2πx) sin(πy). The surfaces displayed on Figure 6.5 (top right
and bottom left) are proportional to ϕ1,2 + ϕ2,1 and ϕ1,2 − ϕ2,1 ;
• a simple eigenvalue λ2,2 = 8π 2 associated with eigenfunction
ϕ2,2 (x, y) = sin(2πx) sin(2πy). The surface displayed in Figure 6.5
(bottom right) is proportional to −ϕ2,2 .
Solution of Exercise 6.8
>> n=5;A=laplacian2dD(n);
>> [L,U]=lu(A);
>> figure(1);spy(L);figure(2);spy(U);
We experimentally check on Figure 6.6 that the LU factorization preserves the
band structure of matrices (which is rigorously proved in Proposition 6.2.1),
but it “fills” this band. There are more nonzero entries in L and U than
in the original matrix A. We observe the same behavior with the Cholesky
factorization.
44
CHAPTER 6.
EXERCISES OF CHAPTER 6
5
5
10
10
15
15
20
20
25
5
10
15
20
25
25
5
10
15
20
25
Fig. 6.6. Matrices L (left) and U (right) of the LU factorization for the discrete
Laplacian matrix of Figure 6.2.
Solution of Exercise 6.10 Incomplete LU preconditioning.
1. It suffices to add tests to program LUfacto
function [L,U]=ILUfacto(A,tolerance)
% Incomplete LU factorization
%nargin = number of input arguments of the function
if nargin==1, tolerance=1.E-4;end; % default tolerance
[m,n]=size(A);
if m~=n, error(’the matrix is not square’), end;
small=1.e-12;
for k=1:n-1
if abs(A(k,k))<small, error(’error: zero pivot’), end;
for i=k+1:n
if abs(A(i,k)) > tolerance
A(i,k)=A(i,k)/A(k,k);
end
for j=k+1:n
if abs(A(i,j)) > tolerance
A(i,j)=A(i,j)-A(i,k)*A(k,j);
end;
end;
end;
end;
U=triu(A);L=A-U+diag(ones(n,1));
2. Since M = L̃Ũ is an approximation of A, a preconditioning of A is M −1 =
Ũ −1 L̃−1 .
>> A=laplacian2dD(10);[LI,UI]=ILUfacto(A);
>> [cond(A), cond(inv(UI)*inv(LI)*A)]
ans =
48.3742
5.1277
7
Exercises of chapter 7
Solution of Exercise 7.1
1. To compute the minimal norm solution of the least square problem we
use the pseudo-inverse of A as follows:
>> x1=pinv(A)*b1;x2=pinv(A)*b2;
>> norm(A*x1-b1),norm(A*x2-b2)
ans =
2.9873e-015
ans =
2.7255
This shows that b1 belongs to the image of A, but not b2 .
2. All solutions are of the form xi + x, where x ∈ Ker A.
>> X=null(A);d=size(X,2)
d =
2.
>> u=X(:,1);v=X(:,2);
% searching other solutions
% dimension of the null space of A
The columns of X form a basis of the null space. The solutions of problem
minx∈R3 kAx−bi k2 are vectors x = xi +αu+βv with α and β real numbers.
Solution of Exercise 7.2
1. >>
>>
>>
>>
>>
A=reshape(1:6,3,2);A=[A eye(size(A)); -eye(size(A)) -A];
b0=[2 4 3 -2 -4 -3]’;x0=pinv(A)*b0;
e=1.E-2;b=b0+e*rand(6,1);x=pinv(A)*b;
xerror=norm(x-x0)/norm(x0)
xerror =
0.0034
>> berror=norm(b-b0)/norm(b0)
berror =
0.0022
46
CHAPTER 7.
>>
>>
>>
>>
EXERCISES OF CHAPTER 7
eta=norm(A)*norm(x0)/norm(A*x0);
costheta=norm(A*x0)/norm(b0);
Cb=cond(A)/eta/costheta
Cb =
5.3982
The amplification coefficient is moderate since b0 belongs to the image of
A:
>> norm(A*x0-b0)
ans =
3.8202e-015
2. >>
>>
>>
>>
b1=[3 0 -2 -3 0 2]’;x1=pinv(A)*b1;
e=1.E-2;b=b1+e*rand(6,1);x=pinv(A)*b;
[x1 x]
% display the solutions
ans =
-0.0000
0.0032
0.0000
-0.0004
-0.0000
0.0030
0.0000
-0.0027
>> xerror=norm(x-x1)/norm(x1)
xerror =
5.1718e+011
>> costheta=norm(A*x1)/norm(b1)
costheta =
1.5053e-015
>> eta=norm(A)*norm(x1)/norm(A*x1);
>> costheta=norm(A*x1)/norm(b1);
>> Cb=cond(A)/eta/costheta
% coefficient C b is infinite
Cb =
7.2400e+014
This is due to the fact that b1 is orthogonal to the image of A. Indeed,
>> A’*b1
ans =
0
0
0
0
we infer that b1 ∈ Ker (At ) = ( Im A)⊥ .
3. >> for i=1:100;
>>
b2=b0*i/100+b1*(1-i/100);
>>
x2=pinv(A)*b2;
>>
e=1.E-2;b=b2+e*rand(6,1);x=pinv(A)*b;
>>
eta=norm(A)*norm(x2)/norm(A*x2);
47
>>
costheta=norm(A*x2)/norm(b2);
>>
coefCb(i)=cond(A)/eta/costheta;
>> end
>> xx=(1:100)/100;
plot(xx, coefCb,’-.’,’MarkerSize’,10,’LineWidth’,3)
As can be checked on Figure 7.1 the coefficient Cb decreases as the entry
of the right-hand side b, in the “direction” of ( Im A)⊥ , gets smaller.
400
350
300
250
200
150
100
50
0
0
0.2
0.4
0.6
0.8
1
Fig. 7.1. Least squares fitting method: amplification coefficient Cb .
Solution of Exercise 7.3
1. We compute the partial derivatives
∂E
(a1 , . . . , an ) = −2
∂αk
Z
1
0
n
X
ϕk (x) f (x) −
ai ϕi (x) dx.
i=1
The matrix and the right-hand side of the system are thus defined by
Z 1
Z 1
f (x)ϕi (x)dx.
ϕi (x)ϕj (x)dx, bi =
Ai,j =
0
0
2. For the canonical basis of Pn−1 , we have
Ai,j
1
=
,
i+j −1
bi =
Z
1
xi−1 f (x)dx.
0
We recognize the Hilbert matrix of size n × n. Since this matrix is very
ill-conditioned (see Exercise 5.9); solving system Aa = b is delicate.
48
CHAPTER 7.
EXERCISES OF CHAPTER 7
R1
3. We define on the set X of functions such that 0 f 2 (x)dx < ∞, the following scalar product
Z 1
hf, gi =
f (x)g(x)dx
(7.1)
0
that makes X a Hilbert space. The sought basis is simply obtained by
applying the Gram–Schmidt orthonormalization (in the sense of scalar
n−1
product (7.1)) procedure to the canonical basis (xi )i=0
.
For an orthonormal basis, A is just the identity! The computation
of a is
R1
explicit since ai = bi . However, the coefficients bi = 0 f (x)ϕi (x)dx are
usually computed through an approximate formula (or quadrature).
Solution of Exercise 7.4 Note that the symmetric matrix At A is positive
definite, since Ker (A) = {0}.
1. We obtain the following computational times:
a) % solution of the normal equations by Cholesky
>> tic;Apb=A’*b;B=chol(A’*A);x1=B\(B’\Apb);t1=toc
t1 =
0.0017
b) % solution of the system by the QR method
tic;[n,p]=size(A);[Q,R]=qr(A);c=Q’*b;x2=R(1:p,1:p)\c(1:p);t2=toc
>> tic;[n,p]=size(A);[Q,R]=qr(A);
>> c=Q’*b;x2=R(1:p,1:p)\c(1:p);t2=toc% calculation of the solution
0.0204
c) We write
min kAx − bk2 = minn kV ΣU ∗ x − bk2 = minn kΣU ∗ x − V ∗ bk2
x∈Rn
x∈R
= minn kΣy − ck2 ,
y∈R
x∈R
setting y = U ∗ x, c = V ∗ b
% solution of the system by the SVD method
>> tic;[n,p]=size(A);
[U,S,V]=svd(A);
% A = U SV t factorization
u=U’*b;x3=V*(u(1:p)./diag(S));t3=toc
t3 =
0.0409
In view of these results, we note that the Cholesky method is the fastest
and the SVD method is the slowest. The three methods give the same
minimum (as they should)
% computation of the minimum
norm(A*x1-b)
% or norm(ApA*x1-Apb)
ans =
7.6876
norm(A*x2-b)
% or norm(c(p+1:n))
49
ans =
7.6876
norm(A*x3-b)
ans =
7.6876
We can also compare the solutions
norm(x1-x2),norm(x1-x3),norm(x2-x3)
ans =
0.0000017
ans =
0.0000017
ans =
3.468E-13
The solution obtained by the Cholesky method seems to be slightly different from the other solutions.
2. >> e=1.e-5;P=[1 1 0;0 1 -1; 1 0 -1];
>> A=P*diag([e,1,1/e])*inv(P);b=ones(3,1);
>> Apb=A’*b;B=chol(A’*A);x1=B\(B’\Apb);
>> [n,p]=size(A);[Q,R]=qr(A);c=Q’*b;
>> x2=R(1:p,1:p)\c(1:p);
>> [x1 x2]
ans =
1.0e+04 *
0.0004
5.0001
0.0001
0.0001
0.0003
5.0000
The two solutions are completely different. The correct solution is the one
given by the QR method as we now check:
>> [norm(A*x1-b) norm(A*x2-b)]
ans =
0.5773
0.0000
Explanation: the matrix A is ill-conditioned cond(A)=1.5000e+10 and
At A is even more so: cond(A’*A)=2.0118e+16. However, this latter matrix is used in solving the normal equations by the Cholesky method.
Conclusion: it is better to use the QR method.
8
Exercises of chapter 8
Solution of Exercise 8.1
A=[1 2 2 1;-1 2 1 0;0 1 -2 2;1 2 1 2];b=ones(4,1);det(A)
ans =
- 4.
This matrix A is thus nonsingular.
1. We recognize the Jacobi method.
>> A=[1 2 2 1;-1 2 1 0;0 1 -2 2;1 2 1 2];b=ones(4,1);
>> M=diag(diag(A));N=M-A; B=inv(M)*N;c=inv(M)*b;
>> x0=ones(4,1);x=x0;
>> for i=1:300, x=B*x+c; if ~(mod(i,100)) , x’,end; end;
ans =
1.0e+14 *
-3.2751
ans =
1.0e+28 *
0.6894
0.9632
-2.0558
-2.1683
-1.9569
ans =
1.0e+42 *
-1.4680
-0.0939
-1.4909
3.7985
6.4370
-0.8940
The sequence seems to diverge. And indeed, it diverges since the spectral
radius of B is larger than 1.
>> max(abs(eig(B)))
ans =
1.3846
2. We recognize the Gauss-Seidel method. Same observations.
52
CHAPTER 8.
EXERCISES OF CHAPTER 8
3. >> M=2*tril(A);N=M-A; B=inv(M)*N;c=inv(M)*b;
>> x0=ones(4,1);x=x0;
>> for i=1:300, x=B*x+c; if ~(mod(i,100)), x’,end; end;
ans =
0.5000
1.0000
-0.5000
-0.5000
ans =
0.5000
1.0000
-0.5000
-0.5000
ans =
0.5000
1.0000
-0.5000
-0.5000
The sequence converges
>> A*ans’
ans =
1.0000
1.0000
1.0000
1.0000
Explanation: this time the spectral radius of B is less than 1.
>> max(abs(eig(B)))
ans =
0.8895
Solution of Exercise 8.2 The convergence of the Jacobi method is checked
by the following program.
function cvg=JacobiCvg(A)
D=diag(A);
if min(abs(D)) <1.E-12
error(’Jacobi is not definite’);
end;
M=diag(D);N=M-A; B=inv(M)*N;
% B is also defined by
if max(abs(eig(B))) > 1
% B=eye(size(A))-inv(M)*A
cvg=0;
else
cvg=1;
end;
The convergence of the Gauss-Seidel method is checked by the following program.
function cvg=GaussSeidelCvg(A)
D=diag(A);
if min(abs(D)) <1.E-12
error(’Gauss-Seidel is not definite’);
end;
M=tril(A);N=M-A; B=inv(M)*N;
if max(abs(eig(B))) > 1
53
cvg=0;
else
cvg=1;
end;
For the matrix A1 the Gauss-Seidel method converges, but the Jacobi method
diverges:
>> A1=[1 2 3 4;4 5 6 7;4 3 2 0;0 2 3 4];
>> JacobiCvg(A1), GaussSeidelCvg(A1)
ans =
0
ans =
1
For the matrix A2 , it is the Jacobi method that converges and the Gauss-Seidel
method that diverges:
>> A2=[2 4 -4 1;2 2 2 0;2 2 1 0;2 0 0 2];
>> JacobiCvg(A2), GaussSeidelCvg(A2)
ans =
1.
ans =
0.
Conclusion: we cannot compare these two methods for an arbitrary matrix.
However, for tridiagonal matrices we have Theorem 8.3.1.
Solution of Exercise 8.3 Numerical experiments show that the Jacobi
and Gauss-Seidel methods always converge for this type of matrix. Let us
prove that these methods actually converge for all diagonally strictly dominant
matrix.
• The characteristic polynomial of the Jacobi matrix is
pJ (λ) = det [D−1 (E + F ) − λI] = − det (D −1 ) det (λD − E − F ).
If A = D − E − F is diagonally strictly dominant, then, for all λ ∈ C such
that |λ| > 1, the matrix λD − E − F is also diagonally strictly dominant
and is thus nonsingular, det (λD − E − F ) 6= 0. Hence, there exists no
λ ∈ C, |λ| > 1 such that pJ (λ) = 0, which proves that %(J ) < 1.
• Apply the same reasoning to the Gauss-Seidel method, while noting that
pG1 (λ) = − det (D−1 ) det (λD − λE − F ).
Solution of Exercise 8.4
>> A=[5 1 1 1;0 4 -1 1;2 1 5 1; -2 1 0 4];b=[8 4 9 3]’;
>> sol=A\b;M1=[3 0 0 0;0 3 0 0;2 1 3 0;-2 1 0 4];
>> N=M1-A;B1=inv(M1)*N;c=inv(M1)*b;
54
CHAPTER 8.
EXERCISES OF CHAPTER 8
>> x=zeros(4,1);for i=1:20, x=B1*x+c;end;norm(x-sol)
ans =
0.0522
>> M2=[4 0 0 0;0 4 0 0;2 1 4 0;-2 1 0 4];N=M2-A;B2=inv(M2)*N;
>> c=inv(M2)*b;x=zeros(4,1);for i=1:20, x=B2*x+c;end;norm(x-sol)
ans =
3.5548e-09
The second method converges faster to the solution. Explanation:
>> [max(abs(eig(B1))), max(abs(eig(B2)))]
ans =
0.8234
0.3560
The spectral radius of the matrix of the second method is smaller than that
of the matrix of the first method.
Solution of Exercise 8.3 Numerical experiments show that the Jacobi
and Gauss-Seidel methods always converge for this type of matrix. Let us
prove that these methods actually converge for all diagonally strictly dominant
matrix.
• The characteristic polynomial of the Jacobi matrix is
pJ (λ) = det [D−1 (E + F ) − λI] = − det (D −1 ) det (λD − E − F ).
If A = D − E − F is diagonally strictly dominant, then, for all λ ∈ C such
that |λ| > 1, the matrix λD − E − F is also diagonally strictly dominant
and is thus nonsingular, det (λD − E − F ) 6= 0. Hence, there exists no
λ ∈ C, |λ| > 1 such that pJ (λ) = 0, which proves that %(J ) < 1.
• Apply the same reasoning to the Gauss-Seidel method, while noting that
pG1 (λ) = − det (D−1 ) det (λD − λE − F ).
Solution of Exercise 8.4
>> A=[5 1 1 1;0 4 -1 1;2 1 5 1; -2 1 0 4];b=[8 4 9 3]’;
>> sol=A\b;M1=[3 0 0 0;0 3 0 0;2 1 3 0;-2 1 0 4];
>> N=M1-A;B1=inv(M1)*N;c=inv(M1)*b;
>> x=zeros(4,1);for i=1:20, x=B1*x+c;end;norm(x-sol)
ans =
0.0522
>> M2=[4 0 0 0;0 4 0 0;2 1 4 0;-2 1 0 4];N=M2-A;B2=inv(M2)*N;
>> c=inv(M2)*b;x=zeros(4,1);for i=1:20, x=B2*x+c;end;norm(x-sol)
ans =
3.5548e-09
The second method converges faster to the solution. Explanation:
55
>> [max(abs(eig(B1))), max(abs(eig(B2)))]
ans =
0.8234
0.3560
The spectral radius of the matrix of the second method is smaller than that
of the matrix of the first method.
Solution of Exercise 8.5 Example of implementation of the Jacobi method.
function [x, iter]=Jacobi(A,b,tol,MaxIter,x)
% Computes by the Jacobi method the solution of Ax=b
% tol = ε of the termination criterion
% MaxIter = maximum number of iterations
% x = x0
[m,n]=size(A);
if m~=n, error(’the matrix is not square’), end;
if abs(det(A)) < 1.e-12
error(’the matrix is singular’)
end;
if ~(JacobiCvg(A)) , error(’Jacobi will not converge’); end;
% nargin = number of input arguments of the function
% default values of the arguments
if nargin==4 , x=zeros(size(b));end;
if nargin==3 , x=zeros(size(b));MaxIter=200;end;
if nargin==2 , x=zeros(size(b));MaxIter=200;tol=1.e-4;end;
M=diag(diag(A));
% Initialization
iter=0;r=b-A*x;
% Iterations
while (norm(r)>tol)&(iter<MaxIter)
y=M\r;
x=x+y;
r=r-A*y;
iter=iter+1;
end;
We obtain the following results
>> n=20;A=laplacian1dD(n);xx=(1:n)’/(n+1);b=xx.*sin(xx);sol=A\ b;
>> tol =1.e-2;x=Jacobi(A,b,tol,1000);norm(x-sol), norm(inv(A))*tol
ans =
0.0010
ans =
0.0010
>> tol =1.e-3;x=Jacobi(A,b,tol,1000);norm(x-sol), norm(inv(A))*tol
ans =
1.0148e-04
56
CHAPTER 8.
EXERCISES OF CHAPTER 8
ans =
1.0151e-04
>> tol =1.e-4;x=Jacobi(A,b,tol,1000);norm(x-sol), norm(inv(A))*tol
ans =
1.0148e-05
ans =
1.0151e-05
which show, on the one hand, that we can approximate very accurately the
exact solution, provided tol is small enough and MaxIter is large enough,
and on the other hand that the upper bound (8.6) is very sharp.
Solution of Exercise 8.5 Example of implementation of the Jacobi method.
function [x, iter]=Jacobi(A,b,tol,MaxIter,x)
% Computes by the Jacobi method the solution of Ax=b
% tol = ε of the termination criterion
% MaxIter = maximum number of iterations
% x = x0
[m,n]=size(A);
if m~=n, error(’the matrix is not square’), end;
if abs(det(A)) < 1.e-12
error(’the matrix is singular’)
end;
if ~(JacobiCvg(A)) , error(’Jacobi will not converge’); end;
% nargin = number of input arguments of the function
% default values of the arguments
if nargin==4 , x=zeros(size(b));end;
if nargin==3 , x=zeros(size(b));MaxIter=200;end;
if nargin==2 , x=zeros(size(b));MaxIter=200;tol=1.e-4;end;
M=diag(diag(A));
% Initialization
iter=0;r=b-A*x;
% Iterations
while (norm(r)>tol)&(iter<MaxIter)
y=M\r;
x=x+y;
r=r-A*y;
iter=iter+1;
end;
We obtain the following results
>> n=20;A=laplacian1dD(n);xx=(1:n)’/(n+1);b=xx.*sin(xx);sol=A\ b;
>> tol =1.e-2;x=Jacobi(A,b,tol,1000);norm(x-sol), norm(inv(A))*tol
ans =
0.0010
57
ans =
0.0010
>> tol =1.e-3;x=Jacobi(A,b,tol,1000);norm(x-sol), norm(inv(A))*tol
ans =
1.0148e-04
ans =
1.0151e-04
>> tol =1.e-4;x=Jacobi(A,b,tol,1000);norm(x-sol), norm(inv(A))*tol
ans =
1.0148e-05
ans =
1.0151e-05
which show, on the one hand, that we can approximate very accurately the
exact solution, provided tol is small enough and MaxIter is large enough,
and on the other hand that the upper bound (8.6) is very sharp.
Solution of Exercise 8.7
1. We start by programming the method.
function [x, iter]=RelaxJacobi(A,b,w,tol,MaxIter,x)
[m,n]=size(A);
if m~=n, error(’the matrix is not suqare’), end;
if abs(det(A)) < 1.e-12
error(’the matrix is singular’)
end;
if ~w, error(’omega=0’);end;
% nargin = number of input arguments of the function
% Default values of the arguments
if nargin==5 , x=zeros(size(b));end;
if nargin==4 , x=zeros(size(b));MaxIter=200;end;
if nargin==3 , x=zeros(size(b));MaxIter=200;tol=1.e-4;end;
if nargin==2 , x=zeros(size(b));MaxIter=200;tol=1.e-4;w=1;end;
D=diag(diag(A));
M=D/w;
% Initialization
iter=0;r=b-A*x;
% Iterations
while (norm(r)>tol)&(iter<MaxIter)
y=M\r;
x=x+y;
r=r-A*y;
iter=iter+1;
end;
if isnan(norm(r))
% this test is justified later
iter=MaxIter;
58
CHAPTER 8.
EXERCISES OF CHAPTER 8
end
Figure 8.1 is obtained by the following script
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
n=10;A=laplacian1dD(n);xx=(1:n)’/(n+1);b=xx.*sin(xx);sol=A\b;
pas=0.1;
for i=1:20
omega(i)=i*pas;
[x, iter]=RelaxJacobi(A,b,omega(i),1.e-4,1000,zeros(size(b)));
itera(i)=iter;
end;
plot(omega,itera,’-+’,’MarkerSize’,10,’LineWidth’,3)
grid on
set(gca,’XTick’,0:.4:2,’YTick’,200:200:1000,’FontSize’,24);
1000
800
600
400
200
0
0.4
0.8
1.2
1.6
2
Fig. 8.1. Relaxation of the Jacobi method: number of iterations in terms of ω.
It seems that the optimal value is close to ω = 1, i.e., the Jacobi method
is optimal. For ω = 0.3 and ω = 1.7 we have
>> [x, iter]=RelaxJacobi(A,b,.3,1.e-4,1000,zeros(size(b)));
>> [iter, norm(x), norm(A*x-b)]
ans =
737.0000
0.0847
0.0001
>> [x, iter]=RelaxJacobi(A,b,1.7,1.e-4,1000,zeros(size(b)));
>> [iter, norm(x), norm(A*x-b)]
ans =
1000
NaN
NaN
If, for ω = 0.3, the computed solution is a good approximation of the exact
solution, it is not the case for ω = 1.7. The algorithm has in fact diverged,
since it computes whimsical values of order 10305 . Note that to calculate
kAxb k2 , Matlab returns the value Nan which means “not a number”. This
justifies the test placed at the end of function RelaxJacobi which checks
that the values computed are real numbers and not Nan.
59
2. Theoretical analysis.
a) The Jacobi matrix is J = D −1 (E + F ) and the relaxed Jacobi matrix
is
o
D −1 n 1 − ω
n
o
D + E + F = D−1 (1 − ω)D + ωE + ωF .
Jω =
ω
ω
They are linked by the relation
Jω = ωJ + (1 − ω)I.
If µi is an eigenvalue of J , then ωµi + (1 − ω) is an eigenvalue of Jω .
The tridiagonal matrix A being symmetric and positive definite, we
know that the Jacobi method converges and that the eigenvalues of
J are real. We denote these eigenvalues by µi :
−1 < µ1 ≤ µ2 ≤ . . . ≤ µn < 1.
Furthermore, if µi ∈ σ(J ) then −µi ∈ σ(J ). In particular, −µ1 =
µn = %(J ).
b) The relaxed Jacobi method converges for all value ω such that
|ωµi + (1 − ω)| < 1,
∀µi ∈ σ(J ),
i.e.,
−1 ≤ (1 − µi )ω − 1 ≤ 1,
∀µi ∈ σ(J ).
Wherefrom we deduce that the method converges if ω ∈ ]0, 2/(1 − µi )[
for all µi ∈ σ(J ), which is true if and only if
ω ∈ I =]0,
2
[.
1 + %(J )
c) Figure 8.2 shows the curves ω 7→ |ωµi + (1 − ω)| for different values
of µi . We infer that
(µn − 1)ω + 1 if ω ∈ ]0, ω̄]
%(Jω ) =
(1 − µ1 )ω − 1 if ω ≥ ω̄,
where ω̄ is defined by the intersection of the two extremal curves
corresponding to µ1 and µn :
(1 − µ1 )ω̄ − 1 = (µn − 1)ω̄ + 1,
that is,
ω̄ =
2
.
2 − (µ1 + µn )
But as µ1 = −µn , we have ω̄ = 1 and Jω = J , we retrieve the Jacobi
method.
Conclusion: there is no interest in relaxing the Jacobi method (at least
for a symmetric and positive definite tridiagonal matrix), contrarily
to the Gauss-Seidel method.
60
CHAPTER 8.
EXERCISES OF CHAPTER 8
1
ω
0
1
1− µ 1
1
1− µ i
1
1− µ N
Fig. 8.2. Relaxation of the jacobi method: ω 7→ %(Jω ).
Solution of Exercise 8.8
1. If the sequence (xk )k is stationary, i.e., xk = x, it is legitimate to ask that
the sequence (x0j )j be the same, since we cannot improve the convergence
in this case.
2. According to (8.14), we have
e0j+1
=
j
X
αjk (xk
k=0
− x) =
j
X
αjk ek .
k=0
Since ek = B k e0 , this yields the result.
3. The matrix B being normal, there exists a unitary matrix U such that
B = U DU ∗ , where D is a diagonal matrix made up of the eigenvalues
of B. Noting that B k = U Dk U ∗ and that 2-norm is invariant by unitary
transformations, we have
ke0j+1 k2 ≤ kU pj (D)U ∗ k2 = kpj (D)k2 .
Inequality (8.17) follows from the fact that the 2-norm of a diagonal matrix
is equal to its largest (in modulus) entry, see Exercise 3.3.
4. Since the iterative method generating the sequence xk converges, the spectral radius of B is strictly smaller than 1.
5. We can apply Proposition 9.5.3 with a = −b = −α and β = 1 ∈
/ [−α, α].
Thus, the unique solution of problem (8.18) is
pj (λ) =
Tj (λ/α)
,
Tj (1/α)
where Tj denotes the Chebyshev polynomial of degree j.
61
6. We set βj = Tj (1/α). Let us right away, remark that βj cannot vanish
since 1/α > 1 and all zeros of Tj are included in [−1, 1]. According to
relation (9.12), we have
βj
λpj (λ) − βj−1 pj−1 (λ),
α
βj
βj+1 pj+1 (B) = 2 Bpj (B) − βj−1 pj−1 (B),
α
βj
0
βj+1 (xj+1 − x) = 2 B(x0j − x) − βj−1 (x0j−1 − x).
α
βj+1 pj+1 (λ) = 2
Since Bx = x − c and βj+1 = −βj−1 + 2βj /α, we obtain the desired
recurrence:
βj+1 x0j+1 = 2
βj
(Bx0j + c − x0j−1 ) + βj+1 x0j−1 ,
α
which is nothing but relation (8.19) with
µj =
2βj
βj−1
=1+
.
αβj+1
βj+1
(8.1)
Remark that µj 6= 0, for all j.
7. Since T0 (t) = 1, T1 (t) = t and T2 (t) = 2t2 − 1, we have
µ0 =
2T0 (1/α)
2T1 (1/α)
2
= 2, µ1 =
=
.
αT1 (1/α)
αT2 (1/α)
2 − α2
To get a recurrence relation, we write
1
µj+1
1
βj+2
=
,
1 + βj /βj+2
βj+2 + βj
α
1
= 1 − βj
,
= 1 − βj
βj+2 + βj
2βj+1
α2
= 1−
µj .
4
=
We can therefore compute the sequence (µj )j by:
µj+1 =
1
.
1 − α2 µj /4
Let us show by induction on j that µj ∈ ]1, 2[. It is obviously true for
j = 1. Assume that µj ∈ ]1, 2[. Then we have
−
α2
α2
α2
< − µj < −
2
4
4
62
CHAPTER 8.
EXERCISES OF CHAPTER 8
and since α ∈ ]−1, 1], we deduce
−
α2
1
< − µj < 0
2
4
and by the recurrence relation
1
1
<
< 1.
2
µj+1
8. Here is a script for the accelerated Jacobi method; the results are displayed
on Figure 8.3.
>> n=10;alpha=.97;h=1/(n+1);
>> xx=h*(1:n)’;zz=cos(xx);
>> b=[];for i=1:n, b=[b;zz*sin(i*h)];end;
>> A=laplacian2dD(n);M=diag(diag(A));
>> N=M-A; B=inv(M)*N;c=inv(M)*b;sol=A\b;
% first two computations by Jacobi
>> x0=rand(size(b));x1=c;xJ=x1;Niter=50;
>> iter=1;a=alpha*alpha/4;para=2/(2-alpha*alpha);
>> while iter<=Niter
>>
xJ=B*xJ+c;
>>
para=1/(1-a*para);
>>
xJT=para*(B*x1+c) +(1-para)*x0;
>>
Jerror(iter)=norm(xJ-sol);
>>
JTerror(iter)=norm(xJT-sol);
>>
x0=x1;x1=xJT;iter=iter+1;
>> end;
>> z=(1:Niter)’;plot(z, log(Jerror),’-o’,z,log(JTerror),’-+’, ...
’MarkerSize’,10,’LineWidth’,3);
grid on
set(gca,’XTick’,0:10:50,’YTick’,-10:2:2,’FontSize’,24);
axis([0 50 -10 2])
text(35,-2,’Jacobi’,’FontSize’,24)
text(20,-8,’Accelerated Jacobi’,’FontSize’,24)
63
2
0
Jacobi
−2
−4
−6
Accelerated Jacobi
−8
−10
0
10
20
30
40
50
Fig. 8.3. Error of the Jacobi and accelerated Jacobi methods in terms of the iteration number.
9
Exercises of chapter 9
Solution of Exercise 9.1
1. function [x, iter]=GradientS(A,b,tol,alpha,MaxIter,x)
% Gradient method for solving system Ax = b
% tol = ε termination criterion
% MaxIter = maximal number of iterations
% x = x0
% nargin = number of input arguments of the function
% Default values of the arguments
if nargin==5 , x=zeros(size(b));end;
if nargin==4 , x=zeros(size(b));MaxIter=2000;end;
if nargin==3 , x=zeros(size(b));MaxIter=2000;alpha=1.e-4;end;
if nargin==2
x=zeros(size(b));MaxIter=2000;alpha=1.e-4;tol=1.e-4;
end;
% Initialization
iter=0;r=b-A*x;
% Iterations
while (norm(r)>tol)&(iter<MaxIter)
x=x+alpha*r;
r=b-A*x;
iter=iter+1;
end;
2. We run the previous function and check that its results are correct
>> n=10;A=laplacian1dD(n);xx=(1:n)’/(n+1);b=xx.*sin(xx);
>> [x, iter]=GradientS(A,b,1.e-4,1.e-4,10000);
>> norm(x-A\b), iter
ans =
1.0194e-05
iter =
9180
66
CHAPTER 9.
EXERCISES OF CHAPTER 9
2000
1500
1000
500
3.5
4
−3
x 10
Fig. 9.1. Number of iterations in terms of α.
The algorithm converged in 9 180 iterations.
3. The following script yields Figure 9.1.
>>
>>
>>
>>
>>
>>
>>
pas=1.e-5;npas=100;
for i=1:npas;
alpha(i)=32*1.e-4+i*pas;
[x, iter]=GradientS(A,b,1.e-10,alpha(i),2000);
niter(i)=iter;
end;
plot(alpha,niter,’-+’,’MarkerSize’,5,’LineWidth’,3)
Zooming in on this figure shows that the optimal value α is close to
415 10−5. The theoretical value given by Theorem 9.1.1 is
spa=eig(A);alphaOPT=2/(min(spa)+max(spa))
alphaOPT =
0.0041322
We notice a good accordance of both values.
Solution of Exercise 9.2
1. By Lemma 9.3.1,
αk =
k∇f (xk )k2
.
hA∇J (xk ), ∇f (xk )i
2. function [x, iter]=GradientV(A,b,tol,MaxIter,x)
% Computes by the variable step gradient method
% the solution of system Ax = b
% tol = ε termination criterion
% MaxIter = maximal number of iterations
67
% x = x0
% nargin = number of input arguments of the function
% Default values of the arguments
if nargin==4 , x=zeros(size(b));end;
if nargin==3 , x=zeros(size(b));MaxIter=2000;end;
if nargin==2 , x=zeros(size(b));MaxIter=2000;tol=1.e-4;end;
% Initialization
iter=0;r=b-A*x;tol2=tol*tol;normr2=r’*r;
% Iterations
while (normr2>tol2)&(iter<MaxIter)
alpha=normr2/(r’*A*r);
x=x+alpha*r;
r=b-A*x;
normr2=r’*r;
iter=iter+1;
end;
3. The variable step gradient method converges slightly faster than the fixed
step gradient method with the optimal step. The result is different when
we do not know the optimal step. For instance, changing the dimension n
and keeping the same parameter alphaOPT, we get
>> n=8;A=laplacian1dD(n);xx=(1:n)’/(n+1);b=xx.*sin(xx);
>> [xG, iterG]=GradientS(A,b,1.e-4,alphaOPT,10000);
>> [xGV, iterGV]=GradientV(A,b,1.e-4,10000);
>> [iterG, iterGV]
ans =
216
144
Solution of Exercise 9.4
function [x, iter]=GradientCP(A,b,tol,MaxIter,x)
% Computes by the conjugate gradient method
% the solution of Ax = b
% Preconditioner: SSOR
% tol = ε termination criterion
% MaxIter = maximal number of iterations
% x = x0
% nargin = number of input arguments of the function
% Default values of the arguments
if nargin==4 , x=zeros(size(b));end;
if nargin==3 , x=zeros(size(b));MaxIter=2000;end;
if nargin==2 , x=zeros(size(b));MaxIter=2000;tol=1.e-4;end;
%preconditioning matrix
n=size(b,1);
omega=2*(1-pi/n);
D=diag(A);E=-tril(A)+diag(D);E=diag(D)/omega-E;D=diag((1)./D);
C=E*D*E’*omega/(2-omega);
68
CHAPTER 9.
EXERCISES OF CHAPTER 9
% initialization
iter=0;r=b-A*x;tol2=tol*tol;normr2=r’*r;z=C\r;p=z;
% Iterations
while (normr2>tol2)&(iter<MaxIter)
Ap=A*p;
tmp=r’*z;
alpha=tmp/(p’*Ap);
x=x+alpha*p;
r=r-alpha*Ap;
z=C\r;
beta=r’*z/tmp;
p=z+beta*p;
normr2=r’*r;
iter=iter+1;
end;
We modify programs GradientC and GradientCP so that they return a vector
containing the norms of the errors during the iterations.
>>
>>
>>
>>
>>
n=50;A=laplacian1dD(n);xx=(1:n)’/(n+1);b=xx.*sin(xx);;
[xCG, CGiterations ]=ModifiedGradientC(A,b,1.e-10,50);
[xPCG,PCGiterations]=ModifiedGradientCP(A,b,1.e-10,50);
z=(1:50)’; plot(z,log(CGiterations),z, ...
log(PCGiterations),’+’,’MarkerSize’,10,’LineWidth’,3)
0 log(error)
−10
−20
−30
iterations
−40
10
20
30
40
50
Fig. 9.2. Convergence of the gradient (continuous line) and the preconditioned
gradient (+) methods.
Figure 9.2 shows that the preconditioned conjugate gradient converges much
faster than the conjugate gradient. For instance, to achieve an error smaller
69
than 10−9 , the preconditioned conjugate gradient takes only 4 iterations, while
45 iterations are necessary for the conjugate gradient.
10
Exercises of chapter 10
Solution of Exercise 10.1 It is clear that the spectrum of W (n) consists
of integers 1, 2, . . . , n. We compare the spectra of both matrices
%Wilkinson matrix
>> n=20;u=[n:-1:1];v=n*ones(n-1,1);W=diag(u)+diag(v,1);
>> W(n,1)=1.e-10;
% perturbation
>> SW=eig(W);[(1:n)’ SW]
ans =
1.0000
0.9958
2.0000
2.1092
3.0000
2.5749
4.0000
3.9653 + 1.0877i
5.0000
3.9653 - 1.0877i
6.0000
5.8940 + 1.9485i
7.0000
5.8940 - 1.9485i
8.0000
8.1181 + 2.5292i
9.0000
8.1181 - 2.5292i
10.0000
10.5000 + 2.7334i
11.0000
10.5000 - 2.7334i
12.0000
12.8819 + 2.5292i
13.0000
12.8819 - 2.5292i
14.0000
15.1060 + 1.9485i
15.0000
15.1060 - 1.9485i
16.0000
20.0042
17.0000
17.0347 + 1.0877i
18.0000
17.0347 - 1.0877i
19.0000
18.4251
20.0000
18.8908
A small perturbation of the entries of the matrix produces large changes
on its eigenvalues; thus the matrix is ill-conditioned for the calculation of
eigenvalues.
72
CHAPTER 10.
EXERCISES OF CHAPTER 10
Solution of Exercise 10.2
>> A=[7.94 5.61 4.29;5.61 -3.28 -2.97;4.29 -2.97 -2.62];
>> b=[1;1;1];x=A\b;S=eig(A);
>> A1=A+0.01*triu(ones(3,3));x1=A1\b;S1=eig(A1);
>> [x x1]
% Solutions of the linear systems
ans =
0.4438
0.3827
-5.3607
-4.2646
6.4219
5.0987
[S S1]
% Spectra of the matrices
ans =
-8.8900
10.9257
0.0200
-8.8808
10.9100
0.0251
The spectrum has relatively little changed, whereas the solutions of both linear
systems are quite modified. Explanation:
>> cond(A)
ans =
545.5000
and since A is symmetric, we have Γ2 (A) = 1 (see definition 10.2.1). The
matrix A is well-conditioned for computing eigenvalues, but not for solving a
linear system.
Solution of Exercise 10.3
1. λ ∈ σ(A) ⇐⇒ A − λIn is a singular matrix. The adjoint matrix (A −
λIn )∗ = A∗ − λ̄In is also singular, wherefrom the result is proved.
2. Let y be an eigenvector of A∗ corresponding to λ̄: A∗ y = λ̄y. Taking the
adjoint in this equality yields y ∗ A = λy ∗ , which means that y is a left
eigenvector of A corresponding to the eigenvalue λ.
3. If A is hermitian, its eigenvalues are real and y ∗ A = λy ∗ ⇐⇒ Ay = λy.
4. Let x be an eigenvector associated with µ and y a left eigenvector associated with λ. We write
λy ∗ x = (λy ∗ )x = (y ∗ A)x = y ∗ (Ax) = µy ∗ x,
and if λ 6= µ, we indeed have y ∗ x = 0.
5. >> A =[1 -2 -2 -2; -4 0 -2 -4; 1 2 4 2; 3 1 1 5];
>> [X,D]=eig(A’); % eigenvectors of A∗ = left eigenvectors of A
>> X’*A-D*X’
% Checking
ans =
1.0e-15 *
0.1110
-0.3331
-0.4441
-0.1110
73
-0.6661
-0.2220
0.4441
0.6661
-0.4006
0.8882
0.6661
-0.4441
0.8882
0.4441
-0.1899
0.9470
Solution of Exercise 10.4 We assume A is a diagonalizable matrix.
1. Let q1 , . . . , qn be the columns of matrix Q, we infer from relation Q∗ A =
diag (λ1 , . . . , λn )Q∗ that
qk∗ A = λk qk∗ ,
that is, vectors qk are left eigenvectors corresponding to eigenvalues λk .
2. Let p1 , . . . , pn be the columns of matrix P . From relation Q∗ P = In , we
deduce equalities
qj∗ pk = δj,k .
If y and x are associated with the same simple eigenvalue λi , y ∗ x is proportional to qi∗ pi = 1.
Note: it is not necessary to assume that matrix A is diagonalizable, it suffices
that eigenvalue λi be simple for vectors x and y to be non orthogonal.
Solution of Exercise 10.5
1. Computing eigenvalues of matrix A:
>> A=[-97 100 98; 1 2 -1; -100 100 101];
>> eig(A)’
ans =
1.0000
2.0000
3.0000
For different perturbations we get the following spectra
>> res=eig(A)’;for i=1:5,B=A+0.01*rand(3,3);res=[res;eig(B)’];end;
>> res
res =
1.0000
2.0000
3.0000
0.2174
2.7958
3.0081
0.3628
2.6614
2.9955
0.2926
2.7306
2.9954
0.4297
2.5768
3.0070
0.4319
2.5848
2.9993
It is eigenvalue 3 that has been the least changed. The two other eigenvalues seem to be more sensitive to perturbations of the matrix.
2.
a) Mapping λ : ε 7→ λε is continuous since the eigenvalues of a matrix
depend continuously on the entries of this matrix.
b) Taking the difference between the two equalities (A0 + εE)xε = λε xε
and A0 x0 = λ0 x0 , we yield the sought relation:
A0 (δx) + εExε = (δλ)xε + λ0 (δx).
74
CHAPTER 10.
EXERCISES OF CHAPTER 10
c) The scalar product of y0 and of the above equality allows to eliminate
terms in δx since by definition y0∗ A = λ0 y0∗ . We get
εy0∗ Exε = (δλ)y0∗ xε ,
i.e.,
λε − λ 0 ∗
y0 xε = y0∗ Exε .
ε
And taking the limit as ε tends to 0
λ0 (0) =
y0∗ Ex0
.
y0∗ x0
d) In the neighbourhood of ε = 0 the slope of curve ε 7→ λε is bounded
from above by
|λ0 (0)| ≤
ky ∗ k2 kEk2 kx0 k2
1
ky0∗ Ex0 k2
≤ 0
= ∗ .
∗
∗
|y0 x0 |
|y0 x0 |
|y0 x0 |
We deduce that the larger the possible variations of λ0 , the larger is
quantity |y∗1x0 | , which justifies the definition cond(A, λ0 ) = 1/|y0∗ x0 |.
0
3. Computing the respective conditionings of the three eigenvalues:
>> [X,D]=eig(A);
% X= right eigenvectors
>> [Y,E]=eig(A’);
% Y= left eigenvectors
>> [diag(D)’;diag(E)’]
% check that the eigenvalues
ans =
% are in the same order
1.0000
2.0000
3.0000
1.0000
2.0000
3.0000
>> fprintf(’Conditioning of eigenvalues \n’)
Conditioning of eigenvalues
>> for i=1:3
>>
x=X(:,i)/norm(X(:,i));
>>
y=Y(:,i)/norm(Y(:,i));
>>
fprintf(’ev = %f,
cond = %f\n’,D(i,i),1/abs(y’*x))
>> end;
ev = 1.000000,
cond = 244.135208
ev = 2.000000,
cond = 244.955098
ev = 3.000000,
cond = 2.000000
Eigenvalue 3 is the best conditioned, which explains the observations of
question 1.
Solution of Exercise 10.7
function l=DefPower(A,ev)
% Power method and deflation technique
converge=0;eps=1.e-6;
75
iter=0;IterMax=100;evn=ev/norm(ev)/norm(ev);
n=size(A,1);x0=ones(n,1)/sqrt(n);
% beginning of iterations
while (iter<IterMax)&(~converge)
u=A*x0;
x=u/norm(u);
x=x-(x’*evn)*ev;
% we orthogonalize with respect to ev
converge=norm(x-x0)<eps;
x0=x;iter=iter+1;
end
l=norm(u);
We compute first the largest (in modulus) eigenvalue of A and a corresponding
eigenvector:
>> A=[2 1 2 2 2;1 2 1 2 2;2 1 2 1 2;2 2 1 2 1;2 2 2 1 2];
>> [l5,u5]=powerD(A);
then the second
>> l4=DefPower(A,u5);
Let us compare with the spectrum of A
>> eig(A)’
ans =
-1.0000
>> [l5 l4]
ans =
8.4297
-0.1388
1.0000
1.7091
1.7091
Solution of Exercise 10.8
function [l,u]=powerI(A,x0)
% Computes by the inverse power method
% l = approximation of |λ1 |
% Initialization
n=size(A,1);x0=ones(n,1)/sqrt(n); % x 0
converge=0;eps=1.e-6;
iter=0;IterMax=3;
% begining of the iterations
while (iter<IterMax)&(~converge)
u=A\x0;
x=u/norm(u);
converge=norm(x-x0)<eps;
x0=x;iter=iter+1;
end
l=1/norm(u);
Performing 3 iterations of function powerI, we get
8.4297
76
CHAPTER 10.
EXERCISES OF CHAPTER 10
>> A=laplacian1dD(100);[l,u]=powerI(A);l
>> l =
9.8689
Using Matlab function eig, we obtain
>> min(eig(A))
ans =
9.8688
Solution of Exercise 10.9 Function house called hereafter, computes the
Householder matrix of a vector (see Exercise 7.5.)
function T=householderTri(A)
% Changing matrix A into
% a tridiagonal matrix by
% the Householder algorithm
[m,n]=size(A);
T=A;
for k=1:m-2
v=T(k+1:n,k);w=v+norm(v)*[1;zeros(n-k-1,1)];
Hw=house(w);
H=[eye(k,k) zeros(k,n-k); zeros(n-k,k) Hw];
T=H’*T*H;
end;
• To check if a matrix is tridiagonal, we execute the following test:
function T=TridiagTest(A,e)
% nargin = number of input arguments of the function
if nargin==1, e=eps; end;
B=diag(diag(A))+diag(diag(A,1),1)+diag(diag(A,-1),-1);
if norm(A-B) > e
T=0;
else
T=1;
end;
• For different values of n, we compare the spectra of matrices A and T :
>> A=rand(n,n);A=A+A’;T=householderTri(A);
>> norm(sort(eig(A)+0.*i)-sort(eig(T)+0.*i))
ans =
4.4949e-15
Note: the addition of the complex term 0i enables to ensure that the sorting
is performed in the same way for the both vectors (see the description of
function sort).
Solution of Exercise 10.10 Here is a program executing the requested
computations.
77
function x=givens(T,i)
% Computes an approximation x of
% the i-th eigenvalue of symmetric
% tridiagonal matrix T.
% λ1 ≤ · · · ≤ λ n
if ~TridiagTest(T,1.E-12)
error(’the matrix is not tridagonal’);
end;
n=size(T,1);
Tol =1.E-10;
% we are sure of the upper bound
b=norm(T,’inf’);
% |λi − x| ≤ Tol
a=-b;x=(a+b)/2.;
while b-a > Tol
m=0;
% number of sign changes
p0=1;p1=T(1,1)-x;
if p0*p1<0, m=m+1;end;
% There may be a problem
for k=2:n
% if p is close to 0
p=(T(k,k)-x)*p1-T(k-1,k)*T(k-1,k)*p0;
if p*p1<0 , m=m+1;end;
p0=p1;p1=p;
end;
if m>= i
b=x;
else
a=x;
end;
x=(a+b)/2.;
end;
Tests:
>> n=10;u=rand(n,1);v=rand(n-1,1);T=diag(u)+diag(v,1)+diag(v,-1);
>> specG=[];for i=1:n, specG=[specG;givens(T,i)]; end;
>> norm(eig(T)-specG)
ans =
6.6042e-11
Solution of Exercise 10.11
>> A=[2 1 2 2 2;1 2 1 2 2;2 1 2 1 2;2 2 1 2 1;2 2 2 1 2];
>> T=householderTri(A);
>> specG=[];for i=1:5, specG=[specG;givens(T,i)]; end;
>> [eig(T) specG]
ans =
8.4297
-1.0000
-1.0000
-0.1388
-0.1388
1.0000
78
CHAPTER 10.
1.0000
1.7091
EXERCISES OF CHAPTER 10
1.7091
8.4297
We find the same eigenvalues.
Solution of Exercise 10.12
1. >>
>>
%
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
k0
A=[5 3 4 3 3;3 5 2 3 3;4 2 4 2 4;3 3 2 5 3; 3 3 4 3 5];
r0=(1:5)’;
Orthonormal basis of the Krylov space defined by A and r 0
n=length(r0);v0=zeros(n,1);v1=r0/norm(r0);
V=[v1];
% shall contain vectors vj
k=0;
% shall contain k0
w=r0;free=1;
while (free & k<=n)
Av=A*v1;
w=Av-v1’*Av*v1-norm(w)*v0;
if norm( w)> 1.e-6
v0=v1;v1=w/norm(w);
k=k+1;V=[V v1];
else
free=0;
end;
end;
k0=k
% Krylov dimension
=
3
2. We define two new vectors x and y containing respectively the diagonal
of Tk and the nonzero superdiagonal of Tk .
>> v0=zeros(n,1);v1=r0/norm(r0);w=r0;
>> x=v1’*A*v1;y=[];
>> for j=1:k0
>>
Av=A*v1;w=Av-v1’*Av*v1-norm(w)*v0;
>>
v0=v1;v1=w/norm(w);
>>
x=[x v1’*A*v1];y=[y norm(w)];
>>
k=k+1;
>> end;
>> T=diag(x)+diag(y,1)+diag(y,-1)
% computing T k
T =
14.1091
5.7767
0
0
5.7767
4.7197
0.3259
0
0
0.3259
0.0964
1.0587
0
0
1.0587
3.0748
eigenvalues and eigenvectors of Tk :
>> [Q,D]=eig(T)
79
Q =
0.0462
-0.1152
0.9462
-0.2988
-0.4264
0.8938
0.0989
-0.0974
-0.0298
0.0552
0.3080
0.9493
-0.9029
-0.4299
-0.0084
-0.0006
-0.2776
0
0
0
0
2.0000
0
0
0
0
3.4183
0
0
0
0
16.8594
2.0000
2.0000
3.4183
D =
Spectrum of A:
>> eig(A)’
ans =
-0.2776
16.8594
The four simple eigenvalues of A are also eigenvalues of T .
3. For r0 = (1, 1, 1, 1, 1)t, we get
k0, eig(T)’
k0 =
2.
ans =
! - 0.2776343
3.4182631
16.859371 !
And for r0 = (1, −1, 0, 1, −1)t, the Krylov dimension is k0 = 0 since r0 is
an eigenvector of A:
norm(A*r0-2*r0)
ans =
0.
Solution of Exercise 10.13
1. >> n=7;A=laplacian2dD(n);specA=eig(A);
% we denote the eignvectors of A by the sign ‘‘+’’
% the figure shows the distinct eigenvalues
>> plot(real(specA),imag(specA),’+’)
>> r0=ones(n*n,1); % choice of r0
>> v0=zeros(n*n,1);v1=r0/norm(r0);w=r0;
>> x=v1’*A*v1;y=[];
>> for j=1:n
>>
Av=A*v1;
>>
w=Av-v1’*Av*v1-norm(w)*v0;
>>
v0=v1;v1=w/norm(w);
>>
x=[x v1’*A*v1];y=[y norm(w)];
>> end;
>> T=diag(x)+diag(y,1)+diag(y,-1);
80
CHAPTER 10.
EXERCISES OF CHAPTER 10
0.05
0.05
0
0
−0.05
0
100
200
300
400
500
−0.05
0
100
200
300
400
500
Fig. 10.1. Computation of the eigenvalues by the Lanczos method.
>> [P,D]=eig(T);
% we denote the approximate eigenvalues of A by a O
plot(real(specA),imag(specA),’+’,real(eig(D)),imag(eig(D)), ...
’o’,’MarkerSize’,10,’LineWidth’,3)
Some eigenvalues of A are well calculated; see Figure 10.1 (left). These
eigenvalues are spread a little everywhere in the spectrum of A.
2. For the second initial data r0 = (1, 2, . . . , n2 )t ; see Figure 10.1 (right).
Varying r0 , we vary matrices Tn . Since the eigenvalues of Tn are also
eigenvalues of A, we can thus recover a good share of the spectrum of
A (which is an n2 × n2 matrix) by computing the spectra of matrices of
smaller size, here n × n.
http://www.springer.com/978-0-387-34159-0
Download