Uploaded by alexgabriel2002.ai

Numerical Methods curs-BOUREANU MARIA-MAGDALENA

advertisement
A friendly course on NUMERICAL METHODS
Maria-Magdalena Boureanu and Laurenţiu Temereancă
2
Contents
1 Introduction
5
2 Solving linear systems - direct methods
7
2.1
2.2
2.3
Gaussian elimination method (Gauss pivoting method) . . . . . . . . . . . . . .
9
2.1.1
Gauss method with partial pivoting at every step . . . . . . . . . . . . .
21
2.1.2
Gauss method with total pivoting at every step . . . . . . . . . . . . . .
25
L − R Decomposition Method . . . . . . . . . . . . . . . . . . . . . . . . . . . .
28
2.2.1
The L − R Doolittle factorization method . . . . . . . . . . . . . . . . .
29
2.2.2
The L − R Croût factorization method . . . . . . . . . . . . . . . . . . .
35
2.2.3
The L − R factorization method for tridiagonal matrices . . . . . . . . .
37
Chio pivotal condensation method . . . . . . . . . . . . . . . . . . . . . . . . . .
42
3 Solving linear systems and nonlinear equations
45
3.1
Jacobi’s method for solving linear systems . . . . . . . . . . . . . . . . . . . . .
48
3.2
Seidel-Gauss method for solving linear system . . . . . . . . . . . . . . . . . . .
54
3.3
Solving nonlinear equations by the successive approximations method . . . . . .
65
4 Eigenvalues and eigenvectors
4.1
75
Krylov’s method to determine the characteristic polynomial . . . . . . . . . . .
5 Polynomial interpolation
5.1
76
87
Lagrange form for the interpolation polynomials . . . . . . . . . . . . . . . . . .
3
89
4
CONTENTS
Chapter 1
Introduction
The purpose of this course is to introduce to students the concept of ”Numerical Methods” and
some basic techniques. We start by answering to a set of questions that naturally appear in
one’s mind.
• Q1: What do we understand by ”Numerical Methods”?
They usually represent the methods that are used when we need to obtain the solution
of a numerical problem by means of a computer.
More exactly, this implies to have a complete set of procedures clearly stated in order to
obtain the solution of a mathematical problem together with its error estimates.
• Q2: Why do error estimates appear?
Sometimes we do not obtain an exact solution of a problem. Instead, from reasons that
we are going to discuss later, we prefer to construct an approximation of it.
There are some errors that simply appear in the calculus when we deal with periodic
continued fractions. For example, if in a problem we arrive at the value 13 , this means
0, (3), but the computer is not able to take infinitely long sequences of decimals. So this
is another reason for the errors to appear and, in this case, to also accumulate.
• Q3: Why do we use the ”Numerical Methods”?
While the analytical methods studied in schools had the role to help understand the
mechanism of a problem, the role of the numerical methods is to solve very heavy problems
which would take too much time and energy to be solved without the involvement of the
software. For example, in highschool we all solved linear systems of 3 or 4 equations. But
how such will it take us to solve a linear system involving hundreds of equations by hand?
This is a question that is better to leave it unanswered.
• Q4: Where do we apply these ”Numerical Methods”?
You have to always keep in mind that mathematics appeared due to real-life necessities.
We do not solve a linear system of hundreds of equations because it is fun, but because
5
6
CHAPTER 1. INTRODUCTION
it is necessary in some specific situations, like when looking into the space technology.
Generally speaking, the numerical methods are quit useful in engineering, natural sciences,
social sciences, antreprenorship, medicine, etc...
To other questions, that will probably occur to students thoughts, we will try to answer during
classes.
Chapter 2
Solving linear systems - direct methods
Definition 1. By a linear system of m linear equations and n unknown variables, with real
coefficients, we understand the following set of equalities
(2.1)

a11 x1 + a12 x2 + ... + a1n xn = b1



a21 x1 + a22 x2 + ... + a2n xn = b2
...........



am1 x1 + am2 x2 + ... + amn xn = bm ,
where aij , bi ∈ R, for all i ∈ {1, 2, ..., m} and all j ∈ {1, 2, ..., n}.
In addition, if b1 = b2 = ... = bm = 0, then the above system is called a homogeneous system of
linear equations (or a homogeneous linear system).
Remark 1. An equation is called linear because the polynomial function involved in it is of
first degree.


x
e1
 x
e2 
N

Definition 2. By a solution of system (2.1) we understand a vector x
e=
 ...  ∈ R such
x
en
that when we replace each xi by x
ei , for all i ∈ {1, 2, ..., n}, all the equalities in the system hold
simultaneously.
Depending on the existence of at least one solution for system (2.1), we distinguish between
incompatible systems, that is, systems with no solution, and compatible system, that is, systems
which admit at least one solution. If the solution of a system is unique, that system is called
determined compatible. If the system admits infinitely many solutions, that system is called
undetermined compatible.
7
8
CHAPTER 2. SOLVING LINEAR SYSTEMS - DIRECT METHODS
To every system (2.1) we can associate the

a11
 a21
A=
 ...
am1
following matrix:

a12 . . . a1n
a22 . . . a2n 
.
... ... ... 
am2 . . . amn
The free terms on the right-hand side of the linear equations can be kept in a vector


b1
 b2 

b=
 ... .
bn
The above system (2.1) can be written in the compact form
n
X
(2.2)
aij xj = bi ,
i = 1, 2, ..., m.
j=1
Moreover, the matrix form of the system (2.1) is the following
(2.3)
Ax = b,
where


x1
 x2 

x=
 ... .
xn
Notice that we denote by

a11 a12
 a21 a22
A=
 ... ...
am1 am2
. . . a1n
. . . a2n
... ...
. . . amn

b1
b2 
,
... 
bm
the extended matrix associated to system (2.1)
There are two types of methods that we can use to solve a system of linear equations via
numerical methods:
1. direct methods
• they are used for systems with less than 100 unknown variables and equations;
• these allow us to arrive at the solution of the system after a finite number of steps.
2. iterative methods
2.1. GAUSSIAN ELIMINATION METHOD (GAUSS PIVOTING METHOD)
9
• they are used for systems with more than 100 unknown variables and equations;
• with these methods we obtain an approximation of the solution.
In this chapter we only focus on the direct methods used for solving linear systems and on
a direct method to calculate the determinant of a matrix, since the calculus of determinants
often appears when we solve linear systems. The iterative methods are discussed in the next
chapter of this book.
2.1
Gaussian elimination method (Gauss pivoting method)
Gauss’s method is a general method introduced by Gauss in 1823. This method is based
on a successive elimination schema of the unknowns of the system by performing elementary
transformations of the system’s matrix: the permutation of two rows (that is, horizontal lines)
or columns (that is, vertical lines), multiplications of one row by scalars and the sums of this
one with another, etc (see [7, Ch.2]).
Let us start by discussing the standard solving of a linear system of two equations and two
unknown variables. We will go through a simple example
x1 − 3x2 = −8
3x1 + 2x2 = 9.
We multiply the first line by −3 and we get
−3x1 + 9x2 = 24
3x1 + 2x2 = 9.
We add the two lines of the system and we deduce that
11x2 = 33,
hence x2 = 3 and then we obtain x1 = 1.
Obviously, this is a trivial exercise, but it will help us better understand how the Gaussian
elimination method works. To this end, let
us write
the above system under the matrix form
1 −3
−8
Ax = b, where A =
, and b =
. Our action consist in adding to the second
3 2
9
line the first line multiplied by 3. This can be written as follows:
1 −3 −8 L2 −3L1 1 −3 −8
A=
∼
,
(2.4)
3 2
9
0 11
33
where the vertical vertical bar from the extended matrix only appears because it helps us
separate more easy the column of the free terms, and ∼ denotes the fact that these two extended
matrices correspond to equivalent systems. We recall that by equivalent systems we understand
10
CHAPTER 2. SOLVING LINEAR SYSTEMS - DIRECT METHODS
that the systems have the same solution. Thus, even though we have worked with the lines of
the system, the solution did not change. And this occurs naturally, as we have seen from the
detailed solving of the system. Notice that in (2.4) the second extended matrix has kept the
first row as it was, while the second row was modified to correspond to the equation 11x2 = 33.
Passing to more complicated example, let us solve the system

 x1 + x2 − 2x3 = −8
−4x1 + x2 − x3 = 1

3x1 − 3x2 + 2x3 = 2.
We intend to eliminate x1 from the second end the third line of the system and to preserve the
first line exactly as it is. How do we proceed? To the second line we add the first line multiplied
by 4, while to the third line we add the first line multiplied by (−3). We have arrived at

 x1 + x2 − 2x3 = −8
5x − 9x3 = −31
 2
−6x2 + 8x3 = 26.
What do we observe? That the second and the third line from a system of two equations and
two unknowns. Next, we eliminate x2 from the last equation. More precisely, we preserve the
second line as it is and to the third line we add the second line multiplied by 65 (this is exactly
as multiplying the second line by 6, the third line by 5, and adding them). We have obtained

 x1 + x2 − 2x3 = −8
(2.5)
5x2 − 9x3 = −31
 14
− 5 x3 = − 56
,
5
consequently x3 = 4. We substitute the value of x3 in the second equation and we deduce that
x2 = 1. Finally, we substitute the values of x2 and x3 in the firs equation and we deduce that
x1 = −1. The above method is actually the Gaussian elimination method . Only that the will
work with the corresponding matrix form instead:

 
 

1
1 −2 −8
1 1 −2
−8
1 1 −2
−8
A =  −4 1 −1
1  ∼  0 5 −9 −31  ∼  0 5 −9
−31  .
3 −3 2
2
0 −6 8
26
0 0 − 14
− 56
5
5
We write the system corresponding to the last matrix, which is exactly system (2.5), and then
we solve it by the backward substitution method, as we already saw. A question remains
though: how can we arrive at system (2.5) by only using the matrix form? Can we formulate
a rule that allows us to jump from matrix A to an equivalent matrix and then to another
equivalent matrix?
Indeed, Gauss gave us the so-called ”rectangular rule” which allows us to arrive at a matrix
associated to an equivalent system without performing the previous calculus (although, in some
sense, we are performing exactly the previous calculus, but in a different approach). Hence let
us try to write the general rules of this method by observing the transformations on the above
example.
2.1. GAUSSIAN ELIMINATION METHOD (GAUSS PIVOTING METHOD)
11
First of all, it is clear that goal is to obtain an upper-triangular matrix so that we could solve
the corresponding system by back substitution. here we should clarify the meaning of some
notions.
• What is an upper-triangular matrix?
Definition 3. An upper-triangular matrix is a matrix that has only zeros under the main
diagonal, that is, a matrix of the following form



C=


c11 c12 c13
0 c22 c23
0
0 c33
... ... ...
0
0
0
... c1,n−1
... c2,n−1
... c3,n−1
...
...
...
0
c1n
c2n
c3n
...
cnn



.


Note that this notion only applies to square matrices, that is, matrices in which the number of
rows coincides with the number of columns. This is fine with us because in applications we are
mostly interested in the situation when a system is a compatible determined system (that is,
the system admits a unique solution). So we will focus on establishing whether thus is the case
or not, and, as a consequence, we will treat systems in which the number of unknown variables
is the same with the number of equations. Correspondingly, the number of columns is the same
with the number of rows and the matrix associated to the system is a square matrix.
Remark 2. Similarly, we can define a lower-triangular matrix. More specific, a lowertriangular matrix is a square matrix that has only zeros above the main diagonal, that is, a
matrix of the form

c11
c21
c31
...
0
c22
c32
...
0
0
c33
...



C=


 cn−1,1 cn−1,2 cn−1,3
cn1
cn2
cn3
...
0
0
...
0
0
...
0
0
...
...
...
... cn−1,n−1 0
... cn,n−1 cnn




.



Returning to our discussion, we identify another question that might ”pop” in one’s head:
• What do we mean by solving a system by back substitution?
Well, it means that we first found out the value of xn from the last equation, then we go back
and we substitute the value of xn in the previous equation which gives us the value of xn−1 .
Then we go back and we substitute the values of xn and xn−1 in the previous equation which
gives us the value of xn−2 , and so on, until we reach the first equation of the system and we
12
CHAPTER 2. SOLVING LINEAR SYSTEMS - DIRECT METHODS
find out the value of x1 . At this point we have obtained the solution of the system


x1
 x2 


 .. 
S =  . ,


 xn−1 
xn
but be careful at the order of writing the values of the unknown variables into S (it is not the
order in which we have found them!). Obviously, the back substitution solving method works
”hand in hand” with an upper-triangular matrix associated to the system.
Remark 3. Similarly, when solving a system by direct substitution, we first find the value of
x1 from the first equation, then we substitute the value of x1 in the equation below and we obtain
x2 . Then we substitute the values of x1 and x2 in the equation below and we obtain the value of
x3 . We continue this procedure until we obtain the value of xn . This direct substitution method
is appropriate for the situation when the matrix associated to our system is a lower-triangular
matrix.
Note that we have clarified these notions, let us see how we can attain our goal, that is, to
arrive to an upper-triangular matrix. We notice the we always perform the same actions:
1. We keep unchanged the first row or the row at which we are at that moment (and implicitly, the rows above it, if they exist).
2. Below the pivot we change all the entries in that column into zeros.
3. We apply the ”rectangle rule” to find out the rest of the elements of the extended matrix.
These are the three instructions that we must apply at every step of the algorithm until we
obtain an upper-triangular matrix. The immediate questions are the following: what is a pivot
and what is the ”rectangle rule”?
A pivot is an element of the matrix A which is situated on the main diagonal (hence is the form
akk ) and it must always be non-trivial. In the eventuality in which akk = 0, we will interchange
row k with another row situated below it. Why below? Because above we already applied the
three instructions and those elements are already the elements we need. Can we interchange
two rows of matrix A without any consequences? Yes, of course, because this is equivalent
to interchanging the order of two equations into our system, and this will not have any effect
on the solution. Be careful though, when interchanging the rows of matrix A we necessary
have to interchange the corresponding elements from the column of free terms (actually we are
interchanging two rows of the extended matrix A).
Remark 4. The method that we are explaining now represents the ”basic Gauss”. Later we
are going to study the ”partial pivoting Gauss method” and the ”total pivoting Gauss method”
and these more advantage methods help us avoid the situation in which a pivot is null.
2.1. GAUSSIAN ELIMINATION METHOD (GAUSS PIVOTING METHOD)
13
Finally, let us see what is with this ”rectangular rule”. At every step of the algorithm we
change the values of the elements of the extended matrix A with new ones (usually).
In fact, we have seen that the elements on the row k at which we are at some point (since we
start our discussion with a11 on row 1, then we move to a22 which is on row 2 etc.) remain
as they were at the previous step. In addition, under akk we will put zero everywhere on the
column. Without returning and modifying any of the elements that we already calculated (these
are the elements above row k and at the left of column k) we apply the ”rectangular rule” to all
the other elements aij . To apply the ”rectangular rule” we imagine a rectangular in which the
first diagonal starts at akk and ends at aij . We can identify the other corners of this rectangular
as being aik and akj , and we can visualize the other diagonal. The rule is the following
akk aij − aik akj
,
akk
and now it is clear why the pivot can never be 0.
matrix below

a11 a12 ... a1k ...
 ..
..
..
 .
.
.

 ak1 ak2 ... akk ↔

 ..
..
 .
.
l

 ai1 ai2 ... aik ↔

 ..
..
..
 .
.
.
an1 an2 ... ank ...
We try to present an image of this rule in the
a1j ... a1n
..
..
.
.
akj ... akn
..
l
.
aij
..
.
... ain
..
.
anj ... ann

b1
.. 
. 

bk 
.. 
. 
.

bi 
.. 
. 
bn
We consider the linear system
(2.6)
Ax = b,
where A = (aij ) 1≤i≤n ∈ Mn×n (R) a matrix with n rows and n columns, b = (bi )1≤i≤n ∈ Mn×1 (R)
1≤j≤n
a column vector and x = (xi )1≤i≤n ∈ Rn the unknown vector.
First, we consider the extended matrix
A = (aij )
1≤i≤n
1≤j≤n+1
,
where we denote ai,n+1 = bi , 1 ≤ i ≤ n.
The Gaussian elimination method consists of processing the extended matrix A by elementary transformations, such that, in n − 1 steps the matrix A becomes upper-triangular:
 (n) (n)

(n)
(n)
(n)
a11 a12 ... a1,n−1
a1,n
a1,n+1


(n)
(n)
(n)
(n)
a2,n
a2,n+1 
 0 a22 ... a2,n−1
 .
 not. (n)
..
..
..
 ..
 = A , where A(1) = A.
(2.7)
. ...
.
.




(n)
(n)
(n)
0 ... an−1,n−1 an−1,n an−1,n+1 
 0
(n)
(n)
0
0 ...
0
an,n
an,n+1
14
CHAPTER 2. SOLVING LINEAR SYSTEMS - DIRECT METHODS
Remark 5. The extended matrix A(n) has n rows and n + 1 columns.
(k)
In what follows, aij , ∀1 ≤ i, k ≤ n, 1 ≤ j ≤ n + 1, represents the element of the extended
matrix A(k) at step k, located on the row i and column j.
(k)
We apply the following algorithm to obtain the matrix (2.7), assuming that akk 6= 0, 1 ≤ k ≤
(k)
n − 1, where the element akk is called pivot:
• we copy the first k rows;
• on column ”k”, under pivot, the elements will be null (zero);
• the remaining elements, under the row ”k”, at the right of the column ”k”,will be calculated with the ”rectangle rule”:
..
.
row k · · ·
(2.8)
row i · · ·
..
.
(k)
akk
..
.
·········
(k)
·········
aik
..
.
column k
(k)
akj
..
.
(k)
aij
..
.
...
(k) (k)
⇒
...
(k+1)
aij
=
(k) (k)
akk aij − aik akj
(k)
.
akk
column j
Therefore, for 1 ≤ k ≤ n − 1, we obtain the following formulae:

(k)

1 ≤ i ≤ k, i ≤ j ≤ n + 1

 aij
(k+1)
0
1 ≤ j ≤ k, j + 1 ≤ i ≤ n
(2.9)
aij
=
(k)

aik
(k)
(k)

 aij − (k) · akj
k + 1 ≤ i ≤ n, k + 1 ≤ j ≤ n + 1.
akk
At the last step k = n − 1, we obtain the upper-triangular system
 (n)
(n)
(n)
(n)
a11 x1 + a12 x2 + ... + a1n xn = a1,n+1



(n)
(n)
(n)


a22 x1 + ... + a2n xn = a2,n+1



..........
(2.10)
(n)
(n)
(n)

aii x1 + ... + ain xn = ai,n+1




..........



(n)
(n)
ann xn = an,n+1 .
The above system (2.10) has the same solution with the initial one (2.6), but this system has
a triangular form. The solution components of system (2.10) are directly obtained by back
substitution.
From the last equation of system (2.10) we have the following cases
(n)
• If ann 6= 0, then
(2.11)
(n)
(n)
xn = an,n+1 ann
.
2.1. GAUSSIAN ELIMINATION METHOD (GAUSS PIVOTING METHOD)
15
(n)
• If ann = 0, then the system (2.6) has no unique solution.
Moreover, from back substitution in (2.10) we obtain for every i = n − 1, n − 2, ..., 1,
!
n
X
(n)
(n)
(n)
(2.12)
xi = ai,n+1 −
aij · xj
aii .
j=i+1
(k)
Remark 6. In this algorithm we suppose that the pivot akk 6= 0, for every 1 ≤ k ≤ n.
(k)
If at a certain step we have akk = 0, we search, on column k of the matrix A(k) , an element
(k)
aik k 6= 0, k < ik ≤ n and we switch rows ik and k in the matrix A(k) , and we apply the above
formulae (2.9).
Example 1. Solve the following systems using the Gaussian elimination method

 −x1 + 3x2 + x3 = −2
2x + 4x2 − x3 = −4
 1
−5x1 − 2x2 + 3x3 = 3.
Proof. We have n = 3, and the corresponding matrix and the free therm of the above system
are




−1 3
1
−2
A= 2
4 −1  , b =  −4  .
−5 −2 3
3
The extended matrix corresponding to the system is

−1
3
1
(1)

A =A=
2
4 −1
−5 −2 3
(1)

−2
−4  .
3
Step 1. To obtain the matrix A(2) we choose a11 = −1 6= 0 pivot. We keep row 1 from A(1) .
(1)
In the first column, under pivot a11 = −1 the elements will be zero and the other elements are
calculated using the ”rectangle rule” (2.8):
(1)
(1)
(1)
(1)
a11 · a22 − a21 · a12
−1 · 4 − 2 · 3
(2)
a22 =
=
= 10,
(1)
−1
a11
(1)
(1)
(1)
(1)
(2)
a11 · a23 − a21 · a13
(2)
a11
(1)
(1)
(1)
(1)
a11 · a24 − a21 · a14
a23 =
a24 =
(1)
(1)
(2)
a32
=
a11 ·
=
a11 ·
(1)
(2)
a33
(1)
a11
(1)
(1)
a32 − a31
(1)
a11
(1)
(1)
a33 − a31
(1)
a11
=
−1 · (−1) − 2 · 1
= 1,
−1
=
−1 · (−4) − 2 · (−2)
= −8,
−1
=
−1 · (−2) − (−5) · 3
= −17,
−1
=
−1 · 3 − (−5) · 1
= −2,
−1
(1)
· a12
(1)
· a13
16
CHAPTER 2. SOLVING LINEAR SYSTEMS - DIRECT METHODS
(1)
(2)
a34
=
(1)
(1)
(1)
a11 · a34 − a31 · a14
(1)
a11
=
−1 · 3 − (−5) · (−2)
= 13.
−1
We obtain the matrix

A(2)
−1 3
1

=
0
10
1
0 −17 −2

−2
−8  .
13
(2)
Step 2. We choose a22 = 10 6= 0 pivot and we keep rows 1 and 2 from A(2) . In the second
(2)
column, under the pivot a22 = 10 the elements will be zero and the other elements are calculated
using the ”rectangle rule”:
(2)
(2)
(2)
(2)
3
10 · (−2) − (−17) · 1
a22 · a33 − a32 · a23
(3)
a33 =
=− ,
=
(2)
10
10
a22
(2)
(3)
a34
=
(2)
(2)
(2)
a22 · a34 − a32 · a24
(2)
a22
=
10 · 13 − (−17) · (−8)
6
=− .
10
10
We get the matrix

A(3)
−1 3
1

=
0 10 1
3
0 0 − 10

−2
−8  .
6
− 10
The system corresponding to the matrix A(3) is



−x1 + 3x2 + x3 = −2
10x2 + x3 = −8
6
3
x3 = − 10
.
− 10
The above system has the same solution with the initial one, but this system has a triangular
form. The solution components of system are directly obtained by back substitution:



6
3
x3 = − 10
/ − 10
=2
x2 = (−8 − x3 )/10 = (−8 − 2)/10 = −1
x1 = (−2 − x3 − 3x2 )/(−1) = (−2 − 2 − 3 · (−1))/(−1) = 1
and therefore, the solution is,

 

x1
1
 x2  =  −1  .
x3
2
Example 2. Solve the following system by using the Gaussian elimination method
(2.13)

x1 + 3x2 − 2x3 − 4x4 = −2



2x1 + 6x2 − 7x3 − 10x4 = −6
−x1 − x2 + 5x3 + 9x4 = 9



−3x1 − 5x2 + 15x4 = 13.
2.1. GAUSSIAN ELIMINATION METHOD (GAUSS PIVOTING METHOD)
17
Proof. We have n = 4, and the corresponding matrix and the free therm of the above system
are




1
3 −2 −4
−2
 2


6 −7 −10 
 , b =  −6  .
A=
 −1 −1 5
 9 
9 
−3 −5 0
15
13
The extended matrix corresponding to the system is

A(1)
1
3 −2 −4
 2
6
−7 −10
=A=
 −1 −1 5
9
−3 −5 0
15
(1)

−2
−6 
.
9 
13
Step 1. To obtain the matrix A(2) we choose a11 = 1 6= 0 pivot. We keep row 1 from A(1) .
(1)
In the first column, under pivot a11 = 1 the elements will be zero and the other elements are
calculated using the ”rectangle rule” (2.8):
(1)
(1)
(1)
(1)
1·6−2·3
a11 · a22 − a21 · a12
(2)
=
= 0,
a22 =
(1)
1
a11
(1)
(1)
(1)
(1)
(2)
a11 · a23 − a21 · a13
(2)
a11
(1)
(1)
(1)
(1)
a11 · a24 − a21 · a14
a23 =
a24 =
(1)
(1)
(2)
a25
=
a11 ·
=
a11 ·
(1)
(2)
a32
(1)
(2)
a33 =
a11 ·
(1)
(2)
a34
=
a11 ·
=
a11 ·
(1)
(2)
a35
(1)
(2)
a11 ·
(2)
a11 ·
a42 =
(1)
a43 =
(1)
(2)
a44
=
a11 ·
=
a11 ·
(1)
(2)
a45
(1)
a11
(1)
(1)
a25 − a21
(1)
a11
(1)
(1)
a32 − a31
(1)
a11
(1)
(1)
a33 − a31
(1)
a11
(1)
(1)
a34 − a31
(1)
a11
(1)
(1)
a35 − a31
(1)
a11
(1)
(1)
a42 − a41
(1)
a11
(1)
(1)
a43 − a41
(1)
a11
(1)
(1)
a44 − a41
(1)
a11
(1)
(1)
a45 − a41
(1)
a11
=
1 · (−7) − 2 · (−2)
= −3,
1
=
1 · (−10) − 2 · (−4)
= −2,
1
=
1 · (−6) − 2 · (−2)
= −2,
1
=
1 · (−1) − (−1) · 3
= 2,
1
=
1 · 5 − (−1) · (−2)
= 3,
1
=
1 · 9 − (−1) · (−4)
= 5,
1
=
1 · 9 − (−1) · (−2)
= 7,
1
=
1 · (−5) − (−3) · 3
= 4,
1
=
1 · 0 − (−3) · (−2)
= −6,
1
=
1 · 15 − (−3) · (−4)
= 3,
1
=
1 · 13 − (−3) · (−2)
= 7.
1
(1)
· a15
(1)
· a12
(1)
· a13
(1)
· a14
(1)
· a15
(1)
· a12
(1)
· a13
(1)
· a14
(1)
· a15
18
CHAPTER 2. SOLVING LINEAR SYSTEMS - DIRECT METHODS
Therefore, we obtain the matrix

A(2)
1
 0
=
 0
0
3 −2 −4
0 −3 −2
2 3
5
4 −6 3

−2
−2 
.
7 
7
(2)
Step 2. Since a22 = 0, we can not apply the rectangle rule and we will change (switch) the
rows 2 and 3 in the matrix A(2) , and we obtain


1 3 −2 −4 −2
7 
2
3
5
L ↔L  0

A(2) 2= 3 
 0 0 −3 −2 −2  .
7
0 4 −6 3
(2)
We choose a22 = 2 6= 0 pivot and we keep rows 1 and 2 from A(2) . In the second column, under
(2)
the pivot a22 = 2 the elements will be zero and the other elements are calculated using the
”rectangle rule” (2.8):
(2)
(2)
(2)
(2)
a22 · a33 − a32 · a23
2 · (−3) − 0 · 3
(3)
a33 =
=
= −3,
(2)
2
a22
(2)
(3)
a34
=
(2)
(3)
a22 ·
(3)
a22 ·
a35 =
(2)
a43 =
(2)
(3)
a44
=
a22 ·
(2)
(3)
a45 =
(2)
(2)
(2)
a22 · a34 − a32 · a24
a22 ·
(2)
a22
(2)
(2)
a35 − a32
(2)
a22
(2)
(2)
a43 − a42
(2)
a22
(2)
(2)
a44 − a42
(2)
a22
(2)
(2)
a45 − a42
(2)
a22
=
2 · (−2) − 0 · 5
= −2,
2
=
2 · (−2) − 0 · 7
= −2,
2
=
2 · (−6) − 4 · 3
= −12,
2
=
2·3−4·5
= −7,
2
=
2·7−4·7
= −7.
2
(2)
· a25
(2)
· a23
(2)
· a24
(2)
· a25
We get the matrix

A(3)
1
 0
=
 0
0
3 −2 −4
2
3
5
0 −3 −2
0 −12 −7

−2
7 
.
−2 
−7
(2)
Step 3. We choose a33 = −3 6= 0 pivot and we keep rows 1, 2 and 3 from A(3) . In the column
(3)
3, under pivot a33 = −3 the elements will be zero and the other elements are calculated using
the ”rectangle rule” (2.8):
(3)
(3)
(3)
(3)
a33 · a44 − a43 · a34
−3 · (−7) − (−12) · (−2)
(4)
=
= −1,
a44 =
(3)
−3
a33
(3)
(4)
a45
=
(3)
(3)
(3)
a33 · a45 − a43 · a35
(3)
a33
=
−3 · (−7) − (−12) · (−2)
= −1.
−3
2.1. GAUSSIAN ELIMINATION METHOD (GAUSS PIVOTING METHOD)
19
We obtain

A(4)
1
 0
=
 0
0

−2
7 
.
−2 
−1
3 −2 −4
2 3
5
0 −3 −2
0 0 −1
The system corresponding to the matrix A(4) is

x1 + 3x2 − 2x3 − 4x4 = −2



2x2 + 3x3 + 5x4 = 7
−3x3 − 2x4 = −2



−x4 = −1.
The above system has the same solution with the initial one, but this system has a triangular
form. The solution components of system are directly obtained by back substitution:

x4 = −1/(−1) = 1



x3 = (−2 + 2x4 )/(−3) = (−2 + 2 · 1)/(−3) = 0
x

2 = (7 − 5x4 − 3x3 )/2 = (7 − 5 · 1 − 3 · 0)/2 = 1


x1 = (−2 + 4x4 + 2x3 − 3x2 )/1 = (−2 + 4 · 1 + 2 · 0 − 3 · 1)/1 = −1,
and therefore, the solution is,

 
x1
−1
 x2   1

 
 x3  =  0
x4
1


.

Example 3. Solve the following system by using the Gaussian elimination method

 x1 − x2 − 3x3 = 8
3x − x2 + x3 = 4
 1
2x1 + 3x2 + 19x3 = 10.
Proof. The extended matrix corresponding to the system is


8
1 −1 −3
A(1) = A =  3 −1 1
4 .
10
2
3 19
(1)
Step 1. We choose a11 = 1 6= 0 pivot, and we keep row 1 from A(1) . In the first column,
(1)
under pivot a11 = 1 the elements will be zero and the other elements are calculated using the
”rectangle rule”:
(1)
(1)
(1)
(1)
1 · (−1) − 3 · (−1)
a11 · a22 − a21 · a12
(2)
=
= 2,
a22 =
(1)
1
a11
(1)
(1)
(1)
(1)
a11 · a23 − a21 · a13
1 · 1 − 3 · (−3)
(2)
a23 =
=
= 10,
(1)
1
a11
20
CHAPTER 2. SOLVING LINEAR SYSTEMS - DIRECT METHODS
(1)
(2)
a24
(1)
(1)
=
=
a11 ·
(1)
(2)
a32
(1)
a11 · a24 − a21 · a14
(1)
(2)
a11 ·
(2)
a11 ·
a33 =
(1)
a34 =
(1)
a11
(1)
(1)
a32 − a31
(1)
a11
(1)
(1)
a33 − a31
(1)
a11
(1)
(1)
a34 − a31
(1)
a11
=
1·4−3·8
= −20,
1
=
1 · 3 − 2 · (−1)
= 5,
1
=
1 · 19 − 2 · (−3)
= 25,
1
(1)
· a12
(1)
· a13
(1)
· a14
=
We obtain the matrix
1 · 10 − 2 · 8
= −6.
1

1 −1 −3
(2)

A =
0 2 10
0 5 25

8
−20  .
−6
(2)
Step 2. We choose a22 = 2 6= 0 pivot and we keep rows 1 and 2 from A(2) . In the second column,
(2)
under the pivot a22 = 2 the elements will be zero and the other elements are calculated using
the ”rectangle rule”:
(2)
(2)
(2)
(2)
2 · 25 − 5 · 10
a22 · a33 − a32 · a23
(3)
a33 =
=
= 0,
(2)
2
a22
(2)
(3)
a34
=
(2)
(2)
(2)
a22 · a34 − a32 · a24
(2)
a22
We obtain the matrix
=
2 · (−6) − 5 · (−20)
= 44.
2


1 −1 −3
8
A(3) =  0 2 10
−20  .
0 0
0
44
The system corresponding to the matrix A(3) is

 x1 − x2 − 3x3 = 7
2x2 + 10x3 = −20

0x3 = 44,
and therefore, the system is incompatible.
Example 4. Solve the following systems by using the Gaussian elimination method


x1 + 2x2 + 3x3 + x4 = 7
x1 + 3x2 − 2x3 − 4x4 = −2






2x1 + x2 + 2x3 + 3x4 = 8
2x1 + 6x2 − 7x3 − 10x4 = −6
b)
a)
2x − x2 − 4x3 + 4x4 = 1
−x1 − x2 + 5x3 + 9x4 = 9




 1

2x1 + x3 − 3x4 = 0
−3x1 − 5x2 + 15x4 = 13.
In a short reviewer of what we did we notice that, in order to continue with our method,
when a pivot was zero we interchange the row of the pivot with a row below (since the rows
above can be considered as being already arranged in a suitable manner for our method to
work). Obviously, a necessity is that the new pivot to be nonzero. But other than, do we have
additional criteria when decided with which row to interchange the row i of the pivot?
2.1. GAUSSIAN ELIMINATION METHOD (GAUSS PIVOTING METHOD)
21
The answer is yes, if we pursue a better numerical stability. Hence we distinguish between
two variants of the Gaussian elimination method:
1. Gauss method with partial pivoting at every step
2. Gauss method with total pivoting at every step
Let us address the firs one.
2.1.1
Gauss method with partial pivoting at every step
Here, to choose the pivot, we must search for the element with the greatest absolute value
among all the elements of the matrix which are situated underneath or at the position of the
pivot, on the same column. More precisely, at every step k, for 1 ≤ k ≤ n − 1, the pivot is the
(k)
element aik k , k ≤ ik ≤ n, with the property
(k)
(k)
aik k = max aik .
k≤i≤n
Why using this method for solving systems? On the one hand,the partial pivoting ensures a
better numerical stability because when we choose elements that are far away from zero to be
the pivot we avoid dangerous numerical situations, like dividing by 0 or by some number close
to zero. On the other hand, to justify the choice of a Gauss method we have to say that its
cost of the algorithm is quite low, of order 23 n3 . Let us say what we mean by the cost of an
algorithm.
Definition 4. The total number of elementary mathematical operations (+, −, ·, :) of an algorithm is called the cost of the algorithm.
The cost of the algorithm for the Gaussian elimination method with partial pivoting
3 is
2 3
Therefor, we can say that this method is of order 3 n , and we denote O 2n3 . To
make a comparison, the cost of the algorithm for Cramer’s method, which was the common
method for solving systems in high school, is n(n + 1)!. We denote by cG the cost of the Gaussian method and by cC the cost of the Cramer’s method. For a better understanding of the
cost differences, let us give some values to n.
4n3 +9n2 −7n
.
6
• For n = 2, cG = 9, cC = 12,
• For n = 3, cG = 28, cC = 72,
• For n = 4, cG = 62, cC = 480,
• For n = 5, cG = 115, cC = 3600.
22
CHAPTER 2. SOLVING LINEAR SYSTEMS - DIRECT METHODS
It is clear now that the larger n becomes, the larger the difference between the two costs. And
in these days, when ”time is money” and we want everything to go faster, of course that we
want our system to be solved as quickly as possible. Especially when we think at low big this
difference between costs would be when n = 70, for example.
With the hope that we have convinced the reader of the utility of the Gaussian elimination
method with partial pivoting, we introduce an example of solving a system.
Example 5. Solve the following system by using the Gauss method with partial pivoting at
every step

 x1 + 3x2 − 2x3 − 4x4 = 3


−x1 − x2 + 5x3 + 9x4 = 14
2x + 6x2 + −7x3 − 10x4 = −2


 1
−3x1 − 5x2 + 15x4 = −6.
Proof. The extended matrix corresponding to the system is

1
3 −2 −4
 −1 −1 5
9
A(1) = A = 
 2
6 −7 −10
−3 −5 0
15

3
14 
.
−2 
−6
We search the pivot to be put in the position of a11 on the firs column. More precisely, we
search the element with the largest absolute value from the first column of the matrix A(1) , i.e.
o
n
(1)
(1)
(1)
(1)
(1)
|piv (1) | = max ai1 = max a11 , a21 , a31 , a41
1≤i≤4
(1)
= max {|1| , |−1| , |2| , |−3|} ⇒ piv (1) = a41 = −3.
Be careful, although we search the biggest absolute value, the pivot is not the absolute value,
instead it is the element with this absolute value.
(1)
Sine we found out that a41 should be the pivot, we interchange the row 1 with row 4 in A(1)
and we obtain


−6
−3 −5 0
15
14 
−1 5
9
L ↔L  −1
.
A(1) 1= 4 
 2
6 −7 −10 −2 
3
1
3 −2 −4
We now proceed with the Gaussian elimination method as usual (see the solving of system
(1)
(2.13)). We choose a11 = −3 6= 0 pivot and we keep row 1 from A(1) . In the first column, under
pivot the elements will be zero and the other elements are calculated using the ”rectangle rule”:
(1)
(1)
(1)
(1)
a11 · a22 − a21 · a12
−3 · (−1) − (−1) · (−5)
2
(2)
a22 =
=
= ,
(1)
−3
3
a11
(1)
(2)
a23
=
(1)
(1)
(1)
a11 · a23 − a21 · a13
(1)
a11
=
−3 · 5 − (−1) · 0
= 5,
−3
2.1. GAUSSIAN ELIMINATION METHOD (GAUSS PIVOTING METHOD)
(1)
(2)
a24
=
(1)
(2)
a11 ·
(2)
a11 ·
(1)
a33 =
(1)
=
a11 ·
=
a11 ·
(1)
(2)
a35
(1)
(2)
a42 =
a11 ·
(1)
(2)
a43
=
a11 ·
=
a11 ·
(1)
(2)
a44
(1)
a11 ·
a32 =
(2)
a34
(1)
=
(1)
(2)
a25
(1)
a11 · a24 − a21 · a14
(1)
=
−3 · 9 − (−1) · 15
= 4,
−3
=
−3 · 14 − (−1) · (−6)
= 16,
−3
=
8
−3 · 6 − 2 · (−5)
= ,
−3
3
=
−3 · (−7) − 2 · 0
= −7,
−3
=
−3 · (−10) − 2 · 15
= 0,
−3
=
−3 · (−2) − 2 · (−6)
= −6,
−3
=
−3 · 3 − 1 · (−5)
4
= ,
−3
3
=
−3 · (−2) − 1 · 0
= −2,
−3
=
−3 · (−4) − 1 · 15
= 1,
−3
(1)
· a15
(1)
· a12
(1)
· a13
(1)
· a14
(1)
· a15
(1)
· a12
(1)
· a13
(1)
· a14
(1)
−3 · 3 − 1 · (−6)
= 1.
−3
Therefore, we obtain the matrix

−3 −5 0
2

0
5
3
A(2) = 
8
 0
−7
3
4
0
−2
3
(2)
a45 =
a11 ·
(1)
a11
(1)
(1)
a25 − a21
(1)
a11
(1)
(1)
a32 − a31
(1)
a11
(1)
(1)
a33 − a31
(1)
a11
(1)
(1)
a34 − a31
(1)
a11
(1)
(1)
a35 − a31
(1)
a11
(1)
(1)
a42 − a41
(1)
a11
(1)
(1)
a43 − a41
(1)
a11
(1)
(1)
a44 − a41
(1)
a11
(1)
(1)
a45 − a41
(1)
a11
23
· a15
=
We search
(2)
(2)
ai2
n

−6
16 
.
−6 
1
15
4
0
1
(2)
a22
(2)
a32
(2)
a42
o
|piv | max
= max
,
,
=
2≤i≤4
8
8 2 4
(2)
,
,
⇒ piv (1) = a32 = .
= max
3 3 3
3
We interchange the row 2 with row 3 in A(2) and we get

−3 −5
0 15

8/3 −7 0
L ↔L  0
A(2) 2= 3 
 0
2/3
5 4
0
4/3 −2 1
(2)

−6
−6 
.
16 
1
We choose a22 = 8/3 pivot and we keep row 1 and row 2 from A(2) . In the second column,
under pivot the elements will be zero and the other elements are calculated using the ”rectangle
rule”:
24
CHAPTER 2. SOLVING LINEAR SYSTEMS - DIRECT METHODS
(2)
(3)
a33
=
(2)
(3)
a22 ·
(2)
(3)
a43 =
a22 ·
(2)
=
a22 ·
=
a22 ·
(2)
(3)
a45
(2)
a22 ·
a35 =
(3)
a44
(2)
=
(2)
(3)
a34
(2)
a22 · a33 − a32 · a23
(2)
a22
(2)
(2)
a34 − a32
(2)
a22
(2)
(2)
a35 − a32
(2)
a22
(2)
(2)
a43 − a42
(2)
a22
(2)
(2)
a44 − a42
(2)
a22
(2)
(2)
a45 − a42
(2)
a22
=
(2)
· a24
=
(2)
· a25
=
(2)
· a23
=
(2)
· a24
=
(2)
· a25
=
8
3
8
3
· 5 − 23 · (−7)
8
3
2
3
·4− ·0
8
3
8
3
=
= 4,
· 16 − 23 · (−6)
8
3
8
3
=
· (−2) − 43 · (−7)
8
3
8
3
· 1 − 43 · 0
8
3
8
3
27
,
4
35
,
2
3
= ,
2
= 1,
· 1 − 43 · (−6)
8
3
= 4.
Therefore, we obtain the matrix

A(3)
−3 −5
0
15
 0 8/3 −7 0
=
 0
0 27/4 4
0
0
3/2 1

−6
−6 
.
35/2 
4
We search
piv
(3)
= max
3≤i≤4
(3)
ai3
= max
n
(3)
a33
,
(3)
a43
o
= max
27 3
,
4
2
(3)
⇒ piv (3) = a33 =
27
.
4
At this step we do need to interchanges any rows.
(3)
We choose the pivot a33 = 27
, and we keep rows 1, 2 and 3 from A(3) . In the column 3, under
4
pivot the elements will be zero and the other elements are calculated using the ”rectangle rule”:
(3)
(3)
(3)
(3)
27
· 1 − 23 · 4
1
a33 · a44 − a43 · a34
(4)
4
=
= ,
a44 =
27
(3)
9
a33
4
(3)
(3)
(3)
(3)
27
3 35
·4− 2 · 2
a ·a −a ·a
1
(4)
= .
a45 = 33 45 (3) 43 35 = 4
27
9
a33
4
We obtain the matrix


−3 −5
0
15
−6
 0 8/3 −7
0
−6 
.
A(4) = 
 0
0 27/4 4
35/2 
0
0
0
1/9
1/9
The system corresponding to the matrix A(4) is

−3x1 − 5x2 + 15x4 = −6


 8
x − 7x3 = −6
3 2
27
x + 4x = 35


2
 14 3 1 4
x
=
.
4
9
9
2.1. GAUSSIAN ELIMINATION METHOD (GAUSS PIVOTING METHOD)
25
The solution components of system are directly obtained by back substitution:

x4 = 19 / 19 = 1 


x3 = 35
− 4x4 / 27
= 35
− 4 · 1 / 27
=2
2
4
2
4
8
8
x2 = (−6 + 7x3 )/ 3 = (−6 + 7 · 2)/ 3 = 3



x1 = (−6 − 15x4 + 5x2 )/(−3) = (−6 − 15 · 1 + 5 · 3)/(−3) = 2,
and therefore, the solution is,

 
x1
2
 x2   3

 
 x3  =  2
x4
1


.

Remark 7. If, at any step k of the algorithm, we notice that
(k)
piv (k) = max aik = 0,
k≤i≤n
then we deduce that the system cannot admit unique solution (either it is incompatible, either
it has infinitely many solutions).
Remark 8. The Gaussian elimination method can also provide the value of the determinant
of the matrix A :
n
Y
(n)
Np
det(A) = (−1)
aii ,
i=1
where Np represents the total number of permutations (of rows or columns, as we will see below
when presenting the Gauss method with total pivoting at every step).
2.1.2
Gauss method with total pivoting at every step
The major difference between Gauss elimination method with partial pivoting at every step
and Gauss elimination method with total pivoting is that, on the second variant, the role of
the pivot at step k will be played by the element that has the greatest absolute value among
the elements from the square matrix that has akk as its first element. More exactly, the pivot
at step k should verify
(k)
piv (k) = max aij .
k≤i,j≤n
As a consequence, sometimes we will interchange columns too, and we should be very careful
because every time we interchange column k with column j. We should be aware of the fact
that we change the order of the variables xk and xj too. Thus, in order to obtain in the
end the right solution, we should memorize the interchanges of the columns and then perform
the corresponding interchanges of the elements that are find to be the values of the unknown
variables. So, from the beginning we notice a disadvantage of this method: it needs more
26
CHAPTER 2. SOLVING LINEAR SYSTEMS - DIRECT METHODS
memory space to retain all these permutations. What is the advantage then? It offers total
numerical stability to the algorithm. As for the cost of the algorithm, it remains the same.
Also formula for finding the value of the determinant of A holds true.
Example 6. a) Solve the following systemby using the Gauss method with total pivoting at
every step

−2x1 + x3 = 1



x1 + 4x2 + x4 = −3
 2x1 − 3x4 = −3


−2x1 + x3 + x4 = 2.
b) Find the value of the determinant of the matrix A, corresponding to the above system.
Proof. a) The extended matrix corresponding

−2
 1
A(1) = A = 
 2
−2
to the system is
0
4
0
0

1
−3 
.
−3 
2
1 0
0 1
0 −3
1 1
We search the element with the property
(1)
piv (1) = max aij
1≤i,j≤4
(1)
= a22 = 4.
(1)
The pivot piv (1) should be a22 = 4.
First we interchange row 1 with row 2 (and we have seen that this does not have any impact
on the solution of the system), and secondly, we interchange column 1 with column 2 in A(1) ,
but this also mens that we interchange x1 with x2 , so this should be remember at the end! We
obtain the matrix




1 4 0 1
−3
−3
4
1 0 1
C1 ↔ C2  0 −2 1 0
0
1 
1 
L ↔L  −2 0 1

.

A(1) 1= 2 
=
 2 0 0 −3 −3 
 0
2 0 −3 −3 
−2 0 1 1
2
2
0 −2 1 1
(1)
We choose a11 = 4 6= 0 pivot and we remark that
A(2) = A(1) .
We search the element with the property
(2)
piv (2) = max aij
2≤i,j≤4
(2)
= a34 = | − 3|.
We interchange the row 2 with row 3, and we interchange column 2
we obtain



4
1
4 1 0 1
−3
C
↔
C
2
4  0
2 0 −3 −3 
−3
L ↔L  0


A(2) 2= 3 
=
 0 −2 1 0


1
0
0
0 −2 1 1
2
0
1
with column 4 in A(2) and
0 1
0 2
1 −2
1 −2

−3
−3 
.
1 
2
2.1. GAUSSIAN ELIMINATION METHOD (GAUSS PIVOTING METHOD)
(2)
27
We choose a22 = −3 6= 0 pivot and we keep rows 1 and 2 from A(2) . In the second column, under
the pivot the elements will be zero and the other elements are calculated using the ”rectangle
rule”, and we obtain


−3
4 1 0 1
 0 −3 0 2
−3 
.
A(3) = 
 0 0 1 −2
1 
1
0 0 1 − 43
We search the element with the property
(3)
piv (3) = max aij
3≤i,j≤4
(3)
= a34 = |−2| .
We interchange column 3 with column 4 in A(3) and we

4 1
1
C
↔
C

2
4
2
 0 −3
=
A(3)
 0 0
−2
0 0 − 43
obtain

−3
−3 
.
1 
1
0
0
1
1
(3)
We choose a33 = −2 6= 0 pivot and we get

4 1
1 0
 0 −3 2 0
A(4) = 
 0 0 −2 1
0 0
0 13

−3
−3 
.
1 
1
3
We deduce, by direct substitution method, the intermediate solution

x4 = 1



x3 = 0
x =1


 2
x1 = −1,
and we interchange in the following

 component
component

component
More precisely, we have

x1 = −1



x2 = 1
x =0


 3
x4 = 1
order

x1



x2
C3 ↔C4
⇒
x


 3
x4
3 ↔ component 4 (since C3 ↔ C4 )
2 ↔ component 4 (since C2 ↔ C4 )
1 ↔ component 2 (since C1 ↔ C2 ).
= −1
=1
=1
=0

x1



x2
C2 ↔C4
⇒
x


 3
x4
= −1
=0
=1
=1

x1



x2
C1 ↔C2
⇒
x


 3
x4
So the solution of system is

x1



x2
x


 3
x4
(4)
(4)
(4)
(4)
=0
= −1
=1
= 1.
b) det(A) = (−1)2+3 a11 · a22 · a33 · a44 = −4 · (−3) · (−2) ·
1
3
= −8
=0
= −1
=1
= 1.
28
CHAPTER 2. SOLVING LINEAR SYSTEMS - DIRECT METHODS
We present next another direct method for solving linear systems
2.2
L − R Decomposition Method
This method is also called L − U decomposition method, or instead of ”decomposition method”
we say sometimes ”factorization method”. The general idea is to solve a linear system of the
form
A · x = b,
(2.14)
where A ∈ Mn×n (R) and b ∈ Mn×1 (R), by finding two matrices L, R ∈ Mn×n (R) such that
A = L · R,
where L is a lower-triangular matrix, which means that all its nonzero elements are situated
at the left of the main diagonal (hence the ”L” from the notation comes from ”left”), and R
is an upper-triangular matrix, which means that all its nonzero elements are situated at the
right of the main diagonal (hence the ”R” from the notation comes from ”right”). on the other
hand, some prefer to denote these matrices by L from ”lower-triangular”, respectively by U
from ”upper-triangular”, thus, as we said, this method is also called ”L − U Decomposition
Method”.
Note that there is an additional property to be fulfilled: one of the matrix L or R should have
only 1 on the main diagonal. Depending on which matrix has this property, we distinguish two
variants of this method. This way, if




1 0 ... 0
r11 r12 ... r1n
 0 r22 ... r2n 
 l21 1 ... 0 




(2.15)
L =  ..
.. . . ..  and R =  ..
.. . .
..  ,
 .
 .
. . 
. . 
.
.
ln1 ln2
... 1
0
0
...
rnn
then the method bears the name Doolittle factorization method.
Remark 9. A matrix L taken as above is called unit lower triangular matrix.
On the other hand, if

(2.16)


L=

l11 0 ... 0
l21 l22 ... 0
..
.. . .
.
. ..
.
.
ln1 ln2 ... lnn








and R = 


1 r12 ... r1n
0 1 ... r2n 

.. .. . .
..  ,
. . 
. .
0 0 ... 1
then this method is called Croût factorization method.
Remark 10. A matrix R taken as above is called unit upper triangular matrix.
2.2. L − R DECOMPOSITION METHOD
29
Why is better to work with a unit lower triangular matrix ar a unit upper triangular matrix
instead of a regular triangular matrix and a regular upper triangular matrix? (This questions is
addressed to the students, after they will compute the laboratory implementation of the L − R
factorization method).
In what follows, we would like to clarify two aspects:
1. how do we use the decomposition A = L · R to solve the system (2.14).
2. how do we find L and R.
For the first aspect we notice that
A · x = b ⇔ (L · R)x = b ⇔ L · (Rx) = b.
We denote Rx = y and we solve the system Ly = b by using the direct substitution method,
due to the fact that L is a lower triangular matrix. After finding the solution y in this way, we
solve the system Rx = y by using the backward substitution method.
For the clarification of the second aspect, the answer depends on which method we use.
2.2.1
The L − R Doolittle factorization method
Let us multiply L by R, where L and R are given by (2.15). Then by making the identification
of the elements so that L · R = A, we obtain
min{i,j}
X
rih · lhj = aij ,
for all 1 ≤ i, j ≤ n,
h=1
and we find the following formulae


 r1j = a1j ,


li1 = ai1 /r11 ,



k−1

X

rkj = akj −
lkh · rhj ,
(2.17)

h=1

!


k−1

X



lih · rhk /rkk ,
 lik = aik −
for all
for all
1≤j≤n
2≤i≤n
for all
2 ≤ k ≤ n,
k≤j≤n
for all
2 ≤ k ≤ n,
k + 1 ≤ i ≤ n.
h=1
We observe that, by these formulae, we first calculate a row from R, then a column from L,
then another row from R, then another column from L an so until we find out all the elements
of the matrices L and R. More specific, the first row of R is in fact the first row of A. The first
column of L, starting with its second element, is obtained by dividing the first column of A to
r11 . For the second row of R we perform:
r22 = a22 − l21 · r12 ,
30
CHAPTER 2. SOLVING LINEAR SYSTEMS - DIRECT METHODS
r23 = a23 − l21 · r13 ,
r24 = a24 − l21 · r14 .
For the second column of L, we have to divide by r22 :
l32 = (a32 − l31 · r12 ) /r22 .
We do not continue in this manner because we will solve in detail an example below. But first
we want to make a very important remark.
Remark 11. It is obvious that, in order to be able to apply this method, it is necessary to
have rkk 6= 0 at every step k. This actually means that, before applying the L − R factorization
method (be it Doolittle, be it Croût, since at Croût method we have to divide by lkk 6= 0 at every
step k) we have to verify that all the principal minors of the matrix A are nonzero.
Definition 5. Let C ∈ Mn×n (R) . By the principal minors of C (also called diagonal minors
of C) we understand the following determinants:
∆1 = c11 − is the principal minor of first order,
∆2 =
∆3 =
c11 c12
c21 c22
− is the principal minor of second order,
c11 c12 c13
c21 c22 c23
c31 c32 c33
− is the principal minor of third order,
..
.
∆3 =
c11 c12 c13
c21 c22 c23
..
..
..
.
.
.
c31 c32 c33
···
···
···
···
c1,n−1
c2,n−1
..
.
− is the principal minor of order n − 1,
cn−1,n−1
∆n = det(C) − is the principal minor of order n.
Now that we have clarified what we understand by a principal (or diagonal) minor of a matrix,
let us ask ourselves: what happens if one of the principal minors of A is zero? Does it mean
that we cannot apply the L − R factorization method? Not at all, it only means that we have
to interchange two rows of the extended matrix: row k (when at step k) and a row below row
k. Then we verify again if all the principal minors of the new A are non-zero, and, if the answer
is positive, we apply the L − R decomposition method.
We give an explicit example of solving a linear system by applying the L − R Doolittle factorization method.
Example 7. Solve the following system by using the L − R Doolittle factorization method

 x1 + 2x2 − x3 = 7
2x − 4x2 + 10x3 = 18
 1
−x1 − 2x2 + 5x3 = −7.
2.2. L − R DECOMPOSITION METHOD
31
The extended matrix corresponding to the above system is


7
18  .
−7
1
2 1

A=
2 −4 10
−1 −2 5
Before applying the method, we make sure that it is going to work by calculating the diagonal
minors of matrix A
∆1 = 1 6= 0,
1 2
∆2 =
= −8.
2 −4
1
2 −1
∆3 = det(A) = 2 −4 10 = −32 6= 0.
−1 −2 5
Since ∆1 , ∆2 , ∆3 6= 0, the matrix A admits a LR factorization. More precisely, we search two
matrices




1 0 0
r11 r12 r13
L =  l21 1 0  and R =  0 r22 r23  ,
l31 l32 1
0
0 r33
such that L · R = A.
At the first step, we compute the elements of the first row of the matrix R using the first
formula from (2.17):
r11 = a11 = 1.
r12 = a12 = 2.
r13 = a13 = −1.
Next, we determine the elements of the first column of the matrix L using the second formula
from (2.17):
21
l21 = ar11
= 21 = 2.
31
l31 = ar11
= −1
= −1.
1
At the second step, we calculate the elements of second row of the matrix R:
r22 = a22 − l21 r12 = −2 − 2 · 2 = −8.
r23 = a23 − l21 r13 = 10 − 2 · (−1) = 12.
Then we calculate the unknown element of the second column of L, that is l32 :
l32 = (a32 − l31 r12 ) /r22 = (−2 − (−1) · 2) /(−8) = 0.
At our final, we determine the unknown element of row 3 of the matrix R:
r33 = a33 − l31 r13 − l32 r23 = 5 − (−1) · (−1) − 0 · 12 = 4.
So, we get


1 0 0
L= 2 1 0 
−1 0 1


1 2 −1
R =  0 −8 12 
0 0
4
Remark 12. While writing the algorithm of this method into C + + (or other programming
32
CHAPTER 2. SOLVING LINEAR SYSTEMS - DIRECT METHODS
language). We notice that the matrix A goes through





1
2 −1
1
2 −1
A =  2 −4 10  7→  2 −4 10  →
7 
−1 −2 5
−1 −2 5
the following transformations:



1
2 −1
1
2 −1
2 −8 12  7→  2 −8 12  .
−1 0
5
−1 0
4
Exercise for students: Please explain the previous Remark.
Moving further with the solving of the system, we denote Rx = y, since A = L · R and Ax = b
is equivalent to L(Rx) = b. We solve the system Ly = b, that is,


 

1 0 0
y1
7
 2 1 0   y2  =  18  ,
(2.18)
−1 0 1
y3
−7
by direct substitution method because (2.18) is the matrix form of the system

 y1 = 7
2y + y2 = 18
 1
−y1 + y3 = −7.
We substitute y1 = 7, in the second equation and we get y2 = 18 − 2y1 = 4. We substitute y1 ,
in the third equation and we get y3 = 0.
Remark 13. Normally, we would substitute y1 and y2 in the third equation, but in this particular situation y2 is missing from the third equation of the system.


7
We have obtained that y =  4  .
0
Now we solve the system Rx = y, that is,


  
1 2 −1
x1
7
 0 −8 12   x2  =  4  ,
0 0
4
x3
0
which is equivalent to

 x1 + 2x2 − x3 = 7
8x + 12x3 = 4
 2
4x3 = 0.
From the last equation we get that x3 = 0, and we solve the system by backward substitution
method. So, we substitute x3 = 0, in the second equation and we get x2 = − 21 . Then we
substitute the values of x3 and x2 into the first equation and we get x1 = 8. Hence, we have
determined the solution of the system,

 

x1
8
 x2  =  − 1  .
2
0
x3
2.2. L − R DECOMPOSITION METHOD
33
By looking closely at the above calculus one can see that we have applied the following formulae
to determine the intermediate solution of system L · y = b:


 y1 = b1 ,
i−1
X
(2.19)
y
=
b
−
lik · yk , i = 2, 3, ..., n.

i
 i
k=1
Moreover, the system R · x = y, has the following solution


 xn = yn /rnn , n
!
X
(2.20)
rik · xk /rii , i = n − 1, n − 2, ..., 1.

 xi = y i −
k=i+1
Example: Solve the following system with LR factorization

 −x1 + 2x2 + 3x3 = −8
x − 2x2 − x3 = 4
 1
−2x1 + 6x2 + 6x3 = −14.
Proof. The extended matrix is

−1 2
3

A=
1 −2 −1
−2 6
6

−8
4 .
−14
We check if the corner determinants of matrix A are nenull.
∆1 = −1 6= 0,
−1 2
∆2 =
= 0.
1 −2
Since the determinant ∆2 = 0, we interchange the row 2 with row 3 in the extended matrix A,
and we obtain


−1 2
3
−8
L ↔L
A 2= 3  −2 6
6
−14  .
1 −2 −1
4
We have
∆1 = −1 6= 0,
−1 2
∆2 =
−2 6
∆3 = det(A) =
= −2 6= 0,
−1 2
3
−2 6
6
1 −2 −1
= −4 6= 0.


−1 2
3
Since ∆1 , ∆2 , ∆3 =
6 0, the matrix A =  −2 6
6  admits a L − R factorization. More
1 −2 −1
34
CHAPTER 2. SOLVING LINEAR SYSTEMS - DIRECT METHODS
precisely, we search two matrices


1 0 0
L =  l21 1 0 
l31 l32 1


r11 r12 r13
and R =  0 r22 r23  ,
0
0 r33
such that L · R = A.
In what follows, if we multiply the row i of matrix L with column j of matrix R, we will denote
with Li (L) × Cj (R).
We determine the elements of first row of the matrix R:
L1 (L) × C1 (R) ⇒ r11 = a11 = −1.
L1 (L) × C2 (R) ⇒ r12 = a12 = 2.
L1 (L) × C3 (R) ⇒ r13 = a13 = 3.
We determine the elements of first column of the matrix L:
−2
21
L2 (L) × C1 (R) ⇒ l21 r11 = a21 ⇒ l21 = ar11
= −1
= 2.
a31
1
L3 (L) × C1 (R) ⇒ l31 r11 = a31 ⇒ l31 = r11 = −1 = −1.
We determine the elements of second row of the matrix R:
L2 (L) × C2 (R) ⇒ l21 r12 + r22 = a22 ⇒ r22 = a22 − l21 r12 = 6 − 2 · 2 = 2.
L2 (L) × C3 (R) ⇒ l21 r13 + r23 = a23 ⇒ r23 = a23 − l21 r13 = 6 − 2 · 3 = 0.
We determine the elements of second column of the matrix L:
L3 (L) × C3 (R) ⇒ l31 r12 + l32 r22 = a32 ⇒ l32 = (a32 − l31 r12 ) /r22 = (−2 − (−1) · 2) /2 = 0.
We determine the elements of row 3 of the matrix R:
L3 (L)×C3 (R) ⇒ l31 r13 +l32 r23 +r33 = a33 ⇒ r33 = a33 −l31 r13 −l32 r23 = −1−(−1)·3−0·0 = 2.
So, we get


1 0 0
L= 2 1 0 
−1 0 1


−1 2 3
R= 0 2 0 
0 0 2
The our system Ax = b is equivalent with
L · R · x = b.
If we denote R · x = y, where y ∈ R3 , in order to obtain the solution x ∈ R3 , we need to solve
the following two triangular systems:
(S1)L · y = b,
(S2)R · x = y.
The lower-triangular system (S1) is equivalent with


 

1 0 0
y1
−8
 2 1 0   y2  =  −14 
−1 0 1
y3
4
2.2. L − R DECOMPOSITION METHOD
35
We remark that the free therm b is chosen from the matrix A in which we interchanged the
rows. The solution y is obtain by direct substitution

 y1 = −8
y = −14 − 2y1 = 2
 2
y3 = 4 + y1 − 0y2 = −4.
The upper-triangular system (S2)

−1
 0
0
is equivalent with

 

2 3
x1
−8
2 0   x2  =  2  .
0 2
x3
−4
The solution x is obtain by back substitution

 x3 = −4/2 = −2
x = (2 − 0x3 )/2 = 1
 2
x1 = (−8 − 3x3 − 2x2 )/(−1) = 4.
So,
system is
 thesolution
 of
x1
4
 x2  =  1 
x3
−2
2.2.2
The L − R Croût factorization method
As the reader probably already understand, the L − R Croût factorization method is quite
similar to the L − R Doolittle method. Thus, we multiply L by R, where L and R are given
by (2.16). Then, by identifying the equal elements from the two equal matrices L · R and A,
we arrive at the following formulae:

li1 = ai1 ,
for all 1 ≤ i ≤ n




r1j = a1j /l11 ,
for all 2 ≤ j ≤ n


!

k−1

X

lik = aik −
lih · rhk ,
for all 2 ≤ k ≤ n, k ≤ i ≤ n
(2.21)

h=1

!


k−1

X



lkh · rhj /lkk , for all 2 ≤ k ≤ n, k + 1 ≤ j ≤ n.
 rkj = akj −
h=1
Once we have found the matrices L and R, we replace, as usual, A by L · R into the matrix
form of the system, so we have
L(Rx) = b,
and, by denoting Rx = y, we solve the system Ly = b by the direct substitution method by
applying the formulae


 y1 = b1 l11 ,
!
i−1
X
lik · yk /lii , i = 2, 3, ..., n.

 yi = bi −
k=1
36
CHAPTER 2. SOLVING LINEAR SYSTEMS - DIRECT METHODS
After finding out the value of y, we use the backward substitution method to solve the system
R · x = y by applying the formulae


 xn = yn , n
X
x
=
y
−
rik · xk , i = n − 1, n − 2, ..., 1.

i
i

k=i+1
Remark 14. Before starting to solve a linear system by the L − R Croût factorization method
we test the principal minors to see if they are non-zero, as explained before.
Exercise 1. Solve the previous system by applying the L − R Croût factorization method.
In addition to solving the linear system Ax = b, there are some other applications to the L − R
factorization method.
Further applications:
1. We can find the value of the determinant of matrix A. Indeed, since A = LR, we have
n
Q
that det A = det L · det R, thus det A = (lii · rii ).
i=1
2. We can find the inverse of matrix A, that is,

x11


 x
 21

 .
−1
A =X=
 .

 .


xn1
A−1 . More exactly, we denote

x12 . . . x1n


x22 . . . x2n 


.
. 
,
. ...
. 

.
. 


xn2 . . . xnn
and, by taking into consideration that

A · A−1
1 0 ...
0





 0 1 ... 0 




.
. 
 .
= In = 
,
. ...
. 
 .


 .
.
. 




0 0 ... 1
2.2. L − R DECOMPOSITION METHOD
37
solving A · X = In reduces to solving n systems:




A·



x11
x21
.
.
.
xn1


 
 
 
=
 
 
 
1
0
.
.
.
0




,







A·



x12
x22
.
.
.
xn2


 
 
 
=
 
 
 
0
1
.
.
.
0




,




...



A·



x1n
x2n
.
.
.
xnn


 
 
 
=
 
 
 
0
0
.
.
.
1




.



To make a comparison between the L − R factorization method and the Gauss elimination
method, we first notice that they both have the algorithm cost of order 23 n3 and we have to
say that, among all the factorization methods, the L − R method has the ”cheapest” cost. The
advantage of the L − R factorization method is that it allows us to solve in an easier manner
multiple systems of the form Ax = b, where the matrix A stays the same and only b changes
(that was the case of the calculus of the inverse matrix A−1 presented above). On the other
hand, the Gauss elimination method with total pivoting has total numerical stability, while
nothing guarantees the stability of the L − R factorization method unless we use a partial or
total pivoting for this method too.
Exercise 2. Solve the following linear system using first the L − R Doolittle factorization
method and then the L − R Croût factorization method:

x1 + 2x2 − x3 + x4 = 6



2x1 − 2x3 − x4 = 0
−3x1 − 6x2 + 3x3 + 2x4 = 2



3x1 + 6x2 − x3 − 3x4 = −8.
We discuss next a particular case of the L − R factorization method.
2.2.3
The L − R factorization method for tridiagonal matrices
Definition 6. By a tridiagonal matrix we understand a matrix that has non-zero elements
only on the main diagonal, on the diagonal upon the main diagonal, and on the diagonal below
the main diagonal. More exactly, a matrix A ∈ Mn×n (R) is called tridiagonal if it has the form





A=



a1 b 1
0 0 ...
0
c1 a2 b2 0 ...
0
0 c2 a3 b3 ...
0
.. .. . . . . . .
..
.
.
.
. .
.
0 0
0
... an−1
0 0
0 0 ... cn−1
0
0
0
bn−1
an





.



38
CHAPTER 2. SOLVING LINEAR SYSTEMS - DIRECT METHODS
Obviously, a linear system of the form Ax = b where A is a tridiagonal matrix can be solved
by the usual L − R factorization method described in the previous subsections. However, since
we have so many zeros into a matrix of this particular form, we search for matrices L and R of
a particular form too:

1
l1
0
..
.




L=


 0
0
0
1
l2
..
.
0
0
1
..
.
...
...
...
..
.
0
0
0
..
.
0
0
0
0
...
1
... ln−1
0
0
0
..
.








0 
1

r1
0
0
..
.




and R = 


 0
0
s1 0 ...
0
0
r2 s2 ...
0
0
0 r3 ...
0
0
..
.. . .
..
..
.
.
.
.
.
0 0 ... rn−1 sn−1
0 0 ...
0
rn





.



The form of the matrices makes the calculus much simpler, since we only have to find the value
of the elements ri , where i ∈ {1, . . . , n}, and li , si , where i ∈ {1, . . . , n − 1}. By performing
the calculus L · R and by taking into consideration the fact that L · R = A, we arrive at the
following formulae:
(2.22)

r1 = a1



 si = b i , 1 ≤ i ≤ n − 1
ci
, 1≤i≤n−1
l
=

i

ri


ri+1 = ai+1 − li si , 1 ≤ i ≤ n − 1.
As usual, after we determine the matrices L and R, instead of the system LRx = b we can
solve:
Ly = b
Rx = y.
We solve the system Ly = b by the direct substitution method, that is, by using the formulae:
y 1 = b1
yi+1 = bi+1 − li yi ,
1 ≤ i ≤ n − 1.
We solve the system Ly = b by the backward substitution method, that is, by using the
formulae:
(
yn
xn = ,
rn
xi = (yi − si xi+1 ) /ri , i ∈ {n − 1, n − 2, . . . , 1}.
Remark 15. Again, in order to make sure that the elements that play the role of pivots, ri ,
i ∈ {1, . . . , n}, are non-zero, we test the principal minors of the matrix A to see if they are nonzero before applying this method. But be careful, when interchanging two lines of a tridiagonal
matrix, most probably the new matrix will not be tridiagonal anymore, so we will have to rely
on the L − R factorization methods for usual matrices.
2.2. L − R DECOMPOSITION METHOD
39
Remark 16. There is no point to occupy the memory of the computer with a n × n matrix
when we only care for those three diagonals of the matrix A. Therefore when we are going
to write the pseudocode of this numerical algorithm we are going to keep the three diagonals
containing non-zero elements into three vectors:
a = (a1 , a2 , . . . , an ),
a = (b1 , b2 , . . . , bn−1 ),
c = (c1 , c2 , . . . , cn−1 ).
The same goes for the matrices L and R.
Example 8. Solve the following linear system:

−x1 + 2x2 = −3



−x1 + 4x2 + x3 = −5
2x − x3 − x4 = 4


 2
x3 + 5x4 = −3.
Proof. The matrix associated to this system

−1
 −1
A=
 0
0
is

2
0 0
4
1 0 
.
2 −1 −1 
0
1 5
We notice that A is a tridiagonal matrix. Before starting to apply the L − R factorization
method for tridiagonal matrixes we check if all the principal minors are non-zero. If one of
the principal minors proves to be 0, we will interchange the row corresponding to the order of
the minor with a row below it. Be careful though: after we perform an interchange of rows,
most probably the matrix that we obtain will not be a tridiagonal matrix anymore and in this
case we have to apply a L − R factorization method for arbitrary matrices (Doolittle or Croût)
instead of the L − R factorization method for tridiagonal matrices.
∆1 = −1 6= 0;
∆4 =
−1
−1
0
0
∆2 =
2
0 0
4
1 0
2 −1 −1
0
1 5
−1 2
−1 4
C2 +2C1
=
= 10 6= 0;
−1
−1
0
0
∆3 =
0
0 0
2
1 0
2 −1 −1
0
1 5
−1 2 0
−1 4 1
0 2 −1
1+1
= (−1)
= 4 + 2 − 2 = 4 6= 0;
2 1
0
2 −1 −1
0 1
5
= −18 6= 0.
Since all the principal minors prove to be non-zero, we proceed further with the L − R factorization method for tridiagonal matrices. So our aim is to find two matrices L and R such that
A = LR, where L and R are of the specific form




r1 s1 0 0
1 0 0 0
 l1 1 0 0 
 0 r2 s2 0 


L=
R=
 0 0 r3 s3  .
 0 l2 1 0  ,
0 0 0 r4
0 0 l3 1
40
CHAPTER 2. SOLVING LINEAR SYSTEMS - DIRECT METHODS
We keep the three diagonals with non-zero elements from A into three vectors:
a = (a1 , a2 , a3 , a4 ) = (−1, 4, −1, 5),
b = (b1 , b2 , b3 ) = (2, 1, −1),
c = (c1 , c2 , c3 ) = (−1, 2, 1),
where a is for the main diagonal, b is for the diagonal above a, and c is for the diagonal below
a. We apply formulae (2.22) for n = 4 in order to determine the vectors
l = (l1 , l2 , l3 ),
r = (r1 , r2 , r3 , r4 ),
s = (s1 , s2 , s3 ),
and to find this way matrices L and R. We have
r1 = −1,
and by si = bi , for 1 ≤ i ≤ 3 we deduce
s = (2, 1, −1).
Then,
l1 =
l2 =
l3 =
c1
−1
=
= 1,
r1
−1
c2
2
= = 1,
r2
2
r2 = a2 − l1 s1 = 4 − 2 = 2,
r3 = a3 − l2 s2 = −1 − 1 = −2,
c3
1
1
=
=− ,
r3
−2
2
We have obtained
l=
1
1, 1, −
2
r4 = a4 − l3 s3 = 5 −
,
r=
9
−1, 2, −2,
2
1
9
= .
2
2
,
so



L=

1 0
1 1
0 1
0
0
1
1
0 0 −
2
0
0
0
1



,




R=


−1 2
0 0
0 2
1 0 

.
0 0 −2 −1 

9
0 0
0
2


−3
 −5 

Since A = LR, solving the system written in its matrix for, Ax = b, where b = 
 4 ,
−3
is equivalent to solving L(Rx) = b. We denote Rx = y and we notice that we actually have to
solve
Ly = b,
Rx = y.
2.2. L − R DECOMPOSITION METHOD
The matrix equation Ly = b,

1
 1

 0

41
that is,
0
1
1
0
0
1
1
0 0 −
2
is equivalent to solving the system





0
0
0
 
y1
 
  y2
·
y3

y4
1


−3
  −5
=
  4
−3


,

y1 = −3,
y1 + y2 = −5
y2 + y3
= 4,



 − 1 y + y = −3.
3
4
2
We introduce y1 = −3 in the second equation of the system and we obtain y2 = −2. We
introduce y2 = −2 in the third equation of the system and we obtain y3 = 6. We introduce
y3 = 6 in the fourth equation of the system and we obtain y4 = 0. We have found


−3
 −2 

y=
 6 .
0
We solve now Rx = y, that is,

 
 
−1 2
0 0
x1
−3
 0 2
 
1
0


x2   −2


=
 0 0 −2 −1  · 

x3   6


9
x4
0
0 0
0
2
In fact, we solve

−x1 + 2x2 = −3,




2x2 + x3 = −2
−2x3 + 5x4
= 6,



9

x4
= 0.
2
From the last equation of the system, x4 = 0. We introduce this in


.

the above equation and we
1
obtain x3 = −3. We introduce this in the above equation and we obtain x2 = . We introduce
2
this in the above equation and we obtain x1 = 4. Therefore the solution of the initial system is


4
 1 


x =  2 .
 −3 
0
42
CHAPTER 2. SOLVING LINEAR SYSTEMS - DIRECT METHODS
Exercise 3. Solve the following systems:

x1 + x2 = 4



x1 − x2 + 2x3 = −2
a)
x + x3 − x 4 = 5


 2
5x3 + 2x4 = −4.

2x1 + x2 = −1



3x1 − x2 + x3 = −10
b)
x + x3 − 2x4 = −2


 2
−x3 + x4 = 3.
Exercise 4. Looking at the two variants of the usual L − R factorization method (Doolittle and
Croût), find another variant of the L − R factorization method for tridiagonal matrices. Solve
the systems from the previous exercise using this second variant of the method.
As we have seen above, before starting to solve a linear system using a L − R factorization
method, we have to find the values of the principal minors to see if they are non-zero. Thus it
would be nice to have a quicker method to calculate determinants, so we end this subsection
by introducing such a method.
2.3
Chio pivotal condensation method
Chio pivotal condensation is a method for determining the value of an n × n determinant (that
is, a determinant of n rows and n columns, which is also called a determinant of order n). The
general idea is to reduce the calculus of an n×n determinant to the calculus of a (n−1)×(n−1)
determinant by calculating the values of (n−1)2 minors of order 2. Then we reduce the calculus
of the (n − 1) × (n − 1) determinant to the calculus of an (n − 2) × (n − 2) determinant by
calculating the values of (n − 2)2 minors of order 2. We repeat the procedure until it only
remains to calculate the value of a 2 × 2 determinant.
Hence if we consider the matrix A = (aij )1≤i,j≤n ∈ Mn×n (R), to find the value of its determinant we apply the formula
(2.23)
det(A) =
1
an−2
11
a11 a12
a21 a22
a11 a13
a21 a23
a11 a12
a31 a32
..
.
a11 a13
a31 a33
..
.
a11 a12
an1 an2
a11 a13
an1 an3
...
...
...
...
a11 a1n
a21 a2n
a11 a1n
a31 a3n
..
.
,
a11 a1n
an1 ann
where a11 6= 0, and we apply again the above formula for the determinants of order n−1, n−2, ...
until we obtain a determinant of order 2.
Remarks:
2.3. CHIO PIVOTAL CONDENSATION METHOD
43
1. If a11 = 0 and there exists 2 ≤ i ≤ n for which ai1 6= 0, then we switch the rows 1 and i in
A, and we change the sign of det(A).
2. If ai1 = 0 for all 1 ≤ i ≤ n, then det(A) = 0.
3. In addition to the utility for the L − R factorization method, the calculus of a determinant
is also useful when we want to apply Gauss elimination method, since a non-zero determinant
of the matrix of the system ensures from the beginning that the system has a unique solution.
Example: Compute the determinant of the following matrices using Chio’s Method:




2 1 0
1
0 −2 1
0
 6 3 2 −1 
 5 1 −1 3 
,

.
A=
B
=
 1 2 1
 4 2
0 
2
5 
1 1 −2 3
6 1 −3 −1
Proof. We first make the calculus for det(A). We apply formula (2.23) for the determinant of
the square matrix A of order n.
2
6
1
1
det(A) =
1
=
4
1 0
1
3 2 −1
2 1
0
1 −2 3
=
1
24−2
2 1
6 3
2 0
6 2
2 1
6 −1
2 1
1 2
2 0
1 1
2 1
1 0
2 1
1 1
2 0
1 −2
2 1
1 3
0 4 −8
3 2 −1 .
1 −4 5
By applying formula (2.23) and by performing the calculus of several determinants of order 2,
we have arrived at the calculus of a determinant of order 3 instead of the initial determinant
of order 4. At this point we notice that the first element a11 of this determinant of order 3 is
zero, so we interchange the first two rows of the determinant. Therefore,
−1
det(A) =
4
3 2 −1
0 4 −8 .
1 −4 5
Now we apply again formula (2.23):
det(A) =
−1
4 · 33−2
3 2
0 4
3 −1
0 −8
=
3 2
1 −4
3 −1
1 5
−1
12
12 −24
−14 16
= 12.
44
CHAPTER 2. SOLVING LINEAR SYSTEMS - DIRECT METHODS
We now pass to the calculus of det(B). We notice that the first element of the square matrix
B is b11 = 0, thus we interchange the first two rows of the determinant without forgetting to
change the sign of the determinant and then we use again formula (2.23) for n = 4:
5 1 −1 3
0 −2 1
0
det(B) = −
4 2
2
5
6 1 −3 −1
=
−1
25
=
−1
54−2
5 1
0 −2
5 −1
0 1
5 3
0 0
5 1
4 2
5 −1
4 2
5 3
4 5
5 1
6 1
5 −1
6 −3
5 3
6 −1
−10 5
0
6
14 13 .
−1 −9 −23
We have seen once more that, from a determinant of order 4 we have arrived at the calculus of
a determinant of order 3. We apply again (2.23) for this last determinant and we get
det(B) =
=
1
250
1
−1
·
25 (−10)3−2
−170 −130
95
230
Exercise 5. Compute the determinants of



2
0 0 1
1
 6
 0 2 0

1 
, B =  1
A=

 0 1 1

3
 1
2 1 −2 −1
0
−10 5
6 14
−10 0
6 13
−10 5
−1 −9
−10 0
−1 −23
=
=
−39100 + 12350
= −107.
250
the following matrices using Chio’s


1 0
1 2
4 1
3 2 −1 0 

 1 5

2 1
0 0 
, C =  1 1
1 −2 3 0 
0 1
0 0 −1 0
method:

3 1
4 1 
.
10 6 
1 10
Chapter 3
Solving linear systems and nonlinear
equations - iterative methods
The iterative methods are based, as their name already announce, on iterations. By an iteration
we understand the repetition of a certain procedure of calculus based on the fact that the result
obtained at a previous step of the calculus is used in the next step of the calculus and so on.
A decisive role in the iterative method is played by the Contraction Principle (Banach Theorem
of fixed point) (see [2]).
Theorem 1. (Contraction Principle, or Banach Theorem of fixed point)
Let T : X → X be a contraction, where X is a complete metric space. Then T admits a fixed
point x∗ ∈ X.
Before giving its proof, let us fully understand the statement of this principle.
Definition 7. Let M 6= ∅ be an arbitrary set and d : M × M → R. We say that d is a metric
(distance) on M if and only if d fulfills the following properties:
(d1) d(x, y) = 0 ⇔ x = y (the identity of indiscernibles),
(d2) d(x, y) = d(y, x), for all x, y ∈ M (symmetry),
(d3) d(x, y) ≤ d(x, z) + d(z, y), for all x, y, z ∈ M (subadditivity or triangle inequality).
If d is a metric on M , we say that (M, d) is a metric space (and sometimes we simply write
that M is a metric space).
Remark 17. Given the axioms (d1)-(d3) from the above definition, we deduce the positivity
of d, that is,
d(x, y) ≥ 0, for all x, y ∈ M, with equality if and only if x = y.
45
46
CHAPTER 3. SOLVING LINEAR SYSTEMS AND NONLINEAR EQUATIONS
Indeed, by (d3) we have that
d(x, y) + d(y, x) ≥ d(x, x),
and, by (d2) and (d1), this means that
2d(x, y) ≥ 0.
Since the space X from the Contraction Principle was a complete metric space, we introduce
another definition.
Definition 8. A metric space (M, d) is called complete (or a Banach space) if any Cauchy
sequence (xn )n ⊂ M (that is, a sequence (xn )n such that, for every ε > 0, there exists Nε ∈ N
with the property that, for all n > Nε and p ∈ N∗ , we have d(xn+p , xn ) < ε) is convergent to a
x ∈ M.
To give an example of a complete metric space than we are going to use further we refer to
(RN , d), where d : RN × RN → R is given by
d(x, y) =
max
i∈{1,2..,N }
|xi − yi |.
Exercise 6. Prove that the previous d satisfies axioms (d1)-(d3).
Remark 18. On a set M we can define multiple metrics. For example, dE : RN × RN → R,
p
dE (x, y) = (x1 − y1 )2 + (x2 − y2 )2 + ... + (xN − yN )2 ,
represents another metric on RN .
Another notion used in the statement of Theorem 1, and maybe the most important one, is
the notion of contraction.
Definition 9. Let (X, d) be a metric space. We say that T : X → X is a contraction if and
only if there exists k ∈ (0, 1) such that
d(T x, T y) ≤ k · d(x, y), for all x, y ∈ X.
Remark 19. It is obvious that any contraction is a continuous function.
And finally, we recall the following definition.
Definition 10. Let T : X → X and x∗ ∈ X. We say that x∗ is a fixed point of T if and only
if
T x∗ = x∗ .
We pass new to the proof of the Contraction Principle.
47
Proof of Theorem 1
The idea of the proof is to find x∗ as the limit of a convergent sequence (xn )n ⊂ X which is
constructed as follows.
Take an arbitrary x0 ∈ X. Then
x 1 = T x0
x 2 = T x1
..
.
xn = T xn−1
xn+1 = T xn
..
.
We want to prove that there exists lim xn ∈ RN . Actually, given the fact that X is a complete
n→∞
metric space, it is enough to show that (xn )n is a Cauchy sequence. By the way we have defined
the sequence (xn )n ,
d(xn+p , xn ) = d(T xn+p−1 , T xn−1 ).
But T is a contraction, thus
d(xn+p , xn ) ≤ kd(xn+p−1 , xn−1 ).
In the same manner as above we deduce that
d(xn+p , xn ) ≤ k 2 d(xn+p−2 , xn−2 ) ≤ ... ≤ k n d(xp , x0 ).
Since d is a metric on X, we use (d3) and we arrive at
d(xn+p , xn ) ≤ k n [d(xp , xp−1 ) + d(xp−1 , xp−2 ) + ... + d(x1 , x0 )] .
Using again the definition of the sequence (xn )n and the fact that T is a contraction, we get
d(xn+p , xn ) ≤ k n k p−1 + k p−2 + ... + k + 1 d(x1 , x0 ),
hence
(3.1)
kp − 1
d(xn+p , xn ) ≤ k ·
d(x1 , x0 ) → 0 as n → ∞,
k−1
n
because k ∈ (0, 1).
We have obtained that (xn )n is a Cauchy sequence in a complete metric space, which means
that there exists x∗ such that (xn )n converges to x∗ in X. We show that this x∗ is a fixed point
of T . Using the continuity of T and the definition of (xn )n ,
∗
T (x ) = T lim xn = lim T xn = lim xn+1 = x∗ .
n→∞
n→∞
n→∞
All it remains to prove is the fact that this fixed point is unique. Arguing by contradiction, we
assume that there exists x ∈ X, x 6= x∗ , such that T x = x. We will show that d(x∗ , x) = 0,
48
CHAPTER 3. SOLVING LINEAR SYSTEMS AND NONLINEAR EQUATIONS
and this, by Remark 17, means that x∗ = x which is a contradiction. Using the fact that x and
x∗ are fixed points of the contraction T ,
d(x∗ , x) = d(T x∗ , T x) ≤ kd(x∗ , x),
therefore,
(1 − k)d(x∗ , x) ≤ 0.
Due to the fact that k ∈ (0, 1), 1 − k 6= 0, hence d(x∗ , x) = 0, and the proof is complete.
Exercise 7. In the above proof, find another way to show that d(x∗ , x) = 0 using (d3).
Remark 20. If we let p → ∞ in (3.1) we obtain the estimation
d(x∗ , xn ) ≤
kn
d(x0 , x1 ),
1−k
which gives us the convergence speed of the sequence (xn )n .
We observe that the speed of the convergence increases if k is smaller.
Based on this contraction principle we can introduce iterative methods for solving linear systems. The iterative methods consists in the construction of a sequence (xk )k that converges to
the exact solution x of a linear system given by the matrix equation Ax = b.
These methods are used for solving ”big” systems because the cost of the algorithm for them
is of order 2n2 which ismuch
smaller then the cost of the algorithm for the direct methods we
3
recall that this was O 2n3
.
The stop of the iterative process represents the truncation of the sequence (xk )k at certain
indices s, determined during the calculus with respect to an imposed precision such that the
term xs to represent a satisfactory approximation of the solution x.
3.1
Jacobi’s method for solving linear systems
This method is also called the simultaneous iterations method, and we will see immediately
why. For a better understanding of the reader, we start the discussion on a concrete example
and only after this we introduce the general formulae.
We are interested to approximate the solution of the following linear system

 10x1 + 5x2 − 3x3 = 6
(3.2)
−2x1 + 10x2 + 7x3 = 1

4x1 − 10x3 = 2,
allowing the error ε = 10−1 .
3.1. JACOBI’S METHOD FOR SOLVING LINEAR SYSTEMS
49
Proof. We extract x1 from the first equation of the system, x2 from the second equation of the
system and x3 from the third equation of the system

 x1 = (6 − 5x2 + 3x3 )/10
x = (1 + 2x1 − 7x3 )/10
 2
x3 = (2 − 4x1 )/(−10).
In fact, we will use the iterations

(k+1)
(k)
(k)

x
=
6
−
5x
+
3x
/10

2
3

 1
(k+1)
(k)
(k)
x2
= 1 + 2x1 − 7x3 /10



(k)
 x(k+1)
= 2 − 4x1 /(−10),
3
(k+1)
which means that at step k+1 we determine the value of xi , i ∈ {1, 2, 3} (denoted by xi
) with
respect to an expression involving the value of other unknowns obtained at the previous step,
thus the name of ”simultaneous iterations method”. But what is the very first value attributed
to x1 , x2 , x3 ? Well, as explained when discussion the Contraction Principle (see Theorem 1),
(0)
(0)
(0)
the beauty of these iterative methods is that it 
does not
really matters, so x1 , x2 and x3

 
(0)
x1
0
 (0)   
(0)
can be chosen arbitrarily. Here we choose x =  x2  =
0 , because it is easier for us
(0)
0
x3
to perform the calculus by hand. But any other values would work just as well (please check
this aspect at the laboratory section!).
We have

(1)
(0)
(0)



x = 6 − 5x2 + 3x3 /10 = 0.6


0.6
 1
(0)
(0)
(1)
x2 = 1 + 2x1 − 7x3 /10 = 0.1 ⇒ x(1) =  0.1  .



−0.2
(0)
 x(1)
/(−10) = −0.2
3 = 2 − 4x1
Note that we have x(1) , we can find x(2) , then we can find x(3) and so on. But when do we stop?
How cold we find out if we are close enough to the solution if we do not know the value of the
solution? (because this is the point of this method). We will stop at step s if and only if
d x(s−1) , x(s) < ε,
where, as announced below Definition 8,
d x(s−1) , x(s) =
(s)
(s−1)
max
xi − xi
n
(0)
x1
i∈{1,2,..,N }
≤ ε.
In our case ε = 10−1 = 0.1 and N = 3, thus
(1)
(0)
d x ,x
= max
1≤i≤3
(1)
xi
−
(0)
xi
= max
(1)
x1
−
,
(1)
x2
−
(0)
x2
,
= max {|0.6 − 0|, |0.1 − 0|, | − 0.2 − 0|} = 0.6 > 0.1.
(1)
x3
−
(0)
x3
o
=
50
CHAPTER 3. SOLVING LINEAR SYSTEMS AND NONLINEAR EQUATIONS
We continue our iterations

(2)
(1)
(1)



x = 6 − 5x2 + 3x3 /10 = (6 − 5 · 0.1 + 3 · (−0.2)) /10 = 0.49


0.49
 1
(2)
(1)
(1)
x2 = 1 + 2x1 − 7x3 /10 = (1 + 2 · 0.6 − 7 · (−0.2)) /10 = 0.36 ⇒ x(2) =  0.36  .



0.04
(1)
 x(2)
=
2
−
4x
/(−10) = (2 − 4 · 0.6) /(−10) = 0.04
3
1
thus
n
o
(2)
(1)
(2)
(1)
(2)
(1)
d x(2) , x(1) = max x1 − x1 , x2 − x2 , x3 − x3
=
= max {|0.49 − 0.6|, |0.36 − 0.1|, |0.04 − (−0.2)|} = 0.26 > 0.1.
We continue our iterations

(3)
(2)
(2)

x
=
6
−
5x
+
3x
/10 = (6 − 5 · 0.36 + 3 · 0.04) /10 = 0.432

2
3

 1
(2)
(2)
(3)
x2 = 1 + 2x1 − 7x3 /10 = (1 + 2 · 0.49 − 7 · 0.04) /10 = 0.17



(2)
 x(3)
/(−10) = (2 − 4 · 0.49) /(−10) = −0.004,
=
2
−
4x
3
1
thus
n
o
(3)
(2)
(3)
(2)
(3)
(2)
d x(3) , x(2) = max x1 − x1 , x2 − x2 , x3 − x3
=
= max {|0.432 − 0.49|, |0.17 − 0.36|, | − 0.004 − 0.04|} = 0.19 > 0.1.
We continue our iterations

(4)
(3)
(3)

x
=
6
−
5x
+
3x
/10 = (6 − 5 · 0.17 + 3 · (−0.004)) /10 = 0.5138

2
3

 1
(4)
(3)
(3)
x2 = 1 + 2x1 − 7x3 /10 = (1 + 2 · 0.432 − 7 · (−0.004)) /10 = 0.1892



(3)
 x(4)
=
2
−
4x
/(−10) = (2 − 4 · 0.432) /(−10) = −0.0272,
3
1
thus
o
n
(4)
(3)
(4)
(3)
(4)
(3)
=
d x(4) , x(3) = max x1 − x1 , x2 − x2 , x3 − x3
= max {|0.5138 − 0.432|, |0.1892 − 0.17|, | − 0.0272 − (−0.004)|} = 0.0818 < 0.1.
So, the solution with precision given by ε = 10−1 is

(4)

 x1 = 0.5138
(4)
x2 = 0.1892

 (4)
x3 = −0.0272.
Now that we have seen how this method works, we will present it with the general formulae.
We consider the linear system
(3.3)
A · x = b,
3.1. JACOBI’S METHOD FOR SOLVING LINEAR SYSTEMS
51
where A = (aij )1≤i,j≤n ∈ Mn×n (R) is the matrix of system(3.3) and b = (bi )1≤i≤n ∈ Mn×1 (R)
is the free term of system (3.3). The target is to determine, if possible, an approximation of the
unique solution x ∈ Rn of the system (3.3). We let x(0) ∈ Rn to be the initial approximation of
the solution of system (3.3), arbitrarily chosen (for example the null vector). We compute


n
X

(k) 
(k+1)
aij xj  aii , 1 ≤ i ≤ n, k ≥ 0,
xi
=  bi −
j=1
j6=i
until
(k+1)
max xi
1≤i≤n
(k)
− xi
≤ ε,
where ε is the error that allows us to set the precision with which we want to approximate the
solution of system (3.3). Then x w x(k+1) .
Remark 21. To apply the above iteration formulae it is obvious that we need to have aii 6= 0,
for all i ∈ {1, 2, .., n}.
Remark 22. The essence of this method is represented by the Contraction Principle (see
Theorem 1) since we consider T : Rn → Rn defined by
T x = y,
where

!
n
X

1


y1 =
b1 −
a1j xj



a11

j=2

!


i−1
n

X
X

1
 y =
bi −
aij yj −
aij xj
i
aii
j=1
j=i+1

!

n−1

X

1



bn −
anj yj .
yn =


ann

j=1



If T is a contraction, then there exists x∗ ∈ Rn such that T x∗ = x∗ , that is,


n
X
1 

x∗i =
aij x∗j  , 1 ≤ i ≤ n,
 bi −
aii
j=1
j6=i
meaning that Ax∗ = b, as we have seen.
It is quite important to notice that T defined above is a contraction if the matrix A (corresponding to the linear system) is strictly diagonally dominant on rows (or columns).
Definition 11. A square matrix A = (aij )1≤i,j≤n ∈ Mn×n (R) is strictly diagonally dominant
on rows if the absolute value of each element from the main diagonal is strictly greater than the
sum of the other terms in absolute value from the line, that is,
|aii | >
n
X
j=1
j6=i
|aij |, for all 1 ≤ i ≤ n.
52
CHAPTER 3. SOLVING LINEAR SYSTEMS AND NONLINEAR EQUATIONS
Similarly, A is strictly diagonally dominant on columns if
|ajj | >
n
X
|aij |, for all 1 ≤ j ≤ n.
i=1
i6=j
Conclusions:
1. In order to apply Jacobi’s iterative methods for solving a linear system Ax = b we have
to verify that aii 6= 0, for all i ∈ {1, 2, ..., n}. In the contrary case, we interchange the
lines of the system.
2. To be certain that Jacobi’s methods is working ( meaning that it guarantees the approximation of the solution with the desired precision) we have to make sure that matrix A
is strictly diagonally dominant on rows or columns. However, even if A is not strictly
diagonally dominant it is still possible for this method to work.
We present next another example of system solved by means of the Jacobi’s iterative method.
Example 9. Using the Jacobi’s method, solve the following system within the error 10−2 .

 5x1 − 3x2 − x3 = 5
−2x1 + 4x2 + x3 = 0

2x1 − 2x2 − 5x3 = −3.
Proof. We have that


5 −3 −1
A =  −2 4
1 
2 −2 −5


5
and b =  0  .
−3
We check if the matrix A is diagonal
dominant on rows :
|a11 | = 5
⇒ |a11 | > |a12 | + |a13 |
|a12 | + |a13 | = | − 3| + | − 1| = 4
|a22 | = 4
⇒ |a22 | > |a21 | + |a23 |
|a21 | + |a23 | = | − 2| + |1| = 3
|a33 | = 5
⇒ |a33 | > |a31 | + |a32 |.
|a31 | + |a33 | = |2| + | − 2| = 4
So, the matrix A is diagonal dominant on rows, and we have the certainty that the Jacobi’s
method enables us to obtain an approximation of the solution with the desired precision. We
remark that A is not diagonal dominant on columns.
We write the initial system in an equivalent form

 x1 = (5 + 3x2 + x3 )/5
x = (2x1 − x3 )/4
 2
x3 = (−3 − 2x1 + 2x2 )/(−5).
3.1. JACOBI’S METHOD FOR SOLVING LINEAR SYSTEMS
53

  
(0)
x1
0
 (0)   
(0)
We choose arbitrarily the initial approximation x =  x2  =
0 , and we consider
(0)
0
x3
the recurrence

(k+1)
(k)
(k)

x
=
5
+
3x
+
x
/5

2

 1
3
(k+1)
(k)
(k)
x2
= 2x1 − x3 /4



(k)
(k)
 x(k+1)
=
−3
−
2x
+
2x
/(−5).
3
1
2
For k = 0 we obtain

(1)
(0)
(0)

x = 5 + 3x2 + x3 /5 = (5 + 3 · 0 + 0) /5 = 1


 1
(1)
(0)
(0)
x2 = 2x1 − x3 /4 = (2 · 0 − 0) /4 = 0



(0)
(0)
 x(1)
=
−3
−
2x
+
2x
/(−5) = (−3 − 2 · 0 + 2 · 0) /(−5) = 0.6.
3
1
2
We check the stop condition
n
o
(1)
(0)
(1)
(0)
(1)
(0)
(1)
(0)
d x(1) − x(0) = max xi − xi = max x1 − x1 , x2 − x2 , x3 − x3
=
1≤i≤3
= max {|1 − 0|, |0 − 0|, |0.6 − 0|} = 1 > ε = 0.01.
For k = 1 we get

(1)
(1)
(2)

/5 = (5 + 3 · 0 + 0.6) /5 = 1.12
+
x
=
5
+
3x
x
 1
2


3
(2)
(1)
(1)
x2 = 2x1 − x3 /4 = (2 · 1 − 0.6) /4 = 0.35



(1)
(1)
 x(2)
=
−3
−
2x
+
2x
/(−5) = (−3 − 2 · 1 + 2 · 0) /(−5) = 1.
3
1
2
We check the stop condition
d x
(2)
−x
(1)
= max
1≤i≤3
(2)
xi
−
(1)
xi
= max
n
(2)
x1
−
(1)
x1
,
(2)
x2
−
(1)
x2
,
(2)
x3
−
(1)
x3
o
=
= max {|1.12 − 1|, |0.35 − 0|, |1 − 0.6|} = 0.4 > ε = 0.01.
For k = 2 we obtain

(3)
(2)
(2)

x
=
5
+
3x
+
x
/5 = (5 + 3 · 0.35 + 1) /5 = 1.41

2

 1
3
(3)
(2)
(2)
x2 = 2x1 − x3 /4 = (2 · 1.12 − 1) /4 = 0.31


 (3)
(2)
(2)

x3 = −3 − 2x1 + 2x2 /(−5) = (−3 − 2 · 1.12 + 2 · 0.35) /(−5) = 0.908.
We check the stop condition
d x
(3)
−x
(2)
= max
1≤i≤3
(3)
xi
−
(2)
xi
= max
n
(3)
x1
−
(2)
x1
,
(3)
x2
−
(2)
x2
,
(3)
x3
−
= max {|1.41 − 1.12|, |0.31 − 0.35|, |0.908 − 1|} = 0.19 > ε = 0.01.
(2)
x3
o
=
54
CHAPTER 3. SOLVING LINEAR SYSTEMS AND NONLINEAR EQUATIONS
..
.
Using the above reasoning we obtain the solution with precision given by ε = 0.01 at the step
k = 14

(14)

 x1 = 1.495639
(14)
x2 = 0.503865

 (14)
x3 = 1.004191.


 
x1
1.5
The exact solution of the system is  x2  =  0.5 .
x3
1
3.2
Seidel-Gauss method for solving linear system
This method is resembling quite a lot to Jacobi’s method for solving linear systems. We will
discuss it on particular system with 4 equations and 4 unknowns for a better understanding
and then we will provide the general formulae.
Let us solve the following linear system via the Seidel-Gauss method:

10x1 + 3x2 − 2x3 − x4 = −1



x1 + 10x2 − x3 + 2x4 = −24
(3.4)
x − x2 + 4x3 + x4 = 14


 1
x1 + x2 + x3 + 20x4 = −18.
We will construct a sequence x(k) k to approximate the exact solution. As before, if the matrix
A associated to this
system is strictly diagonally dominant on rows or columns, we deduce that
the sequence x(k) k converges to the solution of the system.
We have


10 3 −2 −1
 1 10 −1 2 

A=
 1 −1 4
1 
1 1
1 20
We check if A is strictly diagonally dominant on rows:
|a11 | > |a12 | + |a13 | + |a14 | ⇔ 10 > 3 + 2 + 1 -holds true
|a22 | > |a21 | + |a23 | + |a24 | ⇔ 10 > 1 + 1 + 2 -holds true
|a33 | > |a31 | + |a32 | + |a34 | ⇔ 4 > 1 + 1 + 1 -holds true
|a33 | > |a31 | + |a32 | + |a34 | ⇔ 20 > 1 + 1 + 1 -holds true.
We have, seen by these last inequalities that are valid, that A is strictly diagonally dominant
on rows, hence the sequence constructed via this method will be convergent. We point out
that it is sufficient for A to be strictly diagonally dominant on either rows or columns, it is
3.2. SEIDEL-GAUSS METHOD FOR SOLVING LINEAR SYSTEM
55
not required to be both. In fact, this matrix is not strictly diagonally dominant on columns,
because |a33 | = |a13 | + |a23 | + |a43 |.
We now start to construction of the convergent sequence x(k) k with a random value attributed
 
0

0 

to x(0) . We prefer to take x(0) = 
 0  only for the simplicity of the calculus made by hand.
0
To determine the other terms of x(k) k we proceed as in Jacobi’s method by extracting x1 from
the first equation of the system, x2 from the second equation of the system, x3 from the third
equation of the system and x4 from the fourth equation of the system:

1
(−1 − 3x2 + 2x3 + x4 )
x1 = 10



1
x2 = 10
(−24 − x1 + x3 − 2x4 )
(3.5)
1
x = (14 − x1 + x2 − x4 )


 3 41
x4 = 20 (−18 − x1 − x2 − x3 ).
To obtain x(1) , x(2) , ..., x(k) , .. we will use the following formulae which are slightly different from
the ones used in Jacobi’s method:

(k+1)
(k)
(k)
(k)
1

x1
= 10 −1 − 3x2 + 2x3 + x4





(k+1)
(k)
(k)
1
 x(k+1)
=
−24
−
x
+
x
−
2x
2
1
3
4
10
(k+1)
(k+1)
(k)
(k+1)
1

14
−
x
+
x
−
x
x
=

1
2
4
4
 3



(k+1)
(k+1)
(k+1)
(k+1)
1
 x4
= 20
−18 − x1
− x2
− x3
.
Notice that, although preserve the structure of (3.5) for these formulae, as in Jacobi, here
(k+1)
our xi from step k + 1 are calculated a bit differently. Indeed, the calculus of x1
from
(k+1)
(k+1)
Seidel-Gauss is identical to the calculus of x1
from Jacobi, but for x2
Seidel-Gauss uses
(k+1)
(k)
(k+1)
(k+1)
(k+1)
the newly calculated x1
instead of x1 . Then for x3
Seidel-Gauss uses x1
and x2
(k)
(k)
instead of x1 and x2 , respectively. And so on. More exactly, in our case we have

(1)
(0)
(0)
(0)
1
1

x1 = 10 −1 − 3x2 + 2x3 + x4 = 10
(−1 − 3 · 0 + 2 · 0 + 0) = −0.1





(1)
(0)
(0)
1
1
 x(1)
= 10
(−24 + 0.1 + 0 − 2 · 0) = −2.39
2 = 10 −24 − x1 + x3 − 2x4
(1)
(1)
(1)
(0)
1
1


 x3 = 4 14 − x1 + x2 − x4 = 4 (14 + 0.1 − 2.39 − 0) = 2.9275



(1)
(1)
(1)
1
1
 x(1)
= 20
(−18 + 0.1 + 2.39 − 2.9275) = −0.921875.
4 = 20 −18 − x1 − x2 − x3
In order to know to stop calculating the terms of the sequence, we establish what is the desired
precision. Thus, if we would like to take our solution within error 10−2 , we tke ε = 10−2 and
we stop at the step s if
d x(s) − x(s−1) < ε.
In our case
d x
(1)
−x
(0)
= max
1≤i≤4
(1)
xi
−
(0)
xi
= max
n
(1)
x1
−
(0)
x1
,
(1)
x2
−
(0)
x2
,
(1)
x3
−
(0)
x3
,
(1)
x4
−
(0)
x4
o
=
56
CHAPTER 3. SOLVING LINEAR SYSTEMS AND NONLINEAR EQUATIONS
= max {| − 0.1 − 0|, | − 2.39 − 0|, |2.9275 − 0|, | − 0.921875 − 0|} = 2.9275 > ε = 0.01.
We continue

(2)

x1 =





 x(2)
2 =
(2)

x3 =





 x(2)
4 =
1
10
1
10
1
4
−1 −
(1)
3x2
−24 −
(2)
x1
(2)
+
(1)
2x3
+
(1)
x3
= 1.110312
(1)
2x4
= −2.033906
+
−
(2)
(1)
x4
(1)
14 − x1 + x2 − x4 = 2.944414
(2)
(2)
(2)
1
−18 − x1 − x2 − x3 = −1.001041.
20
We check the stop condition
(2)
(1)
d x(2) − x(1) = max xi − xi =
1≤i≤4
= max {|1.110312 + 0.1|, | − 2.033906 + 2.39|, |2.944414 − 2.9275|, | − 1.001041 + 0.921875|} =
= 1.210312 > ε = 0.01.
We continue our iterations

(3)

x1 =





 x(3)
2 =
(3)

x3





 x(3)
4
1
10
1
10
(2)
(2)
(2)
(2)
−1 − 3x2 + 2x3 + x4
(2)
(3)
= 0.998951
−24 − x1 + x3 − 2x4 = −2.005245
(3)
(3)
(2)
1
= 4 14 − x1 + x2 − x4 = 2.999211
(3)
(3)
(3)
1
= 20
−18 − x1 − x2 − x3 = −0.999646,
thus
(2)
(3)
d x(3) − x(2) = max xi − xi = max {|0.998951 − 1.110312 + |, | − 2.005245 + 2.033906|,
1≤i≤4
|2.999211 − 2.944414|, | − 0.999646 + 1.001041|} = 0.111361 > ε = 0.01.
We continue our iterations

(4)

x1 =





 x(4)
2 =
(4)

x3 =





 x(4)
4 =
1
10
1
10
1
4
−1 −
(3)
3x2
−24 −
(4)
x1
(4)
+
(3)
2x3
+
(3)
x3
(4)
(3)
x4
= 1.001451
(3)
2x4
= −2.000295
+
−
(3)
14 − x1 + x2 − x4 = 2.999475
(4)
(4)
(4)
1
−18 − x1 − x2 − x3 = −1.000032,
20
thus
(4)
(3)
d x(4) − x(3) = max xi − xi = max {|1.001451 − 0.998951 + |, | − 2.000295 + 2.005245|,
1≤i≤4
|2.999475 − 2.999211|, | − 1.000032 + 0.999646|} = 0.00495 < ε = 0.01.
So, the solution with precision ε = 10−2 is the following
 (4)

x1 = 1.001451


 (4)
x2 = −2.000295
(4)

x3 = 2.999475


 (4)
x4 = −1.000032.
3.2. SEIDEL-GAUSS METHOD FOR SOLVING LINEAR SYSTEM
57
We are able to give the general formulae. We consider the linear system
A · x = b,
(3.6)
where A = (aij )1≤i,j≤n ∈ Mn×n (R) is the matrix of system (3.6) and b = (bi )1≤i≤n ∈ Mn×1 (R)
is the free term of system (3.6).
For x(0) ∈ Rn the initial approximation of solution of system (3.6), arbitrarily chosen (for
example the null vector), we compute
!
i−1
n
X
X
(k+1)
(k+1)
(k)
xi
= bi −
aij xj
−
aij xj
aii , 1 ≤ i ≤ n, k ≥ 0,
j=1
j=i+1
until
(k)
(k+1)
− xi ≤ ε,
d x(k+1) − x(k) = max xi
1≤i≤n
where ε gives us the precision with which we want to approximate the solution of system (3.6).
Then x w x(k+1) .
Remark 23. To apply the above iteration formulae it is obvious that we need to have aii 6= 0,
for all i ∈ {1, 2, .., n}. If that is not the case, we interchange the lines of the system among
them, as appropriate.
Remark 24. As in case of the Jacobi’s method, a sufficient condition for the sequence constructed by the Seidel-Gauss method to be convergent to the solution of system (3.6) is that
the matrix A to be strictly diagonally dominant on rows or columns. However, this is not a
necessary condition.
For the simplicity of writing, we introduce the following definition.
Definition
12. We say that Jacobi’s iterative method is convergent if and only if the sequence
(k)
x k constructed by Jacobi’s method is convergent to the solution of the linear system.
Similarly,
we say that the Seidel-Gauss iterative method is convergent if and only if the sequence
(k)
x k constructed by the Seidel-Gauss method is convergent to the solution of the linear system.
Further comments. There are more relaxed conditions than the one mentioned by Remark
24 that can be imposed in order to ensure the convergence of these two iterative methods.
In fact, the sequence x(k) k constructed via Jacobi’s method or via Seidel-Gauss method is
convergent to the unique solution of the linear systems (3.6) if and only if the spectral radius
of the matrix B is subunitary, where B is the matrix that appears when we write the system
(3.6) under the equivalent form
(3.7)
x = Bx + c.
This is just to give a very general idea; for the definition of the spectral radius of a matrix, see
Definition 17. For additional comments and proofs, we send the reader to [7, Sec. 4, Th.1].
58
CHAPTER 3. SOLVING LINEAR SYSTEMS AND NONLINEAR EQUATIONS
Closing this parenthesis of comments, we would like to present another example of linear
system approached by means of the Seidel-Gauss method, in addition to solving system (3.4).
Actually, to make a comparison between the efficiency of the two iterative methods, Jacobi
and Seidel-Gauss, we decided to solve via the Seidel-Gauss method the system from Example
9 which was previously solved by Jacobi’s method.
Example 10. Using the Seidel-Gauss method, solve the following system with precision given
by ε = 10−2

 5x1 − 3x2 − x3 = 5
−2x1 + 4x2 + x3 = 0

2x1 − 2x2 − 5x3 = −3.
Proof. We have


5 −3 −1
A =  −2 4
1 
2 −2 −5


5
and b =  0  .
−3
From the Example 9 we already know that matrix A is strictly diagonally dominant, hence the
Seidel-Gauss method works.
We write the initial system into the following form:

 x1 = (5 + 3x2 + x3 )/5
x = (2x1 − x3 )/4
 2
x3 = (−3 − 2x1 + 2x2 )/(−5).
  
(0)
x1
0
 (0)   
(0)
as an initial approximation of the solution
We choose an arbitrary x =  x2  =
0
(0)
0
x3
and we consider the recurrence

(k+1)
(k)
(k)

x
=
5
+
3x
+
x
/5

2
3

 1
(k+1)
(k+1)
(k)
= 2x1
− x3 /4
x2



(k+1)
(k+1)
 x(k+1)
=
−3
−
2x
+
2x
/(−5).
3
1
2

For k = 0 we obtain

(1)
(0)
(0)

x
=
5
+
3x
+
x
/5 = (5 + 3 · 0 + 0) /5 = 1

2

 1
3
(1)
(1)
(0)
x2 = 2x1 − x3 /4 = (2 · 1 − 0) /4 = 0.5



(1)
(1)
 x(1)
=
−3
−
2x
+
2x
/(−5) = (−3 − 2 · 1 + 2 · 0.5) /(−5) = 0.8.
3
1
2
We check the stop condition
d x
(1)
−x
(0)
= max
1≤i≤3
(1)
xi
−
(0)
xi
= max
n
(1)
x1
−
(0)
x1
,
(1)
x2
−
(0)
x2
,
(1)
x3
−
(0)
x3
o
=
3.2. SEIDEL-GAUSS METHOD FOR SOLVING LINEAR SYSTEM
59
= max {|1 − 0|, |0.5 − 0|, |0.8 − 0|} = 1 > ε = 0.01.
For k = 1 we obtain

(2)
(1)
(1)

x = 5 + 3x2 + x3 /5 = (5 + 3 · 0.5 + 0.8) /5 = 1.46


 1
(2)
(2)
(1)
x2 = 2x1 − x3 /4 = (2 · 1.46 − 0.8) /4 = 0.53



(2)
(2)
 x(2)
=
−3
−
2x
+
2x
/(−5) = (−3 − 2 · 1.46 + 2 · 0.53) /(−5) = 0.972.
3
1
2
We check the stop condition
n
o
(2)
(1)
(2)
(1)
(2)
(1)
(2)
(1)
d x(2) − x(1) = max xi − xi = max x1 − x1 , x2 − x2 , x3 − x3
=
1≤i≤3
= max {|1.46 − 1|, |0.53 − 0.5|, |0.972 − 0.8|} = 0.46 > ε = 0.01.
For k = 2 we obtain

(3)
(2)
(2)

x = 5 + 3x2 + x3 /5 = (5 + 3 · 0.53 + 0.972) /5 = 1.5124


 1
(3)
(3)
(2)
x2 = 2x1 − x3 /4 = (2 · 1.5124 − 0.972) /4 = 0.5132



(3)
(3)
 x(3)
/(−5) = (−3 − 2 · 1.5124 + 2 · 0.5132) /(−5) = 0.99968.
3 = −3 − 2x1 + 2x2
We check the stop condition
d x
(3)
−x
(2)
= max
1≤i≤3
(3)
xi
−
(2)
xi
= max
n
(3)
x1
−
(2)
x1
,
(3)
x2
−
(2)
x2
,
(3)
x3
−
(2)
x3
o
=
= max {|1.5124 − 1.46|, |0.5132 − 0.53|, |0.99968 − 0.972|} = 0.0524 > ε = 0.01.
For k = 3 we obtain

(4)
(3)
(3)

/5 = (5 + 3 · 0.5124 + 0.99968) /5 = 1.507856
x
=
5
+
3x
+
x

2

 1
3
(4)
(4)
(3)
x2 = 2x1 − x3 /4 = (2 · 1.507856 − 0.99968) /4 = 0.504008



(4)
(4)
 x(4)
/(−5) = (−3 − 2 · 1.507856 + 2 · 0.504008) /(−5) = 1.001539.
3 = −3 − 2x1 + 2x2
We check the stop condition
d x
(4)
−x
(3)
= max
1≤i≤3
(4)
xi
−
(3)
xi
= max
n
(4)
x1
−
(3)
x1
,
(4)
x2
−
(3)
x2
,
(4)
x3
−
(3
x3
o
=
= max {|1.507856 − 1.5124|, |0.504008 − 0.5132|, |1.001539 − 0.99968|} = 0.009192 < ε = 0.01.
Thus an approximation of the solution of the system of precision given by ε = 10−2 is the
following

(4)

 x1 = 1.507856
(4)
x2 = 0.504008

 (4)
x3 = 1.001539.
Notice that, by using Seidel-Gauss method, we have obtain the solution of this system within
the error 10−2 at step 4, as opposed to Jacobi’s method, when we needed 14 steps.
60
CHAPTER 3. SOLVING LINEAR SYSTEMS AND NONLINEAR EQUATIONS
Exercise 8. Solve the system (3.2) using the Seidel-Gauss method. What do you notice concerning the convergence of the sequence?
If both methods, Jacobi and Seidel-Gauss, are convergent, then we usually notice that the
Seidel-Gauss’ method converges more rapidly to the exact solution of the linear system. But
this is not necessary true in all cases. Furthermore, there are situations when Jacobi’s method
converges, but Seidel-Gauss method does not. At the same time, there are situations when
Seidel-Gauss method converges, but Jacobi’s method does not. Thus we cannot really say
which of these two iterative methods is ”better”, so we study both.
In what follows we present an example for which Jacobi’s method converge, but the SeidelGauss method does not converge.
Example 11. We consider the system

 x1 − x2 + x3 = 0
(3.8)
−2x1 + x2 + x3 = −4

−4x1 + 2x2 + x3 = −7.
a) Using Jacobi’s method, try to solve the above system within the error 10−4 .
b) Using the Seidel-Gauss method, try to solve the above system within the error 10−4 .
Proof. We have that


1 −1 1
A =  −2 1 1 
−4 2 1


0
and b =  −4  .
−7
We check if the matrix A is diagonal
dominant on rows or columns. We have
|a11 | = 1
⇒ |a11 | < |a12 | + |a13 |
|a12 | + |a13 | = | − 1| + |21| = 4
|a11 | = 1
⇒ |a11 | < |a21 | + |a31 |
|a21 | + |a31 | = | − 2| + | − 4| = 6
So, the matrix A is not diagonal dominant on rows or columns, but it is still possible for the
two iterative methods to converge even in this situation. Thus we proceed with the methods
and we see what happens.
We write the initial system in an equivalent form

 x 1 = x2 − x3
x = −4 + 2x1 − x3
 2
x3 = −7 + 4x1 − 2x2 .

a) We choose arbitrarily the initial approximation x(0)
  
(0)
x1
0

  
=  x(0)
=
0 , and we consider

2
(0)
0
x3
3.2. SEIDEL-GAUSS METHOD FOR SOLVING LINEAR SYSTEM
the recurrence
61

(k+1)
(k)
(k)

= x2 − x3
 x1
(k+1)
(k)
(k)
x2
= −4 + 2x1 − x3

(k)
(k)
 (k+1)
x3
= −7 + 4x1 − 2x2 .
For k = 0 we obtain

(1)
(0)
(0)

 x1 = x2 − x3 = 0
(1)
(0)
(0)
x2 = −4 + 2x1 − x3 = −4

(0)
(0)
 (1)
x3 = −7 + 4x1 − 2x2 = −7.
We check the stop condition
n
o
(1)
(0)
(1)
(0)
(1)
(0)
(1)
(0)
d x(1) − x(0) = max xi − xi = max x1 − x1 , x2 − x2 , x3 − x3
=
1≤i≤3
= max {|0 − 0|, | − 4 − 0|, | − 7 − 0|} = 7 > ε = 0.0001.
For k = 1 we get

(2)
(1)
(1)

 x1 = x2 − x3 = −4 + 7 = 3
(2)
(1)
(1)
x2 = −4 + 2x1 − x3 = −4 + 2 · 0 + 7 = 3

(1)
(1)
 (2)
x3 = −7 + 4x1 − 2x2 = −7 + 4 · 0 − 2 · (−4) = 1.
We check the stop condition
o
n
(2)
(1)
(2)
(1)
(2)
(1)
(2)
(1)
=
d x(2) − x(1) = max xi − xi = max x1 − x1 , x2 − x2 , x3 − x3
1≤i≤3
= max {|3 − 0|, |3 + 4|, |1 + 7|} = 8 > ε = 0.0001.
For k = 2 we obtain

(2)
(2)
(3)

 x1 = x2 − x3 = 3 − 1 = 2
(3)
(2)
(2)
x2 = −4 + 2x1 − x3 = −4 + 2 · 3 − 1 = 1

(2)
(2)
 (3)
x3 = −7 + 4x1 − 2x2 = −7 + 4 · 3 − 2 · 3 = −1.
We check the stop condition
o
n
(3)
(2)
(3)
(2)
(3)
(2)
(3)
(2)
d x(3) − x(2) = max xi − xi = max x1 − x1 , x2 − x2 , x3 − x3
=
1≤i≤3
= max {|2 − 3|, |1 − 3|, | − 1 − 1|} = 2 > ε = 0.0001.
For k = 3 we obtain

(4)
(3)
(3)

 x1 = x2 − x3 = 1 + 1 = 2
(4)
(3)
(3)
x2 = −4 + 2x1 − x3 = −4 + 2 · 2 + 1 = 1

(3)
(3)
 (4)
x3 = −7 + 4x1 − 2x2 = −7 + 4 · 2 − 2 · 1 = −1.
We check the stop condition
d x
(4)
−x
(3)
= max
1≤i≤3
(4)
xi
−
(3)
xi
= max
n
(4)
x1
−
(3)
x1
,
(4)
x2
−
(3)
x2
,
(4)
x3
−
(3)
x3
o
=
62
CHAPTER 3. SOLVING LINEAR SYSTEMS AND NONLINEAR EQUATIONS
= max {|2 − 2|, |1 − 1|, | − 1 + 1|} = 0 < ε = 0.0001.
 (4)  

x1
2

 
So, the solution of the system within the error ε = 0.0001 is  x(4)
1 .
=
2
(4)
−1
x3
In fact, if we substitute this solution into the initial system (3.8), we notice that this is not just
an approximation of the solution; instead we have arrived at the exact solution of the system.
 (0)   
x1
0
 (0)   
(0)
b) We choose arbitrarily the initial approximation x =  x2  =
0 , and we consider
(0)
0
x3
the recurrence

(k+1)
(k)
(k)

= x2 − x3
 x1
(k+1)
(k+1)
(k)
x2
= −4 + 2x1
− x3

(k+1)
(k+1)
 (k+1)
x3
= −7 + 4x1
− 2x2
.
For k = 0 we obtain

(1)
(0)
(0)

 x1 = x2 − x3 = 0
(1)
(1)
(0)
x2 = −4 + 2x1 − x3 = −4

(1)
(1)
 (1)
x3 = −7 + 4x1 − 2x2 = 1.
We check the stop condition
(0)
(1)
d x(1) − x(0) = max xi − xi = max {|0 − 0|, | − 4 − 0|, |1 − 0|} = 4 > ε = 0.0001.
1≤i≤3
For k = 1 we get

(2)
(1)
(1)

 x1 = x2 − x3 = −5
(2)
(2)
(1)
x2 = −4 + 2x1 − x3 = −15

(2)
(2)
 (2)
x3 = −7 + 4x1 − 2x2 = 3.
We check the stop condition
(2)
(1)
d x(2) − x(1) = max xi − xi = max {| − 5 − 0|, | − 15 + 4|, |3 − 1|} = 11 > ε = 0.0001.
1≤i≤3
For k = 2 we obtain

(3)
(2)
(2)

 x1 = x2 − x3 = 3 − 1 = −18
(3)
(3)
(2)
x2 = −4 + 2x1 − x3 = −43

(3)
(3)
 (3)
x3 = −7 + 4x1 − 2x2 = 7.
We check the stop condition
(3)
(2)
d x(3) − x(2) = max xi − xi = max {| − 18 + 5|, | − 43 + 15|, |7 − 3|} = 18 > ε = 0.0001.
1≤i≤3
For k = 3 we obtain

(4)
(3)
(3)

 x1 = x2 − x3 = −50
(4)
(4)
(3)
x2 = −4 + 2x1 − x3 = −111

(4)
(4)
 (4)
x3 = −7 + 4x1 − 2x2 = 15.
3.2. SEIDEL-GAUSS METHOD FOR SOLVING LINEAR SYSTEM
63
We check the stop condition
(4)
(3)
d x(4) − x(3) = max xi − xi = max {|50 + 18|, | − 111 + 43|, |15 − 7|} > ε = 0.0001.
1≤i≤3
For k = 4 we obtain

(5)
(4)
(4)

 x1 = x2 − x3 = −125
(5)
(5)
(4)
x2 = −4 + 2x1 − x3 = −271

(5)
(5)
 (5)
x3 = −7 + 4x1 − 2x2 = 31.
By continuing to iterate over and over again with the help of a computer, we remark that the
Seidel-Gauss method does not converge because the distance between two consecutive terms of
the sequence grows bigger and bigger, instead of getting smaller.
We have seen in the above example a system for which the Seidel-Gauss method does not
converge, while Jacobi’s method converges quite rapidly, allowing us to obtain the exact solution
of the system at the fourth step.
We exemplify next the opposite situation, in which Jacobi’s method does not converge, while
Seidel-Gauss method does converge.
Exercise 9. We consider the system

3
1
 x1 − 4 x2 + 4 x3 = 1
−3x + x − 1x = −3
 1 4 1 1 2 2 3 3 2
x − 2 x2 + x3 = 2 .
4 1
a) Using Jacobi’s method, try to solve the above system within the error 10−2 .
b) Using the Seidel-Gauss method, try to solve the above system within the error 10−2 .
Hint: a) Jacobi’s method does not converge because the distance between two consecutive terms
of the
 sequence that we construct via this method becomes bigger and bigger.
(6)

 x1 = 0.014892
(6)
b)
x2 = −0.986449

 (6)
x3 = 1.003052.
Notice that, in order to make sure that the Seidel-Gauss method converges, we based the
construction of the above system on the following property which holds true only for the SeidelGauss method.
Proposition 1. If the matrix A associated to the linear system (3.6) is symmetric and positivedefinite then the sequence constructed by the Seidel-Gauss method converges to the solution of
system (3.6).
Remark 25. The previous condition is not sufficient for Jacobi’s method to converge, see
Exercise 9 where the matrix associated to the linear system is symmetric and positive-definite,
but the sequence constructed by Jacobi’s method does not converge to the solution of the system.
64
CHAPTER 3. SOLVING LINEAR SYSTEMS AND NONLINEAR EQUATIONS
We saw that Proposition 1 gives us a new criterion to establish whether the Seidel-Gauss
method is convergent when the matrix associated to the linear system is not strictly diagonally
dominant on rows or columns. For the convenience of the reader, we recall the definition of the
notions involved in it.
Definition 13. Let be the matrix A = (aij )1≤i,j≤n ∈ Mn×n (R). The matrix A is called a
symmetric matrix if and only if
A = AT ,
that is,
aij = aji , for all 1 ≤ i, j ≤ n.
Definition 14. A symmetric matrix A ∈ Mn×n (R) is called positive-definite if the associated
quadratic form
f (x) = xT Ax
has a positive value for every nonzero vector x ∈ Rn .
Here are some properties that help us determine when a symmetric matrix is positive definite.
Remark
positive.
26. A symmetric matrix is positive-definite if and only if all its eigenvalues are
Theorem 2. (Sylvester’s Criterion). A real symmetric matrix A = (aij )1≤i,j≤n ∈ Mn×n (R) is
positive definite if and only if all its principal minors are positive, that is,
∆1 = a11 > 0,
∆2 =
∆3 =
∆n =
a11 a12
a21 a22
> 0,
a11 a12 a13
a21 a22 a23
a31 a32 a33
...
a11 a12
a21 a22
... ...
an1 an2
... a1n
... a2n
... ...
... ann
> 0,
> 0.
3.3. SOLVING NONLINEAR EQUATIONS BY THE SUCCESSIVE APPROXIMATIONS METHOD65
3.3
Solving nonlinear equations by the successive approximations method
After discussing the solvability of the linear system by direct and iterative methods, it is time
to discuss the solvability of the nonlinear equations as well. Here we use an iterative method,
called the successive approximations method, which is also based on The Contraction Principle
(see Theorem 1) as it was the case of the previous two iterative methods presented in this
chapter.
Let us consider the equation
(3.9)
f (x) = 0,
where f : [a, b] → R. The target is to approximate the solution of equation (3.9), x∗ ∈ [a, b], by
a sequence (xn )n≥0 that converges to x∗ .
The first step is to write the equation (3.9) in an equivalent form
(3.10)
x = g(x).
We construct the sequence (xn )n≥0 as follows
(3.11)
xn+1 = g(xn ),
n ≥ 0,
where x0 ∈ [a, b] is arbitrary taken and it represents the initial approximation of the solution
x∗ .
In fact, f could have multiple solutions on an interval [a, b]. As we have seen at the beginning
of Chapter 3, it does not really matter what is the value of x0 because the sequence constructed
by (3.11) converges to the unique solution of (3.10), as long as (3.10) has a unique solution.
What happens if the solution of (3.10) is not unique? Then, the choice of the initial value x0
of the sequence (xn )n≥0 is relevant, since this will send us to a solution or another. So, the idea
is to know to which specific interval belongs each solution and to start with x0 in a specific
interval (as tight as possible) in order to determine that specific solution.
How do we know where the solutions are? There are two approaches to determine the intervals
of the solutions.
One is analytical, and it is based on different methods studied at Mathematical Analysis, like
the sequence of Rolle. The other is based on computer programs that can draw the graphic of
functions and we follow the intersection with the axis Ox. Now, we may ask ourselves: why do
we need this successive approximations method if we can already see where the solutions are?
We need it for precision, because, as seen before, we can establish from the beginning what is
the error ε that we consider acceptable.
Due to all the above said, in what follows we will consider that f : [a, b] → R admits a unique
solutions x∗ on [a, b] and our goal is to approximate this solution within a given error ε. We
66
CHAPTER 3. SOLVING LINEAR SYSTEMS AND NONLINEAR EQUATIONS
write equation (3.9) in its equivalent form (3.10) and we check if the hypothesis of Theorem 1
is fulfilled by function g, that is, we check if g is a contraction. We recall that g is a contraction
on [a, b] if and only if there exists q ∈ (0, 1) such that
(3.12)
d(g(x), g(y)) ≤ q · d(x, y) for all x, y ∈ [a, b],
where d represents a metric (distance) on R. We take
d(α, β) = |α − β|,
(3.13)
because this is the natural metric on R.
Exercise 10. Show that d : R × R → R by (3.13) is a distance (metric) on R.
Thus, condition (3.12) becomes
|g(x) − g(y)| ≤ q|x − y| for all x, y ∈ R,
and the stop criterion when approximating a solution of (3.10) within error ε is
(3.14)
|xk+1 − xk | < ε.
If (3.14) takes place, we stop and write that xk+1 ' x∗ within error ε. If not, we calculate xk+2
and we test whether |xk+2 − xk+1 | < ε etc. Notice that, the smaller the contraction coefficient
q is, the quicker converges (xn )n to x∗ .
Remark 27. Taking into consideration the definition of the derivability of a function we can
deduce that, if g is derivable (or, equivalently, differentiable) on [a, b], then it is sufficient to
prove that
|g 0 (x)| ≤ q ∈ (0, 1) for all x, y ∈ [a, b],
in order to establish that g is a contraction on [a, b].
Hence we indicate the algorithm to follow when solving equation (3.9) by the successive approximations method:
Step 1: Write (3.9) in the equivalent form (3.10) and check if g is a contraction (use Remark
27 when g is differentiable). If g is a contraction, go to the next step. If g is not a contraction,
find another g such that (3.9) can be written as (3.10).
Step 2: Chose x0 ∈ [a, b], calculate x1 = g(x0 ) and check if |x1 − x0 | < ε. If so, then x1 ' x∗ .
If not, go to the next step.
Step 3: Calculate x2 = g(x1 ) and check if |x2 − x1 | < ε. If so, write x2 ' x∗ . If not, go to the
next step, etc..
Example 12. Using the method of successive approximations determine the subunitary solution
of the equation
(3.15)
within error ε = 3 · 10−5 .
x3 + 5x + 1 = 0,
3.3. SOLVING NONLINEAR EQUATIONS BY THE SUCCESSIVE APPROXIMATIONS METHOD67
Proof. Since we are looking for a subunitary solution of (3.15), we work on the interval [−1, 1]
and we try to find a contraction g : [−1, 1] → R such that equation (3.15) could be written in
the equivalent form
g(x) = x.
An easy way to put (3.15) in the above form is to add x to both members of the equality:
x3 + 6x + 1 = x.
Then g : [−1, 1] → R is defined by
g(x) = x3 + 6x + 1,
(3.16)
and, since g is derivable on [−1, 1], we use Remark 27. We have
|g 0 (x)| = |3x2 + 6| ≥ 6 when x ∈ [−1, 1],
so this g given by (3.16) is not a contraction.
We try another way of choosing g.
x3 + 5x + 1 = 0 ⇔ 5x = −x3 − 1 ⇔ x = −
x3 + 1
.
5
We take g : [−1, 1] → R defined by
g(x) = −
(3.17)
x3 + 1
.
5
Thus g is also derivable on [−1, 1] so we use again Remark 27 to test if it is a contraction. We
have
3
1
|g 0 (x)| = |3x2 | ≤ ∈ (0, 1) for all x ∈ [−1, 1].
5
5
Then g defined by (3.17) is a contraction of coefficient q =
step of the algorithm.
3
5
∈ (0, 1) and we pass to the next
Let us take x0 = 0 ∈ [−1, 1].
Then
1
x1 = g(x0 ) = g(0) = − .
5
We test
1
1
3
|x1 − x0 | < ε ⇔ − − 0 < 3 · 10−5 ⇔ <
,
5
5
100000
which is false, hence we pass to the next step.
We calculate
3
− 15 + 1
124
1
x2 = g(x1 ) = g −
=−
=−
= 0.198400.
5
5
625
68
CHAPTER 3. SOLVING LINEAR SYSTEMS AND NONLINEAR EQUATIONS
We test
|x2 − x1 | < ε ⇔ −
124 1
3
1
3
+
<
⇔
<
,
625 5
100000
625
100000
which is false, hence we pass to the next step.
We calculate
3
− 124
+1
124
−1906624 + 244140625
625
x3 = g(x2 ) = g −
=−
=−
=
625
5
513
=−
242234001
= −0.1984380936192.
1220703125
We test
|x3 − x2 | < ε ⇔ |−0.1984380936192 + 0.198400| <
3
⇔ 0.0000380936192 < 0.00003,
100000
which is false, hence we pass to the next step.
We calculate
242234001
x4 = g(x3 ) = g −
' −0.19843719376902433.
1220703125
We test
|x4 − x3 | < ε ⇔ |−0.1984380936192 + 0.19843719376902433| < 0.00003,
which is true and we write that x4 ' 0.19843719376902433 approximates the solution of (3.15)
within error ε = 3 · 10−5 and we stop.
Remark 28. There are other possibilities to find a contraction g satisfying an equation of the
form (3.10) which is equivalent to (3.15). For example
x3 + 5x + 1 = 0 ⇔ x(x2 + 1) = −1 ⇔ x = −
x2
1
.
+5
We take g : [−1, 1] → R given by
g(x) = −
(3.18)
1
.
x2 + 5
Since g is derivable on [−1, 1] we use Remark 27. We have
|g 0 (x)| =
2x
2
≤
∈ (0, 1) for all x ∈ [−1, 1],
(x2 + 5)2
36
since g 0 is increasing on [−1, 1].
Therefore g defined by (3.18) is a contraction and the contraction coefficient is q =
1
18
∈ (0, 1).
1
Moreover, since q = 18
< 53 , where 53 was the contraction coefficient of the previous g, the one
given by (3.17), we expect that the sequence that we construct now will converge more rapidly.
Let us see.
3.3. SOLVING NONLINEAR EQUATIONS BY THE SUCCESSIVE APPROXIMATIONS METHOD69
We take x0 = 0. Then x1 = g(x0 ) = − 021+5 = − 15 . As before |x1 − x0 | > ε and we continue the
iterations.
1
1
25
x2 = g(x1 ) = g −
=−
=−
.
2
1
5
126
−
+5
5
25
3
We test |x2 − x1 | < ε ⇔ − 126
+ 15 < 100000
, which is false, hence we pass to the next step.
We calculate
1
15876
25
1
= − 625
=−
x3 = g(x2 ) = g −
=−
.
2
126
80005
+5
− 25 + 5
15876
126
We test
|x3 − x2 | < ε ⇔ −
25
251
3
15876
3
+
⇔
<
,
<
80005 126
100000
10080630
100000
which is true, hence x3 = − 15876
' −0.1984375977 approximates the solution of equation (3.15)
80005
−5
within error ε = 3 · 10 and we stop (one step earlier than before).
We have seen above two possibilities to choose g such that it is a contraction. However, there
are situation when it is not so easy to identify a contraction g. If that is the case, the next
result could come in handy.
Proposition 2. If the function g : [a, b] → [a, b] is derivable, invertible and
|g 0 (x)| > 1 for all x ∈ [a, b],
then the inverse function g −1 is a contraction on [a, b].
Proof. For every x ∈ [a, b] we have
g(g −1 (x)) = x ⇒ g 0 (g −1 (x)) · (g −1 )0 (x) = 1 ⇒ (g −1 )0 (x) =
and so
0
g −1 (x) =
1
|g 0 (g −1 (x))|
1
g 0 (g −1 (x))
< 1.
Let us see an example that relies on Proposition 2.
Example 13. Using the method of successive approximations determine the root x∗ ∈ [1, 2] of
equation
x3 + 3x2 = 16,
with precision given by ε = 6 · 10−2 .
70
CHAPTER 3. SOLVING LINEAR SYSTEMS AND NONLINEAR EQUATIONS
Proof. Let
f (x) = x3 + 3x2 − 16,
x ∈ [1, 2].
We search a convenient g in order to write the previous equation in the equivalent form
x = g(x). We have
x3 + 3x2 = 16 ⇔ x2 (x + 3) = 16 ⇔ x + 3 =
16
16
⇔ x = 2 − 3 ⇔ x = g(x),
2
x
|x {z }
=g(x)
where g(x) =
16
x2
− 3.
We check if the function g is a contraction on [1, 2], i.e. |g 0 (x)| < 1, for every x ∈ [1, 2]. We
have
16
16
|g 0 (x)| = − 3 ≥ 3 > 1, for every x ∈ [1, 2],
x
2
since the function x163 is decreasing on the interval [1, 2]. So, the function g is not a contraction
on [1, 2]. We try to use the Proposition 2.
In this case we write the equation in an equivalent form
x = h(x),
where h(x) = g −1 (x) is the inverse function of g. More precisely, we take out x on right side of
the equation
16
1
x2
16
4
16
−3
⇔
x+3
=
⇔
=
⇔
= x2 ⇔ √
= x ⇔ x = h(x),
2
2
x
x
x+3
16
x+3
x+3
h
i
4
, x ∈ [1, 2]. We note that h([1, 2]) = √45 , 2 ⊆ [1, 2].
where h(x) = √x+3
x = g(x) ⇔ x =
1
4
We check if the function h = √x+3
= 4(x + 3)− 2 is a contraction on [1, 2], i.e. |h0 (x)| < 1, for
every x ∈ [1, 2]. We have
3
1
2
2
0
|h (x)| = 4 −
(x + 3)− 2 = √
3 ≤ √
3 < 1, for every x ∈ [1, 2].
2
x+3
1+3
Therefore, the function h is a contraction on [1, 2] and we can more to the next step of the
algorithm.
We arbitrarily choose x0 = 1 from [1, 2] and, as usual, we consider the recurrence
4
xn+1 = h(xn ) ⇔ xn+1 = √
,
xn + 3
n ≥ 0.
For n = 0, we have
x1 = h(x0 ) = √
4
x0 + 3
=√
4
= 2.
1+3
3.3. SOLVING NONLINEAR EQUATIONS BY THE SUCCESSIVE APPROXIMATIONS METHOD71
We check the stop condition
|x1 − x0 | = |2 − 1| = 1 > ε =
6
.
100
For n = 1, we have
4
4
4
x2 = h(x1 ) = √
=√
= √ ' 1.788854382.
x1 + 3
2+3
5
We check the stop condition
4
|x2 − x1 | = √ − 2 ' 0.2111456 > ε = 0.06.
5
For n = 2, we have
√
445
4
4
=q
x3 = h(x2 ) = √
=p
√ ' 1.8278652467.
x2 + 3
√4 + 3
4
+
3
5
5
We check the stop condition
√
445
4
|x3 − x2 | = p
√ − √ ' 0.0390108647 < ε = 0.06.
5
4+3 5
So,
x3 = p
√
445
√ ,
4+3 5
is the solution of equation within error ε = 6 · 10−2 .
Finally, to help the reader to better understand this method, we present another solved exercise.
Example 14. Using the method of successive approximations determine the root x∗ ∈ [1, 2] of
the equation
√
x = 4 x + 2,
with precision given by ε = 10−2 .
Proof. We consider the function
f (x) = x −
√
4
x + 2,
x ∈ [1, 2].
We search a contraction g in order to write the previous equation in an equivalent form
x = g(x).
72
CHAPTER 3. SOLVING LINEAR SYSTEMS AND NONLINEAR EQUATIONS
This time the choice of g is obvious, g(x) =
√
4
x + 2,
x ∈ [1, 2].
We prove that g is a contraction on the interval [1, 2], i.e. |g 0 (x)| < 1, for every x ∈ [1, 2]. We
have that
3
1
1
1
(x + 2)− 4 = √
3 ≤ √
3 < 1, for every x ∈ [1, 2],
4
4
4
4 x+2
4 1+2
|g 0 (x)| =
since, the function
1
√
3
4 4 x+2
is decreasing on [1, 2]. Therefore, g is a contraction on [1, 2].
We arbitrarily choose x0 = 2 from [1, 2] and we consider the recurrence
√
xn+1 = g(xn ) ⇔ xn+1 = 4 xn + 2,
n ≥ 0.
For n = 0, we get
x1 = g(x0 ) =
√
4
x0 + 2 =
√
4
4=
√
2.
We check the stop condition
√
|x1 − x0 | = | 2 − 2| ' 0.585786 > ε = 0.01.
For n = 1, we have
x2 = g(x1 ) =
√
4
q
4 √
x1 + 2 =
2 + 2 ' 1.3593230172.
We check the stop condition
|x2 − x1 | =
q
4 √
2+2−
√
2 ' 0.054891 > ε = 0.01.
For n = 2, we have
x3 = g(x2 ) =
√
4
x2 + 2 =
rq
4 4 √
2 + 2 + 2 ' 1.3538262837.
We check the stop condition
|x3 − x2 | ' 0.005497 < ε = 0.01.
So,
x3 =
rq
4 4 √
2 + 2 + 2,
is the solution of the equation within error ε = 10−2 .
Exercise 11. Approximate the solution of the following equations within error ε on the specified
interval
a) x =
4
x3 −6
,x∈
5
, ε = 10−3 ,
,
2 2
3
3.3. SOLVING NONLINEAR EQUATIONS BY THE SUCCESSIVE APPROXIMATIONS METHOD73
b) x3 + 10x = 1, x ∈ [0, 1], ε = 3 · 10−3 ,
√
c) 3 8x − 3 − 2x = 0, x ∈ [0, 1], ε = 2 · 10−2 ,
d) x3 − 6x + 2 = 0, x ∈ [−1, 1], ε = 3 · 10−4 ,
e) x5 + 8x + 2 = 0, x ∈ − 21 , 21 , ε = 10−2 ,
f ) x3 + 6x2 = 4, x ∈ [0, 1], ε = 6 · 10−3 ,
g) x =
(x−2)3
,
3
x ∈ [4, 5], ε = 5 · 10−3 .
74
CHAPTER 3. SOLVING LINEAR SYSTEMS AND NONLINEAR EQUATIONS
Chapter 4
Eigenvalues and eigenvectors
Definition 15. A nonzero vector v ∈ Mn×1 (R) is called eigenvector of the matrix A ∈
Mn×n (R) if there exists λ ∈ C such that
Av = λv.
The scalar λ is called eigenvalue of the matrix A corresponding to the eigenvector v.
1 −1
1
Example 15. Let the matrix A =
, the vector v1 =
and λ1 = −1. We
−4 1
2
have
1 −1
1
−1
Av1 =
=
= λ1 v1 .
−4 1
2
−2
So, λ1 = −1 is an eigenvalue of the matrix A and v1 is an eigenvector corresponding to λ1 = −1.
1
Exercise 12. Show that λ2 = 3 is an eigenvalue of the matrix A and v2 =
is an
−2
eigenvector corresponding to λ2 = 3.
Definition 16. Let A ∈ Mn×n (R). The polynomial defined by
pA (λ) = det(A − λIn ),
is called the characteristic polynomial of the matrix A.
Remark 29. The polynomial pA (λ) is a polynomial of degree n, with real coefficients:
(4.1)
pA (λ) = λn + c1 λn−1 + c2 λn−2 + ... + cn−1 λ + cn .
Very important, the roots of the characteristic polynomial pA , λ1 , λ2 , ..., λn , are the eigenvalues
associated to A.
Indeed, Av = λv is equivalent to (A − λIn )v = 0 and this homogeneous linear system has
nonzero solutions if and only det(A − λIn ) = 0, meaning that pA (λ) = 0.
75
76
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
Definition 17. Let A ∈ Mn×n (R). We say that
σ(A) = λ λ is an eigenvalue of A ,
is the spectrum of A and
ρ(A) = max {|λ|},
λ∈σ(A)
is the spectral radius of A.
Remark 30. If ρ(B) < 1, where B is given in (3.7), then the iterative methods for solving
linear system, Jacobi and Seidel-Gauss, are converging.
Finding the eigenvalues associated to a matrix A plays a decisive role in several branches of
mathematics and physics. From a mathematical point of view, see for example Proposition 1 and
Remark 26 from this study. Also, the sign of the eigenvalues of the Hessian matrix associated to
a C 2 function could help us establish whether that function admits local minimum or maximum
points. And the examples that illustrate the importance of the eigenvalues could go on. From
a physical point of view, the eigenvalues problems are related to the analysis of the dynamic
behavior of mechanical, electrical, fluid, thermal, and structural systems. In addition to this,
they are also relevant in the analysis of control systems. Hence, in what follows we focus on
the calculus of the characteristic polynomial associated to a square matrix A, since the roots
of this polynomial are in fact the eigenvalues associated to A.
4.1
Krylov’s method to determine the characteristic polynomial
We consider a matrix A ∈ Mn×n (R). To determine the characteristic polynomial associated
to A, that is, the polynomial pA given by (4.1), we have to determine the coefficients ci ∈ R,
i ∈ {1, 2, ..., n}. To this purpose we apply the Cayley-Hamilton Theorem which states that
pA (A) = On .
This means that
An + c1 An−1 + c2 An−2 + ... + cn−1 A + cn In = 0n .
We choose an arbitrary y (0) ∈ Mn×1 (R) \ {0n } and we multiply to the right by y (0) , that is,
(4.2)
An y (0) + c1 An−1 y (0) + c2 An−2 y (0) + ... + cn−1 Ay (0) + cn y (0) = On .
For simplicity, we denote
A · y (0) = y (1) ,
A2 · y (0) = y (2) ,
..
.
Ak · y (0) = y (k) ,
..
.
An · y (0) = y (n) ,
4.1. KRYLOV’S METHOD TO DETERMINE THE CHARACTERISTIC POLYNOMIAL77
and we notice that
y (k) = Ay (k−1)
for all k ∈ {1, 2, ..., n}.
Then relation (4.2) becomes
c1 y (n−1) + c2 y (n−2) + ... + cn−1 y (1) + cn y (0) = −y (n) ,
or, equivalently,



B·

(4.3)
c1
c2
..
.



 = −y (n) ,

cn
where
B = y (n−1) y (n−2) ...y (1) y (0) ,
(4.4)
meaning that the columns of B are exactly y (n−1) , y (n−2) , ..., y (0) . We have arrived at a linear
system (4.3) which can be treated using the methods that we studied in the first chapter of our
work. Hence, to find out if system (4.3) is compatible determined we can apply Chio’s method
to calculate the determinant of matrix B.
If det B = 0, we realize that (4.3) does not admit a unique solution, so we go back and choose
another y (0) ∈ Mn×1 (R) \ {0n }.
If det B 6= 0, we use the direct methods (Gauss elimination method or L − R factorization
method) to determine the unique solution of (4.3), that is,


c1
 c2 


C =  ..  .
 . 
cn
By finding c1 , c2 , ..., cn we have arrived at the characteristic polynomial associated to A, pA (λ) =
λn + c1 λn−1 + ... + cn . Moreover, Krylov’s approach allows us to further determine the inverse
of A. Indeed, if cn 6= 0,
A−1 = −
(4.5)
1
An−1 + c1 An−2 + ... + cn−1 In .
cn
If, on the other hand, cn = 0, then A is not invertible.
Now we concentrate on the computing the eigenvalues λ1 , λ2 , ..., λn and, if these eigenvalues
are real and λi 6= λj whenever i 6= j, we can also determine the corresponding eigenvectors
v1 , v2 , ..., vn associated to A. To solve the characteristic equation pA (λ) = 0 we can use the
methods that were taught in high-school for solving nonlinear equations. Once we have determined the eigenvalues, if λi ∈ R∗ and λi 6= λj for i 6= j, i, j ∈ {1, 2, ..., n}, then we compute
qi (λ) =
pA (λ)
= λn−1 + q1,i λn−2 + q2,i λn−3 + ... + qn−1,i λ + qn−1,i
λ − λi
(1 ≤ i ≤ n),
78
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
and we determine the eigenvectors as
vi = y (n−1) + q1,i y (n−2) + q2,i y (n−3) + ... + qn−1,i y (1) + qn−1,i y (0)
(1 ≤ i ≤ n).
To synthesize, we present the algorithm for finding the characteristic polynomial, the eigenvalues and the eigenvectors of the matrix A.
Step 1. We arbitrarily choose y (0) ∈ Mn×1 (R) \ {0n }.
Step 2. We compute y (1) , y (2) , ..., y (n−1) using the formula
y (k) = Ay (k−1) ,
1 ≤ k ≤ n − 1.
Step 3. We test if det(B) = 0, where B is defined by (4.4).
If det(B) = 0, then we go back to Step 1.
If det(B) 6= 0, then we move to Step 4.
Step 4. We compute y (n) = Ay (n−1) and then we solve the linear system


c1
 c2 


(4.6)
B ·  ..  = −y (n) .
 . 
cn
Step 5. We write down the characteristic polynomial
pA (λ) = λn + c1 λn−1 + ... + cn−1 λ + cn ,
and we solve the equation
pA (λ) = 0,
in order to determine the eigenvalues λ1 , λ2 , ..., λn associated to A.
Step 6. If the eigenvalues are real and distinct numbers, then the Krylov’s method allow us
to compute the eigenvectors. More precisely, we compute
qi (λ) =
pA (λ)
= λn−1 + q1,i λn−2 + q2,i λn−3 + ...qn−1,i
λ − λi
(1 ≤ i ≤ n),
and an eigenvector corresponding to λi is
y (n−1) + q1,i y (n−2) + q2,i y (n−3) + ...qn−1,i y (0) .
Step 7. Moreover, if the free coefficient cn of the characteristic polynomial is nonzero, the
matrix A is invertible and
A−1 = −
1
An−1 + c1 An−2 + ... + cn−1 In ,
cn
where In is the unit matrix.
If the free coefficient cn = 0, then the matrix A is not invertible.
In what follows we present some solved exercise for the convenience of the reader.
4.1. KRYLOV’S METHOD TO DETERMINE THE CHARACTERISTIC POLYNOMIAL79
Example 16. Using Krylov’s method, determine the eigenvalues and eigenvectors corresponding to matrix
1 −1
A=
.
−4 1
Is the matrix A invertible? If so, determine the inverse of A, using Krylov’s method.
Proof. Step 1. We choose arbitrarily y
(0)
1
0
=
6=
0
0
, and we consider the vectorial
recurrence
y (k+1) = A · y (k) ,
0 ≤ k ≤ 1.
Step 2. We compute y (1)
y
(1)
=A·y
(0)
1 −1
−4 1
=
Step 3. We test if det(B) 6= 0, where B = y
(1)
y
(0)
1
0
=
=
1
−4
.
1 1
−4 0
5
−8
. We notice that det(B) =
4 6= 0 and we move further.
Step 4. We compute y (2)
y
(2)
=A·y
(1)
=
1 −1
−4 1
1
−4
=
.
We solve the linear system
1 1
−4 0
c1
·
= −y (2) ,
c2
which is equivalent to
c1 + c2 = −5
⇒
−4c1 = 8
c1 = −2
c2 = −3.
Step 5. The characteristic polynomial corresponding to matrix A is
pA (λ) = λ2 + c1 λ + c2 = λ2 − 2λ − 3.
To determine the eigenvalues, we solve
pA (λ) = 0 ⇔ λ2 − 2λ − 3 = 0.
This equation is equivalent to
(λ + 1)(λ − 3) = 0,
then we deduce
that λ1 = −1 and λ2 = 3. Alternativly, we could of calculate ∆ = 16 to find
√
2± ∆
λ1,2 = 2 .
Step 6. Since the eigenvalues are distinct real number, Krylov’s method allows us to compute
the eigenvectors.
80
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
For the eigenvalue λ1 = −1, we compute
q1 (λ) =
pA (λ)
= λ − 3.
λ − λ1
An eigenvector corresponding to the eigenvalue λ1 = −1 is
1
1
−2
(1)
(0)
y − 3y =
−3
=
.
−4
0
−4
For the eigenvalue λ2 = 3, we compute
q2 (λ) =
pA (λ)
= λ + 1.
λ − λ2
An eigenvector corresponding to the eigenvalue λ2 = 3 is
1
1
2
(1)
(0)
y +y =
+
=
.
−4
0
−4
Step 7. Since the free term of the characteristic polynomial c2 = −4 is nonzero, then the
matrix A is invertible and the corresponding invertible matrix is given by the following formula
1
1
1
1 −1 −1
1 −1
1 0
−1
A = − (A + c1 I3 ) = (A − 2I3 ) =
−2
=
.
−4 1
0 1
c2
3
3
3 −4 −1
Example 17. Using Krylov’s method, determine the eigenvalues and eigenvectors corresponding to matrix


3 3 −1
A =  1 3 −1  .
2 3 0
Is the matrix A invertible? If so, determine the inverse of A, using Krylov’s method.

Proof. Step 1. We choose arbitrarily y (0)
  
1
0
=  0  6=  0  , and we consider the vectorial
0
0
recurrence
y (k+1) = A · y (k) ,
0 ≤ k ≤ 2.
Step 2. We compute y (1) and y (2)

y (1) = A · y (0)
   
3 3 −1
1
3





=
=
1 3 −1
0
1 ,
2 3 0
0
2
4.1. KRYLOV’S METHOD TO DETERMINE THE CHARACTERISTIC POLYNOMIAL81

  

3 3 −1
3
10
y (2) = A · y (1) =  1 3 −1   1  =  4  .
2 3 0
2
9


10
3
1
Step 3. We test if det(B) 6= 0, where B = y (2) y (1) y (0) =  4 1 0 . We notice that
9 2 0
det(B) = −1 6= 0 and we move further.
Step 4. We compute y (3)


 

3 3 −1
10
33
y (3) = A · y (2) =  1 3 −1   4  =  13  .
2 3 0
9
32
We solve the linear system


c1
B ·  c2  = −y (3) ,
c3
which is equivalent to





10 3 1
c1
33
 4 1 0   c2  = −  13  .
c3
9 2 0
32
(4.7)


c1
We note that  c2  is the vector which contains the coefficients of the characteristic polynoc3
mial.
We write the system (4.7) in an equivalent form


 10c1 + 3c2 + c3 = −33
 c1 = −6
⇒
4c + c2 = −13
c = 11
 1
 2
9c1 + 2c2 = −32
c3 = −6.
Step 5. The characteristic polynomial corresponding of the matrix A is
pA (λ) = λ3 + c1 λ2 + c2 λ + c3 = λ3 − 6λ2 + 11λ − 6.
To determine the eigenvalues, we solve the following equation
pA (λ) = 0 ⇔ λ3 − 6λ2 + 11λ − 6 = 0.
We notice that this is a polynomial equations of integer coefficients. Thus, if pA admits rational
roots, these must be in the set D6 = {±1, ±2, ±3, ±6}. We test these values to see if any of
them is a root of pA .
Since
pA (1) = 1 − 6 + 11 − 6 = 0,
82
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
we deduce that λ1 = 1. Next we could test the other values to see if any of them is an eigenvalue
as well, we could divide pA (λ) by (λ − 1), or we could apply Horner’s rule for polynomials. For
generality, we usually apply Horner’s rule or we divide the polynomial by (λ − λ1 ). We get
pA (λ) = (λ − 1)(λ2 − 5λ + 6).
Solving λ2 − 5λ + 6 = 0 we obtain λ2 = 2 and λ3 = 3.
Step 6. Since the eigenvalues are distinct real number, Krylov’s method allows us to compute
the eigenvectors.
For the eigenvalue λ1 = 1, we compute
q1 (λ) =
pA (λ)
= (λ − 2)(λ − 3) = λ2 − 5λ + 6.
λ − λ1
An eigenvector corresponding to eigenvalue λ1 = 1



10
y (2) − 5y (1) + 6y (0) =  4  − 5 
9
is

  

3
1
1
1  + 6  0  =  −1  .
2
0
−1
For eigenvalue λ2 = 2, we compute
q2 (λ) =
pA (λ)
= (λ − 1)(λ − 3) = λ2 − 4λ + 3.
λ − λ2
An eigenvector corresponding to eigenvalue λ2 = 2 is


 
   
10
3
1
1
(2)
(1)
(0)







y − 4y + 3y =
−4 1
+3 0
=
4
0 .
9
2
0
1
For eigenvalue λ3 = 3, we compute
q3 (λ) =
pA (λ)
= (λ − 1)(λ − 2) = λ2 − 3λ + 2.
λ − λ3
An eigenvector corresponding to eigenvalue λ3 = 3 is


 
   
10
3
1
3
(2)
(1)
(0)







y − 3y + 2y =
−3 1
+2 0
=
4
1 .
9
2
0
3
Step 7. Moreover, since the free term of the characteristic polynomial c3 = −10 is nonzero,
then the matrix A is invertible and its inverse is given by the formula
1
1 2
(A + c1 A + c2 I3 ) = (A2 − 6A + 11I3 ) =
c3
6








10 15 −6
3 3 −1
1 0 0
3 −3 0
1
1
=  4 9 −4  − 6  1 3 −1  + 11  0 1 0  =  −2 2 2  .
6
6
9 15 −5
2 3 0
0 0 1
−3 −3 6
A−1 = −
4.1. KRYLOV’S METHOD TO DETERMINE THE CHARACTERISTIC POLYNOMIAL83
Remark 31. In the above examples, we choose y (0) to have as many zeros as possible just to
ease our calculus. The method works for any y (0) provided that det(B) =
6 0, where B is given
by (4.4).
Example 18. Using Krylov’s method, determine the eigenvalues and eigenvectors corresponding to matrix


−2 1 −1
A =  1 −1 2  .
−1 2 −1
Is the matrix A invertible? If so, determine the inverse of A, using Krylov’s method.

Proof. Step 1. We choose arbitrarily y (0)
  
1
0
=  0  6=  0  , and we consider the vectorial
0
0
recurrence
y (k+1) = A · y (k) ,
0 ≤ k ≤ 2.
Step 2. We compute y (1) and y (2)

y (1) = A · y (0)
  

−2 1 −1
1
−2
=  1 −1 2   0  =  1  ,
−1 2 −1
0
−1


 
−2 1 −1
−2
6
(2)
(1)





y =A·y =
=
1 −1 2
1
−5
−1 2 −1
−1
5

6 −2
Step 3. We test if det(B) 6= 0, where B = y (2) y (1) y (0) =  −5 1
5 −1
det(B) = 0, so we return to Step 1.
Step 1’. We choose another
   
0
0
(0)
y =  1  6=  0  .
0
0

.

1
0 . We notice that
0
Step 2’. We compute y (1) and y (2)

y (1) = A · y (0)
  

−2 1 −1
0
1
=  1 −1 2   1  =  −1  ,
−1 2 −1
0
2

y (2) = A · y (1)

 

−2 1 −1
1
−5
=  1 −1 2   −1  =  6  .
−1 2 −1
2
−5
84
CHAPTER 4. EIGENVALUES AND EIGENVECTORS


−5
1
1
Step 3’. We test if det(B) 6= 0, where B = y (2) y (1) y (0) =  6 −1 0 . We notice that
−5 2 0
det(B) = 7 6= 0 and we move further.
Step 4’. We compute y (3) :

 


−2 1 −1
−5
21
y (3) = A · y (2) =  1 −1 2   6  =  −21  .
−1 2 −1
−5
22
We solve the linear system


c1
B ·  c2  = −y (3) ,
c3
which is equivalent to





−5 1 0
c1
21
 6 −1 1   c2  = −  −21  .
−5 2 0
c3
22
(4.8)


c1
We note that  c2  is the vector which contains the coefficients of the characteristic polynoc3
mial.
We write the system (4.8) in an equivalent form


 −5c1 + c2 = −21
 c1 = 4
6c − c2 + c3 = 21 ⇒
c = −1
 1
 2
−5c1 + 2c2 = −22
c3 = −4.
Step 5’. The characteristic polynomial corresponding to the matrix A is
pA (λ) = λ3 + c1 λ2 + c2 λ + c3 = λ3 + 4λ2 − λ − 4.
To determine the eigenvalues, we solve the following equation
pA (λ) = 0 ⇔ λ3 + 4λ2 − λ − 4 = 0.
We notice that this is a polynomial equations of integer coefficients. Thus, if pA admits rational
roots, these must be in the set D4 = {±1, ±2, ±4}. We test these values to see if any of them
is a root of pA .
Since
pA (1) = 1 + 4 − 1 − 4 = 0,
we deduce that λ1 = 1. We apply here Horner’s rule (or we divide the polynomial pA by λ − λ1 )
and we deduce that
pA (λ) = (λ − 1)(λ2 + 5λ + 4).
4.1. KRYLOV’S METHOD TO DETERMINE THE CHARACTERISTIC POLYNOMIAL85
Solving λ2 + 5λ + 4 = 0 we obtain λ2 = −1 and λ3 = −4.
Step 6’. Since the eigenvalues are distinct real number, Krylov’s method allows us to compute
the eigenvectors.
For the eigenvalue λ1 = 1, we compute
q1 (λ) =
pA (λ)
= (λ + 1)(λ + 4) = λ2 + 5λ + 4.
λ − λ1
An eigenvector corresponding to eigenvalue λ1 = 1



−5
y (2) + 5y (1) + 4y (0) =  6  + 5 
−5
is

   
1
0
0
−1  + 4  1  =  5  .
2
0
5
For eigenvalue λ2 = −1, we compute
q2 (λ) =
pA (λ)
= (λ − 1)(λ + 4) = λ2 + 3λ − 4.
λ − λ2
An eigenvector corresponding to eigenvalue λ2 = −1 is




  

−5
1
0
−2
y (2) + 3y (1) − 4y (0) =  6  + 3  −1  − 4  1  =  −1  .
−5
2
0
1
For eigenvalue λ3 = −4, we compute
q3 (λ) =
pA (λ)
= (λ − 1)(λ + 1) = λ2 − 1.
λ − λ3
An eigenvector corresponding to eigenvalue λ3 = −4 is

   

−5
0
−5
y (2) − y (0) =  6  −  1  =  5  .
−5
0
−5
Step 7. Moreover, since the free term of the characteristic polynomial c3 = −4 is nonzero,
then the matrix A is invertible and its inverse is given by the formula
1 2
1
(A + c1 A + c2 I3 ) = (A2 + 4A − I3 ) =
c3
4




 


6 −5 5
−2 1 −1
1 0 0
−3 −1 1
1
1
=  −5 6 −5  + 4  1 −1 2  −  0 1 0  =  −1 1 3  .
4
4
5 −5 6
−1 2 −1
0 0 1
1
3 1
A−1 = −
Remark 32. In the above examples solving the linear systems B · c = −y (n) was quite easy
and it was not necessary to use Gauss’s elimination method or L − R factorization methods.
86
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
Exercise 13. Using Krylov’s method, determine the eigenvalues and
matrices

−5
0 2
−1 2
a) A =
, b) B =
, c) C =  2
1 1
0 −3
−1

1 −2 0
d)D =  −1 1
1 
0
0 −3

the eigenvectors of the

0 0
1 0 ,
2 −2


−3 1 −3
e)E =  0
3
0 .
1 −1 2
When possible, find the inverse of each matrix.
Hint: a) λ1 = −1, λ2 = 2.
b) λ1 = −3, λ2 = −1.
c)λ1 = 1, λ2 = −2, λ3 = −5.


 
1
0
(0)
(0)
d) To determine the characteristic polynomial of D, if we take y =  0  or y =  1 ,
0
0
 
0
(0)

we are sent back from Step 3 to Step 1. Thus we take y =
0 .
1
√
λ1,2 = 1 ± 2, λ3 = −3.
 
1
e)To determine the characteristic polynomial of E, if we take y (0) =  0 , we are sent back
0
 
0
(0)

from Step 3 to Step 1. Thus we take y =
1 .
0
√
−1± 13
λ1 = 3, λ2,3 =
.
2
Chapter 5
Polynomial interpolation
There are situations that often appear in every day life, as well as in engineering and science,
when, by a certain procedure, like using tools for measurement, we are aware of the values of
a function (temperature, pressure, velocity, magnetism, electrical, tension, electrical intensity,
etc.) in certain points only. In this case we say that the data under consideration is only known
at discrete point, without being known as a continuous function. However, starting from these
discrete points where we know the values of a continuous function, we can approximate this
function using a polynomial, see the theorem below.
Theorem 3. (Weierstrass Approximation Theorem). Assume that f ∈ C[a, b]. Then, for any
ε > 0, there exists a polynomial P such that
|f (x) − P (x)| < ε
for all x ∈ [a, b].
Why the interest in approximating a function on a interval? Because we may need to know
(at least with approximation) the values of the function in other points than the ones in which
we already know. This procedure is called interpolation.
Of course, we may approximate a function f using many types of functions, other than the
polynomials. Why our interested for polynomial interpolation then? Because the derivative or
the integral of function f may be needed and one big advantage is that the derivative and the
indefinite integral of a polynomial are quite easy to compute and they are polynomials too. In
addition, the roots of the polynomial equations are not difficult to located at all. This is why
the approximation of functions via polynomials is heavily used in numerical methods.
We introduce a proper definition of the interpolation polynomial.
Definition 18. Let f : [a, b] → R, f ∈ C[a, b]. Assume that we have a set of n + 1 points
a = x0 < x1 < ... < xn = b for which we know the corresponding values f0 , f1 , ..., fn−1 , fn of
f , meaning that f (xi ) = f i, for all i ∈ {0, 1, ..., n}. Then the points xi , i ∈ {0, 1, ..., n} are
called nodes and the k−degree polynomial P (with k ≤ n) that agrees with f at all nodes xi ,
i ∈ {0, 1, ..., n} (that is, P (xi ) = fi , for all i ∈ {0, 1, ..., n}), is called interpolation polynomial.
87
88
CHAPTER 5. POLYNOMIAL INTERPOLATION
Proposition 3. The interpolation polynomial P exists and it is unique.
Proof. We search a polynomial P such that
P (x) = an xn + an−1 xn−1 + ... + a1 x + a0 ,
and
(5.1)
P (xi ) = fi
for all i ∈ {0, 1, ..., n}.
Hence we have to determine a0 , a1 , ..., an ∈ R such that relations (5.1) are fulfilled, that is, we
have to solve the following linear system:
(5.2)

an xn0 + an−1 x0n−1 + ... + a1 x0 + a0 = f0



an xn1 + an−1 x1n−1 + ... + a1 x1 + a0 = f1
...........



an xnn + an−1 xnn−1 + ... + a1 xn + a0 = fn .
The matrix associated to this system is
xn0 xn−1
0
 xn1 x1n−1
A=
 ... ...
xnn xnn−1

...
...
...
...

x0 1
x1 1 
.
... ... 
xn 1
As xi 6= xj for all i 6= j, i, j ∈ {0, 1, ..., n} we infer that det(A) 6= 0 (it is called a Vandermonde
determinant) and we deduce that system (5.2) admits a unique solution. This means that P
exists and it is unique.
Remark 33. As a consequence of the uniqueness of the interpolation polynomial, the different
names associated to this polynomial, such are Lagrange interpolation polynomial, or Newton
interpolation polynomial, etc, are referring to the same polynomial. What varies is the formula
to obtain them.
Our interest in the interpolation polynomials lies with their capacity to approximate an unknown continuous function f which is given only by the values in some nodes. Another way to
look at this, is to estimate the error that appears when we approximate f .
Definition 19. The pointwise error between a function f and some polynomial approximation
to it, Pn , is defined by
Rn = f − Pn .
For the interpolation polynomials we can obtain a representation of this pointwise error Rn .
5.1. LAGRANGE FORM FOR THE INTERPOLATION POLYNOMIALS
89
Theorem 4. Let f : [a, b] → R be a function in C n+1 ([a, b]), and Pn (x) the interpolation
polynomial for f (x) with respect to n + 1 points, xi , i ∈ {0, 1, ..., n} in the interval [a, b]. Then
for each x ∈ [a, b] there exists a point ξ = ξ(x) in the open interval
min{x0 , x1 , ..., xn } < ξ < max{x0 , x1 , ..., xn },
such that
Rn (x) =
(x − x0 )(x − x1 )...(x − xn ) (n+1)
f
(ξ).
(n + 1)!
For a proof of this theorem see for example [7, Th.1,p.190].
Remark 34. When we try to approximate the value of f at a point x using an interpolation
/ [a, b], P (x) may not approximate
polynomial P we have to check whether x ∈ [a, b] or not. If x ∈
f (x).
In what follows we present Lagrange’s method to arrive at the interpolation polynomial.
5.1
Lagrange form for the interpolation polynomials
Let us first approximate a continuous function f by two nodes, x0 and x1 , on [x0 , x1 ]. We
denote the interpolation polynomial used in this case by L1 , where L stands for ”Lagrange”
and the index 1 comes from x1 . Then the degree of L1 is 1 or less than 1, thus the general form
of L1 would be
L1 (x) = a1 x + a0 , where a1 , a0 ∈ R.
To determine L1 we need to determine a0 and a1 . We know that
L1 (x0 ) = f (x0 ) = f0
L1 (x1 ) = f (x1 ) = f1 .
This ie equivalent to
a1 x 0 + a0 = f 0
a1 x 1 + a0 = f 1 .
We make the difference between these two equations and we deduce that
a1 (x0 − x1 ) = f0 − f1 ⇔ a1 =
f0 − f1
.
x0 − x1
We substitute a1 in the first equation and we get that
a0 = f 0 −
Then
L1 (x) = a1 x + a0 ⇔ L1 (x) =
f0 − f1
x0 .
x0 − x1
f0 − f1
f0 − f1
x + f0 −
x0 ⇔
x0 − x1
x0 − x1
90
CHAPTER 5. POLYNOMIAL INTERPOLATION
(5.3)
L1 (x) =
x − x1
x − x0
f0 −
f1 ,
x0 − x1
x1 − x0
and this is the Lagrange form of the interpolation polynomial determinant by 2 nodes.
If instead of 2 nodes we know the values of f in 3 nodes, x0 < x1 < x2 , than the Lagrange
interpolation polynomial is of a general form
L2 (x) = a2 x2 + a1 x + a0 ,
with a2 , a1 , a0 ∈ R to be determined knowing that L2 (x0 ) = f0 , L2 (x1 ) = f1 and L2 (x2 ) = f2 .
By performing a calculus similar to the one above, we compute a0 , a1 , a2 and we arrive at
(5.4)
L2 (x) =
(x − x1 )(x − x2 )
(x − x0 )(x − x2 )
(x − x0 )(x − x1 )
f0 +
f1 +
f2 .
(x0 − x1 )(x0 − x2 )
(x1 − x0 )(x1 − x2 )
(x2 − x0 )(x2 − x1 )
In the same manner we can obtain the formula for L3 , L4 , etc. In fact, when we have n + 1
nodes, x0 < x1 < ... < xn we denote
(5.5)
li (x) =
(x − x0 )(x − x1 )...(x − xi−1 )(x − xi+1 )...(x − xn )
, for i ∈ {0, 1, ..., n},
(xi − x0 )(xi − x1 )...(xi − xi−1 )(xi − xi+1 )...(xi − xn )
and the Lagrange form of the interpolation polynomial is the following
(5.6)
Ln (x) = f0 l0 (x) + f1 l1 (x) + ... + fn ln (x) =
n
X
fi li (x).
i=0
One can see immediately that formula (5.6) is the generalization of formula (5.3) and (5.4).
Note that, the more nodes we take, the better we can approximate the function, but, at the
same time, the calculus becomes more complicated.
Let us illustrate the use of this formula in the following solved exercises.
Example
Let f : R → R be an unknown continuous function such that f (−1) = 0, f (0) =
19. 15
1
−2, f 2 = − 8 and f (2) = 12. Approximate the value of f in 1 using the Lagrange form of
the interpolation polynomial. Can you also approximate f (3)?
Proof. We have that x0 = −1, x1 = 0, x2 = 12 , x3 = 2, and f0 = 0, f1 = −2, f2 = − 15
, f3 =
8
12, hence we can approximate f on [−1, 2] using a Lagrange interpolation polynomial of degree
3 or less. We use the formula
(5.7)
L3 (x) = f0 l0 (x) + f1 l1 (x) + f2 l2 (x) + f3 l3 (x),
where li are given by (5.5) for n = 3. Since f0 = 0 it is pointless to calculate l0 .
l1 .
(x + 1) x − 12 (x − 2)
(x − x0 )(x − x2 )(x − x3 )
=
l1 (x) =
= (x + 1) x −
(x1 − x0 )(x1 − x2 )(x1 − x3 )
(0 + 1) 0 − 21 (0 − 2)
We determine
1
2
(x − 2) .
5.1. LAGRANGE FORM FOR THE INTERPOLATION POLYNOMIALS
91
Also,
l2 (x) =
8
(x − x0 )(x − x1 )(x − x3 )
(x + 1) (x − 0) (x − 2)
1
1
= − x(x + 1)(x − 2).
= 1
(x2 − x0 )(x2 − x1 )(x2 − x3 )
9
+1 2 −0 2 −2
2
Finally,
(x + 1) (x − 0) x − 12
(x − x0 )(x − x1 )(x − x2 )
1
1
= − x(x + 1) x −
l3 (x) =
=
.
(x3 − x0 )(x3 − x1 )(x3 − x2 )
9
2
(2 + 1) (2 − 0) 2 − 12
By (5.7) we obtain that polynomial
5
4
1
1
(x − 2) + x(x + 1)(x − 2) − x(x + 1) x −
=
L3 (x) = −2 (x + 1) x −
2
3
3
2
5
2
1
= − x3 + x2 + x − 2,
3
3
3
approximates f on [−1, 2]. Consequently, since 1 ∈ [−1, 2],
8
f (1) ' L3 (1) = − .
9
On the other hand, 3 ∈
/ [−1, 2], thus we cannot say that L3 (3) approximates the value of
f (3).
Example 20. Let be the table
xi
fi
-1
0
− 23
1
2
0
1
2
3
1
2
1
0
a) Determine the Lagrange
interpolation polynomial which interpolates the above dates;
1
1
b) Evaluate f − 2 , f 3 , f (0) and f (2).
Proof. We remark that the above dates correspond the function f (x) = cos
We have n = 4 and
2
2
x0 = −1, x1 = − , x2 = 0, x3 = , x4 = 1,
3
3
1
1
f0 = 0, f1 = , f2 = 1, f3 = , f4 = 0.
2
2
The Lagrange interpolation polynomial is
L(x) = f0 l0 (x) + f1 l1 (x) + f2 l2 (x) + f3 l3 (x) + f4 l4 (x),
(5.8)
⇔ L(x) = 0 · l0 (x) +
πx
2
, x ∈ [−1, 1].
x ∈ [−1, 1],
1
1
· l1 (x) + 1 · l2 (x) + · l3 (x) + 0 · l4 (x),
2
2
x ∈ [−1, 1],
92
CHAPTER 5. POLYNOMIAL INTERPOLATION
where the Lagrange fundamental polynomials lk (x), 0 ≤ k ≤ 4, are determined using the
formula
4
Y
x − xi
, 0 ≤ k ≤ 4.
lk (x) =
x
−
x
k
i
i=0
i6=k
Since f = 0 and f4 = 0, is not necessary to compute l0 (x) and l4 (x). In what follows we
determine the Lagrange fundamental polynomials l1 (x), l2 (x) and l3 (x). We have
(x + 1) (x − 0) x − 32 (x − 1)
(x − x0 )(x − x2 )(x − x3 )(x − x4 )
=
l1 (x) =
=
(x1 − x0 )(x1 − x2 )(x1 − x3 )(x1 − x4 )
− 23 + 1 − 23 − 0 − 23 − 32 − 23 − 1
27
x(x + 1)(x − 1)(3x − 2),
40
(x + 1) x + 23 x − 23 (x − 1)
(x − x0 )(x − x1 )(x − x3 )(x − x4 )
l2 (x) =
=
=
(x2 − x0 )(x2 − x1 )(x2 − x3 )(x2 − x4 )
(0 + 1) 0 + 23 0 − 32 (0 − 1)
=−
1
= (x + 1)(3x + 2)(3x − 2)(x − 1),
4
(x + 1) x + 32 (x − 0) (x − 1)
(x − x0 )(x − x1 )(x − x2 )(x − x4 )
=
l3 (x) =
= 2
(x3 − x0 )(x3 − x1 )(x3 − x2 )(x3 − x4 )
+ 1 32 + 23 23 − 0 23 − 1
3
=−
27
x(x + 1)(3x + 2)(x − 1).
40
From relation (5.8) we get
27
1
1
L(x) = · −
x(x + 1)(x − 1)(3x − 2) + (x + 1)(3x + 2)(3x − 2)(x − 1)+
2
40
4
27
1
+ · −
x(x + 1)(3x + 2)(x − 1) =
2
40
9
49
9 2
2
= (x − 1)
x − 1 = x4 − x2 + 1
40
40
40
b) From
a) we have
4 49
2
1
9
f − 2 = L − 12 = 40
− 12 − 40
− 12 + 1 = 0.777380,
9
1 4
49 1 2
f 13 = L 13 = 40
− 40
+ 1 = 0.866666,
3
3
f (0) = f (x2 ) = f2 = 1,
f (2) cannot be evaluated, since 2 ∈
/ [−1, 1].
Bibliography
[1] C de Boor, A practical guide to splines, 2nd ed. Springer, New York, 2000.
[2] H. Brezis, Functional Analysis, Sobolev Spaces and Partial Differential, Springer, New
York, NY Equations, 2010.
[3] R. L. Burden and J. D. Faires, Numerical Analysis, Brooks Cole Ed., 2004.
[4] P.G. Ciarlet, Introduction a l’Analyse Numérique et l’Optimisation, Ed. Masson, Paris,
1990.
[5] F. Chatelin, Spectral approximation of linear operators, Academic Press, New York,
1983.
[6] B. Demidovici and I. Maron , Éléments de Calcul Numérique, Ed. Mir Moscou, 1973.
[7] E. Isaacson and H.B. Keller, Analysis of Numerical Methods, New York and Pasadena
1993.
[8] R. Militaru, Méthodes Numériques Théorie et Applications, editura Sitech, Craiova,
2008.
[9] M. Popa and R. Militaru, Metode numerice in pseudocod-aplicaţii, Editura Sitech,
Craiova, 2012.
93
Download