GEO4060 project assignment: A simple library for sparse matrices

advertisement

GEO4060 project assignment: A simple library for sparse matrices

This note describes a project assignment for the courses of GEO4060. Although the project is inspired by needs in mathematical computations, and the text contains some (simple) mathematics, there is no need for a thorough understanding of concepts like matrices and vectors to do the project.

The tasks to be implemented are defined in terms of simple summation expressions and array traversals. The challenges lie in manipulating a data structure that represents the nonzeroes elements of a potentially very large two-dimensional array.

1 Introduction to linear systems

Solving systems of linear equations is a fundamental computational task in almost all branches of science and technology. To give a very simple example of what a linear system is, let us imagine an oil reservoir that only contains oil and water. Suppose x

1 denotes the percentage of volume occupied by oil, and x

2 denotes the volume percentage of water. Since there only exist oil and water in the reservoir, we know that x

1

+ x

2

=1, which by itself is not sufficient for determining the exact values of x

1 and x

2

. However, suppose we also know that the volume occupied by oil is three times that of water, i.e., x

1

= 3 x

2

, the following system of two equations can help us to find the values of x

1 and x

2

: x

1

+ x

2

= 1 , x

1

− 3 x

2

= 0 .

(1) x

2

Of course, we need no heavy mathematical machinery to find x

1

= 0 .

= 0 .

75,

25 as the solution to the above system. The situation will, however, be highly complicated if a large number of unknowns ( x

1

, x

2

, . . . , x n

) constitute a similar system. Such large systems of linear equations arise in countless applications of science and technology, such as oil reservoir simula-

1

tion, weather prediction, ocean wave modeling, and simulating the electrical activity in the human heart, to just mention a few.

In everyday words, we can say that a system of linear equations describes how a set of variables ( x

1

, x

2

, . . . , x n

) interact with each other. In the mathematical language, it can be expressed as

Ax = b , (2) where A is an n × n matrix ( n rows and n columns):

 a

1 , 1 a

1 , 2

· · · a

1 ,n

.

..

a

2 , 1

.

..

a

2 , 2

· · ·

. .. ...

a

2 ,n a n, 1 a n, 2

· · · a n,n

,

(3) while x = ( x

1

, x

2

, . . . , x n

) is a vector containing the unknown variables and b = ( b

1

, b

2

, . . . , b n

) is a given vector.

Equation (2) is written in a compact form. The left-hand side is a matrixvector product Ax , resulting in a vector, which is supposed to be equal to the other vector b on the right-hand side. To “program” the equation, we need to express the matrix-vector product and the equality in terms of the individual elements in A , x , and b . Rules from mathematics (linear algebra) give the following alternative form of (2): n

X a i,j x j

= b i j =1

(4) for i = 1 , . . . , n . Let us write out this sum for various i values in our simple example where n = 2: a

1 , 1

= 1 , a

1 , 2

= 1 , a

2 , 1

= 1 , a

2 , 2

= − 3 , b

1

= 1 b

2

= 0 .

(5)

2 Simple iterative methods

To solve Ax = b on a computer (when it becomes an impossible task for paper and pencil), scientists have devised many solution methods, among them are iterative methods the most suitable choices when n is large.

Roughly speaking, an iterative method for solving Ax = b starts with an initial guess x

0 = ( x

0

1

, x

0 an iterative fashion, i.e., x

1

2

, . . . , x

0 n

) and tries to improve the solution in

, x

2

, . . . , x k . The hope is that x k will be close

2

enough to the true solution x . (Exactly how many iterations are needed depends on the properties of A and the chosen iterative method.)

Among all the iterative methods, the so-called simple iterative methods have the simplest mathematical formulation, which can be translated into straightforward loops in a computer language. These methods will therefore be the subject for implementation in this project assignment.

2.1

Jacobi iterations

The first simple iterative method uses Jacobi iterations, where the k th iterate x k = ( x k

1

, x k

2

, . . . , x k n

) is found by the following algorithm, on the basis of the preceding iterate x k

1 = ( x k

1

1

, x k

2

1

, . . . , x k

− n

1 ).

1

= a i,i x k i

=

1 a i,i

 b i

− i

1

X a i,j x k

− j

1 j =1

− n

X j = i +1 a i,j x k

− j

1

 b i

− a i, 1 x k

1

1

. . .

− a i,i

1 x k

− i

1

1

− a i,i +1 x k

− i +1

1

. . .

− a i,n x k

− n

1

(6) for i = 1 , . . . , n .

A word of assurance here. It is beyond the scope of this project to understand how one derives the formula (6). It suffices to say that implementing

(6) requires no deep mathematical understanding. The above formula simply loops over entries of A , b , x k

1 and carries out basic arithmetic operations such as multiplication, subtraction, and division.

2.2

Gauss-Seidel iterations

A modified version of the Jacobi iterations consists of using new values on the right-hand side as soon as they are computed: x k i

=

1 a i,i

 b i

− i

1

X a i,j x k j j =1

− n

X j = i +1

 a i,j x k

− j

1

(7) for i = 1 , . . . , n . This simple iterative method is called Gauss-Seidel iterations. Note that the only difference between Gauss-Seidel iterations and

Jacobi iterations is that x k

1

, . . . , x k i

1 appear in (7), instead of that appear in (6). This means that once x k i x k

1

1

, . . . , x k

− i

1

1 is found, it immediately participates in updating x k i +1

, x k i +2 solution array is needed: x k i

, and so on. This also means that only one can overwrite x k

− i

1

.

3

2.3

Successive over relaxation

To inject more flexibility into the above Gauss-Seidel iterations, an idea is to use

ω x k

Gauss

Seidel

+ (1 − ω ) x k

1 (8) as the final result of iteration number k . Here, ω is a scalar constant which in a mathematical sense “relaxes” the k th iterate of the Gauss-Seidel method, giving the name of successive over relaxation (SOR) method. Note that the

Gauss-Seidel method is a special case of SOR when ω = 1.

Once again we remark that the above three simple iterative methods have the same feature in regard of implementation, because they simply involve basic arithmetic operations and loop over entries of A , b , x k

1 (and x k ).

3 Sparse matrices

In many applications where one needs to solve Ax = b , the matrix A has only a small fraction of nonzero entries. That is, most a i,j entries are known to be zero in advance. (Very often this stems from the fact that when setting up equation no.

i , only a few unknowns are explicitly coupled to x i through that equation.) The common situation is that the number of nonzero entries in each row of A is roughly a fixed number, independent of the size n of the matrix.

To give an idea about the percentage of nonzero entries in A , let us suppose that the average number of nonzero entries per row is 20 (which, by the way, is a quite accurate estimate for many complicated applications).

Consequently the number of nonzero entries in A is 20 n , while the total number of entries in A is n

2 . This means that the percentage of nonzero entries in A is

20 n n 2

=

20

.

n

(9)

Therefore, the more rows of A , the smaller the percentage of its nonzero entries. Matrices having only a small percentage of nonzero entries are commonly called sparse matrices .

Although the mathematical formulations of the simple iterative methods remain the same no matter how many entries of A are zero (as long as we do not divide by zeroes, i.e., a i,i

= 0 for i = 1 , . . . , n ), a computer implementation may suffer in two respects due to the zero entries. First, multiplications with the zero entries of A in (6)-(8) are a waste of time, and should thus be avoided. Second, storing the unnecessary zero entries

4

of A is a huge waste of memory. Nowadays applications in science and technology easily involve n ≥ 10 6 unknown variables, requiring a matrix

A with n ≥ 10 6 rows and n ≥ 10 6 columns. For n = 10 6 , storing all the entries of A as double-precision floating-point numbers requires 10 6

× 10 6

× 8 bytes, approximately 8000GB memory, which is impossible to achieve on any regular computer. On the other hand, storing only the nonzero entries requires 10 6 × 20 × 8 bytes, about 160 MB memory (which is affordable on any modern computer).

The lesson learned here is that a computer implementation of the simple iterative methods for sparse matrices should only store the nonzero entries.

To this end, a popular data storage scheme is so-called compact row storage , which extracts all the nonzero entries of A , row by row, and then packs them into a one-dimensional array A of floating-point numbers. In addition, a one-dimensional integer array jcol , which is of the same length as A , records the column number for each corresponding nonzero entry. To be able to map the entries in A back to their corresponding rows in A , another one-dimensional integer array irow is used. The length of irow is n + 1, where the value of irow(1) is 1 and the value of irow(n+1) is nnz+1 , while nnz denotes the total number of nonzero entries in A . Moreover, the value of irow(i) is the position in A storing the first nonzero entry on row i of A .

(The last nonzero entry on row i resides at position irow(i+1)-1 of A .)

A simple example of compact row storage can be shown for the following sparse matrix

 

A =

 a

1 , 1

0

0 a

2 , 2

0 a

3 , 2 a

4 , 1

0

0 a

5 , 2 a

0 a

1 , 4

2 , 3

0

0 a

2 , 5 a

3 , 3

0 0

0 a

4 , 4

0 a

5 , 4 a a

4 , 5

5 , 5

, (10) for which the arrays A , irow , and jcol are

A = ( a

1 , 1

, a

1 , 4

, a

2 , 2

, a

2 , 3

, a

2 , 5

, a

3 , 2

, a

3 , 3

, a

4 , 1

, a

4 , 4

, a

4 , 5

, a

5 , 2

, a

5 , 4

, a

5 , 5

) , irow = (1 , 3 , 6 , 8 , 11 , 14) , jcol = (1 , 4 , 2 , 3 , 5 , 2 , 3 , 1 , 4 , 5 , 2 , 4 , 5) .

4 The project assignment

The purpose of the project assignment is to create an object containing a data structure for sparse matrices, two solver routines for the Jacobi and

SOR iteration methods, a routine for reading the sparse matrix A-vector and

5

the solution b-vector from files in ASCII format and a routine for writing the resulting x-vector to a file. Then a main program with a sample application should be implemented. This application must test that all functionality of the object works as intended. A makefile or a simple Unix shell script for compiling and linking the application must also be provided. The project deliver shall contain a written report explaining how the solution of the problem was made together with the complete source code.

4.1

Fortran 2003 implementation

In Fortran 2003 it is natural to represent sparse matrices by an object

MatSparse :

MODULE MatSparse

IMPLICIT NONE

TYPE matrix

REAL(KIND=8), POINTER

REAL(KIND=8), POINTER

INTEGER, POINTER

INTEGER, POINTER

INTEGER

INTEGER

CHARACTER(LEN=80)

:: A(:)

:: b(:)

:: irow(:)

:: jcol(:)

:: n

:: nnz

:: Avector

CHARACTER(LEN=80) :: bvector

!// More variables here for the x-vectors and other

!// variables you need

CONTAINS

PROCEDURE

PROCEDURE

PROCEDURE

PROCEDURE

PROCEDURE

PROCEDURE

END TYPE matrix

:: scan => scanfromfile

:: dump => dumptomfile

:: Jacobi1_it => Jacobi1iteration

:: SOR1_it => SOR1itreation

:: Jacobi => Jacobi_it

:: SOR => SOR_it

!// More global variables here

CONTAINS

SUBROUTINE scanfromfile(binary)

!// Read A and b vectors from files

IMPLICIT NONE

LOGICAL

END SUBROUTINE scanfromfile

:: binary

!// More subroutine declarations here

END MODULE MatSparse

The subroutines should perform the following actions:

• scanfromfile : read the A-vector and b-vector from the two files. The files are in ASCII format. The layout of the A-vector file can be as follows:

6

value of n: 312 value of nnz: 1874 a(1,1)=xxx a(1,4)=xxx a(2,2)=xxx a(2,3)=xxx a(2,5)=xxx

...

and the layout for the b-vector can be like this: value of n: 312 b(1) = -0.000741122

b(2) = -0.00131356

• dumptofile : write the x-vector to a file. The data layout in the file must of course match what is explained for scanfromfile above.

• Jacobi1iteration : perform a single iteration with the Jacobi method.

• SOR1iteration : as Jacobi1iteration , but a single iteration with the

SOR method.

• Jacobi : solution of a linear system with Jacobi’s method. The iteration loop is stopped if the difference between xnew and xold is less than tol , or if the number of iterations exceeds maxit . The number of iterations is returned. The difference between xnew and xold can be computed by the formula v u n u X t

( x k i i =1

− x k

− i

1

) 2 where the x k

− i

1 is the previous iteration.

• SOR : as Jacobi , but for the SOR method.

A main program main.f90

should define an object of the matrix , retrieve the filenames either as a user dialog or from command line arguments. Then call the objects scan subroutine to get the number of entries in the A-vector and the b-vector. Allocate space for the vectors and load the A-vector and b-vector from files. thereafter read from the files, a start vector ( x-vector ) for the iterations is defined and initialized, and Jacobi iteration is called.

The tol and maxit parameters can be given on the command line. Compare xnew computed by Jacobi with the exact solution x (use the “difference” formula for two vectors as given above). Repeat the computational steps for

7

the SOR iteration. The omega and maxiter parameter can be given on the command line.

The Jacobi and SOR methods have the signatures

SUBROUTINE Jacobi_it(numit)

SUBROUTINE SOR_it(numit)

The source code goes into the src directory and the object code into the obj directory, respectively. Using ta Makefile to compile the source code and link it into an executable code which you use to test with the accompanying input files for the A-vecrot and b-vector. The clean.sh

file is a shell script for traversing the total directory tree and removing all binary files (which can easily be regenerated). The relevant Unix command for clean.sh

is find . \( -name ’*.o’ -o -name ’*.a’ -o -name ’*.so’ \) \

-print -exec rm -rf {} \;

Remember to run clean.sh

before you pack the tar file and transfer it to the

Fronter delivery space

5 The final time of delivery

The final time of delivery is Friday 23.05.2014 at 23:59

8

Download