Computer Project: The Matrix Market and Sparse Matrices

advertisement
Computer Project: The Matrix Market and Sparse Matrices
Purpose: To learn how to get and use matrices from the Market Market
and the University of Florida Sparse Matrix Collection. Also to learn about
Matlab utilities for solving and displaying sparse matrices.
Prerequisite: Knowledge of using Gaussian elimination to solve Ax = b
(for example Sections 1.3 and 1.4 of Spence, Insel and Friedberg).
Optionally, the LU factorization (for example Section 2.5 in SIF)
Built-in Matlab functions used: size, nnz, spy, tic, toc, norm, lu, \, prod, subplot and colmmd.
Files that need to be downloaded: mmread.m, loadmatrix.m (Matlab files), matrixmarket_project.doc
(this file), matrixmarket_worksheet_1.doc, matrixmarket_worksheet_2.doc and possibly 7za.exf. You
can find these at my web site www.math.sjsu.edu/~foster/math129a_projects.html
Background: The Matrix Market (http://math.nist.gov/MatrixMarket/ ) and the University of Florida
(UF) Sparse Matrix Collection ( http://www.cise.ufl.edu/research/sparse/matrices ) are marvelous
collections of matrices that come from real world applications that, in many cases, lead to large
matrices. Many of the matrices are bigger than 1000 by 1000 and indeed one matrix in the UF
collection is over 5,000,000 by 5,000,000. Even though the matrices are large many of them can be
solved on your home computer using Matlab. You will do so as part of this assignment.
A key reason that such large problems can be quickly solved is that most of the matrices on the Matrix
Market are sparse. A sparse matrix is a matrix with a high percentage of its elements equal to zero.
Such matrices show up frequently in practice, especially for large problems. A matrix often
characterizes interactions between various parts of a problem. In many settings one part of a problem
only directly interacts with other nearby parts. The lack of direct interaction with distant parts of the
problem leads to zeros in the matrix. For problems that are large there are many more distant parts of a
problem than nearby parts and the matrix will have many zero entries. Such problems lead to sparse
matrices.
Matlab has tools that try to solve problems while maintaining as much of the zero structure as possible.
This can reduce the work required by a huge amount since the portion of a calculation that refers to a
zero portion of a matrix can often be skipped. (For example 0+x = x and does not require an actual
addition). For example in Matlab when a matrix is identified as a “sparse” matrix, Matlab will use a
different algorithm to solve Ax = b than if the matrix is “full” (that is mostly nonzero). The different
algorithm will take advantage of the zeros in A, if possible. Fortunately the user of Matlab does not need
to use different notation to solve Ax = b -- x = A \ b still works!
In addition to the \ operator here are some of the other Matlab tools that you will use in this assignment:
 size(A) returns the number of rows and columns of a matrix
 nnz(A) returns the number of non-zero entries in a matrix
 spy(A) draws a picture of the location of the non-zero entries in the matrix A – the nonzeros are
drawn as dots and the zeros are left blank.
 tic and toc – can be used to determine how long it takes to do a calculation
 norm(v) calculates the length or norm of a vector. This is the square root of the sum of the
squares of the components of v
 [L,U] = lu(A) determines L and U in a L U factorization of a matrix A. If A is stored as a sparse
matrix then L and U determined will also be sparse.
 full(A) will force Matlab to store a matrix as a full (that is not sparse) matrix
 prod(v) multiplies all the entries in a vector v
 colmmd - determines an order of the columns of a matrix that will tend to reduce “fill-in”


subplot - allows plotting several subplots on the same plot
condest – estimates the condition number of a matrix
To do this assignment you need to be able to (1) find matrices in the Matrix Market or the UF Sparse
Matrix Collection, (2) download files describing the matrices to your computer, (3) to load the matrix
into Matlab and (4) to use the Matlab commands listed above to work with the matrices. We will give
description of these steps in the remainder of this file. As additional help, in case you find it useful, my
web site has streaming videos that describe each of the above steps in greater detail and also associated
power point presentations.
To download files you should go to the Matrix Market site (http://math.nist.gov/MatrixMarket/ ) or to
the UF Sparse Matrix Collection (http://www.cise.ufl.edu/research/sparse/matrices ) and find the matrix
that you want. At the Matrix Market site right click on the file name (for example gr_30_30.mtx.gz) in
the “Download as Compressed Matrix Market file: (some file name).mtx.gz” section and save the file to
your computer by choosing “save link target as” or “save target as” in your web browser. At the UF
Sparse Matrix Collection right click on the “MM” entry in the download column and save the file to
your computer by choosing “save link target as” or “save target as.” You will need to select the folder
where you will save the file but do not change the name of the file. All files related to this project
should be saved in the same folder on your computer.
I want you to download and collect some information (described later) on at least two different matrices.
The matrices should be square and at least 900 by 900 or larger. However, if you have a relatively new
computer – less than a few years old -- you should choose larger matrices (2000 by 2000 or more). In
addition
 One of the matrices should be a matrix which includes a description that is at least a paragraph
long of the application where the matrix arises. The “NEP” collection in the Matrix Market
seems to have more detailed descriptions than some of the other collections.
 A second matrix can be any square 900 by 900 or larger matrix that seems of interest to you.
 At least one of your matrices (ideally all your matrices but this is not required) must lead to an
accurate solution which we will define as a solution with norm(x – xtrue) / norm(xtrue) < 10-6,
say. One key to choosing matrices that have accurate solutions is to choose a matrix whose
“condition number” is not too large. Calculations in Matlab start with errors of around 10-16 and
if the matrix has a condition number equal to condA then it turns out that the value of norm(x –
xtrue) / norm(xtrue) should be about (10-16 x condA) or less. We won’t present the reasons
here. The Matrix Market (but not the UF collection) often lists the condition number of matrices
in the collection. Also, unless the matrix is quite large, one can calculate (approximately) the
condition number of a matrix using condest(A).
 Finally, as extra credit you can, if you wish, try to solve the largest matrix that is practical to
solve on your computer. I will let you decide what “practically solve” means. This will depend
on how long you wish to wait for a solution and also on how much memory your computer has.
I expect that some of you will be able to solve 10000 by 10000 or even much larger matrices on
the super computer, as the name was used 10 or 15 years ago, that is in your backpack or on your
desktop. The person(s) in the class who solves the largest system will receive double extra credit.
From either site the files that you download will be in compressed format. If you know how to
decompress files using programs such as 7-zip, winzip, powerdesk, or winrar you can decompress the
file yourself. If you decompress the file yourself make sure that the decompressed file has a “.mtx”
extension (for example gr_30_30.mtx). However you do not need to decompress the files yourself -- it
is easier to decompress the file using the loadmatrix.m program. Loadmatrix is a Matlab program that
will automatically decompress the file (if necessary) and also convert the matrix to Matlab format. It is
designed to work with files downloaded from either the Matrix Market site or those downloaded from
the UF Sparse Matrix Collection site. To use the loadmatrix.m program you need to download three
files
–
loadmatrix.m,
7za.exf,
and
mmread.m
–
from
www.math.sjsu.edu/~foster/math129a_projects.html. You should store these three files and also all the
files that you download from the Matrix Market and UF sites in the same folder on your computer. To
use the loadmatrix program you need to open up Matlab and move to the folder where you are storing all
your files for the project. In Matlab 5.3 one way to move to this folder is to select the “file” menu, the
“set path” submenu, click on the “browse” button, navigate to the desired folder and click on “ok.” .”
In Matlab 6 or 7 to move to the appropriate folder click on the button with three dots (called “browse for
folder”) next to “current directory:” in the toolbar menu, navigate to the appropriate folder and then
click “ok.” You can confirm that you are in the desired folder by typing “pwd” (print working
directory) from the Matlab prompt. You can confirm that all the desired files are in the current folder by
type “dir” from the Matlab prompt.
Now to decompress the file and load the matrix in Malab, after you get to the proper folder in Matlab,
you should type from the Matlab prompt “matixname = ‘the name of the matrix’ ” and then type
“loadmatrix”. For example if you downloaded the matrix named gr_30_30 then the name of the file
should be “gr_30_30.mtx.gz” (if you downloaded the file from the Matrix Market) or “gr_30_30.tar.gz”
(if you downloaded the file from the UF site)” or “gr_30_30.mtx” (for example if you have
decompressed the file yourself). Then at the Matlab prompt type
>> matrixname = ‘gr_30_30’
>> loadmatrix
Do not include the any extensions in the matrixname variable. Loadmatrix will store the matrix as a
sparse matrix named A in Matlab. Also loadmatrix will construct a “true” solution xtrue and a
corresponding right hand side b = A*xtrue. This true solution can be used to test the accuracy of
Matlab’s solver x = A \ b. Of course, in the real world one would not know the true solution (why solve
the system if you did) but we are using a true solution to test Matlab’s solver. Finally, you now have the
Matrix Market matrix available in Matlab!
Now you are ready to begin the mathematical part of the assignment. To do this fill out
matrixmarket_worksheet_1 for one of the matrices that you selected and matrixmarket_worksheet_2 for
the second. Worksheet_2 assumes that you have studied the LU factorization. If your class has not
done this you can fill out worksheet_1 for both examples. If you do the extra credit fill out another copy
of worksheet_1. The worksheet contains instructions and the questions that need to be answered.
In working through the worksheets you may get pictures like the ones pictured below
Every matrix will produce a different picture and so yours won’t be exactly like these.
Download