Leslie Matrix Manipulation in C with MPI By Jesse A. Hanley In association with “Time after Time: Age-Structured Models” By Angela B. Shiflet and George W. Shiflet Introduction This document includes the specifications on how the Leslie Matrix module runs, as well as the requirements needed to operate the module. The code is written in C with MPI. Function Overview In this section, we will go through the various functions within the program, describing each. void print_usage(); This function will print information regarding the proper usage of the program, including information about command line arguments. The following is the output: Command Line Arguments: -o <filename> : -i <filename> : -g <#> : -d <filename> : -h : void Supply a file for output to be displayed Supply a file for input matrix to be read Supply the number of generations to be run Supply a file for the age distribution Get help about this program, including input format malloc_double_list( double **ptrList, int iEntries ); This function, when given a pointer to a list of doubles and a number of entries, allocates the list in memory with the given number of entries. void malloc_int_list( int **ptrList, int iEntries ); This function, when given a pointer to a list of integers and a number of entries, allocates the list in memory with the given number of entries. void free_double_list( double **ptrList ); This function frees a list of doubles from memory. void free_int_list( int **ptrList ); This function frees a list of integers from memory. void append( int **ptrList, int *iSize, int iItem ); This function takes a pointer to a list of integers, along with the size of the list and a specified integer as arguments. A new list is allocated in memory, one item longer. Then, the values are copied over, and the specified integer is put in the last place of the new array. The original list is freed from memory, and the pointers are swapped so that ptrList now points to the new array. The value of iSize is also incremented. int count_entries( FILE *ptrFile ); This function returns the number of doubles in a given file. Leslie Matrix Manipulation in C with MPI void 2 fill_value_array( int iEntries, double **ptrValueArray, double **ptrInputArray ); This function takes the values pointed to by ptrInputArray and manipulates them such that the values pointed to by ptrValueArray will hold the values of the corresponding Leslie matrix. The input file only stores the Leslie matrix values that are potentially nonzero. For instance, if the input file holds the values 0, 5, 4, 0.15, and 0.5 and consequently ptrInputArray points to an array with the values 0, 5, 4, 0.15, and 0.5, after execution of the function ptrValueArray would point to an array containing 0, 5, 4, 0.15, 0, 0, 0, 0.5, 0. void read_values( FILE *ptrFile, int iEntries, double **ptrValueArray ); Given a file pointer and a pointer to an allocated double list, this function reads doubles from the file and inserts them into the list. This function acts as a helper function for load_values(…). void load_values( double **ptrInputArray, int *ptrSize, int *ptrWidth, char *sFileName ); This function attempts to open the file with the given sFileName, then it allocates two lists: one from ptrInputArray and a private list called ptrValueArray. The function calls other functions to find the number of values in the input file, to read the values into the internal array, and then to create the proper Leslie matrix form from these values and store them in ptrInputArray. Finally, load_values closes the input file. void parse_arguments( int argc, char **argv, char **sFileInput, char **sFileDist, char **sFileOutput, int *iGenerations ); This function takes the standard argc and argv from the main(…) function call and parses them with the getopt function from <getopt.h>. For the 3 character array values, the function allocates room for each in memory and then uses strcpy to copy the optarg into the appropriate space. For iGenerations, the character is converted to an integer and stored in the appropriate memory spot. void query_distribution( double **ptrInputArray, int iArrayWidth, char *sFileDist ); This function attempts to open the file with name sFileDist; then it allocates space in memory for ptrInputArray and loads the values in the file into the list. void mat_product_vector( double **ptrArray, double **ptrVector, double **ptrResults, int iWidth ); This function performs the basic matrix operation of matrix-vector multiplication. The results are stored in ptrResults. void mat_product_mat( double **ptrArray1, double **ptrArray2, double **ptrResults, int iWidth, int iRank, int iSize ); This function performs the basic matrix operation of matrix-matrix multiplication. The results are stored in ptrResults. The majority of the parallelization takes place here, where the technique of domain decomposition allows for each process to manipulate the matrix and then the function collectively bring the results together. void array_to_power( double **inputArray, double **ptrResultArray, int iWidth, int iPower, int iRank, int iSize ); This function handles the mathematics involved to raise inputArray to the iPower, and storing the values in ptrResultArray. Since the math involved with raising a matrix to a power is simply multiplying the matrix repeatedly, only one reference to inputArray is needed. Leslie Matrix Manipulation in C with MPI void 3 print_results( char *ptrCharTitle, double **ptrValues, int iWidth, int iGenerations, char *sInputFile, char *sDistFile, char *sOutputFile, double dEigenValue ); Given the appropriate parameters, this function handles the output of the file: opening it, and printing the results in an appealing fashion. void list_copy( double **ptrList1, double **ptrList2, int iWidth ); This function takes pointers to two equally sized lists, and copies the contents of ptrList1 into ptrList2. Command Line Arguments and Input/Output This program utilizes command line arguments to help reduce the input/output performance drag. There are 5 different command line arguments, 4 of which take arguments. The arguments -i, -o, and -d are expected to include a filename afterwards. They represent the input file for population growth information, the output file, and the file containing the distribution of the target animal populations, respectively. The -g option also expects an argument, which we use to set the number of generations the simulation will run. The final option is -h, which will print some useful information about the file to standard error. This option will override other options and will make the program exit. In summary, there are 5 command line options: -i, -o, -d, -g, and -h. The first three expect filenames as arguments, -g expects an integer that will act as the number of generations, and -h will return information about the program. Notes about input files: In order to save space and to minimize the amount of time in input/output and the disk operations involved in such, the input files are assumed to follow the following format: The file referenced as the argument for the -i option should only include the important values of 0 5 4 the initial population. For example: to represent the Leslie matrix 0 1.5 0 , the input file simply 0 0 0.5 contains: 0 5 4 0.15 0.5. Also, the file referenced as the argument for the -d contains the values associated with the population distribution, which in this case could be 3000, 440, and 350. The file just needs to have these three values separated by spaces. Parallelization in Matrix Operations To best parallelize a program, efforts must be made to find the part of the program that does the most computation. Then, we can suitable apply parallelism to the problem. Often, by far the most computationally heavy section code is in raising the matrix of values to a power. In the current model, the problem becomes a great candidate for domain decomposition. Each process performs the set of matrix multiplication operations over a distinct set of numbers. After Leslie Matrix Manipulation in C with MPI these are done, an MPI_Allreduce is called, which sums all the values in ptrResults list for each process. This process is repeated until the appropriate number of generations has run. Communication between processes is often extremely time consuming relatively. This program overcomes some of that problem by packing several integers together and making one communication. Although in a small example, the MPI_Allreduce is a heavy operation compared to the matrix multiplication, with a much larger initial Leslie matrix and distribution set. However, the communication can overcome the time penalty, and the parallel solution can actually show speedup and offer better performance. 4