Data Structure Sang Yong Han http://ec.cse.cau.ac.kr/ Chung-Ang University Spring 2011 1 Arrays Array: a set of pairs (index and value) data structure For each index, there is a value associated with that index. representation (possible) implemented by using consecutive memory. The Array ADT Objects: A set of pairs <index, value> where for each value of index there is a value from the set item. Index is a finite ordered set of one or more dimensions, for example, {0, … , n-1} for one dimension, {(0,0),(0,1),(0,2),(1,0),(1,1),(1,2),(2,0),(2,1),(2,2)} for two dimensions, etc. Methods: for all A Array, i index, x item, j, size integer Array Create(j, list) ::= return an array of j dimensions where list is a j-tuple whose kth element is the size of the kth dimension. Items are undefined. Item Retrieve(A, i) ::= if (i index) return the item associated with index value i in array A else return error Array Store(A, i, x) ::= if (i in index) return an array that is identical to array A except the new pair <i, x> has been inserted else return error Arrays in C int list[5], *plist[5]; list[5]: five integers list[0], list[1], list[2], list[3], list[4] *plist[5]: five pointers to integers plist[0], plist[1], plist[2], plist[3], plist[4] implementation of 1-D array list[0] base address = list[1] + sizeof(int) list[2] + 2*sizeof(int) list[3] + 3*sizeof(int) list[4] + 4*size(int) Arrays in C (cont’d) Compare int *list1 and int list2[5] in C. Same: list1 and list2 are pointers. Difference: list2 reserves five locations. Notations: list2 - a pointer to list2[0] (list2 + i) - a pointer to list2[i] (&list2[i]) *(list2 + i) - list2[i] Example Example: Address Contents 1228 0 1230 1 1232 2 for (i=0; i < rows; i++) 1234 3 printf(“%8u%5d\n”, ptr+i, *(ptr+i)); 1236 4 int one[] = {0, 1, 2, 3, 4}; //Goal: print out address and value void print1(int *ptr, int rows) { printf(“Address Contents\n”); printf(“\n”); } 2D Arrays The elements of a 2-dimensional array a declared as: int a[3][4]; may be shown as a table a[0][0] a[1][0] a[2][0] a[0][1] a[0][2] a[0][3] a[1][1] a[1][2] a[1][3] a[2][1] a[2][2] a[2][3] Rows Of A 2D Array a[0][0] a[0][1] a[0][2] a[0][3] row 0 a[1][0] a[1][1] a[1][2] a[1][3] row 1 a[2][0] a[2][1] a[2][2] a[2][3] row 2 Columns Of A 2D Array a[0][0] a[1][0] a[2][0] a[0][1] a[1][1] a[2][1] a[0][2] a[1][2] a[2][2] a[0][3] a[1][3] a[2][3] column 0 column 1 column 2 column 3 2D Array Representation In C 2-dimensional array x a, b, c, d e, f, g, h i, j,array k, l of rows view 2D array as a 1D x = [row0, row1, row 2] row 0 = [a,b, c, d] row 1 = [e, f, g, h] row 2 = [i, j, k, l] and store as 1D arrays Array Representation In C x[] a b c d e f g h i j k l This representation is called the array-of-arrays representation. Contiguous space required for this representation ? Other Data Structures Based on Arrays •Arrays: •Basic data structure •May store any type of elements Polynomials: defined by a list of coefficients and exponents - degree of polynomial = the largest exponent in a polynomial p( x) a1xe1 ... an xen Polynomials A(X)=3X20+2X5+4, B(X)=X4+10X3+3X2+1 Polynomial ADT Objects: Methods: for all poly, poly1, poly2 Exponents Polynomial Zero( ) Boolean IsZero(poly) a set of ordered pairs of <ei,ai> where ai in Coefficients and ei in Exponents, ei are integers >= 0 Polynomial, coef Coefficients, expon ::= return the polynomial p(x) = 0 ::= if (poly) return FALSE else return TRUE Coefficient Coef(poly, expon) ::= if (expon poly) return its coefficient else return Zero Exponent Lead_Exp(poly) ::= return the largest exponent in poly Polynomial Attach(poly,coef, expon) ::= if (expon poly) return error else return the polynomial poly with the term <coef, expon> inserted Polyomial ADT (cont’d) Polynomial Remove(poly, expon) ::= if (expon poly) return the polynomial poly with the term whose exponent is expon deleted else return error Polynomial SingleMult(poly, coef, expon)::= return the polynomial poly • coef • xexpon Polynomial Add(poly1, poly2) ::= return the polynomial poly1 +poly2 Polynomial Mult(poly1, poly2) ::= return the polynomial poly1 • poly2 Polynomial Addition (1) #define MAX_DEGREE 101 typedef struct { int degree; float coef[MAX_DEGREE]; } polynomial; Running time? Addition(polynomial * a, polynomial * b, polynomial* c) { … } advantage: easy implementation disadvantage: waste space when sparse Polynomial Addition (2) Use one global array to store all polynomials A(X)=2X1000+1 B(X)=X4+10X3+3X2+1 starta finisha startb coef exp finishb avail 2 1 1 10 3 1 1000 0 4 3 2 0 0 1 2 3 4 5 6 Polynomial Addition (2) (cont’d) #define MAX_TERMS 100 typedef struct { int exp; float coef; } polynomial; polynomial terms[MAX_TERMS]; Running time? Addition(int starta, int enda, int startb, int endb, int startc, int endc) { … } advantage: less space disadvantage: longer code Time Complexity ? Sparse Matrices col1 col2 row0 row1 row2 row3 row4 5*3 row5 15/15 col3 col4 col5 col6 15 0 0 22 0 15 0 11 3 0 0 0 0 0 0 6 0 0 0 0 0 0 0 0 91 0 0 0 0 0 0 0 28 0 0 0 6*6 8/36 sparse matrix data structure? Sparse Matrix - Airline flight Airline flight matrix. airports are numbered 1 through n flight(i,j) = list of nonstop flights from airport i to airport j n = 1000 (say) n x n array of list pointers => 4 million bytes total number of nonempty flight lists = 20,000 (say) Sparse Matrix - Web page matrix Web page matrix. web pages are numbered 1 through n web(i,j) = number of links from page i to page j Web analysis. authority page … page that has many links to it hub page … links to many authority pages Web Page Matrix n = 2 billion (and growing by 1 million a day) n x n array of ints => 16 * 1018 bytes (16 * 109 GB) each page links to 10 (say) other pages on average on average there are 10 nonzero entries per row space needed for nonzero elements is approximately 2billion x 10 x 4 bytes = 80 billion bytes (80 GB) Sparse Matrix ADT Objects: a set of triples, <row, column, value>, where row and column are integers and form a unique combination, and value comes from the set item. Methods: for all a, b Sparse_Matrix, x max_row index item, i, j, max_col, Sparse_Marix Create(max_row, max_col) ::= return a Sparse_matrix that can hold up to max_items = max _row max_col and whose maximum row size is max_row and whose maximum column size is max_col. Sparse Matrix ADT (cont’d) Sparse_Matrix Transpose(a) ::= return the matrix produced by interchanging the row and column value of every triple. Sparse_Matrix Add(a, b) ::= if the dimensions of a and b are the same return the matrix produced by adding corresponding items, namely those with identical row and column values. else return error Sparse_Matrix Multiply(a, b) ::= if number of columns in a equals number of rows in b return the matrix d produced by multiplying a by b according to the formula: d [i] [j] = (a[i][k]•b[k][j]) where d (i, j) is the (i,j)th element else return error. Sparse Matrix Representation (1) (2) Represented by a two-dimensional array. Sparse matrix wastes space. Each element is characterized by <row, col, value>. Sparse_matrix Create(max_row, max_col) ::= #define MAX_TERMS 101 /* maximum number of terms +1*/ typedef struct { int col; int row; The terms in A should be ordered int value; based on <row, col> } term; term A[MAX_TERMS] Matrix Transpose 00304 00570 00000 02600 0000 0002 3506 0700 4000 Transpose of a Sparse Matrix in 2D array representation for (j = 0; j < columns; j++) for( i = 0; i <rows; i++) b[j][i] = a[i][j]; Time and Space Complexity ? Sparse Matrix Operations Transpose of a sparse matrix. a[0] [1] [2] [3] [4] [5] [6] [7] [8] row col value 6 6 8 0 0 15 0 3 22 0 5 -15 1 1 11 1 2 3 transpose 2 3 -6 4 0 91 5 2 28 b[0] [1] [2] [3] [4] [5] [6] [7] [8] row col value 6 6 8 0 0 15 0 4 91 1 1 11 2 1 3 2 5 28 3 0 22 3 2 -6 5 0 -15 Transpose a Sparse Matrix (1) for each row i take element <i, j, value> and store it in element <j, i, value> of the transpose. difficulty: where to put <j, i, value>? (0, 0, 15) ====> (0, 0, 15) (0, 3, 22) ====> (3, 0, 22) (0, 5, -15) ====> (5, 0, -15) (1, 1, 11) ====> (1, 1, 11) Move elements down very often. (2) For all elements in column j, place element <i, j, value> in element <j, i, value> Transpose of a Sparse Matrix void transpose (term a[], term b[]) /* b is set to the transpose of a */ { int n, i, j, currentb; n = a[0].value; /* total number of elements */ b[0].row = a[0].col; /* rows in b = columns in a */ b[0].col = a[0].row; /*columns in b = rows in a */ b[0].value = n; if (n > 0) { /*non zero matrix */ currentb = 1; for (i = 0; i < a[0].col; i++) /* transpose by columns in a */ for( j = 1; j <= n; j++) /* find elements from the current column */ if (a[j].col == i) { /* element is in current column, add it to b */ Time Complexity Analysis of Matrix Transpose Fast Matrix Transpose Step 1: #nonzero in each row of transpose. Complexity = #nonzero in each column of original m x nmatrix original matrix = [2, 1, 2, 2, 0, 1] t nonzero elements Step 1: O(n+t) Step2: Calculate Starting position of eachStep row 2: of O(n) transpose = [1, 3, 4, 6, 8, 8] S: O(t) Overall O(n+t) Step 3: Move elements from original list to transpose list. Time Complexity Analysis of Fast Matrix Transpose m x n original matrix t nonzero elements Step 1: O(n+t) Step 2: O(n) Step 3: O(t) Overall O(n+t) Runtime Performance - Transpose 500 x 500 matrix with 1994 nonzero elements Run time measured on a 300MHz Pentium II PC 2D array 210 ms SparseMatrix 6 ms Runtime Performance - Addition Matrix Addition. 500 x 500 matrices with 1994 and 999 nonzero elements 2D array SparseMatrix 880 ms 18 ms