ARRAYS C S 1 2 0 1 Data Structures & Algorithms Prof. Dr. Wajid Aziz kh.wajid@ajku.edu.pk 1 ARRAYS C S Motivation • You want to store 4 numbers in a program o No problem. You define four float variables: float num1, num2, num3, num4; 1 2 0 1 2 o Easy enough, right? o But what if you want to store 2000 numbers? • Are you really going to make 2000 separate variables? float num1, num2,..., num1998, num1999,num2000; • That would be CRAZY! • So, what is the solution? o A data structure! Specifically, an array! • An array is one of the most common data structures. Prof. Dr. Wajid Aziz ARRAYS C S 1 2 0 1 3 Basic Concepts • Array name (data) • Index/subscript (0...9) • The slots are numbered sequentially starting at zero (Java, C++) • If there are N slots in an array, the index will be 0 through N-1 • Array length = N = 10 • Array size = N x Size of an element = 40 Prof. Dr. Wajid Aziz ARRAYS C S 1 2 0 1 4 Using Arrays • Array_name[index] • For example, in C++ ocout<<data[4]; • will display 0 odata[3] = 99; • Will replace -3 with 99 Prof. Dr. Wajid Aziz 1 ARRAYS C S 1 2 0 1 5 Using Arrays • data[ -1 ] o illegal • data[ 10 ] What will be the output of? data[5] + 10 data[3] = data[3] + 10 o illegal (10 > upper bound) • data[ 1.5 ] o illegal • data[ 0 ] o OK • data[ 9 ] o OK Prof. Dr. Wajid Aziz ARRAYS C S 1 2 0 1 6 ARRAYS C S 1 2 0 1 7 Array As a Data Structure • In general, the length of an arrayA be obtained from index set formula Length=UB-LB+1 • Where LB is lower bound (smallest index), UB is upper bound (largest index) and The simplest type of data structure is linear array. • Example: An automobile company uses an array AUTO to record number of automobiles sold each years from 1932 to 1984. o Rather beginning index set with 0, it more convenient to begin from 1932, then • LB=1932, UB=1984, hence length of array will be Length = UB-LB+1 = 1984-1932+1 = 53 Prof. Dr. Wajid Aziz Array As a Data Structure • The simplest type of data structure is linear array. o It is a collection of multiple values of same type o Examples: o An array of student grades o An array of student names o An array of objects (OOP perspective!) • If we choose name A for the array, then array elements are denoted by A[1], A[2], A[3], …, A[N] • The K in A[K] is called subscript and A[K], is called subscripted variable. • The number N called length or size of array. Prof. Dr. Wajid Aziz ARRAYS C S 1 2 0 1 8 Representation of Arrays in Memory • Let LA be a linear array in the memory of the 1000 computer, then LOC(LA(K))= Address of the element LA(K) of array LA 1001 1002 • The computer does not need to keep track of 1003 the address of every element of LA o Needs to keep track the address of first 1004 element of LA, called base address Base[LA] • The address of any element of LA can be computed using the formula LOC (LA[K])=Base[LA]+W(K-LB) • Where W is the number of words per memory cell of the LA. Prof. Dr. Wajid Aziz 1005 Computer Memory 2 ARRAYS C S 1 2 0 1 9 ARRAYS Representation of Arrays in Memory • Consider the AUTO array example, which records number of automobiles sold each years from 1932 to 1984. • Suppose AUTO appears in the memory as pictured in the Figure. Base(AUTO)=200, and W=4 words per cell. LOC(AUTO[1932])= 200, LOC(AUTO[1932])= 204, … • The address of array element for the year 1965 is LOC(AUTO[1965])= Base(AUTO)+W(1965-LB) = 200+4(1965-1932) = 200+4(33) = 200+132 = 332 Prof. Dr. Wajid Aziz C S 1 2 0 1 10 ARRAYS C S 1 2 0 1 11 Array Characteristics • Homogeneity o All elements in an array must have the same data type • Contiguous Memory o Array elements (or their references) are stored in contiguous/consecutive memory locations • Direct access to an element o Index reference is used to access it directly • Static data structure o An array cannot grow or shrink during program execution…the size is fixed Prof. Dr. Wajid Aziz ARRAYS ARRAY Operations C S 1 2 0 1 12 Array Operations • Traversing: Accessing and processing each element of ana array exactly once o Display all contents of an array o Count the number of elements of an array with a given property /* (Traversing a linear array) Here LA is a linear array with lower bound LB and upper bound UB. This algorithm traverses LA by applying an operation PROCESS to each element of LA */ for (K=LB;K<=UB; K++) PROCESS (LA[K]) Prof. Dr. Wajid Aziz 3 ARRAYS C S 1 2 0 1 13 Array Operations (Traversing) • Consider the AUTO array example, which records number of automobiles sold each years from 1932 to 1984. Find the number NUM of years during which more than 300 cars were sold. NUM=0 for (K=LB;K<=UB; K++) if AUTO[K]> 300 NUM=NUM+1 • Print year and number of automobiles sold in that year. for (K=LB;K<=UB; K++) cout<< K<“\t” AUTO[K] Prof. Dr. Wajid Aziz ARRAYS C S 1 2 0 1 14 ARRAYS C S 1 2 0 1 15 Array Operations • Insertion: Add an element at a certain index oWhat if we want to add an element at the beginning? • This would be a very slow operation! Why? o Because we would have to shift ALL other elements over one position • What if we add an element at the end? o It would be FAST. Why? No need to shift. Prof. Dr. Wajid Aziz Array Operations (Traversing) • Print year and number of automobiles sold in that year. for (K=LB;K<=UB; K++) cout<< K<<“\t”<<AUTO[K] Prof. Dr. Wajid Aziz ARRAYS Array Operations: Insertion C S 1 2 0 1 16 • Algorithm: The algorithm inserts a data elements ITEM into Kth position in a linear array LA comprising of N elements. • The first four steps create a space in LA by moving downward one location each element from Kth position on. INSERT (LA, N, K, ITEM) 1. Set J=N 2. Repeat step 3 and 4 while J>=K 3. Set LA[J+1]=LA[J] 4. Set J=J-1 5. LA[K]=item 6. Set N=N+1 7. Exit Prof. Dr. Wajid Aziz 4 ARRAYS ARRAYS Array Operations: Insertion C S 1 2 0 1 17 • C++ Code Segment to Insert an item in an array: The code segment inserts a data elements ITEM into Kth position in a linear array LA comprising of N elements. VOID INSERT (int LA[], int N,int K, int ITEM) { for (J =N ; J >= K; J--) LA[J] = LA[J-1]; LA[K] = ITEM; N=N+1 } Prof. Dr. Wajid Aziz C S 1 2 0 1 18 ARRAYS 1 2 0 1 19 • Algorithm: The algorithm deletes Kth element from a linear array LA and assigns it to a variable ITEM. DELETE (LA, N, K, ITEM) 1. Set ITEM=LA[K] 2. Repeat for J=K to N-1 Set LA[J]=LA[J+1] 3. N=N-1 4. Exit Prof. Dr. Wajid Aziz • Deletion: Remove an element at a certain index oRemove an element at the beginning of the array • Performance is again very slow. o Because ALL elements need to shift one position backwards oRemove an element at the end of an array • Very fast because of no shifting needed Prof. Dr. Wajid Aziz ARRAYS Array Operations: Deletion C S Array Operations Array Operations: Deletion C S 1 2 0 1 20 • C++ Code Segment: The code deletes Kth element from a linear array LA and assigns it to a variable ITEM. void DELETE (int LA[], int N, int K, int ITEM) { ITEM=LA[K] for (J=K;J<N; J--); LA[J]=LA[J+1]; N=N-1 ) Prof. Dr. Wajid Aziz 5 ARRAYS C S 1 2 0 1 21 Array Operations • Searching refers to the operation of finding LOC of ITEM in an array. • Searching through the array: • Depends on the algorithm • Some algorithms are faster than others o More detail coming soon! oLinear Search oBinary Search Prof. Dr. Wajid Aziz ARRAYS C S 1 2 0 1 22 ARRAYS ARRAYS Array Operations C S 1 2 0 1 23 • Linear Search: It is a sequential searching algorithm where o Start from the leftmost element of the array and one by one compare ITEM with each element of array o If ITEM matches with an element, return the index. o If ITEM doesn’t match with any of elements, return -1.. • It is the simplest searching algorithm. Prof. Dr. Wajid Aziz Linear Search Array Operations C S 1 2 0 1 24 • Linear Search Code Segment: The code returns location of the values o If found it will return Index of the ITEM o Otherwise, it will return -1 int search(int LA[], int N, int ITEM) { int J; for (J = 0; J < N; J++) if (LA[J] == ITEM) return J; return -1; } Prof. Dr. Wajid Aziz 6 ARRAYS C S 1 2 0 1 25 Analysis of Linear Search • Applicable for all data i.e. Sorted or Unsorted oBasic operation is “comparison” oThey ONLY way to be sure that a value isn’t in the array is to look at every single spot of the array oIf we have 100 elements then we have to make 100 comparisons to be sure about the value oTherefore for “n” elements, the number of comparisons will be “n” i.e. O(n) Prof. Dr. Wajid Aziz ARRAYS C S 1 2 0 1 Binary Search 26 ARRAYS C S 1 2 0 1 27 Array Operations • Binary Search: It is a searching algorithm for finding an element's position in a sorted array. • In this approach, the element is always searched in the middle of a portion of an array. • Example: Number Guessing Game from childhood o I have a secret number between 1 and 100. o Make a guess and I’ll tell you whether your guess is too high or too low. o Then you guess again and the process continues until you guess the correct number. o Your job is to MINIMIZE the number of guesses you make. Prof. Dr. Wajid Aziz ARRAYS C S Binary Search (Introduction) • Number Guessing Game from childhood o What is the first guess of most people? • 50. o Why? 1 2 0 1 28 • No matter the response (too high or too low), the most number of possible values for your remaining search is 50 (either from 1-49 or 51-100) • Any other first guess results in the risk that the possible remaining values is greater than 50. o Example: you guess 75 o I respond: too high o So now you have to guess between 1 and 74 » 74 values to guess from instead of 50 Prof. Dr. Wajid Aziz 7 ARRAYS C S 1 2 0 1 ARRAYS Binary Search (Introduction) • Applicable only on Sorted array index value 0 2 1 6 2 19 3 27 4 33 5 37 6 38 7 41 8 118 • We are searching for the value, 19 • So where is halfway between? o One guess would be to look at 2 and 118 and take their average (60). o But 60 isn’t even in the list o And if we look at the number closest to 60 • It is almost at the end of the array 29 C S 1 2 0 1 Binary Search (Introduction) • We quickly realize that if we want to adapt the number guessing game strategy to searching an array, we MUST search in the middle INDEX of the array. index value 1 2 0 1 31 0 2 1 6 2 19 3 27 4 33 5 37 6 38 7 41 8 118 o Index 4 stores 33 • The answer would be “less than” • So we would modify our search range to in between index 0 and index 3 o Note that index 4 is no longer in the search space o The second index we’d look at is index 1, since (0+3)/2=1 o Then we’d finally get to index 2, since (2+3)/2 = 2 o And at index 2, we would find the value, 19, in the array Prof. Dr. Wajid Aziz 4 33 5 37 6 38 7 41 8 118 Prof. Dr. Wajid Aziz Array Operation • We would ask, “is the number I am searching for, 19, greater or less than the number stored in index 4? • We then continue this process 3 27 ARRAYS Binary Search (Introduction) index value 2 19 o The lowest index is 0 o The highest index is 8 o So the middle index is 4 ARRAYS C S 1 6 • In this case: 30 Prof. Dr. Wajid Aziz 0 2 C S 1 2 0 1 32 • Binary Search Code Segment: The code returns location of the values o If found it will return Index of the ITEM o Otherwise, it will return -1 int search(int LA[], int Low, int High int ITEM) { int Mid; Mid = (Low + High)/2 if (ITEM == LA[Mid]) return Mid else if (ITEM > LA[Mid]) Low = Mid + 1 else High = Mid - 1 } Prof. Dr. Wajid Aziz 8 ARRAYS C S 1 2 0 1 33 ARRAYS Array Operation Analysis of Binary Search • Let’s analyze how many comparisons (guesses) are necessary when running this algorithm on an array of n items First, let’s try n = 128 o After 1 guess, we have 64 items left, o After 2 guesses, we have 32 items left, o After 3 guesses, we have 16 items left, o After 4 guesses, we have 8 items left, o After 5 guesses, we have 4 items left, o After 6 guesses, we have 2 item left o After 7 guesses, we have 1 items left. o After 8 guesses, we have 0 items left. C S 1 2 0 1 Analysis of Binary Search • General case for n items oAfter 1 guesses, we have n/2 items left, oAfter 2 guesses, we have n/4 items left, oAfter 3 guesses, we have n/8items left, oAfter 4 guesses, we have n/16 items left, o………………… o………………… o………………… oSo on until we have 1 item left 34 Prof. Dr. Wajid Aziz Prof. Dr. Wajid Aziz ARRAYS C S 1 2 0 1 35 ARRAYS Analysis of Binary Search • General case for n items oAfter 1 guesses, we have n/2 items left, oAfter 2 guesses, we have n/4 items left, oAfter 3 guesses, we have n/8items left, oAfter 4 guesses, we have n/16 items left, o………………… oAfter 10 guesses, we have oAfter k guesses, we have oWe will stop when we left with 1 item Prof. Dr. Wajid Aziz n/21 n/22 n/23 n/24 n/210 n/2k C S 1 2 0 1 36 Analysis of Binary Search • So we will stop once n 1 2k n 2k k log2 n • This means that a binary search roughly takes log2n comparisons when searching in a sorted array of n items • Efficiency of Binary Search is O(log2n) Prof. Dr. Wajid Aziz 9 ARRAYS C S 1 2 0 1 37 ARRAYS Linear Search vs Binary Search • Linear search O(n) • Binary Search O(log2n) • Binary Search is more efficient n log n 8 1024 65536 1048576 33554432 1073741824 3 10 16 20 25 30 Multi-Dimensional Array C S • Most of the programming languages allow 1 2 0 1 38 Prof. Dr. Wajid Aziz ARRAYS 1 2 0 1 39 • A two-dimensional m×n array A in memory is a collection of m.n data elements such that each element is specified by a pair of integers (J, K) called subscripts, with the property 1≤J≤m and 1≤K≤n • The elements of A with first subscript J and second subscript K are denoted by A[J, K] or Prof. Dr. Wajid Aziz o Two-dimensional arrays - having 2 subscripts o Three-dimensional arrays-having 3 subscripts • Some of the programming languages allow number of dimensions for an array as high as 7. Prof. Dr. Wajid Aziz ARRAYS Two-Dimensional Array C S • A multi-dimensional array can be termed as an array of arrays that stores homogeneous data in tabular form. A[J][K] Two-Dimensional Array C S 1 2 0 1 40 • A two-dimensional m×n array A will be represented in memory by a block of m.n sequential memory locations . • Specially programming language will store the array in o Column-major order (column by column) o Row-major order (row by row) Prof. Dr. Wajid Aziz 10 ARRAYS ARRAYS Two-Dimensional Array C S 1 2 0 1 41 • Like one-dimensional array, the computer keeps the track of Base(A)— the address of first element A[1][1]. • To compute address LOC(A[J][K]), use the formula for column-major order o LOC(A[J][K])=Base(A) + w(M(K-1)+(J-1)) • To compute address LOC(A[J][K]), use the formula for row-major order o LOC(A[J][K])=Base(A) + w(N(J-1)+(K-1)) Prof. Dr. Wajid Aziz Two-Dimensional Array C S 1 2 0 1 42 ARRAYS 1 2 0 1 43 • Suppose Base(score)=100 and w=4 words per memory cell. • The LOC(score[10][ 3]) using row-major order is LOC(Score[J][K])=Base(A) + w(N(J-1)+(K-1)) LOC(Score[10][3])=100 + 4(4(10-1)+(3-1))=100+4(36+2)=252 • The LOC(score[10][ 3]) using colum-major order is LOC(Score[J][K])=Base(A) + w(M(K-1)+(J-1)) LOC(Score[10][3])=100 + 4(25(3-1)+(10-1))=100+4(50+9)=336 Prof. Dr. Wajid Aziz Students Score1 Score2 Score3 Score4 1 95 88 100 85 2 72 77 66 72 3 78 70 80 96 . . . . . . . . . . . . . . . 25 84 88 73 81 Prof. Dr. Wajid Aziz ARRAYS Two-Dimensional Array C S • Suppose 25 students in a class are giving 4 tests as shown in following Table. • Assume students are numbered from 1 to 25. Two-Dimensional Array C S 1 2 0 1 44 • Two dimensional arrays are called matrices in mathematics and tables in business applications. • A two-dimensional 3×3 array A in C++ is shown below: 𝑨 𝟎 𝟎 𝑨 𝟎 𝟏 𝑨[𝟎][𝟐] 𝑨 𝟏 𝟎 𝑨 𝟏 𝟏 𝑨[𝟐][𝟐] 𝑨 𝟐 𝟎 𝑨 𝟐 𝟏 𝑨[𝟐][𝟐] • The number of elements in 3×3 array A is 9. Prof. Dr. Wajid Aziz 11 ARRAYS C S 1 2 0 1 45 C S ARRAYS General Multi-Dimensional Arrays General Multi-Dimensional Arrays • The n-dimensional m1×m2×m3×…×mn array B is a collection of m1.m2.m3…mn data elements in which each element is specified by a list of integers— such as K1,K2,K3,…,Kn called subscripts, with property that • Suppose B is a three-dimensional 2×4×3 array. • The data elements of array B are 2.4.3=24. • These 24 elements of B appear in three layers called page. o 1 ≤ K1 ≤ m1’ 1 ≤ K2 ≤ m2 ’ ….’ 1 ≤ Kn ≤ mn • The elements of B with K1,K2,…,Kn will be denoted by o B[K1][K2]…[Kn] • Specifically, programming language will store the array B in o Column-major order o Row-major order Prof. Dr. Wajid Aziz C S 1 2 0 1 46 o Each page consist of 2×4 rectangular array of elements with same third subscript. o Three subscripts of an elements of 3-dimensional array are called row, column and page. Prof. Dr. Wajid Aziz ARRAYS ARRAYS General Multi-Dimensional Arrays General Multi-Dimensional Arrays • The two ways of storing the 3-dimensional arrays are shown in the following figure.. C S • For a given subscript Ki, the effective index Ei of length Li is the number of indices preceding Ki in the index set and is calculated as Ei=Ki-LB 1 2 0 1 47 1 2 0 1 Prof. Dr. Wajid Aziz 48 • The address LOC(C[K1][K2], …, [KN]) of an arbitrary element of array C can be obtained using column major order as Base(C)+w((((…(ENLN-1+EN-1)LN-2+…+E3)L2+E2)L1+E1) • By row-major order Base(C)+w((((…(E1L2+E2)L3+…+E3)L4+…+EN-1)LN+EN) Prof. Dr. Wajid Aziz 12 C S 1 2 0 1 49 ARRAYS ARRAYS General Multi-Dimensional Arrays General Multi-Dimensional Arrays • Suppose a three-dimensional array M is declared using M(2:7, -4:1, 5:9) • The lengths of three dimensions of M are o L1=7-2+1=6 o L2=1-(-4)+1=6 o L3=9-5+1=5 • The number of elements M are o Elements=L1.L2.L3=6.6.5=180 • Effective indices of M[5][-1][7] o E1=5-2=3 o E2=-1-(-4)=3 o E3=7-5=2 Prof. Dr. Wajid Aziz C S 1 2 0 1 50 • The location of M[5][-1][7] using row-major order is LOC(M[5][-1][7])=Base(M)+w((E1L2+E2)L3+E3) E1L2=3.6=18 E1L2+E2=18+3=21 (E1L2+E2)L3=21.5=105 (E1L2+E2)L3+E3 =105+2=107 Let Base(M)=200 and w=4 LOC(M[5][-1][7])=200+4(107)=200+428=628 Prof. Dr. Wajid Aziz ARRAYS General Multi-Dimensional Arrays C S 1 2 0 1 51 • The location of M[5][-1][7] using column-major order is LOC(M[5][-1][7])=Base(M)+w((E3L2+E2)L1+E1) E3L2=2.6=12 E3L2+E2=12+3=15 (E3L2+E2)L1=15.6=90 (E3L2+E2)L1+E1=90+3=93 Let Base(M)=200 and w=4 LOC(M[5][-1][7])=200+4(93)=200+372=572 Prof. Dr. Wajid Aziz 13