E2: ALGORITHM & LOGIC DEVELOPMENT UNIT II Algorithm Design and Data Structure: Introduc on to Data Structure, The study of data structures helps to understand the basic concepts involved in organizing and storing data as well as the rela onship among the data sets. This in turn helps to determine the way informa on is stored, retrieved, and modified in a computer’s memory. Basic Concept Data structure is a branch of computer science. The study of data structure helps you to understand how data is organized and how data flow is managed to increase efficiency of any process or program. Data structure is the structural representa on of logical rela onship between data elements. This means that a data structure organizes data items based on the rela onship between the data elements. Example: A house can be iden fied by the house name, loca on, number of floors and so on. These structured set of variables depend on each other to iden fy the exact house. Similarly, data structure is a structured set of variables that are linked to each other, which forms the basic component of a system Terminology Data structures are the building blocks of any program or the so ware. Choosing the appropriate data structure for a program is the most difficult task for a programmer. Following terminology is used as far as data structures are concerned Data: Data can be defined as an elementary value or the collec on of values, for example, student's name and its id are the data about the student. Group Items: Data items which have subordinate data items are called Group item, for example, name of a student can have first name and the last name. Record: Record can be defined as the collec on of various data items, for example, if we talk about the student en ty, then its name, address, course and marks can be grouped together to form the record for the student. File: A File is a collec on of various records of one type of en ty, for example, if there are 60 employees in the class, then there will be 20 records in the related file where each record contains the data about each employee. A ribute and En ty: An en ty represents the class of certain objects. it contains various a ributes. Each a ribute represents the par cular property of that en ty. Field: Field is a single elementary unit of informa on represen ng the a ribute of an en ty. BCA 1st Sem BCADS-117 Page | 23 E2: ALGORITHM & LOGIC DEVELOPMENT Goals of Data Structure Data structure basically implements two complementary goals. Correctness: Data structure is designed such that it operates correctly for all kinds of input, which is based on the domain of interest. In other words, correctness forms the primary goal of data structure, which always depends on the specific problems that the data structure is intended to solve. Efficiency: Data structure also needs to be efficient. It should process the data at high speed without u lizing much of the computer resources such as memory space. In a real me state, the efficiency of a data structure is an important factor that determines the success and failure of the process. Features of Data Structure Some of the important features of data structures are: Robustness: Generally, all computer programmers wish to produce so ware that generates correct output for every possible input provided to it, as well as execute efficiently on all hardware pla orms. This kind of robust so ware must be able to manage both valid and invalid inputs. Adaptability: Developing so ware projects such as word processors, Web browsers and Internet search engine involves large so ware systems that work or execute correctly and efficiently for many years. Moreover, so ware evolves due to ever changing market condi ons or due to emerging technologies. Reusability: Reusability and adaptability go hand-in-hand. It is a known fact that the programmer requires many resources for developing any so ware, which makes it an expensive enterprise. However, if the so ware is developed in a reusable and adaptable way, then it can be implemented in most of the future applica ons. Thus, by implemen ng quality data structures, it is possible to develop reusable so ware, which tends to be cost effec ve and me saving. Classifica on data Structure A data structure provides a structured set of variables that are associated with each other in different ways. It forms a basis of programming tool that represents the rela onship between data elements and helps programmers to process the data easily. Data structure can be classified into two categories: Primi ve data structure Non-primi ve data structure BCA 1st Sem BCADS-117 Page | 24 E2: ALGORITHM & LOGIC DEVELOPMENT Primi ve Data Structure Primi ve data structures consist of the numbers and the characters which are built in programs. These can be manipulated or operated directly by the machine level instruc ons. Basic data types such as integer, real, character, and Boolean come under primi ve data structures. These data types are also known as simple data types because they consist of characters that cannot be divided. Non-primi ve Data Structure Non-primi ve data structures are those that are derived from primi ve data structures. These data structures cannot be operated or manipulated directly by the machine level instruc ons. They focus on forma on of a set of data elements that is either homogeneous (same data type) or heterogeneous (different data type). These are further divided into linear and nonlinear data structure based on the structure and arrangement of data. Linear Data Structure A data structure that maintains a linear rela onship among its elements is called a linear data structure. Here, the data is arranged in a linear fashion. But in the memory, the arrangement may not be sequen al. Ex: Arrays, linked lists, stacks, queues. Non-linear Data Structure Non-linear data structure is a kind of data structure in which data elements are not arranged in a sequen al order. There is a hierarchical rela onship between individual data items. Here, the inser on and dele on of data is BCA 1st Sem BCADS-117 Page | 25 E2: ALGORITHM & LOGIC DEVELOPMENT not possible in a linear fashion. Trees and graphs are examples of non-linear data structures. Algorithm Algorithm is a step-by-step procedure, which defines a set of instruc ons to be executed in a certain order to get the desired output. Algorithms are generally created independent of underlying languages, i.e. an algorithm can be implemented in more than one programming language. From the data structure point of view, following are some important categories of algorithms Search − Algorithm to search an item in a data structure. Sort − Algorithm to sort items in a certain order. Insert − Algorithm to insert item in a data structure. Update − Algorithm to update an exis ng item in a data structure. Delete − Algorithm to delete an exis ng item from a data structure. Characteris cs of an Algorithm Not all procedures can be called an algorithm. An algorithm should have the following characteris cs Clear and Unambiguous: Algorithm should be clear and unambiguous. Each of its steps should be clear in all aspects and must lead to only one meaning. Well-Defined Inputs: If an algorithm says to take inputs, it should be welldefined inputs. Well-Defined Outputs: The algorithm must clearly define what output will be yielded and it should be well-defined as well. Finite-ness: The algorithm must be finite, i.e. it should not end up in an infinite loops or similar. Feasible: The algorithm must be simple, generic and prac cal, such that it can be executed upon will the available resources. It must not contain some future technology, or anything. Language Independent: The Algorithm designed must be language-independent, i.e. it must be just plain instruc ons that can be implemented in any language, and yet the output will be same, as expected. Advantages and Disadvantages of Algorithm Advantages of Algorithms: It is easy to understand. Algorithm is a step-wise representa on of a solu on to a given problem. In Algorithm the problem is broken down into smaller pieces or steps hence, it is easier for the programmer to convert it into an actual program. Disadvantages of Algorithms: BCA 1st Sem BCADS-117 Page | 26 E2: ALGORITHM & LOGIC DEVELOPMENT Wri ng an algorithm takes a long me so it is me-consuming. Branching and Looping statements are difficult to show in Algorithms Different approach to design an algorithm Top-Down Approach: A top-down approach starts with iden fying major components of system or program decomposing them into their lower-level components & itera ng un l desired level of module complexity is achieved. In this we start with topmost module & incrementally add modules that is calls. Bo om-Up Approach: A bo om-up approach starts with designing most basic or primi ve component & proceeds to higher level components. Star ng from very bo om, opera ons that provide layer of abstrac on are implemented Building Blocks of Algorithms, Complexity of Algorithms: Suppose X is an algorithm and n is the size of input data, the me and space used by the algorithm X are the two main factors, which decide the efficiency of X. Time Factor − Time is measured by coun ng the number of key opera ons such as comparisons in the sor ng algorithm. Space Factor − Space is measured by coun ng the maximum memory space required by the algorithm. The complexity of an algorithm f(n) gives the running me and/or the storage space required by the algorithm in terms of n as the size of input data. Space Complexity Space complexity of an algorithm represents the amount of memory space required by the algorithm in its life cycle. The space required by an algorithm is equal to the sum of the following two components – A fixed part that is a space required to store certain data and variables, that are independent of the size of the problem. For example, simple variables and constants used, program size, etc. A variable part is a space required by variables, whose size depends on the size of the problem. For example, dynamic memory alloca on, recursion stack space, etc. Space complexity S(P) of any algorithm P is S(P) = C + SP(I), where C is the fixed part and S(I) is the variable part of the algorithm, which depends on instance characteris c I. Following is a simple example that tries to explain the concept – Algorithm: SUM(A, B) Step 1 - START BCA 1st Sem BCADS-117 Page | 27 E2: ALGORITHM & LOGIC DEVELOPMENT Step 2 - C ← A + B + 10 Step 3 - Stop Here we have three variables A, B, and C and one constant. Hence S(P) = 1 + 3. Now, space depends on data types of given variables and constant types and it will be mul plied accordingly. Time Complexity Time complexity of an algorithm represents the amount of me required by the algorithm to run to comple on. Time requirements can be defined as a numerical func on T(n), where T(n) can be measured as the number of steps, provided each step consumes constant me. For example, addi on of two n-bit integers takes n steps. Consequently, the total computa onal me is T(n) = c ∗ n, where c is the me taken for the addi on of two bits. Here, we observe that T(n) grows linearly as the input size increases. Nota ons for the Growth Rates of Func ons When analyzing the efficiency of algorithms, it's common to express the growth rates of func ons using asympto c nota on. The three most common nota ons are Big-O, Omega, and Theta. These nota ons describe different types of behavior for func ons as their input sizes approach infinity. 1. Big-O Nota on (O): - Denoted as O(f(n)), where f(n) is a func on. - Represents an upper bound on the growth rate of a func on. - If a func on g(n) is O(f(n)), it means that, for sufficiently large n, g(n) grows at most as fast as a constant mes f(n). Example: If an algorithm has a me complexity of O(n^2), it means that the worst-case running me grows quadra cally with the input size. 2. Omega Nota on (Ω): - Denoted as Ω(f(n)), where f(n) is a func on. - Represents a lower bound on the growth rate of a func on. - If a func on g(n) is Ω(f(n)), it means that, for sufficiently large n, g(n) grows at least as fast as a constant mes f(n). Example: If an algorithm has a me complexity of Ω(n), it means that the worst-case running me grows linearly or faster with the input size. 3. Theta Nota on (Θ): BCA 1st Sem BCADS-117 Page | 28 E2: ALGORITHM & LOGIC DEVELOPMENT - Denoted as Θ(f(n)), where f(n) is a func on. - Represents both an upper and lower bound on the growth rate of a func on. - If a func on g(n) is Θ(f(n)), it means that, for sufficiently large n, g(n) grows at the same rate as a constant mes f(n). Example: If an algorithm has a me complexity of Θ(n log n), it means that the worst-case running me grows in the same order of magnitude as n log n. These nota ons are useful for expressing the efficiency of algorithms in terms of how they scale with input size without ge ng bogged down in the exact details of the func on. They help algorithm designers and analysts communicate the performance characteris cs of algorithms concisely. Procedure and Recursion It seems there might be a bit of a mix-up in the terms you've used. "Procedure" typically refers to a sequence of steps or ac ons to be executed in a specific order, while "recursion" is a programming concept where a func on calls itself. Let me provide informa on on both: Procedure: In the context of data structures and algorithms, a procedure usually refers to a set of steps or opera ons that perform a specific task or opera on. It's a series of well-defined steps or ac ons that must be carried out in a par cular order to achieve a specific goal. Procedures are o en used in algorithms, where each step contributes to solving a larger problem. For example, consider a procedure to sort an array using the bubble sort algorithm. The steps of the procedure involve comparing and swapping elements un l the en re array is sorted. Recursion: Recursion is a programming concept where a func on calls itself in its own defini on. This can be a powerful and elegant way to solve certain problems. In the context of data structures, recursion is o en used with tree-like structures. For example, consider a binary tree. A recursive algorithm for traversing the tree would involve calling the same algorithm on the le and right subtrees. The base case of the recursion is typically when you reach a leaf node. #include <stdio.h> int sum(int k); int main() { BCA 1st Sem BCADS-117 Page | 29 E2: ALGORITHM & LOGIC DEVELOPMENT int result = sum(10); prin ("%d", result); return 0; } int sum(int k) { if (k > 0) { return k + sum(k - 1); } else { return 0; } } Example Explained When the sum() func on is called, it adds parameter k to the sum of all numbers smaller than k and returns the result. When k becomes 0, the func on just returns 0. When running, the program follows these steps: 10 + sum(9) 10 + ( 9 + sum(8) ) 10 + ( 9 + ( 8 + sum(7) ) ) ... 10 + 9 + 8 + 7 + 6 + 5 + 4 + 3 + 2 + 1 + sum(0) 10 + 9 + 8 + 7 + 6 + 5 + 4 + 3 + 2 + 1 + 0 Output 55 Array Arrays are defined as the collection of similar types of data items stored at contiguous memory locations. It is one of the simplest data structures where each data element can be randomly accessed by using its index number. In C programming, they are the derived data types that can store the primitive type of data such as int, char, double, float, etc. For example, if we want to store the marks of a student in 6 subjects, then we don't need to define a different variable for the marks in different subjects. Instead, we can define an array that can store the marks in each subject at the contiguous memory locations. BCA 1st Sem BCADS-117 Page | 30 E2: ALGORITHM & LOGIC DEVELOPMENT Properties of array There are some of the properties of an array that are listed as follows o Each element in an array is of the same data type and carries the same size that is 4 bytes. o Elements in the array are stored at contiguous memory locations from which the first element is stored at the smallest memory location. o Elements of the array can be randomly accessed since we can calculate the address of each element of the array with the given base address and the size of the data element. Representation of an array We can represent an array in various ways in different programming languages. As an illustration, let's see the declaration of array in C language - As per the above illustration, there are some of the following important points o Index starts with 0. o The array's length is 10, which means we can store 10 elements. o Each element in the array can be accessed via its index. Why are arrays required? Arrays are useful because o Sorting and searching a value in an array is easier. o Arrays are best to process multiple values quickly and easily. o Arrays are good for storing multiple values in a single variable - In computer programming, most cases require storing a large number of data of a similar type. To store such an amount of data, we need to define a large number of variables. It would be very difficult to remember the names of all the BCA 1st Sem BCADS-117 Page | 31 E2: ALGORITHM & LOGIC DEVELOPMENT variables while writing the programs. Instead of naming all the variables with a different name, it is better to define an array and store all the elements into it. Memory allocation of an array As stated above, all the data elements of an array are stored at contiguous locations in the main memory. The name of the array represents the base address or the address of the first element in the main memory. Each element of the array is represented by proper indexing. We can define the indexing of an array in the below ways 1. 0 (zero-based indexing): The first element of the array will be arr[0]. 2. 1 (one-based indexing): The first element of the array will be arr[1]. 3. n (n - based indexing): The first element of the array can reside at any random index number. In the above image, we have shown the memory allocation of an array arr of size 5. The array follows a 0-based indexing approach. The base address of the array is 100 bytes. It is the address of arr[0]. Here, the size of the data type used is 4 bytes; therefore, each element will take 4 bytes in the memory. How to access an element from the array? We required the information given below to access any random element from the array o Base Address of the array. o Size of an element in bytes. o Type of indexing, array follows. The formula to calculate the address to access an array element BCA 1st Sem BCADS-117 Page | 32 E2: ALGORITHM & LOGIC DEVELOPMENT 1. Byte address of element A[i] = base address + size * ( i - first index) Here, size represents the memory taken by the primitive data types. As an instance, int takes 2 bytes, float takes 4 bytes of memory space in C programming. We can understand it with the help of an example Suppose an array, A[-10 ..... +2 ] having Base address (BA) = 999 and size of an element = 2 bytes, find the location of A[-1]. L(A[-1]) = 999 + 2 x [(-1) - (-10)] = 999 + 18 = 1017 Basic operations Now, let's discuss the basic operations supported in the array o Traversal - This operation is used to print the elements of the array. o Insertion - It is used to add an element at a particular index. o Deletion - It is used to delete an element from a particular index. o Search - It is used to search an element using the given index or by the value. o Update - It updates an element at a particular index. 2D Array 2D array can be defined as an array of arrays. The 2D array is organized as matrices which can be represented as the collection of rows and columns. However, 2D arrays are created to implement a relational database look alike data structure. It provides ease of holding bulk of data at once which can be passed to any number of functions wherever required. How to declare 2D Array The syntax of declaring two dimensional array is very much similar to that of a one dimensional array, given as follows. 1. int arr[max_rows][max_columns]; BCA 1st Sem BCADS-117 Page | 33 E2: ALGORITHM & LOGIC DEVELOPMENT however, It produces the data structure which looks like following. Above image shows the two dimensional array, the elements are organized in the form of rows and columns. First element of the first row is represented by a[0][0] where the number shown in the first index is the number of that row while the number shown in the second index is the number of the column. How do we access data in a 2D array Due to the fact that the elements of 2D arrays can be random accessed. Similar to one dimensional arrays, we can access the individual cells in a 2D array by using the indices of the cells. There are two indices attached to a particular cell, one is its row number while the other is its column number. However, we can store the value stored in any particular cell of a 2D array to some variable x by using the following syntax. 1. int x = a[i][j]; where i and j is the row and column number of the cell respectively. BCA 1st Sem BCADS-117 Page | 34 E2: ALGORITHM & LOGIC DEVELOPMENT We can assign each cell of a 2D array to 0 by using the following code: 1. for ( int i=0; i<n ;i++) 2. { 3. for (int j=0; j<n; j++) 4. { 5. 6. a[i][j] = 0; } 7. } Initializing 2D Arrays We know that, when we declare and initialize one dimensional array in C programming simultaneously, we don't need to specify the size of the array. However this will not work with 2D arrays. We will have to define at least the second dimension of the array. The syntax to declare and initialize the 2D array is given as follows. 1. int arr[2][2] = {0,1,2,3}; The number of elements that can be present in a 2D array will always be equal to (number of rows * number of columns). BCA 1st Sem BCADS-117 Page | 35