D ata S tructures Chapter – 1 Introduction to Data Structures Asmelash Girmay Department of Information Technology The Need of Data Structure • Helps to understand relationship one data element with other and organize it within the memory. Example: • Months of a year: one month after the other can be linked • Thus knowing the starting and ending month name could help to know the other months • Specific department of a university can be represented in a tree • Example: IT 2018-10-30 IT3201 Data Structures 2 Data Representation • Various methods are used to represent data in Computers • Example: hierarchal data representation • Bit – basic unit of data representation • Byte – combination of 8 bits • In ASCII code representation 1 byte represents a character • One or more characters are used to form a string • String: is a data structure that emerges through several layers of data structures. 2018-10-30 IT3201 Data Structures 3 Integer Representation • Int: -2n-1 to 2n-1 - 1 using n-number of bits • Non-negative int represented using binary number system • Each bit position represents power of 2, where the right most bit represents 20 = 1 • Example: 00100110 represents 21 + 22 + 25 = 2 + 4 + 32 = 38. • Negative int represented using one’s complement and two’s complement • One’s complement complements each bit, e.g. -38 = 11011001 • Two’s complement adds to the one’s complement of a number, e.g. -38 = 11011010 2018-10-30 IT3201 Data Structures 4 Real Number Representation • Float: m*nr => -3.4*1038 to 3.4*1038 • A real number is represented by mantissa and exponent. E.g., • 43.56 can be represented as 4356 * 10-2 , where 4356 – mantissa, -2 exponent • Both mantissa and exponent are two’s complement of binary integers. E.g. • 4356 = 0001000100000100, and -2 = 1111 1110 (8 – bit representation) • 43.56 = 0001 0001 0000 0100 1111 1110 2018-10-30 IT3201 Data Structures 5 Character Representation • Char: [0-9, a-z, A-Z] + other symbols = 2n chars, for n-number of bits • Codes to represent characters are BCD, EBCDIC, ASCII • ASCII is commonly used • If 8 bits used to represent a character, 28 = 256 characters can be represented • E.g., If character ‘A’ represented with 0100 0001 and ‘B’ with 0100 0010, then “AB” can be represented with 0100 0001 0100 0010, which is a string 2018-10-30 IT3201 Data Structures 6 Abstract Data Types (ADTs) • ADT is the mathematical model of data objects • Specifies logical & mathematical properties of a data type and its operations • ADT is useful as a guideline to implement a data type. • In ADT, no implementation • Two steps in ADT 1. Description of the way in which components are related to each other. 2. Statements of operations that can be performed on that data type. 2018-10-30 IT3201 Data Structures 7 Abstract Data Types (ADTs)... • Example: in C, to implement integer data type, INTEGER_ADT defines: • Range of numbers that will be represented • Formats for storing the integer numbers • Operations on it, such as addition, subtraction, division, multiplication, modulo 2018-10-30 IT3201 Data Structures 8 Data Type • A method of interpreting a bit pattern. • It’s the implementation of the mathematical model specified by an ADT. • It is an internal representation of data in the memory. • Once ADT specification is done, a data type can then be implemented in: • Hardwire – Circuitry as part of a computer • Software – using the existing hard wired instructions 2018-10-30 IT3201 Data Structures 9 Data Structure Types 2018-10-30 IT3201 Data Structures 10 Review on C This course uses C to implement data structures 2018-10-30 IT3201 Data Structures 11 Function • A section of program, separately written from main program for: • Reusability, readability, and/or modularity of software program. • Function is a way of modularizing and organizing program. • Function has much emphasis on procedural-oriented programming such as C. • We can define a function in C in 4 ways. • A function – with return type and argument/s [WRWA] • A function – with return type but no arguments [WRNA] • A function – with no return type but with arguments [NRWA] • A function – with no return type and no arguments [NRNA] KEY: R-return type, A – Argument lists, W – With, and N – No 2018-10-30 IT3201 Data Structures 12 Function… 2018-10-30 IT3201 Data Structures 13 Array • Array is a derived data type, from basic data types like int, float, char, etc. • It is used to store more than one data of same types at a time. E.g., • float StudentsGrade [57]; • It stores data in indexing form, where indexing starts at 0 (zero). • It is also a data structure, but built-in and stores fixed-size sequential collection of elements of same type. • There are one-dimensional arrays, two-dimensional arrays, and three or above dimensional arrays. 2018-10-30 IT3201 Data Structures 14 Structures • A structure is a user-defined derived data type which allows you to combine data items of different kinds/types. • Used to store/represent a record as in a database table. E.g., • Attributes of a book: • Title, • Author, • Subject, • BookId 2018-10-30 IT3201 Data Structures 15 Structures… • Exercise: Given a record with attributes, Students {name, gender, department, idNumber}. Construct a structure type of students and use them in your main program. • A special type of structure that points to it self. • It’s important in creating a linked list, dynamic stack and queue, tree, and other data structures. 2018-10-30 IT3201 Data Structures 16 Pointers • It is a type, which stores the location of another variable of any data type. • In C, some tasks are easy to work with pointers; • Tasks like memory management cannot be performed without using pointers. • Example: int A; int *ptrA = &A; 2018-10-30 IT3201 Data Structures 17 Memory Management • Memory space is limited and should be dynamically managed. • In C, memory management is done using • Pointers, and • Built-in functions such as calloc, free, malloc and realloc 2018-10-30 IT3201 Data Structures 18 Conditional statements • In C, conditional statements are used to define conditional actions, e.g., if no class, work on your homework. Otherwise, go to class. • Conditions are defined using • if… • if...else • if...else if…else • switch 2018-10-30 IT3201 Data Structures 19 Loops • In C, loops are used to program repetitive actions • Loops defined using one of the following, let’s say: int i, n = 10; • For loop: • for (i=0; i<n; i++) { //Statements } • While loop • while (i<n) {//Statements; i++;} • Do…while loop • do {//Statements; i++;} while (i<n); 2018-10-30 IT3201 Data Structures 20 Analysis of Algorithm The algorithm can be analyzed by tracing all step-by-step instructions. 2018-10-30 IT3201 Data Structures 21 Algorithm • Algorithm should be checked for • Correctness • Simplicity • Performance • Algorithm performance analysis and measurements depends on: • Space complexity • Time complexity 2018-10-30 IT3201 Data Structures 22 Algorithm: Space Complexity • Space complexity of an algorithm is the amount of memory it needs to run to completion. • Space needed by a program: • Instruction space: to run an executable program. It is fixed • Data space: to store all constants and variable values… • For constants and simple variables. Fixed • For fixed structural variables such as array and structure • Dynamically allocated space. 2018-10-30 IT3201 Data Structures 23 Algorithm: Space Complexity… • Environment stack space: to store information of suspended functions • When a function is invoked, the following data is stored: • Return address • Value of all lead variables and formal parameters of the function being invoked • The amount of space used by recursive functions is called recursive stack space. • This space depends on: • The size of each local variables • The depth of the recursion 2018-10-30 IT3201 Data Structures 24 Algorithm: Time Complexity • The amount of time a program needs to run to completion • Exact time depends on: • Implementation of the algorithm • Programming language • Compiler used • CPU speed • Other hardware characteristics • Counting all operations performed in the algorithm will help in calculating time 2018-10-30 IT3201 Data Structures 25 Algorithm: Time Complexity… • Time complexity has to be expressed in the form of functions • Analysis of algorithm depends • The input data • Three cases of an algorithm • Best case – best possible input data • Average case – typical input data • Worst case – worst input configuration 2018-10-30 IT3201 Data Structures 26 Algorithm: Big “OH” Notation • A characteristics scheme that measures properties of algorithm complexity, performance, and memory requirements. • Complexity can be measured by eliminating constant factors. • Thus, complexity function f(n) of an algorithm increases as ‘n’ increases. • E.g., Let’s consider a sequential searching algorithm • If an array contains n elements. • Worst case – the search compares the whole elements with the target, thus f(n) = n • Average case – if the target element found at half way, thus f(n) = n/2 • Best case – if the target element found in the first position, thus f(n) = 1 2018-10-30 IT3201 Data Structures 27 Algorithm: Big “OH” Notation... • F(n) = O(n) read as “fof n is big Oh of n” or “f(n) is the order of n” • The total running time (or time complexity) includes the initializations and several other iterative statements through the loop. • Based on the time complexity representation of the big Oh notation, an algorithm can be: 2018-10-30 IT3201 Data Structures 28 Limitation of Big “OH” Notation • It contains no effort to improve the programming methodology. Big Oh Notation does not discuss the way and means to improve the efficiency of the program, but it helps to analyze and calculate the efficiency (by finding time complexity) of the program. • It does not exhibit the potential of the constants. For example, one algorithm is taking 1000n2 time to execute and the other n3 time. The first algorithm is O(n2), which implies that it will take less time than the other algorithm which is O(n3). However, in actual execution the second algorithm will be faster for n< 100 2018-10-30 IT3201 Data Structures 29 Recursion ”A recursion routine is one whose design includes a call to itself” 2018-10-30 IT3201 Data Structures 30 Recursion • Recursion is a powerful technique for defining an algorithm • A procedure is recursive if it is defined in terms of itself. E • Example: • Factorial: f(n) = n*f(n-1), where f(0) = 1 for n>0. • Fibonacci numbers f(n-1) + f(n-2), where f(0) = 0, f(1) = 1. 2018-10-30 IT3201 Data Structures 31 Recursion… • Factorial: • Factorial (n) • If n == 0 or 1 Return 1; • Else Return n*Factorial(n-1); • Fibonacci: • Fibonacci(n) • if n <= 1 Return n; • else Return Fibonacci(n-1) + Fibonacci(n-2); 2018-10-30 IT3201 Data Structures 32 Principle of Recursion • While design an algorithm with recursion, the following basic principles should be considered. 1. Find the key step – then find out if the remaining problem can be solved the same way 2. Find a stopping rule – after substantial part of an algorithm is done, the execution should stop 3. Outline the algorithm – combining the above two principles (1 and 2) with if…else statements and recursion, an algorithm should be designed 4. Check termination – the recursion should terminate after finite number of steps 5. Draw a recursion tree – to analyze the recursion algorithm, one has to draw a recursion tree 2018-10-30 IT3201 Data Structures 33 Recursion vs Loop • Loop is used when we want to execute a part of the program or block of statements several times • A recursion function is a function which calls itself from its body again and again. 2018-10-30 IT3201 Data Structures 34 The End ☺ 2018-10-30 IT3201 Data Structures 35