01. Introduction to Data Structures

advertisement
D ata S tructures
Chapter – 1 Introduction to Data Structures
Asmelash Girmay
Department of Information Technology
The Need of Data Structure
• Helps to understand relationship one data element with other and
organize it within the memory. Example:
• Months of a year: one month after the other can be linked
• Thus knowing the starting and ending month name could help to know the other months
• Specific department of a university can be represented in a tree
• Example: IT
2018-10-30
IT3201 Data Structures
2
Data Representation
• Various methods are used to represent data in Computers
• Example: hierarchal data representation
• Bit – basic unit of data representation
• Byte – combination of 8 bits
• In ASCII code representation 1 byte represents a character
• One or more characters are used to form a string
• String: is a data structure that emerges through several layers of data structures.
2018-10-30
IT3201 Data Structures
3
Integer Representation
• Int: -2n-1 to 2n-1 - 1 using n-number of bits
• Non-negative int represented using binary number system
• Each bit position represents power of 2, where the right most bit represents 20 = 1
• Example: 00100110 represents 21 + 22 + 25 = 2 + 4 + 32 = 38.
• Negative int represented using one’s complement and two’s complement
• One’s complement complements each bit, e.g. -38 = 11011001
• Two’s complement adds to the one’s complement of a number, e.g. -38 = 11011010
2018-10-30
IT3201 Data Structures
4
Real Number Representation
• Float: m*nr => -3.4*1038 to 3.4*1038
• A real number is represented by mantissa and exponent. E.g.,
• 43.56 can be represented as 4356 * 10-2 , where 4356 – mantissa, -2 exponent
• Both mantissa and exponent are two’s complement of binary
integers. E.g.
• 4356 = 0001000100000100, and -2 = 1111 1110 (8 – bit representation)
• 43.56 = 0001 0001 0000 0100 1111 1110
2018-10-30
IT3201 Data Structures
5
Character Representation
• Char: [0-9, a-z, A-Z] + other symbols = 2n chars, for n-number of bits
• Codes to represent characters are BCD, EBCDIC, ASCII
• ASCII is commonly used
• If 8 bits used to represent a character, 28 = 256 characters can be represented
• E.g., If character ‘A’ represented with 0100 0001 and ‘B’ with 0100 0010, then
“AB” can be represented with 0100 0001 0100 0010, which is a string
2018-10-30
IT3201 Data Structures
6
Abstract Data Types (ADTs)
• ADT is the mathematical model of data objects
• Specifies logical & mathematical properties of a data type and its operations
• ADT is useful as a guideline to implement a data type.
• In ADT, no implementation
• Two steps in ADT
1. Description of the way in which components are related to each other.
2. Statements of operations that can be performed on that data type.
2018-10-30
IT3201 Data Structures
7
Abstract Data Types (ADTs)...
• Example: in C, to implement integer data type, INTEGER_ADT defines:
• Range of numbers that will be represented
• Formats for storing the integer numbers
• Operations on it, such as addition, subtraction, division, multiplication,
modulo
2018-10-30
IT3201 Data Structures
8
Data Type
• A method of interpreting a bit pattern.
• It’s the implementation of the mathematical model specified by an ADT.
• It is an internal representation of data in the memory.
• Once ADT specification is done, a data type can then be implemented in:
• Hardwire – Circuitry as part of a computer
• Software – using the existing hard wired instructions
2018-10-30
IT3201 Data Structures
9
Data Structure Types
2018-10-30
IT3201 Data Structures
10
Review on C
This course uses C to implement data structures
2018-10-30
IT3201 Data Structures
11
Function
• A section of program, separately written from main program for:
• Reusability, readability, and/or modularity of software program.
• Function is a way of modularizing and organizing program.
• Function has much emphasis on procedural-oriented programming such as C.
• We can define a function in C in 4 ways.
• A function – with return type and argument/s [WRWA]
• A function – with return type but no arguments [WRNA]
• A function – with no return type but with arguments [NRWA]
• A function – with no return type and no arguments [NRNA]
KEY: R-return type, A – Argument lists, W – With, and N – No
2018-10-30
IT3201 Data Structures
12
Function…
2018-10-30
IT3201 Data Structures
13
Array
• Array is a derived data type, from basic data types like int, float, char, etc.
• It is used to store more than one data of same types at a time. E.g.,
• float StudentsGrade [57];
• It stores data in indexing form, where indexing starts at 0 (zero).
• It is also a data structure, but built-in and stores fixed-size sequential collection of
elements of same type.
• There are one-dimensional arrays, two-dimensional arrays, and three or above
dimensional arrays.
2018-10-30
IT3201 Data Structures
14
Structures
• A structure is a user-defined derived data type which allows you to
combine data items of different kinds/types.
• Used to store/represent a record as in a database table. E.g.,
• Attributes of a book:
• Title,
• Author,
• Subject,
• BookId
2018-10-30
IT3201 Data Structures
15
Structures…
• Exercise: Given a record with attributes, Students {name, gender,
department, idNumber}. Construct a structure type of students and use
them in your main program.
• A special type of structure that points to it self.
• It’s important in creating a linked list, dynamic stack and queue, tree, and
other data structures.
2018-10-30
IT3201 Data Structures
16
Pointers
• It is a type, which stores the location of another variable of any data
type.
• In C, some tasks are easy to work with pointers;
• Tasks like memory management cannot be performed without using pointers.
• Example: int A; int *ptrA = &A;
2018-10-30
IT3201 Data Structures
17
Memory Management
• Memory space is limited and should be dynamically managed.
• In C, memory management is done using
• Pointers, and
• Built-in functions such as calloc, free, malloc and realloc
2018-10-30
IT3201 Data Structures
18
Conditional statements
• In C, conditional statements are used to define conditional actions,
e.g., if no class, work on your homework. Otherwise, go to class.
• Conditions are defined using
• if…
• if...else
• if...else if…else
• switch
2018-10-30
IT3201 Data Structures
19
Loops
• In C, loops are used to program repetitive actions
• Loops defined using one of the following, let’s say: int i, n = 10;
• For loop:
• for (i=0; i<n; i++) { //Statements }
• While loop
• while (i<n) {//Statements; i++;}
• Do…while loop
• do {//Statements; i++;} while (i<n);
2018-10-30
IT3201 Data Structures
20
Analysis of Algorithm
The algorithm can be analyzed by tracing all step-by-step instructions.
2018-10-30
IT3201 Data Structures
21
Algorithm
• Algorithm should be checked for
• Correctness
• Simplicity
• Performance
• Algorithm performance analysis and measurements depends on:
• Space complexity
• Time complexity
2018-10-30
IT3201 Data Structures
22
Algorithm: Space Complexity
• Space complexity of an algorithm is the amount of memory it needs
to run to completion.
• Space needed by a program:
• Instruction space: to run an executable program. It is fixed
• Data space: to store all constants and variable values…
• For constants and simple variables. Fixed
• For fixed structural variables such as array and structure
• Dynamically allocated space.
2018-10-30
IT3201 Data Structures
23
Algorithm: Space Complexity…
• Environment stack space: to store information of suspended functions
• When a function is invoked, the following data is stored:
• Return address
• Value of all lead variables and formal parameters of the function being invoked
• The amount of space used by recursive functions is called recursive stack space.
• This space depends on:
• The size of each local variables
• The depth of the recursion
2018-10-30
IT3201 Data Structures
24
Algorithm: Time Complexity
• The amount of time a program needs to run to completion
• Exact time depends on:
• Implementation of the algorithm
• Programming language
• Compiler used
• CPU speed
• Other hardware characteristics
• Counting all operations performed in the algorithm will help in calculating time
2018-10-30
IT3201 Data Structures
25
Algorithm: Time Complexity…
• Time complexity has to be expressed in the form of functions
• Analysis of algorithm depends
• The input data
• Three cases of an algorithm
• Best case – best possible input data
• Average case – typical input data
• Worst case – worst input configuration
2018-10-30
IT3201 Data Structures
26
Algorithm: Big “OH” Notation
• A characteristics scheme that measures properties of algorithm complexity,
performance, and memory requirements.
• Complexity can be measured by eliminating constant factors.
• Thus, complexity function f(n) of an algorithm increases as ‘n’ increases.
• E.g., Let’s consider a sequential searching algorithm
• If an array contains n elements.
• Worst case – the search compares the whole elements with the target, thus f(n) = n
• Average case – if the target element found at half way, thus f(n) = n/2
• Best case – if the target element found in the first position, thus f(n) = 1
2018-10-30
IT3201 Data Structures
27
Algorithm: Big “OH” Notation...
• F(n) = O(n) read as “fof n is big Oh of n” or “f(n) is the order of n”
• The total running time (or time complexity) includes the initializations and several other
iterative statements through the loop.
• Based on the time complexity representation of the big Oh notation, an algorithm can
be:
2018-10-30
IT3201 Data Structures
28
Limitation of Big “OH” Notation
• It contains no effort to improve the programming methodology. Big Oh Notation
does not discuss the way and means to improve the efficiency of the program,
but it helps to analyze and calculate the efficiency (by finding time complexity) of
the program.
• It does not exhibit the potential of the constants. For example, one algorithm is
taking 1000n2 time to execute and the other n3 time. The first algorithm is O(n2),
which implies that it will take less time than the other algorithm which is O(n3).
However, in actual execution the second algorithm will be faster for n< 100
2018-10-30
IT3201 Data Structures
29
Recursion
”A recursion routine is one whose design includes a call to itself”
2018-10-30
IT3201 Data Structures
30
Recursion
• Recursion is a powerful technique for defining an algorithm
• A procedure is recursive if it is defined in terms of itself. E
• Example:
• Factorial: f(n) = n*f(n-1), where f(0) = 1 for n>0.
• Fibonacci numbers f(n-1) + f(n-2), where f(0) = 0, f(1) = 1.
2018-10-30
IT3201 Data Structures
31
Recursion…
• Factorial:
• Factorial (n)
• If n == 0 or 1 Return 1;
• Else Return n*Factorial(n-1);
• Fibonacci:
• Fibonacci(n)
• if n <= 1 Return n;
• else Return Fibonacci(n-1) + Fibonacci(n-2);
2018-10-30
IT3201 Data Structures
32
Principle of Recursion
• While design an algorithm with recursion, the following basic principles should be
considered.
1.
Find the key step – then find out if the remaining problem can be solved the same way
2.
Find a stopping rule – after substantial part of an algorithm is done, the execution should
stop
3.
Outline the algorithm – combining the above two principles (1 and 2) with if…else
statements and recursion, an algorithm should be designed
4.
Check termination – the recursion should terminate after finite number of steps
5.
Draw a recursion tree – to analyze the recursion algorithm, one has to draw a recursion
tree
2018-10-30
IT3201 Data Structures
33
Recursion vs Loop
• Loop is used when we want to execute a part of the program or block
of statements several times
• A recursion function is a function which calls itself from its body again
and again.
2018-10-30
IT3201 Data Structures
34
The End ☺
2018-10-30
IT3201 Data Structures
35
Download