Uploaded by primevideo2k2k

DATA STRUCTURE AND ALGORITHM SEM II

advertisement
MODULE - I
Introduction to
Data Structures
MODULE 1
Introduction to Data Structures
Module Description
Data Structure is a way of collecting and organising data in such a way that we can perform
operations on these data in an effective way. Data Structures is about rendering data elements
in terms of some relationship, for better organization and storage. We have data player's name
"Shane" and age 26. Here "Shane" is of String data type and 26 is of integer data type.
We can organize this data as a record like Player record. Now we can collect and store players’
records in a file or database as a data structure. For example, "Mitchelle" 30, "Steve" 31, "David"
33
In simple language, Data Structures are structures programmed to store ordered data, so that
various operations can be performed on it easily.
Chapter 1.1
Basics of Data Structure
Chapter 1.2
Pointers and Recursion
Chapter Table of Contents
Chapter 1.1
Basics of Data Structure
Aim ......................................................................................................................................................... 1
Instructional Objectives....................................................................................................................... 1
Learning Outcomes .............................................................................................................................. 1
1.1.1 Introduction to Data Structures ............................................................................................... 2
Self-assessment Questions ......................................................................................................... 3
1.1.2 Classification of data structures ............................................................................................... 4
(i) Primitive data structure ........................................................................................................ 4
(ii) Non-primitive data structure.............................................................................................. 5
Self-assessment Questions ......................................................................................................... 7
1.1.3 Elementary Data Organization ................................................................................................. 8
1.1.4 Time and Space Complexity ................................................................................................... 10
(i) Asymptotic Notation .......................................................................................................... 15
Self-assessment Questions ....................................................................................................... 20
1.1.5 String Processing ...................................................................................................................... 21
Self-assessment Questions ....................................................................................................... 23
1.1.6 Memory Allocation .................................................................................................................. 24
(i) Static memory allocation .................................................................................................... 24
(ii) Dynamic memory allocation ............................................................................................ 24
Self-assessment Questions ....................................................................................................... 25
1.1.7 Accessing the address of a variable: Address of (&) operator ............................................ 26
Self-assessment Questions ....................................................................................................... 27
Summary ............................................................................................................................................. 28
Terminal Questions............................................................................................................................ 29
Answer Keys........................................................................................................................................ 30
Activity................................................................................................................................................. 31
Case Study: Exponentiation .............................................................................................................. 31
Bibliography ........................................................................................................................................ 33
e-References ........................................................................................................................................ 33
External Resources ............................................................................................................................. 33
Video Links ......................................................................................................................................... 33
Aim
To equip the students with the basic skills of using Data Structures in programs
Instructional Objectives
After completing this chapter, you should be able to:
•
Describe Data Structures and its types
•
Explain items included elementary data organisation
•
Summarize the role of algorithms in programming
•
Explain the procedure to calculate time and space complexities
•
Explain the string processing with its functions
•
Demonstrate memory allocation and address variable
Learning Outcomes
At the end of this chapter, you are expected to:
•
Outline the different types of data structures
•
Elaborate asymptotic notations with example
•
Calculate the time and space complexities of any sorting algorithm
•
List down the string processing functions
•
Summarize the contents in elementary data organisation
•
Differentiate static and dynamic memory allocation
1
1.1.1 Introduction to Data Structures
Computers can store, retrieve and process vast amounts of data within its storage media. In
order to work with a large amount of data, it is very important to organize the data properly. If
the data is not organized properly, it becomes difficult to access the data. Thus, if the data is
organized efficiently, any operation can be performed on data very easily and quickly. Hence
it provides a faster response to the user.
This organization of data can be done with the help of data structures. Data structures enable
a programmer to properly structure large amounts of data into conceptually manageable
relationships. If we use a data structure to store the data, it becomes very easy to retrieve and
process them.
A data structure can be defined as a particular method of organizing a large amount of data so
that it can be used efficiently. A data structure can be used to store data in the form of a stack,
queue etc.
Any Data structure will follow these 4 rules:
1. It should be an agreement on how to store data in memory,
For example, data can be stored in an array, queue, linked list etc.
2. It should specify the operations we can perform on that data,
For example, we can specify add, delete, search operations on any data structure
3. It should specify the algorithms for those operations,
For example, efficient algorithms for searching element in array.
4. The algorithms used must be time and space efficient.
We have many primitive data types like integer, character, string etc. which stores specific kind
of data. Data structures allow us to perform operations on groups of data, such as adding an
item to a list, searching a particular element from the list etc. When a data structure provides
operations, we can call the data structure an abstract data type (abbreviated as ADT).
A data structure is a form of abstract data type having its own set of data elements along with
functions to perform operations on that data. Data structures allow us to manage large amounts
2
of data efficiently so that it can be stored in large databases. These data structures can be used
to design efficient sorting algorithms.
Every data structure has advantages and disadvantages. Every data structure suits to specific
problem domain depending upon the type of operations and the data arrangement.
For example, an array data structure is suitable for read operations. We can use an array as a
data structure to store n number of elements in contiguous memory locations and read/ add
elements as and when required. Other data structures include liked lists, queue, stack etc.
Self-assessment Questions
1) __________________is NOT the component of data structure.
a) Operations
b) Storage Structures
c) Algorithms
d)None of above
2) Which of the following are true about the characteristics of abstract data types?
a) It exports a type
b) It exports a set of operations
c) It exports a set of elements
d) It exports a set of arrays
3) Each array declaration need not give, implicitly or explicitly, the information
about,
a) The name of array
b) The data type of array
c) The first data from the set to be stored
d) the index set of the array
3
1.1.2 Classification of data structures
A data structure is the portion of memory allotted for a model, in which the required data can
be arranged in a proper fashion.
There are certain linear data structures (e.g., stacks and queues) that permit the insertion and
deletion operations only at the beginning or at end of the list, not in the middle. Such data
structures have significant importance in systems processes such as compilation and program
control.
Types of data structures:
A data structure can be broadly classified into:
1. Primitive data structure
2. Non-primitive data structure
(i) Primitive data structure
The data structures, typically those data structure that are directly operated upon by machine
level instructions, i.e., the fundamental data types such as int, float, double in case of ‘c’ are
known as primitive data structures.
Primitive data types are used to represent single values:
•
Integer: This is used to represent a number without decimal point.
For example, 22, 80
•
Float and Double: This is used to represent a number with decimal point.
For example, 54.1, 57.8
•
Character: This is used to represent single character.
For example, ‘L’, ‘g’
•
String: This is used to represent group of characters.
For example, "Hospital Management”
•
4
Boolean: This is used represent logical values either true or false.
(ii) Non-primitive data structure
A non-primitive data type is something else such as an array structure or class is known as the
non-primitive data type.
The data types that are derived from primary data types are known as non-Primitive data types.
These data types are used to store group of values.
The non-primitive data types are:
•
Arrays
•
Structure
•
Union
•
linked list
•
Stacks
•
Queue etc.
Non-primitive data types are not defined by the programming language, but are instead created
by the programmer. They are sometimes called "reference variables," or "object references,"
since they reference a memory location, which stores the data.
The non-primitive data types are used to store the group of values.
There are two types of non-primitive data structures.
1. Linear Data Structures
2. Non-linear data structures
1. Linear Data Structure:
A list, which shows the relationship of adjacency between elements, is said to be linear data
structure. The most, simplest linear data structure is a 1-D array, but because of its deficiency,
list is frequently used for different kinds of data.
A list is an ordered list, which consists of different data items connected by means of a link or
pointer. This type of list is also called a linked list. A linked list may be a single list or double
linked list.
5
•
Single linked list: A single linked list is used to traverse among the nodes in one
direction.
•
Double linked list: A double linked list is used to traverse among the nodes in both the
directions.
A linked list is normally used to represent any data used in word-processing applications, also
applied in different DBMS packages.
A list has two subsets. They are:
•
Stack: It is also called as last-in-first-out (LIFO) system. It is a linear list in which
insertion and deletion take place only at one end. It is used to evaluate different
expressions.
•
Queue: It is also called as first-in-first-out (FIFO) system. It is a linear list in which
insertion takes place at once end and deletion takes place at other end. It is generally
used to schedule a job in operating systems and networks.
2. Non-Linear data structure:
A list, which does not show the relationship of adjacency between elements, is said to be nonlinear data structure.
The frequently used non-linear data structures are,
•
Trees: It maintains hierarchical relationship between various elements.
•
Graphs: It maintains random relationship or point-to-point relationship between
various elements.
6
The Figure 1.1.1 shows the classification of data structures.
Figure 1.1.1: Types of data structures
Self-assessment Questions
4) Which of the following data structure is linear type?
a) Graph
b) Trees
c) Binary tree
d) Stack
5) Which of the following data structure is non-linear type?
a) Strings
b) Lists
c) Stacks
d) Graph
6) Which of the following data structure can't store the non-homogeneous data
elements?
a) Arrays
b) Records
c) Pointers
d) Stacks
7
1.1.3 Elementary Data Organization
There are some basic terminologies related to Data Structures. They are detailed below:
•
Data
The term data means a value or set of values. These values may represent an observation, like
roll number of student, marks of student, name of the employee, address of person, phone
number, etc. In programming languages we generally express data in the form of variables with
variable name as per the type of data like integer, floating point, character, etc.
For example, figures obtained during exit polls, roll number of student, marks of student, name
of the employee, address of person, phone number etc.
•
Data item:
A data item is a set of characters which are used to represent a specific data element. It refers
to a single unit of values. It is also called as Field.
For example, name of student in class is represented by data item say std_name.
The data item can be classified into two data types depending on usage:
1. Elementary data type: These data items can’t be further subdivided.
For example, roll number
2. Group data type: These data items can be further sub divided into elementary data
items. For example, Date can be divided into days, months, years
•
Entity:
An entity is something that has a distinct, separate existence, though it need not be a material
existence. An entity has certain attributes, or properties, which may be assigned values. Values
assigned may be either numeric or non-numeric.
For example, Student is an entity. The possible attributes for student can be roll number, name
and date of birth, gender, and class. Possible values for these attributes can be 32, Alex,
24/09/2000, M, 11. In C language we usually use structures to represent an entity.
8
•
Entity Set:
An entity set is a group of or set of similar entities.
For example, Consider a situation where we have multiple entities with same attributes, say,
students in B.Tech second year.
Here each student will have their respective roll numbers, names, marks obtained etc. All these
students can represent an entity set like an array of students.
•
Information:
When the data is processed by applying certain rules, the newly processed data is called
information. The data is not useful for decision marking whereas information is useful for
decision making.
For example, a student who has scored maximum marks in a subject becomes information as
it is processed and conveys a meaning
•
Record:
Record is a collection of field values of a given entity. Set of multiple records with same fields
form a file. Now a row in this file can be termed as a record. Every record has one or more than
one fields associated with it. Generally each record has at least one unique identifier field.
For example, roll number, name, address etc. of a particular student.
•
File:
File is a collection of records of the entities in a given entity set.
For example, file containing records of students of a particular class.
Figure 1.1.2 shows the student file structure.
9
Figure 1.1.2: Structure of a File
•
Key:
A key is one or more field(s) in a record that take(s) unique values and can be used to
distinguish one record from the others.
For example, in the above snapshot, ID is the key to identify particular record.
1.1.4 Time and Space Complexity
The running time of an algorithm depends on how long it takes a computer to run the lines of
code of the algorithm.
It depends on:
i.
The speed of the computer
ii. The programming language
iii. The compiler that translates the programming language into code which runs directly
on the computer
The complexity of an algorithm is a function describing the efficiency of the algorithm in terms
of the amount of data the algorithm must process. Usually there are natural units for the
domain and range of this function.
10
There are two main complexity measures of the efficiency of an algorithm:
•
Time complexity is a function describing the amount of time an algorithm takes in
terms of the amount of input to the algorithm. "Time" can mean:
i.
The number of memory accesses performed
ii. The number of comparisons between integers
iii. The number of times some inner loop is executed
iv. Some other natural unit related to the amount of real time the algorithm will
take
•
This idea of time is always kept separate from "wall clock" time, since many factors
unrelated to the algorithm itself can affect the real time like:
i.
The language used
ii. Type of computing hardware
iii. Proficiency of the programmer
iv. Optimization in the compiler
It turns out that, if the units are chosen wisely, all the other things do not matter and thus an
independent measure of the efficiency of the algorithm can be done.
•
Space complexity is a function describing the amount of memory (space) an algorithm
takes in terms of the amount of input to the algorithm. The requirement of the “extra
“memory is often determined by not counting the actual memory needed to store the
input itself.
We use natural, but fixed-length units, to measure space complexity. We can use bytes,
but it's easier to use units like: number of integers, number of fixed-sized structures, etc.
In the end, the function we come up with will be independent of the actual number of
bytes needed to represent the unit. Space complexity is sometimes ignored because the
space used is minimal and/or obvious, but sometimes it becomes as important an issue
as time.
For example, “This algorithm takes n2 time”. Here, n is the number of items in the
input or "This algorithm takes constant extra space" because the amount of extra
memory needed does not vary with the number of items processed.
11
An array of n floating point numbers is to be put into ascending numerical order. This task is
called sorting. One simple algorithm for sorting is selection sort. Let an index i go from 0 to n1, exchanging the ith element of the array with the minimum element from i up to n. Here are
the iterations of selection sort carried out on the sequence {4 3 9 6 1 7 0}:
Index
0
1
2
3
4
5
6
comments
|
4
3
9
6
1
7
0
initial
i=0 |
0
3
9
6
1
7
4
swap 0, 4
i=1 |
0
1
9
6
3
7
4
swap 1, 3
i=2 |
0
1
3
6
9
7
4
swap 3, 9
i=3 |
0
1
3
4
9
7
6
swap 6, 4
i=4 |
0
1
3
4
6
7
9
swap 9, 6
i=5 |
0
1
3
4
6
7
9
(done)
Here is a simple implementation in C:
int find_min_index (float [], int, int);
void swap (float [], int, int);
/* selection sort on array v of n floats */
Void selection_sort (float v[], int n) {
int
i;
/* for i from 0 to n-1, swap v[i] with the minimum
* of the i'th to the n'th array elements
*/
for (i=0; i<n-1; i++)
swap (v, i, find_min_index (v, i, n));
}
/* find the index of the minimum element of float array v from
* indices start to end
*/
int find_min_index (float v[], int start, int end) {
int
i, mini;
mini = start;
for (i=start+1; i<end; i++)
if (v[i] < v[mini]) mini = i;
return mini;
}
/* swap i'th with j'th elements of float array v */
void swap (float v[], int i, int j) {
float t;
12
t = v[i];
v[i] = v[j];
v[j] = t;
}
The performance of the algorithm is quantified, the amount of time and space taken in terms
of n. It is interesting to note how the time and space requirements change as n grows large;
sorting 10 items is trivial for almost any reasonable algorithm one can think of, but what about
1,000, 10,000, 1,000,000 or more items?
It is clear from this example, the amount of space needed is clearly dominated by the memory
consumed by the array. If the array can be stored, it can sort it. That is, it takes constant extra
space.
The main interesting point is in the amount of time the algorithm takes. One approach is to
count the number of array accesses made during the execution of the algorithm; since each
array access takes a certain (small) amount of time related to the hardware, this count is
proportional to the time the algorithm takes.
Thus end up with a function in terms of n that gives us the number of array accesses for the
algorithm. This function is called T (n), for Time.
T (n) is the total number of accesses made from the beginning of selection sort until the end.
Selection_sort itself simply calls swap and find_min_index as i go from 0 to n-1, so
T (n) = [time for swap + time for find_min_index (v, i, n)].
(n-2) because for loop goes from 0 up to but not including n-1). (Note: for those not familiar
with Sigma notation, the looking formula above just means "the sum, as we let i go from 0 to
n-2, of the time for swap plus the time for find_min_index (v, i, n).) The swap function makes
four accesses to the array, so the function is now,
13
T (n) = [4 + time for find_min_index (v, i, n)].
With respect to find_min_index, it is seen that it does two array accesses for each iteration
through the for loop, and it does the for loop n - i - 1 time:
T (n) = [4 + 2 (n - i - 1)].
With some mathematical manipulation, this can be broken up into:
T (n) = 4(n-1) + 2n (n-1) - 2(n-1) - 2.
(Everything times n-1 because it goes from 0 to n-2, i.e., n-1 times). Remembering that the sum
of i as i goes from 0 to n is (n (n+1))/2, then substituting in n-2 and cancelling out the 2's:
T (n) = 4(n-1) + 2n (n-1) - 2(n-1) - ((n-2) (n-1)).
And to make a long story short,
T (n) = n2 + 3n - 4.
So this function gives us the number of array accesses selection sort makes for a given array
size, and thus an idea of the amount of time it takes. There are other factors affecting the
performance, For instance the loop overhead, other processes running on the system, and the
fact that access time to memory is not really a constant. But this kind of analysis gives a good
idea of the amount of time one will spend waiting, and allows comparing these algorithms to
other algorithms that have been analysed in a similar way.
14
(i) Asymptotic Notation
The function, T (n) = n2 + 3n – 4 (refer earlier section), describes precisely the number of array
accesses made in the algorithm. In a sense, it is a little too precise; all we really need to say is
n2; the lower order terms contribute almost nothing to the sum when n is large. One likes a
way to justify ignoring those lower order terms and to make comparisons between algorithms
easy. So the asymptotic notation is used.
The worst-case complexity of the algorithm is the function defined by the maximum number
of steps taken on any instance of size n. It represents the curve passing through the highest
point of each column.
The best-case complexity of the algorithm is the function defined by the minimum number of
steps taken on any instance of size n. It represents the curve passing through the lowest point
of each column.
Finally, the average-case complexity of the algorithm is the function defined by the average
number of steps taken on any instance of size n.
•
Lower Bound: A non-empty set A and its subset B is given with relation ≤. An element
a is called lower bound of B if a ≤ x x B (read as if a is less than equal to x for all x belongs
to set B). For example, a non-empty set A and its subset B is given as A={1,2,3,4,5,6}
and B={2,3}. The lower bound of B= 1, 2 as 1, 2 in the set A is less than or equal to all
element of B.
•
Upper Bound: An element A is called upper bound of B if x ≤ a x B. For example, a
non-empty set A and its subset B is given as A={1,2,3,4,5,6} and B={2,3}. The upper
bound of B= 3,4,5,6 as 3,4,5,6 in the set A is greater than or equal to all element of B.
•
Tight Bound: A bound (upper bound or lower bound) is said to be tight bound if the
inequality is less than or equal to (≤).
Theta (Θ) Notation
It provides both upper and lower bounds for a given function.
Θ (Theta) Notation: means ‘order exactly’. Order exactly implies a function is bounded above
and bounded below both. This notation provides both minimum and maximum value for a
15
function. It further gives that an algorithm will take this much of minimum and maximum
time that a function can attain for any input.
Let g(n) be given function. f(n) be the set of function defined as,
Θ (g(n)) = {f(n): if there exist positive constant c1,c2 and n0 such that 0≤c1g(n)≤f(n) ≤c2g(n)
for all n, n n0}
It can be written as f(n)= Θ(g(n)) or f(n) Θ(g(n)), here f(n) is bounded both above and below
by some positive constant multiples of g(n) for all large values of n. It is described in the
following figure.
Figure 1.1.3 shows the Graphical representation of Theta (Θ) Notation.
Figure 1.1.3 Theta (Θ) Notation Graph
In the above figure, function f(n) is bounded below by constant c1 times g(n) and above by
constants c2 times g(n). We can explain this by following example:
For example,
To show that 3n+3 = Θ (n) or 3n+3 Θ (n) we will verify that f(n) g(n) or not with the help of
the definition
i.e., Θ (g(n)) = {f(n): if there exist positive constant c1,c2 and n0 such that 0≤c1g(n)≤f(n) ≤c2g(n)
for all n, n n0}
16
In the given problem f(n)= 3n+3 and g(n)=n to prove f(n) g(n) we have to find c1,c2 and n0
such that 0≤c1g(n)≤f(n) ≤c2g(n) for all n, n n0
=> to verify f(n) ≤c2g(n)
We can write f(n)=3n+3 as f(n)=3n+3 ≤ 3n+3n (write f(n) in terms of g(n) such that
mathematically inequality should be true)
≤6n for all n > 0
c2=6 for all n > 0 i.e., n0=1
To verify 0≤c1g(n)≤f(n)
We can write f(n)=3n+3 3n (again write f(n) in terms of g(n) such that mathematically
inequality should be true)
c1=3 for all n, n0=1
=> 3n≤3n+3≤6n for all n n0, n0=1
i.e., we are able to find, c1=3, c2=6 n0=1 such that 0≤c1g(n)≤f(n) ≤c2g(n) for all n, n n0 So, f(n)=
Θ (g(n)) for all n >=1
Big O Notation
This notation provides upper bound for a given function. O(Big Oh) Notation: mean `order at
most' i.e., bounded above or it will give maximum time required to run the algorithm. For a
function having only asymptotic upper bound, Big Oh „O‟ notation is used. Let a given
function g(n), O(g(n))) is the set of functions f(n) defined as,
O(g(n)) = {f(n): if there exist positive constant c and n0 such that 0≤f(n) ≤cg(n) for all n, n n0}
f(n) = O(g(n)) or f(n) O(g(n)), f(n) is bounded above by some positive constant multiple of
g(n) for all large values of n.
17
The definition is illustrated with the help of figure 1.1.4.
Figure 1.1.4: Big O Notation Graph
In this figure, function f(n) is bounded above by constant c times g(n). We can explain this by
following examples:
For example,
To show 3n2+4n+6=O (n2) we will verify that f(n) g(n) or not with the help of the definition
i.e., O(g(n))={f(n): if there exist positive constant c and n0 such that 0≤f(n) ≤cg(n) for all n, n
n0}
In the given problem:
f(n)= 3n2+4n+6
g(n)= n2
To show 0≤f(n) ≤cg(n) for all n, n n0
f(n)= 3n2+4n+6≤3n2+n2
≤4 n2
c=4 for all nn0, n0=6
i.e., we can identify , c=4, n0=6
So, f(n)=O(n2)
18
for n6
Properties of Big O
The definition of big O is difficult to have to work with all the time, kind of like the "limit"
definition of a derivative in Calculus.
Here are some helpful theorems that can be used to simplify big O calculations:
•
Any kth degree polynomial is O (nk).
•
a nk = O(nk) for any a > 0.
•
Big O is transitive. That is, if f(n) = O(g(n)) and g(n) is O(h(n)), then f(n) = O(h(n)).
Big-Ω (Big-Omega) notation
Sometimes, it is said that an algorithm takes at least a certain amount of time, without providing
an upper bound. We use big-Ω notation; that's the Greek letter "omega."
If a running time is Ω (f (n)), then for large enough n, the running time is at least k⋅f (n) for
some constant k. Here's how to think of a running time that is Ω(f (n)): Figure 1.1.5 shows the
graph for Big-Ω (Big-Omega) notation
Figure 1.1.5: Big-Ω (Big-Omega) notation
Graph
It is said that the running time is "big-Ω of f (n)." We use big-Ω notation for asymptotic lower
bounds, since it binds the growth of the running time from below for large enough input sizes.
Just as Θ (f (n)) automatically implies O (f (n)), it also automatically implies Ω (f (n)). So it can
be said that the worst-case running time of binary search is Ω (lg n). One can also make correct,
but imprecise, statements using big-Ω notation. For example, just as if you really do have a
million dollars in your pocket, you can truthfully say "I have an amount of money in my pocket,
19
and it's at least 10 dollars," you can also say that the worst-case running time of binary search
is Ω(1), because it takes at least constant time.
Self-assessment Questions
7) When determining the efficiency of algorithm, the space factor is measured by?
a) Counting the maximum memory needed by the algorithm
b) Counting the minimum memory needed by the algorithm
c) Counting the average memory needed by the algorithm
d) Counting the maximum disk space needed by the algorithm
8) The complexity of Bubble sort algorithm is,
a) O (n)
b) O (log n)
c) O (n2)
d) O (n log n)
9) The Average case occur in linear search algorithm:
a) When Item is somewhere in the middle of the array
b) When Item is not in the array at all
c) When Item is the last element in the array
d) When Item is the last element in the array or is not there at all
20
1.1.5 String Processing
In C, textual data is represented using arrays of characters called a string. The end of the string
is marked with a special character, the null character, which is simply the character with the
value 0. The null or string-terminating character is represented by another character escape
sequence, \0.
In fact, C's only truly built-in string-handling is that it allows us to use string constants (also
called string literals) in our code. Whenever we write a string, enclosed in double quotes, C
automatically creates an array of characters for us, containing that string, terminated by the \0
character. For example, we can declare and define an array of characters, and initialize it with
a string constant:
char string[] = "Hello, world!";
In this case, we can leave out the dimension of the array, since the compiler can compute it for
us based on the size of the initializer (14, including the terminating \0). This is the only case
where the compiler sizes a string array for us, however; in other cases, it will be necessary that
we decide how big the arrays and other data structures we use to hold strings are.
We must call functions to perform operations on strings like copying and comparing them,
breaking strings up into parts, joining them etc. We will look at some of the basic string
functions here.
1. strlen: Returns the length of the string; (i.e., the number of characters in it), not
including the \0 character
char string7[] = "abc";
int len = strlen(string7);
printf("%d\n", len);
Output is 3
2. strcpy: This function copies one string to another
char string1[] = "Hello, world!";
char string2[20];
strcpy(string2, string1);
21
Here value of string1 i.e., “Hello world!” will be copied into string2.
3. strcat: This function concatenates two strings. It appends one string onto the end of
another.
char string5[20] = "Hello, ";
char string6[] = "world!";
printf("%s\n", string5);
strcat(string5, string6);
printf("%s\n", string5);
The first call to printf prints “Hello, '', and the second one prints “Hello, world!”, indicating
that the contents of string6 have been tacked on to the end of string5.
4. strcmp: The standard library's strcmp function compares two strings, and returns 0 if
they are identical, or a negative number if the first string is alphabetically “less than” the
second string, or a positive number if the first string is “greater.”
char string3[] = "this is";
char string4[] = "a test";
if(strcmp(string3, string4) == 0)
printf("strings are equal\n");
else
printf("strings are different\n");
This code fragment will print “strings are different”. Notice that strcmp does not return a
Boolean, true/false, zero/nonzero answer.
The table 1.1.1 below lists some more commonly used string functions.
Table 1.1.1: Commonly used String Functions
Function
Description
Strcmpi()
Compares two strings with case insensitivity
Strrev()
Reverses a string
Strlwr ()
Converts uppercase string letters to lowercase
22
Strupr()
Converts lowercase string letters to uppercase
Strchr()
Finds the first occurrence of a given character in a string
Strrchr()
Finds the last occurrence of a given character in a string
Strset()
Sets all characters in a string to a given character
Strnset()
Sets the specified number of characters ina string to a given character
Strdup()
Used for duplicating a string
Self-assessment Questions
10) Which of the following function compares 2 strings with case-insensitively?
a) Strcmp(s, t)
b) strcmpcase(s, t)
c) Strcasecmp(s, t)
d) strchr(s, t)
11) How will you print \n on the screen?
a) printf("\n")
b) printf('\n');";
c) echo "\\n
d) printf("\\n");
12) Strcat function adds null character.
a) Only if there is space
b) Always
c) Depends on the standard
d) Depends on the compiler
23
1.1.6 Memory Allocation
Memory allocation is primarily a computer hardware operation but is managed through
operating system and software applications. Memory allocation process is quite similar in
physical and virtual memory management. Programs and services are assigned with a specific
memory as per their requirements when they are executed. Once the program has finished its
operation or is idle, the memory is released and allocated to another program or merged within
the primary memory.
Memory allocation has two core types;
•
Static Memory Allocation: The program is allocated memory at compile time.
•
Dynamic Memory Allocation: Memory is allocated as required at run-time.
(i) Static memory allocation
The compiler allocates the required memory space for a declared variable. By using the address
of operator, the reserved address is obtained and this address may be assigned to a pointer
variable. Since most of the declared variables have static memory, this way of assigning pointer
value to a pointer variable is known as static memory allocation.
For example, a variable in a function, is only there until the function finishes.
void func() {
int i; /* `i` only exists during `func` */
}
(ii) Dynamic memory allocation
Dynamic memory allocation is when an executing program requests that the operating system
give it a block of main memory. Dynamic allocation is a unique feature to C (amongst high
level languages). It enables us to create data types and structures of any size and length to suit
our programs need within the program.
The program then uses this memory for some purpose. Usually the purpose is to add a node to
a data structure. In object oriented languages, dynamic memory allocation is used to get the
memory for a new object. The memory comes from above the static part of the data segment.
Programs may request memory and may also return previously dynamically allocated memory.
Memory may be returned whenever it is no longer needed. Memory can be returned in any
order without any relation to the order in which it was allocated.
24
A new dynamic request for memory might return a range of addresses out of one of the holes.
But it might not use up all the holes, so further dynamic requests might be satisfied out of the
original hole. If too many small holes develop, memory is wasted because the total memory
used by the holes may be large, but the holes cannot be used to satisfy dynamic requests. This
situation is called memory fragmentation. Keeping track of allocated and deallocated memory
is complicated. A modern operating system does all the tracking of memory.
int* func() {
int* mem = malloc(1024);
return mem;
}
int* mem = func(); /* still accessible */
In the above example, the allocated memory is still valid and accessible, even though the
function terminated. When you are done with the memory, you have to free it:
free(mem);
Self-assessment Questions
13) In static memory allocation, complier allocates required memory using dereference
operator.
a) True
b) False
14) Memory is dynamically allocated once the program is complied.
a) True
b) False
15) Memory Fragmentation occurs when small holes of memory are formed which
cannot be used to fulfil the dynamic requests.
a) True
b) False
25
1.1.7 Accessing the address of a variable: Address
of (&) operator
The address of a variable can be obtained by preceding the name of a variable with an
ampersand sign (&), known as address-of operator.
For example, foo = &myvar;
This would assign the address of variable myvar to foo; by preceding the name of the variable
myvar with the address-of operator (&), we are no longer assigning the content of the variable
itself to foo, but its address. The actual address of a variable in memory cannot be known before
runtime, but let's assume, in order to help clarify some concepts, that myvar is placed during
runtime in the memory address 1776.
In this case, consider the following code fragment:
1 myvar = 25;
2 foo = &myvar;
3 bar = myvar;
The values contained in each variable after the execution of this are shown in the following
diagram:
Figure 1.1.6:
First, we have assigned the value 25 to myvar (a variable whose address in memory we assumed
to be 1776). The second statement assigns foo variable, the address of myvar, which we have
assumed to be 1776. Finally, the third statement, assigns the value contained in myvar to bar.
This is a standard assignment operation. The main difference between the second and third
statements is the appearance of the address-of operator (&). The variable that stores the address
of another variable (like foo in the earlier example) is what in C is called a pointer. Pointers are
a very powerful feature of the language that has many uses in lower level programming.
26
Did you Know?
In order to use these string functions you must include string.h file in your C program
using #include <string.h>.
Self-assessment Questions
16) Choose the right answer.
Prior to using a pointer variable.
a) It should be declared
b) It should be initialized
c) It should be declared and initialized
d) It should be neither declared nor initialized
17) The address operator &, cannot act on ________________ and ____________.
18) The operator > and < are meaningful when used with pointers, if,
a) The pointers point to data of similar type
b) The pointers point to structure of similar data type.
c) The pointers point to elements of the same array.
d) The pointers point to elements of the another array.
27
Summary
o Data Structure is a way of collecting and organising data in such a way that we can
perform operations on these data in an effective way.
o The address of a variable can be obtained by preceding the name of a variable with
the address-of operator (&).
o Data structures are categorized into two types: linear and nonlinear. Linear data
structures are the ones in which elements are arranged in a sequence, nonlinear data
structures are the ones in which elements are not arranged sequentially.
o The complexity of an algorithm is a function describing the efficiency of the
algorithm in terms of the amount of data the algorithm must process.
o The basic terminologies in the concept of data structures are Data, Data item, Entity,
Entity set, Information, Record, file, key etc.
o String functions like strcmp, strcat, strlen are used for string processing in C.
o Static and Dynamic memory allocation are the core types of memory allocation.
28
Terminal Questions
1. How is a primitive data structure different from that of a non-primitive data
structure?
2. With an example explain upper bound, lower bound and tight bound.
3. Explain the concept of string processing in C with some basic string functions.
4. Draw a comparison between static memory allocation and dynamic memory
allocation.
29
Answer Keys
Self-assessment Questions
Question No.
Answer
1
d
2
a and c
3
c
4
d
5
d
6
a
7
a
8
b
9
a
10
c
11
d
12
b
13
b
14
b
15
a
16
c
17
18
30
r-values and arithmetic
expressions
c
Activity
1. Activity Type: Online
Duration: 15 Minutes
Description:
a. Divide the students into two groups.
b. Give selection sort algorithm and insertion sort algorithm to each of the groups.
c. Students should calculate the time complexities for both these algorithms.
Case Study: Exponentiation
Let us look at the implementation of exponentiation using recursion and iteration. Illustrated
is a very basic algorithm for raising a base b to a non-negative integer exponent e.
def exp(b, e):
result = 1
for i in range(e):
result = result * b
return result
def exp(b, e):
if e == 0:
return 1
else:
return b * exp(b, e-1)
They both have a time complexity of Θ(e). It does not really matter if it is implemented using
recursion or iteration; in either case, the b is multiplied together e times.
To construct a more efficient algorithm, we can apply the principle of divide and conquer: a
problem of size n can be divided into two problems of size n/2, and then those solutions can
be combined to solve the problem of interest. If the time to solve a problem of size n/2 is less
than half of the time to solve a problem of n, then this is a better way to go.
31
Here is an exponentiation procedure that wins by doing divide-and-conquer:
def fastExp(b, e):
if e == 0:
return 1
elif odd(e):
return b * fastExp(b, e-1)
else:
return square(fastExp(b, e/2))
It has to handle two cases slightly differently: if e is odd, then it does a single multiply of b with
the result of a recursive call with exponent e-1. Then it can compute the result for exponent
e/2, and square the result. Thus time taken to compute is less.
What is the time complexity of this algorithm? Let’s start by considering the case when e is a
power of 2. Then we’ll always hit the last case of the function, until we get down to e = 1. To
compute the result of recursive calls, just log2 e. Note that notation e is our variable, not the
base of the natural log.) Further, if we start with a number with a binary representation like
1111111, then we’ll always hit the odd case, then the even case, then the odd case, then the even
case, and it will take 2 log2 e recursive calls. Each recursive call costs a constant amount. In the
end, the algorithm has time complexity of Θ(log e).
32
Bibliography
e-References
•
cs.utexas.edu, (2016). Complexity Analysis. Retrieved on 19 April 2016, from
https://www.cs.utexas.edu/users/djimenez/utsa/cs1723/lecture2.html
•
compsci.hunter.cuny.edu, (2016). C Strings and Pointers. Retrieved on 19 April
2016, from http://www.compsci.hunter.cuny.edu/~sweiss/resources/cstrings.pdf
External Resources
•
Kruse, R. (2006). Data Structures and program designing using ‘C’ (2nd ed.). Pearson
Education.
•
Srivastava, S. K., & Srivastava, D. (2004). Data Structures Through C in Depth (2nd
ed.). BPB Publications.
•
Weiss, M. A. (2001). Data Structures and Algorithm Analysis in C (2nd ed.). Pearson
Education
Video Links
Topic
Link
Data Structures Introduction
https://www.youtube.com/watch?v=92S4zgXN17o
Types of Data Structures
https://www.youtube.com/watch?v=VeEneWqC5a4
Asymptotic Notation
https://www.youtube.com/watch?v=6Ol2JbwoJp0
Memory Allocation
https://www.youtube.com/watch?v=Dml54J3Kwm4
Variables and Addresses
https://www.youtube.com/watch?v=2RsAt8RQ194
33
Notes:
34
Chapter Table of Contents
Chapter 1.2
Pointers and Recursion
Aim ....................................................................................................................................................... 35
Instructional Objectives..................................................................................................................... 35
Learning Outcomes ............................................................................................................................ 35
1.2.1 Introduction to Pointers and Recursive functions............................................................... 36
1.2.2 Declaring and Initializing Pointers ........................................................................................ 36
Self-assessment Questions ....................................................................................................... 39
1.2.3 Accessing a variable through its pointer ............................................................................... 40
Self-assessment Questions ....................................................................................................... 49
1.2.4 Memory allocation functions ................................................................................................. 50
(i) malloc()................................................................................................................................. 50
(ii) calloc() ................................................................................................................................. 51
(iii) free() ................................................................................................................................... 51
(iv) Realloc().............................................................................................................................. 51
Self-assessment Questions ....................................................................................................... 54
1.2.5 Recursion................................................................................................................................... 55
(i) Definition ............................................................................................................................. 55
(ii) Advantages .......................................................................................................................... 57
(iii) Recursive programs .......................................................................................................... 58
Self-assessment Questions ....................................................................................................... 64
Summary ............................................................................................................................................. 66
Terminal Questions............................................................................................................................ 66
Answer Keys........................................................................................................................................ 67
Activity................................................................................................................................................. 68
Bibliography ........................................................................................................................................ 69
e-References ........................................................................................................................................ 69
External Resources ............................................................................................................................. 69
Video Links ......................................................................................................................................... 69
Aim
To provide the students with the knowledge of Pointers and Recursion
Instructional Objectives
After completing this chapter, you should be able to:
•
Demonstrate the role of Pointers in data structures
•
Explain memory allocation functions
•
Explain Recursion and its advantages
•
Describe how variables are accessed through pointers
Learning Outcomes
At the end of this chapter, you are expected to:
•
Outline the steps to declare and initialise pointers
•
List the advantages of Recursion
•
Differentiate malloc() and calloc() and realloc()
•
Write programs for binomial coefficient and Fibonacci using recursion
35
1.2.1 Introduction to Pointers and Recursive
functions
In any programming language, creating and accessing variables is very important. So far we
have seen how to access variables using their variable names. In this chapter we introduce the
concept of indirect access of objects using their address. This chapter describes the use of
Pointers in accessing the variables using their memory addresses. As memory is a resource of
a computer, it should be allocated and deallocated properly. This chapter also describes the
different memory allocation techniques.
Further we introduce the concept of Recursion where a function calls itself again and again to
complete a task. We will also study some Recursive functions and their program
implementations. We will also look at the application of Recursion to various problems like
factorial, GCD, Fibonacci series etc.
1.2.2 Declaring and Initializing Pointers
Introduction to pointers and its need in programming
In all the previous programs, we referred to a variable with its variable name. Hence, the
program did not care about the physical address of those variables. So, whenever we need to
use a variable, we access them using the identifier which describes that variable.
Computers memory is divided into cells or locations which has unique addresses. Every
variable that we declare in our program has an address associated with it. Thus a variable can
also be accessed using the address of that variable. This can be achieved by using Pointers.
Pointers can be defined as the special variables which have a capability to store address of any
variable. They are used in C++ programs to access the memory and manipulate the data using
addresses.
Pointers are a very important feature of C++ Programming as it allows to access data using
their memory addresses and not directly using their variable names. Pointers do not have
much significance when simple primitive data types like integer, character or float are used.
But as the data structures become more complex, pointers play a very vital role in accessing
the data.
36
For example, consider an integer variable “a”. This variable will have 3 things associated with
itself. First is its name, second is its value and third is the memory address. Assume that a
variable “a” is having value “5” and address as “1000”. Hence we can access this variable “a”
by using a pointer variable which will store the address of variable a. Thus we can manipulate
values of any variable using a pointer variable.
Pointers are used for dynamic memory allocation so as to handle a huge amount of data. It
would have been very difficult to allocate memory globally or through functions, without
pointers.
Declaration and Initialization of Pointer Variables
Like any other variable in C++, pointer variables should be also declared before they are used
for storing addresses.
In this chapter we are going to study 2 operators known as Pointer Operators:
1. & (address of) Operator: This operator gives the address of any variable.
For example, if “max” is an integer variable, then &max will give memory address of
variable max.
2. * (dereference) Operator: This operator returns the value at any memory address.
Thus, the argument to this operator must be a pointer. It is called as a dereference
operator as it works in a opposite manner to the & operator.
For example, if “ptr” is a pointer variable which stores the address of variable “a”, then
*ptr will return the value located at the memory address pointed by “ptr”.
A pointer variable can be declared as follows:
Syntax: Data_type * pointer_variable_name;
We need to specify the data type followed by * symbol and finally the name of pointer
variable terminated by a semicolon.
For example, int *ptr;
This declaration tells the compiler that “ptr” is a pointer variable of type integer.
37
A Pointer variable can be initialized as given below:
Syntax: pointer_variable_name=& variable_name;
For example, int a;
//variable a is declared as a integer variable
int *ptr;
//declare ptr as pointer variable
ptr=&a;
//ptr is a pointer variable which stores address of variable a
We can combine declaration and initialisation in one step also:
Syntax: data_type *pointer_variable_name=&variable_name
For example, int *ptr= &a;
It means ptr is a pointer variable storing the address of variable a.
Note: pointer variable should always point to address of variable of same data type:
For example, char a;
int *ptr;
ptr=&a;
//Invalid as a is char type and ptr is integer type.
Dereferencing a pointer
As it is already discussed, we can use * (dereference) operator to access the value stored at the
address pointed by the pointer variable. This * operator is called as “value at operator” or
“indirection operator”.
For example, /*pointer variable declaration and initialization*/.
#include<stdio.h>
int main()
{
int a=5;
//a is a integer variable
int *ptr;
//ptr is a pointer variable declared
ptr=&a;
//pointer ptr stores the address of variable a
printf("Address of variable a is: %d\n", &a);
//prints the address of variable a
printf("Address of variable a is %d\n", ptr);
//prints address of var a as it is stored in ptr
printf("Value of variable a is: %d\n", *ptr);
//prints the value of variable a
printf("Address of pointer ptr is:%d\n", &ptr);
38
//prints address of pointer variable ptr
return 0;
}
Output:
Self-assessment Questions
1) Pointer is special kind of variable which is used to store __________ of the
variable.
a) Data Variable
b) Variable Name
c) Value
d)Address
2) Pointer variable is declared using preceding _________ sign.
a) *
b) %
c) &
d) ^
3) Consider the 32 bit compiler. We need to store address of integer variable to
integer pointer. What will be the size of integer pointer,
a) 2 Bytes
b) 6 Bytes
c) 10 Bytes
d) 4 Bytes
39
1.2.3 Accessing a variable through its pointer
Pointers are special kind of variables which can hold the address of another variable. Once a
pointer has been assigned the address of any variable, we can use the value of that variable
and manipulate it as per the requirement.
We know that the pointers can store the address of any variable using & (address of)
operator. Once the address is stored in pointer, we can use a * (dereferencing/ indirection)
operator followed by variable name to access the value of that variable.
For example,
int a, b;
a=60;
int *ptr;
ptr= &a;
b= *ptr;
Considering the above section of program, we declare 2 integer variables a and b.
We have also declared a integer pointer variable “ptr”. In the next statement we assign the
address of variable a to the pointer “ptr”.
The statement
b=*ptr;
will assign the value at the address pointed by “ptr” to the variable “b”. Thus variable “b”
becomes same as variable “a”.
This is equivalent to the statement
b= a;
40
The below figure 1.2.1 shows how a variable can be accessed using a pointer.
Figure 1.2.1: Accessing a variable through its pointer
Program:
/*Accessing a variable through pointer*/
#include <stdio.h>
int main()
{
int a, b;
a=60;
b=0;
int *ptr;
ptr= &a;
printf("Value of variable a=%d\n", a);
printf("Value of variable b=%d\n", b);
b=*ptr;
//assign value at address pointed by ptr to the
variable b
printf("Value of pointer variable ptr=%d\n", ptr);
printf("Value of variable b=%d\n", b);
return 0;
}
Output:
41
Variable “a” is assigned a value 60. Pointer variable “ptr” will store the address of variable “a”
say 1005. When we say “b=*ptr”, b will be assigned a value 60 pointed by the pointer “ptr”.
Thus after execution of above program, a=60 and b will also have value 60.
Different Types of Pointer variables and their use:
In the following program, we have created 4 pointers: one integer pointer “iptr”, one float
pointer “fptr”, one double type pointer “cptr” and one character type pointer “chptr”.
Each type of pointer variable stores the address of respective type of variable. For example,
Character pointer variable will store the address of variable “ch” which is of type character.
The indirection operator i.e., * accesses an object of a specified data type at an address.
Accessing any variable by its memory address is called indirect access. In the below given
example *iptr indirectly accesses the variable that iptr points to i.e., variable a.
Similarly pointer variable *fptr indirectly accesses the variable that fptr points to i.e. variable
b.
Program:
/*different types of pointer variables */
#include <stdio.h>
int main()
{
int a, *iptr;
float b, *fptr;
double c, *cptr;
char ch, *chptr;
iptr=&a;
//iptr stores address of integer variable a
fptr=&b;
//fptr stores address of float variable b
cptr=&c;
//cptr stores address of double variable c
chptr=&ch; //chptr stores address of character variable ch
a = 10;
b = 2.5;
c = 12.36;
ch = 'C';
42
printf("Address of variable a is %u \n", iptr);
printf("Address of variable b is %u \n", fptr);
printf("Address of variable c is %u \n", cptr);
printf("Address of variable ch is %u \n\n", chptr);
printf("Value of variable a is %d \n", *iptr);
printf("Value of variable b is %f \n", *fptr);
printf("Value of variable c is %f \n", *cptr);
printf("Value of variable ch is %c \n", *chptr);
return 0;
}
Output:
When the following pointer variable declarations are encountered, memory spaces are
allocated for these variables at some addresses.
int a, *iptr;
float b, *fptr;
double c, *cptr;
char ch, *chptr;
43
The memory layout during declaration phase is shown in Figure 1.2.2.
Figure 1.2.2: Declaration of Pointer variables
But when we assign the addresses of variables to the respective pointer variables, the memory
layout will look the way shown in below figure 1.2.3
Figure 1.2.3: Effect of indirect Access and Assignments of Pointers
These initialized pointers may now be used to indirectly access the variables they are
pointing.
44
Pointer arithmetic
As we perform arithmetic operations on regular integer variables, we can also perform
arithmetic operations on pointer variables. Only addition and subtraction operations can be
performed on pointer types.
But behaviour of addition and subtraction on pointer variables is slightly different. The
operations behave differently according to the data type they are pointing to. The sizes of
basic data types like integer, char, float etc. are already defined.
Suppose we define 3 pointer variables as given below:
char *cptr;
short *sptr;
long *lptr;
Let us assume that they point to memory locations 4000, 5000 and 6000 respectively.
If we write an increment statement as given below, it
will increment the address contained in it. Thus, its value will become 5001.
++cptr ;
This is because cptr is a character pointer and character is of 1 byte. Thus, incrementing a
character pointer will add 1 to the memory address.
Similarly the statements,
++sptr ;
will increment the address contained in sptr by 2 bytes as short is 2
bytes in size and
++lptr ;
will increment the address contained in lptr by 4 bytes as long is 4 bytes
in size.
Thus, when we increment a pointer, the pointer is made to point to the following element of
the same type. Hence, the size in bytes of the type it points to is added to the pointer after
incrementing it.
45
Same rules will follow for addition as well as subtraction operation. The below given
statements give the same result as that of increment operator.
cptr = cptr + 1;
sptr = sptr + 1;
lptr = lptr + 1;
These increment (++) and decrement (--) operators can be used as either prefix or postfix
operator in any expression. So in case of pointers, these operators can be used in similar way
but with slight difference.
In case of prefix operator, the value is incremented first and then the expression is evaluated.
In case of postfix operator, the expression is evaluated first and then the value is incremented.
Same rules follow for incrementing and decrementing pointers.
As per the operator
precedence rules, postfix operators, such as increment and decrement, have higher
precedence than prefix operators, such as the dereference operator (*). Thus, the following
expression:
*ptr ++;
is same as *(ptr++). As ++ operator is used as postfix, the whole expression is evaluated as the
value pointed originally by the pointer is then incremented.
There are the four possible combinations of the dereference operator with both the prefix
and suffix increment operators.
1. *ptr++ //equivalent to *(ptr++)
//increment pointer ptr and dereference unincremented address
2. *++ptr //equivalent to *(++ptr)
//increment pointer ptr and dereference incremented address
3. ++*ptr //equivalent to ++(*ptr)
//dereference pointer and increment the value stored in it
4. (*ptr)++
//equivalent to (*ptr)++
//dereference pointer and post-increment the value stored in it
46
If we consider the following statement,
*ptr++ = *qtr++;
As ++ has a higher precedence than *, both ptr and qtr pointers are incremented. Because
both increment operators (++) are used as postfix operators, thus incrementing the value
stored at address pointed by pointer ptr and qtr.
Program:
/*Pointer Arithmetic*/
#include<stdio.h>
int main()
{
int ivar = 5, *iptr;
char cvar = 'C', *cptr;
float fval = 4.45, *fptr;
iptr = &ivar;
cptr = &cvar;
fptr = &fval;
printf("Address of integer variable ivar = %u\n", iptr);
printf("Address of character variable cvar = %u\n", cptr);
printf("Address of floating point varibale fvar = %u\n\n", fptr);
/* Increment*/
iptr++;
cptr++;
fptr++;
printf("After increment address in iptr = %u\n", iptr);
printf("After increment address in cptr = %u\n", cptr);
printf("After increment address in fptr = %u\n\n", fptr);
/* increment by 2*/
iptr = iptr + 2;
cptr = cptr + 2;
fptr = fptr + 2;
printf("After +2 address in iptr = %u\n", iptr);
printf("After +2 address in cptr = %u\n", cptr);
printf("After +2 address in fptr = %u\n\n", fptr);
/* Decrement*/
iptr--;
cptr--;
fptr--;
printf("After decrement address in iptr = %u\n", iptr);
printf("After decrement address in cptr = %u\n", cptr);
printf("After decrement address in fptr = %u\n\n", fptr);
47
return 0;
}
Output:
48
Self-assessment Questions
4) Comment on following pointer declarations int *ptr, p;.
a) ptr is a pointer to integers , p Is not
b) ptr and p both are pointers to integer
c) ptr is pointer to integer, p may or may not be
d) ptr and p both are not pointers to integer
5) What will be the output?
main()
{
char *p;
p = "Hello";
printf("%c\n",*&*p);
}
a) Hello
b) H
c) 1005 (memory address of variable p)
d) 1008(memory address of character H)
6) The statement int **a;,
a) Is illegal
b) Is legal but meaningless
c) Is syntactically and semantically correct
d) Stacks
7) Comment on the following,
const int *ptr;
a) We cannot change the value pointed by ptr.
b) We cannot change the pointer ptr itself.
c) Is illegal
d) We can change the pointer as well as the value pointed by it
49
1.2.4 Memory allocation functions
Memory is a resource of computer system and it needs to be allocated properly for any kind
of data structures used in programs. Dynamic memory allocation is a process of allocating
memory to the data during program execution.
Normally when we are dealing with simple arrays or strings, we allocate the required amount
of memory during compile time itself. We cannot extend the allocated memory during runtime. Hence, in such cases we need to allocate sufficient amount of memory at the compile
time. But in compile time memory management, sometimes the allocated memory may not
be used hence wasting the memory space.
Thus, we can make use of Dynamic memory allocation technique to allocate and de-allocate
memory at runtime. Dynamic memory allocation helps us to increase or decrease the
memory when the program is under execution.
The following are the dynamic memory allocation functions in C:
1. malloc ()
It Allocates requested size of bytes and returns a pointer of first byte of allocated
space.
2. calloc()
It Allocates space for an array elements, initializes to zero and then returns a pointer
to memory
3. realloc()
It deallocate the previously allocated space.
4. free()
We change the size of previously allocated space.
(i) malloc()
malloc, as the name indicates, stands for memory allocation. This function reserves a block of
memory of specified size to return a pointer of type void.
Syntax of malloc()
ptr=(cast-type*)malloc(byte-size)
50
Here, ptr is pointer of cast-type. The malloc() function returns a pointer to an area of
memory with size of byte size. If the space is insufficient, allocation fails and returns NULL
pointer.
ptr=(int*)malloc(100*sizeof(int));
This statement will allocate either 200 or 400 according to size of int 2 or 4 bytes respectively
and the pointer points to the address of first byte of memory.
(ii) calloc()
Calloc stands for "contiguous allocation". The difference between malloc() and calloc() is that,
malloc() allocates single block of memory whereas calloc() allocates multiple blocks of
memory each of same size and sets all bytes to zero.
Unless ptr is NULL, it must have been returned by an earlier call to malloc(), calloc() or
realloc().
Syntax of calloc()
ptr=(cast-type*)calloc(n,element-size);
This statement will allocate contiguous space in memory for an array of n elements.
For example:
ptr=(float*)calloc(25,sizeof(float));
This statement allocates contiguous space in memory for an array of 25 elements each of size
of float, i.e., 4 bytes.
(iii) free()
This function is used to explicitly free the memory allocated by malloc() and calloc()
functions. It releases all the memory reserved for program.
free(ptr);
(iv) Realloc()
Sometimes a programmer requires extra memory or allocated memory becomes more than
sufficient. In these cases, a programmer can change memory size previously allocated using
realloc().
51
Syntax of realloc()
ptr=realloc(ptr,newsize);
Here, ptr is reallocated with size of newsize.
For example:
#include<stdio.h>
#include<stdlib.h>
int main()
{
int *ptr,i,n1,n2;
printf("Enter size of array: ");
scanf("%d",&n1);
ptr=(int*)malloc(n1*sizeof(int));
printf("Address of previously allocated memory: ");
for(i=0;i<n1;++i)
printf("%u\t",ptr+i);
printf("\nEnter new size of array: ");
scanf("%d",&n2);
ptr=realloc(ptr,n2);
for(i=0;i<n2;++i)
printf("%u\t",ptr+i);
return 0;
}
Output:
52
Example showing use of malloc(), calloc() and free()
Program:
#include<stdio.h>
#include<stdlib.h>
int main()
{
int n,i,*ptr,sum=0;
printf("Enter number of elements: ");
scanf("%d",&n);
ptr=(int*)calloc(n,sizeof(int)); //memory allocated using calloc
if(ptr==NULL)
{
printf("Error! memory not allocated.");
exit(0);
}
printf("Enter elements of array: ");
for(i=0;i<n;++i)
{
scanf("%d",ptr+i);
sum+=*(ptr+i);
}
printf("Sum=%d",sum);
free(ptr);
return 0;
}
Output:
53
Self-assessment Questions
8) What function should be used to free the memory allocated by calloc()?
a) dealloc();
b) malloc(variable_name, 0)
c) free();
d) memalloc(variable_name, 0)
9) Which header file should be included to use functions like malloc() and calloc()?
a) memory.h
b) stdlib.h
c) string.h
d) dos.h
10) How will you free the memory allocated by the following program?
#include<stdio.h>
#include<stdlib.h>
#define MAXROW 3
#define MAXCOL 4
int main()
{
int **p, i, j;
p = (int **) malloc(MAXROW * sizeof(int*));
return 0;
}
a) The name of array
b) The data type of array
c) The first data from the set to be stored
d) the index set of the array
11) Specify the 2 library functions to dynamically allocate memory?
54
a) malloc() and memalloc()
b) alloc() and memalloc()
c) malloc() and calloc()
d) memalloc() and faralloc()
1.2.5 Recursion
Recursion is considered to be the most powerful tool in a programming language. But
sometimes, Recursion is also considered as the most tricky and threatening concept to a lot of
programmers. This is because of the uncertainty of conditions specified by user.
In short Something Referring to itself is called as a Recursive Definition
(i) Definition
Recursion can be defined as defining anything in terms of itself. It can be also defined as
repeating items in a self-similar way.
In programming, if one function calls itself to accomplish some task then it is said to be a
recursive function. Recursion concept is used in solving those problems where iterative
multiple executions are involved.
Thus, to make any function execute repeatedly until we obtain the desired output, we can
make use of Recursion.
Example of Recursion:
The best example in mathematics is the factorial function.
n! = 1.2.3.........(n-1).n
If n=6, then factorial of 6 is calculated as,
6! = 6(5)(4)(3)(2)(1)= 720
Consider we are calculating the factorial of any given using a simple. If we have to calculate
factorial of 6 then what remains is the calculation of 5!
In general we can say
n ! = n (n-1)!
(i.e., 6! = 6 (5!))
it means we need to execute same factorial code again and again which is nothing but
Recursion.
55
Thus, the Recursive definition for factorial is:
f(n) = 1
if n=0
n * f (n-1)
otherwise
The above Recursive function says that the factorial of any number n=0 is 1, else the factorial
of any other number n is defined to be the product of that number n and the factorial of one
less than that number .
For example, consider n=4
As n is not equal to 0, the first case will not satisfy.
Thus, applying second case we get
4! = 4(4-1)! = 4(3!)
To find 3! Again we have to apply the same definition.
4! = 4(3!)=4[(3)(2!)]
Now, we have to calculate 2! , which requires 1! , which requires 0!.
As 0! is 1 by definition , we reach the end of it. Now we have to substitute the calculated
values one by one in reverse order.
4!=4(3!)= 4(3)(2!)=4(3)(2)(1!)= 4(3)(2)(1)(0!)= 4(3)(2)(1)(1)= 24
Thus, 4!= 24
From the above solution it is clear that the each time we need to calculate the factorial of a
value one less than the original one. Thus we reach value 0 where we have to stop applying
same function of factorial.
Any recursive definitions will have some properties. They are:
•
There are one or more base cases for which recursions are not needed.
•
All cycles of recursion stops at one of the base cases.
We should make sure that each recursion always occurs on a smaller version of the original
problem.
56
In C Programming a recursive factorial function will look like:
int factorial(int n)
{
if (n==0)
//Base Case
return 1;
else
return n*factorial (n-1);
//Recursive Case
}
The above program is for calculating factorial of any number n. First when we call this
factorial function, it checks for the base case. It checks if value of n equals 0. If n equals 0, then
by definition it returns 1.
Otherwise it means that the base case is not yet been satisfied. Hence, it returns the product of
n and factorial of n-1.
Thus, it calls the factorial function once again to find factorial of n-1. Thus forming recursive
calls until base case is met.
Figure 1.2.4 shows the series of recursive calls involved in the calculation of 5!. The values of n
are stored on the way down the recursive chain and then used while returning from function
calls.
Figure 1.2.4: Recursive computation of 5!
(ii) Advantages
An important advantage of Recursion is that it saves time of programmer to a large extent.
Even though problems like factorial, power or Fibonacci can be solved using loops but their
57
recursive solutions are shorter and easier to understand. And there are algorithms that are
quite easy to implement recursively but much more challenging to implement using loops.
Advantages of Recursion:
•
Reduce unnecessary calling of function.
•
Solving problems becomes easy when its iterative solution is very big and complex
and cannot be implemented with loops.
•
Extremely useful when applying the same solution
(iii) Recursive programs
1. Fibonacci series
One of the well-known problems is generating a Fibonacci series using Recursion.
A Fibonacci series looks like 0,1, 1, 2, 3, 5, 8, 13, 21and so on
Working: The next number is equal to sum of previous two numbers. The first two numbers
of Fibonacci series are always 0 and. The third number becomes the sum of first 2 numbers,
i.e., 0 + 1 = 1. Similarly, the Fourth number is the sum of 3rd and 2nd number, i.e., 1 + 1 = 3
and so on.
Thus, the Recursive definition for Fibonacci is:
0
Fk(n) = 1
if 0 <=k <n-1
if k=n-1
(𝑘𝑘−1)
∑i=k−n
Fin
Otherwise
In C Programming a recursive fibonacci function will look like:
int fib(int n)
{
if (n <= 1)
return n;
else
return fib(n - 1) + fib(n - 2);
}
58
If n is less than or equal to 1, then return n. Otherwise return the sum of the previous two
terms in the series by calling fib function twice. Once for (n-1) and next for fib (n-2).This
combines results from 2 separate recursive calls. This is sometimes known as "deep"
recursion. The below figure 1.2.5 demonstrates the working of recursive algorithm for
Fibonacci series.
Figure 1.2.5: Recursive Algorithm
For example, the call to fib (4) repeats the calculation of fib (3) (see the circled regions of the
tree). In general, when n increases by 1, we roughly double the work; that makes about 2n
calls.
Following is the c program for implementation of Fibonacci series:
Program:
#include<stdio.h>
int fib(int n)
{
if ( n == 0 )
return 0;
else if ( n == 1 )
return 1;
else
return ( fib(n-1) + fib(n-2) );
}
int main()
{
int n, j = 0, i;
printf("Fibonacci series implementation\n");
printf("How many terms in series: ");
scanf("%d",&n);
printf("Fibonacci series\n");
for ( i = 1 ; i <= n ; i++ )
{
printf("%d\n",fib(j));
j++;
}
return 0;
}
59
Output:
The above program uses the recursion concept to print the Fibonacci series. The program
first asks the total number of terms to be displayed as output. Then it makes recursive calls to
the function fib() and finds the next term in the series by adding previous two values in the
series.
2. Binomial Coefficient
Binomial coefficient C (n, k) counts the number of ways to form an unordered collection of k
items selected from a collection of n distinct items
For example, if you wanted to make a group of two from a group of four people, the number
of ways to do this is C (4, 2).
Where, n=4 i.e., 4 people and k=2 i.e. group of 2 people
There are total 6 ways to group them in an unordered manner.
Let us assume 4 people as A, B, C and D
So the 2 letter groups are: AB, AC, AD, BC, BD, and CD
Hence, C (n, k) = C (4, 2) = 6.
In general, Binomial Coefficients can be defined as:
•
A binomial coefficient C (n, k) is the coefficient of X^k in the expansion of (1 + X)^n.
•
A binomial coefficient C (n, k) also gives the number of ways, regardless of order, that
k items can be chosen from among n items.
60
Problem:
This Problem of Binomial Coefficients can be implemented using Recursion. We need to
write a function that takes two parameters n and k and returns the value of Binomial
Coefficient C (n, k).
Recursive function:
The value of C(n, k) can recursively calculated using following standard formula for Binomial
Coefficient’s.
C(n, k) = C(n-1, k-1) + C(n-1, k)
C(n, 0) = C(n, n) = 1
Below given program implements the calculation of Binomial Coefficients in a Recursive
Manner.
Program:
//Recursive implementation of Binomial Coefficient C(n, k)
#include<stdio.h>
int binomial(int n, int k)
{
if (k==0 || k==n)
// Base Cases
return 1;
else
return binomial(n-1, k-1) + binomial(n-1, k);
}
int main()
{
int n, k;
printf("Enter the value of n:");
scanf("%d",&n);
printf("\nEnter the value of k:");
scanf("%d",&k);
printf("\nValue of C(%d, %d) is %d ", n, k, binomial(n, k));
return 0;
}
61
Output:
It should be noted that in the above program, the binomial function is called again and again
until the base cases are satisfied.
Below given figure 1.2.6 is the Recursive tree for n=5 and k=2.
Figure 1.2.6: Example of DP and Recursion
3. GCD (Greatest Common Divisor)
The Greatest Common divisor of two or more integers is the largest positive integer that
divides the numbers without a remainder.
For example, the GCD of 8 and 12 is 4.
Problem Definition: Given any nonnegative integers a and b, considering both are not equal
to 0, calculate gcd(a, b).
Recursive Definition:
For a,b ≥ 0, gcd(a,b) =
62
a
if b=0
gcd(b, (a mod b))
otherwise
Input: Any Nonnegative integers a and b, both not equal to zero.
Output: The greatest common divisor of a and b.
For example:
Consider a=54 and b=24. We need to find GCD (54, 24)
Thus, the divisors of 54 are: 1, 2, 3, 6, 9, 18, 27, and 54
Similarly, the divisors of 24 are: 1, 2, 3, 4, 6, 8, 12, and 24
Thus, 1,2,3,6 are the common divisors of both 54 and 24:
The greatest number of these common divisors is 6.
That is, the GCD (greatest common divisor) of 54 and 24 is 6.
The following program demonstrates computation of GCD using recursion:
Program:
/*GCD of Numbers using Recursion*/
#include <stdio.h>
int gcd(int a, int b)
{
while (a != b)
{
if (a > b)
return gcd(a - b, b);
else
return gcd(a, b - a);
}
return a;
}
int main()
{
int a, b, ans;
printf("Enter the value of a and b: ");
scanf("%d%d", &a, &b);
ans = gcd(a, b);
printf("GCD(Greatest common divisor) of %d and %d is %d.\n", a, b,
ans);
}
63
Output:
Did you Know?
One critical requirement of recursive functions is termination point or base case. Every
recursive program must have base case to make sure that the function will terminate.
Missing base case results in unexpected behaviour.
Self-assessment Questions
12) Which Data Structure is used to perform Recursion?
a) Queue
b) Stack
c) Linked List
d) Tree
13) What is the output of the following code?
int doSomething(int a, int b)
{ if (b==1)
return a;
else
return a + doSomething(a,b-1);
}
doSomething(2,3);
64
a) 4
b) 2
c) 3
d) 6
14) Determine output of,
int rec(int num){
return (num) ? num%10 + rec(num/10):0;
}
main(){
printf("%d",rec(4567));
}
a) 4
b) 12
c) 22
d) 21
15) What will be the below code output?
int something(int number)
{
if(number <= 0)
return 1;
else
return number * something(number-1);
}
something(4);
a) 12
b) 24
c) 1
d) 0
16) void print(int n),
{
if (n == 0)
return;
printf("%d", n%2);
print(n/2);
}
What will be the output of print(12)?
a) 0011
b) 1100
c) 1001
d) 1000
65
Summary
o A pointer is a value that designates the address (i.e., the location in memory), of
some value. Pointers are variables that hold a memory location.
o ‘&’ - address of variable is used to assign address of any variable to pointer variable.
o ‘*’ indirection operator is used to access the value contained in a particular pointer.
o Pointers store the address of any variable using & operator. We can access the value
of that variable using a * operator succeeded by the variable name.
o Memory allocation function:
•
Malloc() - Allocates requested size of bytes and returns a pointer first byte of
allocated space
•
Calloc() - Allocates space for an array elements, initializes to zero and then
returns a pointer to memory
•
Free() - deallocate the previously allocated space
•
Realloc() - Change the size of previously allocated space
o Recursion is the process of repeating items in a self-similar way. In Programs, if a
function makes a call to itself then it is called a recursive function. Recursion is
more general than iteration. Choosing between recursion and looping involves the
considerations of efficiency and elegance.
Terminal Questions
1. Explain the role of pointers in data structures.
2. What are the memory allocation functions? Explain in detail.
3. Define Recursive functions.
4. Write a note on indirection operator.
66
Answer Keys
Self-assessment Questions
Question No.
Answer
1
d
2
a
3
a
4
a
5
b
6
c
7
a
8
c
9
b
10
d
11
c
12
b
13
d
14
c
15
b
16
a
67
Activity
1. Activity Type: Offline
Description:
Ask all the students to get the output of the below question:
#include<stdio.h>
int main(){
int i = 3;
int *j;
int **k;
j=&i;
k=&j;
printf("%u %u %d ",k,*k,**k);
return 0;
}
Prepare a presentation on pointers and dynamic memory allocation.
68
Duration: 10 Minutes
Bibliography
e-References
•
cslibrary.stanford.edu, (2016). Stanford CS Education Library. Retrieved on 19
April 2016, from http://cslibrary.stanford.edu/106/
•
doc.ic.ac.uk, (2016). Recursion. Retrieved on 19 April 2016, from
http://www.doc.ic.ac.uk/~wjk/c++Intro/RobMillerL8.html
External Resources
•
Kruse, R. (2006). Data Structures and program designing using ‘C’ (2nd ed.).
Pearson Education.
•
Srivastava, S. K., & Srivastava, D. (2004). Data Structures Through C in Depth (2nd
ed.). BPB Publications.
•
Weiss, M. A. (2001). Data Structures and Algorithm Analysis in C (2nd ed.).
Pearson Education
Video Links
Topic
Link
Introduction to pointers, declaring and
initializing pointers and accessing
https://www.youtube.com/watch?v=fAPt0Upy3ho
variables through is pointers
Memory allocation functions
https://www.youtube.com/watch?v=s4io0ir2kas
Recursion
https://www.youtube.com/watch?v=AuTjrMu-2F0
69
Notes:
70
MODULE - II
Searching and
Sorting
MODULE 2
Searching and Sorting
Module Description
This module introduces the problem of searching a list to find a particular entry. The
discussion centres on two well-known algorithms: sequential search and binary search.
Most of this chapter assumes that the entire sort can be done in main memory, so that the
number of elements is relatively small(less than a million).
Sorts that cannot be performed in main memory and must be done on disk or tape are also
quite important. This type of sorting is known as external sorting and will be discussed in the
second chapter of this module. It is assumed in our examples that the array contains only
integers to simplify matters. At the same time, we have to understand that more complicated
structures are possible.
Chapter 2.1
Searching Techniques
Chapter 2.2
Sorting Techniques
Chapter Table of Contents
Chapter 2.1
Searching Techniques
Aim ....................................................................................................................................................... 71
Instructional Objectives..................................................................................................................... 71
Learning Outcomes ............................................................................................................................ 71
2.1.1 Introduction to Searching ....................................................................................................... 72
(i) Types of Searching .............................................................................................................. 73
Self-assessment Questions ....................................................................................................... 77
2.1.2 Basic Sequential Searching ...................................................................................................... 77
Self-assessment Questions ....................................................................................................... 82
2.1.3 Binary search ............................................................................................................................ 82
(i) Iterative implementation .................................................................................................... 83
(ii) Recursive implementation ................................................................................................ 85
Self-assessment Questions ....................................................................................................... 87
2.1.4 Comparison between sequential and binary search ............................................................ 88
Self-assessment Questions ....................................................................................................... 89
Summary ............................................................................................................................................. 90
Terminal Questions............................................................................................................................ 91
Answer Keys........................................................................................................................................ 92
Activity................................................................................................................................................. 93
Case Study: Alphabetizing Papers .................................................................................................... 93
Bibliography ........................................................................................................................................ 95
e-References ........................................................................................................................................ 95
External Resources ............................................................................................................................. 95
Video Links ......................................................................................................................................... 95
Aim
To educate the students in searching and sorting techniques
Instructional Objectives
After completing this chapter, you should be able to:
•
Explain searching and its types with its code snippet
•
Describe sequential search using iterative and recursive
•
Explain binary search
•
Compare linear search and binary search
Learning Outcomes
At the end of this chapter, you are expected to:
•
Elaborate searching techniques with example
•
Compute time complexities for binary search and sequential search algorithm
•
Outline the applications of linear and binary search
•
Write code for iterative and recursive implementation for both the searching
techniques
71
2.1.1 Introduction to Searching
This chapter focuses on how searching plays an important role in the concept of data
structures. Searching helps to find if a particular element is part of a given list or not. In this
chapter, we will focus on two types of searching techniques namely sequential or linear search
and binary search. We will also come across iterative and recursive implementation for the
two types of above mentioned searching techniques in this chapter.
Searching is a technique of determining whether a given element is present in a list of
elements. We are given names of people and are asked for an associated telephone listing. We
are given an employee names or codes and are asked for the personnel records of the
employee. In these examples, we are given small piece of data or information, which we call as
a key, and we are asked to find a record that has other information associated with the key.
We shall allow both the possibility that there is more than one record with the same key and
that there is no record at all with a given key. See Figure 2.1.1.
Figure 2.1.1: Sample records of employees
If the element we are searching is present in the list, then the searching technique should
returns the index where that element is present in the list. If the search element is not present
in the list then the searching technique should return NULL indicating that search element is
not present in the list. Like sorting there are number of searching technique available in the
literature and searching techniques vary based on its purpose suited for the application.
Searching techniques can also be classified based on the data structures used to store the list.
Searching techniques will vary for both linear as well as non-linear data structures. Among
linear data structures such as arrays, linked lists, stacks and queues, the searching techniques
will also vary. If array is used, then we must use different searching technique. Searching an
element in a linked list requires different searching techniques. Also in case of non-linear data
72
structures like trees will have different searching techniques. In this chapter we will introduce
different types of searching algorithms and their implementations.
(i) Types of Searching
There are many different searching techniques and modern research is focused on advanced
searching techniques using graphs like breadth first search (BFS) and depth first search (DFS)
which will be discussed in the later chapters. In most common practice and very well-known
there are two types of searching techniques, namely linear or sequential search and binary
search algorithms.
1. Linear Search or Sequential Search
The simplest way to do a search in a given list is to begin at one end of the list and scan down
it until the desired key is found or the other end is reached. This is our first method of
searching which we call linear or sequential search.
Let A = [10 15 6 23 8 96 55 44 66 11 2 30 69 96] and searching element e = 11. Consider a
pointer i, to begin with the process initialize the pointer i = 1. Compare the value pointed by
the pointer with the searching element e= 11. As A (1) = 10 and it is not equal to element e
increment the pointer i by i+1. Compare the value pointed by pointer i.e., A (2) = 15 and it is
also not equal to element e. Continue the process until the search element is found or the
pointer i reaches the end of the list.
Working of linear or sequential search is shown in the below figure 2.1.2
Figure 2.1.2: Pictorial representation of solution for sequential search
73
Characteristics and applications
In case of linear search, the searching happens sequentially. With this, if the element is
present at end of the list or not at all present in the lists then this will lead to worst case
scenario in case of linear search. In other words, for N elements in a list we will require (N-1)
iterations for the above mentioned worst case scenario. This is a O(N) or Big O notation. The
speed of linear search algorithms becomes directly proportional to the number of elements in
the list. We should also note that, for linear search the list need not be in a sorted order. In
some cases, we might place some frequently searched items or elements at the start of the list
which will result into faster retrieval thereby increasing the performance irrespective of size of
the item or element.
Despite of its worst case scenario which affects the performance of the searching technique,
linear search technique is widely used in many applications. The built- in functions in
programming languages like find index of Ruby, or of jQuery, completely depend on linear
search techniques.
2. Binary search algorithm
Linear search is easy and efficient for short lists, but a worst for long ones. Just imagine trying
to find the name “Carmel Fernandes” in a large directory by reading one name at a time
starting at the front of the book! To find any record in a long list, there are far more efficient
methods, provided that the keys in the list are already sorted into order.
A better methods for a list with keys in order is first to compare the key with one in the centre
of the list and then restrict the search to only the first or second half of the list, depending on
whether the key comes before or after the central one. With one comparison of keys we thus
reduce the list to half its original size. Repeating this exercise, at each step, we reduce the
length of the list to be searched by half. With only 20 search iterations, this method locates
any required key in a list containing more than a million keys.
This method is called binary search. This approach requires that the keys in the list to be of a
scalar or other type that can be regarded as having an order and that the list already
completely in order.
Working of binary search algorithm
Consider an array A = [11, 14, 15, 25, 32, 36, 39, 45, 52, 55, 59, 63, 77, 83, 99] and the
searching element e= 83. Let low be an integer containing first index of the array i.e., 0 and
74
high be the integer containing highest index of the array which is 14 in this case. Here we first
compute mid by finding midpoint of high and low. In our case the midpoint is 7 (midpoint=
[high (14) + low (0)]/2).
First we check if the search element is at midpoint. In our case search element is 83 and
element at the midpoint is 45 which is not true.
Next step is we check if value at the midpoint is greater or smaller than the search element. If
the value at midpoint is greater than the value of search element then it means our search
element is in the left part of midpoint and hence the search must me restricted in the first half
of the array. So we set high at midpoint and leave low unchanged. If the value at the midpoint
is smaller than the search element, then it means our search element is present in right half of
the array and hence search should be restricted in right side from midpoint. Therefore we set
low to the midpoint and leave high unchanged.
We repeat this process until the search element is found. The solution of the example taken
above is solved below pictorially.
Initially low =0, high =14, midpoint=7 and we have element to be searched e= 83.
Since A[mid] is 45 which is smaller than 83 we set low =mid for our next iteration. So
calculate the next midpoint using new low and high.
Iteration 2
Again the element is not present at midpoint and midpoint element is greater than 83 hence,
the new we set low =11 and leave high unchanged. Our new midpoint will be,
75
Iteration 3
Now, first we check if the value at midpoint is our search element; which is true. Hence, our
search is complete. The algorithm will return the value of the midpoint which is 11 in our
case.
Characteristics and applications
In case of binary search technique, the given list of elements will be divided into two parts and
one part gets eliminated in each iteration which is not so in case of linear search technique.
This feature makes binary search more efficient and more powerful compared to that of
linear search irrespective of the number of elements present in a list. Binary search technique
is implemented either iteratively or recursively.
For binary search, the elements need to be in a sorted order which is not so in case of linear
search. With this, the list is already in a sorted order before we start with searching. Since
binary search is more suitable for larger sets of elements, its performance degrades if the list
frequently gets updated for which again sorting takes place for the updated list.
76
Self-assessment Questions
1) Searching is important function because ________________.
a) Information retrieval is most important part of the computer system
b) Data in the computer system is unorganized
c) It allows validating data present in the computer’s memory
d) It allows computer to validate information
2) What factor degrades the performance of binary search technique?
a) Number of iterations
b) Re-sorting
c) Size of the element
d) Size of the list
3) ____________ search algorithm begins at one end of the list and scans down it
until the desired key is found or the other end is reached.
a) Sequential
b) Binary
c) BFS
d) DFS
2.1.2 Basic Sequential Searching
As we have already discussed working of sequential search algorithm, let us now focus on
building a logical algorithm and implementing a program for the same. There are two ways of
implementing this algorithm. First one is iterative method of implementation and the second
one is recursive implementation. Figure 2.3 below shows a flowchart for implementing basic
sequential search algorithm.
77
Figure 2.1.3: Flowchart for linear search algorithm
Types of implementation:
A search can be implemented by two methods:
1. Iterative implementation
2. Recursive implementation
1. Iterative Implementation
Below program code demonstrates iterative method of implementation of sequential search
algorithm.
#include <stdio.h>
void main()
{
int arr[20];
int x, n, key, flag = 0;
printf("Enter the number of elements: \n");
scanf("%d", &n);
printf("Enter the elements: \n");
for (x = 0; x < n; x++)
{
scanf("%d", &arr[x]);
}
printf("Entered elements of the array are:\n");
for (x = 0; x < n; x++)
{
78
printf("%d ", arr[x]);
}
printf("\nEnter the element you want to search: \n");
scanf("%d", &key);
/* Linear search logic */
for (x = 0; x < n ; x++)
{
if (key == arr[x] )
{
flag = 1;
break;
}
}
if (flag == 1)
printf("Element %d found in the array\n",key);
else
printf("Element %d not found in the array\n",key);
}
Output:
2. Recursive Implementation
Below program code demonstrates recursive method of implementation.
#include<stdio.h>
int line_search(int[],int,int);
int main()
{
int arr[100],n,x,key;
printf("Enter the number of elements: ");
scanf("%d",&n);
printf("Enter %d elements: ",n);
for(x=0;x<n;x++)
scanf("%d",&arr[x]);
printf("Entered elements of the array are:\n");
for(x=0;x<n;x++)
printf("%d ",arr[x]);
printf("\nEnter the element you want to search: \n");
scanf("%d",&key);
79
x=line_search(arr,n-1,key);
if(x!=-1)
printf("Element %d found in the array at %d location\n",key,x+1);
else
printf("Element %d is not found in the array\n",key);
}
int line_search(int s[100],int n,int key)
{
if(n<=0)
return -1;
if(s[n]==key)
return n;
else
return line_search(s,n-1,key);
}
Output:
Analysis of linear search algorithm
Worst Case Analysis (Usually Done)
In the worst case complexity, the upper bound on running time of an algorithm is calculated.
In this case, we understand which case causes maximum number of operations to be
executed. For sequential or linear search, the worst case is when the element to be searched is
not present in the array. When number to be searched is not present, the algorithm compares
it with all the elements of array one by one. Therefore, the worst case time complexity of
linear search would be Θ (n).
Average Case Analysis (Sometimes done)
In average case complexity, we take every possible input and calculate computing time for all
of the inputs. All the calculated values are summed and are divided by the sum of total
80
number of inputs. In this way, we understand distribution of cases. For the linear search
problem, consider that all cases are uniformly distributed including the worst case. So we sum
all the cases and divide the sum by (n+1). Below is the example of average case time
complexity.
Best Case Analysis (Ideal)
Linear search algorithm performs best if the element is found to be the first element in the list
during a searching process. That is, the time required for searching will be very less. Hence,
the best or ideal case of linear search will be Θ (1).
81
Self-assessment Questions
4) Which is the worst case for linear search algorithm?
a) The element to be searched is present at the first position in the array
b) The element to be searched is present at the last position in the array
c) The element to be searched is present at the middle position in the array
d) The element to be searched is not present in the array
5) Best or ideal case complexity for linear search algorithm is ______.
a) O(log n)
b) O(n)
c) O(1)
d) O(n log n)
6) The average case complexity of linear search algorithm is __________.
a) O(log n)
b) O(n)
c) O(1)
d) O(n log n)
2.1.3 Binary search
Similar to the implementation of linear search algorithm, binary search algorithm can also be
implemented using linear and recursive methods of implementation. First, consider the below
flowchart in figure 2.1.4 to understand the working of the algorithm. The flowchart is
implemented using the same logic as discussed in the previous section.
The flowchart begins with input key (element) from the user. It then finds the mid-point of
the list using the values from low and high index of the list. After finding the mid-point, a
comparison will be done so as to check the key value with that of the element present at the
mid-point of the list. If the key matches the element at the mid-point, the search is successful.
If it does not match, another comparison is done to check if the key value is lesser than that of
the element at the mid-point. If it is successful then the left sub-array will be considered and if
not then right sub-array will be considered. The whole process is repeated until we find the
key value matching the element of the list and if not then we can conclude that the key value
is not present in the list.
82
Figure 2.1.4: Flowchart for binary search algorithm
(i) Iterative implementation
Iterative implementation of binary search is based on the explanation in the above flowchart.
The limitation of this algorithm is seen at the termination of this algorithm for an
83
unsuccessful search. That is, when the search is not successful the low convolutes over to the
right of high, low > high and this terminates the while loop.
The Algorithm below implements iterative method of binary search.
#include <stdio.h>
int main()
{
int x, low, high, mid, n, key, arr[50];
printf("Enter the number of elements: ");
scanf("%d",&n);
printf("Enter the %d elements: ", n);
for (x = 0; x < n; x++)
scanf("%d",&arr[x]);
printf("\nEnter the element you want to search: \n");
scanf("%d", &key);
low = 0;
high = n - 1;
mid = (low+high)/2;
while (low <= high) {
if (arr[mid] < key)
low = mid + 1;
else if (arr[mid] == key) {
printf("Element %d found in the array at %d location\n", key,
mid+1);
break;
}
else
high = mid - 1;
mid = (low + high)/2;
}
if (low > high)
printf("Element %d is not found in the array\n", key);
return 0;
}
Output:
84
(ii) Recursive implementation
Recursive implementation of binary search overcomes the limitation that was seen in iterative
method of binary search. Here it checks for condition, if (low>high) then it returns -1. This is
similar to while condition in the previous method. This terminates the recursion. If this is not
successful then the recursive function gives new values into parameters of a recursive call.
The algorithm below implements the recursive method for performing binary search.
#include<stdio.h>
#include<stdlib.h>
int bin_rsearch(int[], int, int, int);
int main() {
int n, x, key, pos;
int low, high, arr[20];
printf("Enter the number of elements: ");
scanf("%d", &n);
printf("Enter the %d elements: ");
for (x = 0; x < n; x++)
{
scanf("%d", &arr[x]);
}
low = 0;
high = n - 1;
printf("\nEnter the element you want to search: \n");
scanf("%d", &key);
pos = bin_rsearch(arr, key, low, high);
if (pos != -1) {
printf("Element %d found in the list at %d location\n", key, (pos +
1));
} else
printf("Element %d is not found in the array\n", key);
return (0);
}
// Binary Search function
int bin_rsearch(int s[], int i, int low, int high)
{
int mid;
if (low > high)
return -1;
mid = (low + high) / 2;
85
if (i == s[mid]) {
return (mid);
} else if (i < s[mid]) {
bin_rsearch(s, i, low, mid - 1);
} else {
bin_rsearch(s, i, mid + 1, high);
}
}
Output:
Analysis of Binary search algorithm
Worst Case Analysis (Usually Done)
Similar to linear search, the worst case of binary search is when the element to be searched is
not present in the array. When number to be searched is not present in each iteration of the
binary search algorithm, the size of the permissible array is halved. And this having goes up to
O(log n) times. Therefore, the worst case time complexity of recursive binary search
algorithm is O(log n) and the worst case complexity of iterative binary search algorithm is
O(1).
Average Case Analysis (Sometimes done)
To calculate the average case complexity of binary search algorithm, we take the sum over all
the elements of the product of number of comparisons required to find each element and the
probability of searching that element.
For simplicity of analysis, consider no item which is not in A will be searched for, and the
probabilities for searching each element uniform. Therefore the time complexity of binary
search algorithm is O (log n).
86
Best Case Analysis (Ideal)
In the sequential search problem, the best case is the element to be searched is present at the
middle of the array. The number of operations in the best case is constant i.e. it is
independent on n. So time complexity in the best case would be O (1).
Did you Know?
The difference between O(log(N)) and O(N) is extremely significant when size of the array
N is large: for any practical problem it is crucial that we avoid O(N) searches.
Self-assessment Questions
7) To calculate midpoint in binary search algorithm we ___________.
a) Divide lowest index by highest index
b) First add lowest index and highest index and then divide the sum by 2
c) First subtract highest index from lowest index and divide the result by 2
d) Just subtract highest index and lowest index
8) Best case for binary search algorithm is when the element to be searched is,
a) In the beginning of the array
b) At the end of the array
c) At the middle of the array
d) At any position in the array
9) Average case complexity for binary search algorithm is O(log n),
a) True
b) False
87
2.1.4 Comparison between sequential and binary
search
As already discussed in the previous section of this chapter, it is understood that both binary
search and sequential search algorithms have several differences. In this section we will
compare these algorithms based on various parameters.
1. Implementation requirements
First and the foremost thing, the most obvious difference between the two algorithms lies in
input requirements. Sequential search can be done even over an unsorted array elements as
the comparison is done in sequential manner. Whereas, for binary search algorithm to work,
the input array of elements must be sorted. This is because the fundamental way of working
of algorithm, which is based on array indices.
2. Efficiency
Linear search algorithms works best for small array sizes. However, as the size of the array
increases the performance goes down. However, binary search technique works at its best for
any array size as the array size is halved in every iteration or recursion, so does the number of
comparisons.
The efficiency also depends on location of the search element. If for linear search algorithm,
the element is present at the starting location of the array, then it becomes the best case and if
present at the last position then it becomes worst case. Binary search is most efficient when
the element to be searched is at the middle of the array. For any other location other than the
middle, the efficiency doesn’t affect much.
3. Complexities
For linear search has an average case complexity of O(n) which makes it very slow and
inefficient for huge array sizes. However, binary search an average case complexity of O (log
n) which makes it a better search algorithm even for large array sizes.
4. Data structure
The binary search algorithm works best for arrays but not for linked lists because of the very
fundamental structure of arrays having regular indexing and contiguous memory allocation
unlike liked lists. On the other hand, sequential searching works well for both arrays and
linked lists.
88
Self-assessment Questions
10) Linear search algorithm requires array to be sorted before it start searching.
a) True
b) False
11) Efficiency of linear search algorithm does not depend on the position of the
element to be searched.
a) True
b) False
12) In general, binary search algorithm is best when the array size is big.
a) True
b) False
89
Summary
o Searching is one of the primary function of the computer system, information
retrieval being increasingly important.
o Though there are various other algorithms for searching, two very famous and
important algorithms are sequential search and binary search
o Linear search is simplest of the two, which involves searching elements from one of
the ends of the array until the search element is found. As the array size grows big,
this algorithm proves inefficient as it consumes lot of time for carrying out search.
o Binary search overcomes the disadvantage of sequential search as it reduces number
of iterations taken for finding out the search element. It halves the array size and
hence the number of comparisons in each iteration.
o Unlike linear search, binary search requires the input array to be sorted because of
very fundamental working of its algorithm which is based on index assignment to
lower to upper side of the array.
o In general case, time complexity of sequential search algorithm is O(n) which makes
it slower and less efficient as compared to binary search which has time complexity
of O(log n).
90
Terminal Questions
1. Explain in brief different types of searching algorithms.
2. Consider the following array A = {23, 26, 32, 35, 39, 42, 44, 47, 50, 55, 58, 62, 66,
88, 99} and search for element e=26 using binary search technique. (Solution
needs to be demonstrated pictorially with solution for each iteration).
3. Provide an algorithm for recursive implementation of linear search.
4. Draw a neat flowchart for binary search algorithm.
5. Explain in brief the difference between linear and binary search algorithm.
91
Answer Keys
Self-assessment Questions
92
Question No.
Answer
1
a
2
b
3
a
4
d
5
c
6
b
7
b
8
c
9
a
10
b
11
b
12
a
Activity
1. Activity Type: Offline
Duration: 20 Minutes
Description:
Stack 10 reference books in ascending order of their titles.
Ask the students to write a program for binary search to search books by their title.
Case Study: Alphabetizing Papers
Consider the example of a human alphabetizing a couple dozen papers. If we think about it
for a while it’s basically a sorting algorithm. If one tries to understand the process or working
of this algorithm, following questions are needed to be asked.
1. How are the papers alphabetized?
2. How are the papers arranged?
Basically all papers with names starting with A are put in pile named ‘A’, similarly names
starting with B are put in pile names ‘B’ and so on. The groups (pile) range varies based on
the number of other factors which are chosen as per convenience. Once the grouping is done,
next each pile or a group is scanned letter by letter and a new algorithm is used for further
working. In 90 per cent of the cases, humans unknowingly use insertion sort algorithm.
It is well known that the quicksort is the best and fastest way to sort. The question is then,
why don't humans use quicksort? Human brain doesn't do all comparisons equally. It's just
"easier" for our brains to quickly apply insertion sort. Splitting into letter groups makes each
smaller problem more manageable.
In reality, humans use an algorithm called a bucket sort. A bucket sort followed by individual
insertion sorts (exactly what humans tend to do) is a linear time sorting algorithm. When we
have some notion of the distribution of the items to be sorted, we can break through that
boundary and do linear time sorting. The requirement with linear time sorting is that the
input must follow some known distribution. This is the reason why humans instinctively
break the piles into various types of groupings. If there are many papers, it is required to
reduce group ranges. Furthermore, the ideal bucket setup would distribute the papers
93
roughly evenly. The letter S might need its own bucket, but we can put all the letters up
through F in their own bucket. Humans have many of experiences with both the general
problem and their specific problem (For example, the peculiarities of a particular class' name
distribution) and so they try to optimize the algorithm given the known distribution. They
are setting up the parameters of the linear time sort (number of buckets, bucket ranges, etc.)
exactly as they should to optimize the sort time.
The main disadvantage to these linear sort algorithms is that they require lot of extra memory
space (versus comparison-based sorting). We need to have an auxiliary bookkeeping array on
the order of the original problem to do them. This isn't a problem in real life problem, where
in we just need a large table to arrange papers. In a very real sense, this supposedly "naive"
algorithm that humans use is among the very best possible.
Questions:
1. Explain the process followed by humans for sorting papers, as described in above case
study. What is the method called technically and what is the supported sorting
algorithm?
2. Why do you think humans cannot think of sorting using quicksort?
3. Why it is not advised to use bucket sort for implementing computer based sorting
algorithm?
4.
Do you agree with the author in the case study that the process followed by humans
for sorting applications in real life is fastest?
94
Bibliography
e-References
•
interactivepython.org, (2016). Problem solving in Data structures: The Binary
Search. Retrieved on 19 April 2016, from
http://interactivepython.org/runestone/static/pythonds/SortSearch/TheBinarySea
rch.html
•
pages.cs.wisc.edu, (2016). Searching and Sorting. Retrieved on 19 April 2016,
from http://pages.cs.wisc.edu/~bobh/367/SORTING.html
External Resources
•
Kruse, R. (2006). Data Structures and program designing using ‘C’ (2nd ed.).
Pearson Education.
•
Srivastava, S. K., & Srivastava, D. (2004). Data Structures Through C in Depth (2nd
ed.). BPB Publications.
•
Weiss, M. A. (2001). Data Structures and Algorithm Analysis in C (2nd ed.).
Pearson Education
Video Links
Topic
For Introduction to searching, and types
of searching techniques
Implementation of linear search
iterative method
Implementation of binary search
iterative method
Implementation of binary search
recursive method
Comparison of linear and binary search
Link
https://www.youtube.com/watch?v=mqixr2wdLqg
https://www.youtube.com/watch?v=AqjVd6FVFbE
https://www.youtube.com/watch?v=g9BKw_TobpI
https://www.youtube.com/watch?v=-bQ4UzUmWe8
https://www.youtube.com/watch?v=u3v-vh2t9FE
95
Notes:
96
Chapter Table of Contents
Chapter 2.2
Sorting Techniques
Aim ....................................................................................................................................................... 97
Instructional Objectives..................................................................................................................... 97
Learning Outcomes ............................................................................................................................ 97
2.2.1 Introduction.............................................................................................................................. 98
2.2.2 Basics of Sorting ....................................................................................................................... 99
Self-assessment Questions ..................................................................................................... 105
2.2.3 Sorting Techniques ................................................................................................................ 106
(i) The Bubble Sort ................................................................................................................. 106
(ii) Insertion Sort .................................................................................................................... 110
(iii) Selection Sort ................................................................................................................... 113
(iv) Merge Sort ........................................................................................................................ 117
(v) Quick Sort.......................................................................................................................... 123
Self-assessment Questions ............................................................................................................... 132
Summary ........................................................................................................................................... 134
Terminal Questions.......................................................................................................................... 135
Answer Keys...................................................................................................................................... 135
Activity............................................................................................................................................... 136
Bibliography ...................................................................................................................................... 137
e-References ...................................................................................................................................... 137
External Resources ........................................................................................................................... 137
Video Links ....................................................................................................................................... 137
Aim
To educate the students in searching and sorting techniques
Instructional Objectives
After completing this chapter, you should be able to:
•
Explain the need of sorting
•
Demonstrate bubble and insertion sort algorithms with example
•
Discuss the time and space complexities of merge and quick sort algorithms
Learning Outcomes
At the end of this chapter, you are expected to:
•
Calculate the complexities of all sorting algorithms
•
Identify the efficient algorithm
•
Outline the steps to sort the unsorted numbers using quick sort
97
2.2.1 Introduction
Sorting is a technique to store the data in sorted order. Data can be sorted either in ascending
or descending order, which can be numerical, lexicographical, or any user-defined order. The
term Sorting is related to Searching of data. Here the data considered consists of only
integers, but it may be anything like string or records.
In our real life, we need to search many things, like a particular record in database, students’
marks in the result database, a particular person’s telephone number, any students name in
the list etc. The sorting process will arrange the data in a particular sequence making it easier
to search whenever needed. Thus data searching can be optimized to a great extent by using
sorting techniques.
Every single record to be sorted will contain one key based on which the record will be sorted.
For example, suppose we have a record of students, every such record will have data like Roll
number, name and percentage.
In the above record we can sort the record in ascending and descending order based on key
i.e. Roll number. If we wish to search a student with roll no. 54, we don't need to search the
complete record but we will simply search between the Students with roll no. 50 to 60, thus
saving a lot of time.
Some of the examples of sorting in real life scenarios are as followings:
1. Telephone Directory: A Telephone directory keeps telephone numbers of people
sorted based on their names. So that names can be searched very easily.
2. Dictionary: A Dictionary contains words in alphabetical order so that searching of
any work becomes easy.
Before studying any sorting algorithms, it is necessary to know about the 2 main
operations involved.
1. Comparison: Two values need to be compared with each other depending upon the
sorting criteria. ,
2. Exchange or Swapping: When two values are compared with each other, if required
they need to be exchanged with each other.
98
Sorting algorithm helps in arranging the elements in a particular order. Efficient sorting
algorithm is important to optimize the use of other algorithms (such as search and merge
algorithms) which require sorted lists to work correctly.
More importantly, the output must satisfy two conditions:
1. The output is in non-decreasing order (each element is not smaller than the previous
element according to the desired total order).
For example, consider the following set of elements to be sorted: 45, 76, 2, 56, 89, 4
As per the first condition, the output of a sorting algorithm must be in non-decreasing
order i.e., 2, 4, 45, 56, 76, 89
2.
The output is a permutation, or reordering, of the input.
The output should be always the reordering of the same elements to be sorted. You
cannot add or delete or replace any element from the set.
As per this second condition the sorted list will be: 2, 4, 45, 56, 76, 89
Right from the beginning, the sorting problem has attracted a great deal of research.
This is perhaps due to the complexity of solving the problem efficiently despite its
simple, familiar statement.
Although many consider it a solved problem, new useful sorting algorithms are still being
invented For example, library sort was first published in 2004). The Sorting algorithms are
prevalent in introductory computer science classes. Here the abundance of algorithms for the
problem provides a gentle introduction to a variety of core algorithm concepts. They are big
O notation, data structures, and the divide and conquer algorithms, randomized algorithms,
best, worst and average case analysis, time-space trade-offs, and lower bounds.
In this chapter, we will look at basics of sorting and sorting techniques like bubble sort,
selection sort, insertion sort, merge sort and quick sort in detail.
2.2.2 Basics of Sorting
Data can be sorted either in ascending (increasing) or in decreasing (decreasing) order. If the
order is not mentioned then it is assumed to be ascending order. In this chapter sorting is
99
done by ascending order. These algorithms can be made to work for descending order also by
making simple modifications.
Sort Stability
Sort stability comes into the picture if the key on which the data is being sorted is not unique
for each record. I.e. two or more records have identical keys. For example, consider a list of
records where each record contains the name and age of a person. Consider name as sort key
and sort all the records according to the names as shown in table 2.1.1.
Table 2.1.1: Unsorted List
Name
Age
Vineet
25
Amit
37
Deepa
67
Shriya
45
Deepa
20
Kiran
18
Deepa
56
Name
Age
Amit
37
Deepa
56
Deepa
67
Deepa
20
Kiran
18
Shriya
45
Vineet
25
Table 2.1.2: Sorted-Unstable List
100
Table 2.1.3: Sorted Unstable List
Name
Age
Amit
37
Deepa
20
Deepa
56
Deepa
67
Kiran
18
Shriya
45
Vineet
25
Name
Age
Amit
37
Deepa
67
Deepa
20
Deepa
56
Kiran
18
Shriya
45
Vineet
25
Table 2.1.4: Sorted-Stable List
Any sorting algorithm would place (Amit, 37) in 1st position, (kuran, 18) in 5th position,
(Shriya, 45) in 6th position and (Vineet, 25) in 7th position. There are identical keys(names),
which are (Deepa, 67) , (Deepa, 20) and (Deepa, 56) and any sorting algorithm would place
them in adjacent locations i.e. 2nd, 3rd, and 4th locations but not necessarily in same relative
order.
A sorting algorithm is said to be stable if it maintains the relative order of the duplicate keys
in the sorted output. i.e., if the keys are equal then their relative order in the sorted output is
101
the same. For example, the records Ri and Rj have equal keys and if the record Rj precedes
record Rj in the input data then Ri should precede Rj in the sorted output data also if the sort
is stable. If the sort is not stable then Ri and Rj may be in any order in the sorted output. So
in an unstable sort the duplicate keys may occur in any order in the sorted output.
Sort efficiency
Sorting is an important and frequent operation in many applications. So the aim is not only
to get sorted data but to get it in the most efficient manner. Therefore many algorithms have
been developed for sorting and to decide which one to use when we need to compare them
using some parameters.
Choice is made using these three parameters:
1. Coding time: Coding time is the time taken to write the program for implementing a
particular sorting algorithm. Coding time depends upon the algorithm you are
choosing for sorting.
For example, simple sorting programs like bubble sort will require less coding time
whereas heap sorting will consume more Coding time.
2. Space requirement: It is the space required to store the executable program,
constants, variables etc.
For example, below program demonstrates how to find the space requirement of a
sorting program. It uses size command to display the space required for text, data etc.
/*program to find space requirement */
#include <stdio.h>
#include <time.h>
int main(int argc, char *argv[])
{
time_t start, stop;
clock_t ticks; long count;
time(&start);
int array[4]={3, 67, 2, 64}, c, d, temp;
for( c=0; c<(4-1); c++)
{
for(d=0; d<4-c-1; d++)
{
if(array[d]>array[d+1])
{
temp= array[d];
102
array[d]=array[d+1];
array[d+1]=temp;
}
}
}
printf("Sorted list is ascending order:\n");
for(c=0;c<4;c++)
printf("%d\n", array[c]);
int i=0;
while(i<50000)
{
i++;
ticks = clock();
}
time(&stop);
printf("Used %0.2f seconds of CPU time. \n",
(double)ticks/CLOCKS_PER_SEC);
printf("Finished in about %.0f seconds. \n", difftime(stop, start));
return 0;
}
Output:
3. Run time or execution time: It is the time taken to successfully execute a sorting
algorithm to obtain a sorted list of elements.
For example, the below program demonstrates how to calculate total execution time
of a sorting program. It uses header file <time.h> and calculates the execution time of
a sorting program.
/*program to find execution time */
#include <stdio.h>
#include <time.h>
int main(int argc, char *argv[])
{
103
time_t start, stop;
clock_t ticks; long count;
time(&start);
int array[4]={2, 67, 12, 23}, i, j, temp;
for( i=0; i<4; i++)
{
for(j=0; j<4-i-1; j++)
{
if(array[j]>array[j+1])
{
temp= array[j];
array[j]=array[j+1];
array[j+1]=temp;
}
}
}
printf("Sorted array is:\n");
for(i=0;i<4;i++)
printf("%d\n", array[i]);
int k=0;
while(k<50000)
{
k++;
ticks = clock();
}
time(&stop);
printf("Used %0.2f seconds of CPU time. \n",
(double)ticks/CLOCKS_PER_SEC);
printf("Finished in about %.0f seconds. \n", difftime(stop, start));
return 0;
}
Output:
If data is in small quantity and sorting is needed only during a few occasions, then any simple
sorting technique can be used. This is because in these cases, a simple or a less efficient
104
technique would behave at par with the complex techniques which are developed to minimize
run time and space requirements. So it is pointless to search and apply complex algorithm.
Running time can be defined as the total time taken by the sorting program to run to
completion. Hence, running time is one of the most important factors in implementation of
algorithms. If the amount of data to be sorted is in large quantity, then it is crucial to
minimize runtime by choosing an efficient runtime technique.
The 2 basic operations in sorting are comparison and moving records. The record moves or
any other operations are generally a constant factor of number of comparisons. Moreover the
record moves can be considerably reduced so that run time is measured by considering only
the comparisons. Calculating the exact number of comparisons may not be always possible so
an approximation is given by big=O notation. Thus the run time efficiency of each algorithm
is expressed as O notation. The efficiency of most of the sorting algorithm is between O(n log
n) and O(n2).
Self-assessment Questions
1) The technique used for arranging data elements in a specific order is called as
____________.
a) Arranging
b) Filtering
c) Sorting
d) Distributing
2) The Time required to Complete the execution of a sorting program is called as
____________.
a) Coding Time
b) Average Time
c) Running Time
d) Total Time
3) A sorting technique is called stable if it _______.
a) Takes O(nlogn) times
b) Maintains the relative order of occurrence of non-distinct elements
c) Uses divide-and-conquer paradigm
d) Takes O(n) space
105
2.2.3 Sorting Techniques
Sorting techniques depends on two important parameters. The first parameter is the
execution time of program, which means time taken for execution of program. The second
parameter is the space, which means space or memory taken by the program. The algorithm
that you choose must be more efficient in terms of execution time and space usage.
There are many techniques for sorting. For example, Bubble sort, Selection sort, merge sort
etc. The choice of sorting algorithm depends upon the particular situation.
In-place sorting and Not-in-place sorting
Sorting algorithms may require some extra space for comparison of elements and temporary
storage of few data elements.
The sorting algorithms which does not require any extra space for sorting, and usually the
sorting happens within array is called as in-place sorting. This is called in-place sorting.
Bubble sort is an example of in-place sorting. Many other in-place sorting algorithms include
selection sort, insertion sort, heap sort, and Shell sort.
But in some sorting algorithms, the program requires space which is more than or equal to
the elements being sorted. Sorting which uses equal or more space for temporary storage is
called not-in-place sorting. They sometimes require arrays to be separated and sorted. Mergesort is an example of not-in-place sorting.
To understand the more complex and efficient first sorting algorithms, it is important to
understand the simpler, but slower algorithms. This topic deals with bubble sort, insertion
sort; and selection sort, merge sort and quick sort. Any of these sorting algorithms are good
enough for most small tasks.
(i) The Bubble Sort
Bubble Sort is an algorithm which is used to sort N elements that are given in a memory.
For example, an Array with N number of elements. Bubble Sort compares the entire element
one by one and sort them based on their values.
The bubble sort makes multiple passes through a list. It compares adjacent items and
exchanges those that are out of order. Each pass through the list places the next largest value
in its proper place. In essence, each item “bubbles” up to the location where it belongs.
106
Sorting takes place by stepping through all the data items one-by-one in pairs and comparing
adjacent data items and swapping each pair that is out of order.
Fig 2.2.2 shows the first pass of a bubble sort. The shaded items are being compared to see if
they are out of order. If there are n items in the list, then there are n−1 pairs of items that
need to be compared on the first pass. It is important to note that once the largest value in the
list is part of a pair, it will continually be moved along until the pass is complete.
Figure 2.2.2: First pass of Bubble sort
At the start of the second pass as shown in the below Figure 2.2.3, the largest value is now in
place. There are n−1 items left to sort, meaning that there will be n−2 pairs. Since each pass
places the next largest value in place, the total number of passes necessary will be n−1. After
completing the n−1 passes, the smallest item must be in the correct position with no further
processing required.
The exchange operation is sometimes called a “swap”. Typically, swapping two elements in a
list requires a temporary storage location (an additional memory location).
107
A code fragment such as:
temp = alist[i]
alist[i] = alist[j]
alist[j] = temp
will exchange the ith and jth items in the list. Without the temporary storage, one of the values
would be overwritten.
Figure 2.2.3: Second pass of Bubble sort
Below is the code to implement Bubble sort.
/* Implementation of Bubble sort Algorithm */
#include <stdio.h>
int main()
{
int arr[300], n, i, j, swap;
printf("Enter number of elements:\n");
scanf("%d",&n);
printf("Enter those %d elements\n", n);
for(i =0; i < n; i++)
scanf("%d",&arr[i]);
for(i =0; i <( n -1);i++)
{
for(j =0; j < n - i -1; j++)
{
if(arr[j]> arr[j+1])
{
swap= arr[j];
arr[j]= arr[j+1];
arr[j+1]= swap;
}
}
108
}
printf("After sorting using bubble sort, the elements are:\n");
for( i =0; i < n ;i++)
printf("%d\n", arr[i]);
return 0;
}
Output:
Complexity Analysis of Bubble Sorting
Worst case Time Complexity:
Bubble sort algorithm will sort the array on n elements as given below.
1st Pass: n-1 Comparisons and n-1 swaps
2nd Pass: n-2 Comparisons and n-2 swaps
....
(n-1)th Pass: 1 comparison and 1 swap.
All together: c ((n-1) + (n-2) + ... + 1), where c is the time required to do one comparison, one
swap.
i.e., (n-1)+(n-2)+(n-3)+.....+3+2+1
Sum of the above series = n(n-1)/2
109
Sum= O(n2)
Hence the worst time complexity of Bubble Sort is O(n2).
Space complexity:
Bubble Sort has Space complexity of O(1), because only one additional memory space is
required for temp variable.
Best-case Time Complexity:
Best Case Time Complexity is O(n), when the given list of elements is already sorted.
A bubble sort is considered as the most inefficient sorting method. Bubble sort algorithm
exchanges elements before the final location of element is known, thus utilizing more time in
these exchange operations. However, as the bubble sort algorithm makes passes through the
entire unsorted portion of the list of elements.
(ii) Insertion Sort
Consider a contiguous list. In this case, it is necessary to move entries in the list to make room
for the insertion. To find the position where the insertion is to be made, we must search. One
method for performing ordered insertion into a contiguous list is first to do a binary search to
find the correct location, and then move the entries as required and insert the new entry.
Since so much time is needed to move entries no matter how the search is done, it turns out
in many cases to be just as fast to use sequential search as binary search. By doing sequential
search from the end of the list, the search and the movement of entries can be combined in a
single loop, thereby reducing the overhead required in the function.
Following are some of the important characteristics of Insertion Sort.
•
It has one of the simplest implementation
•
It is efficient for smaller data sets, but very inefficient for larger lists.
•
Insertion Sort is adaptive, that means it reduces its total number of steps if given a
partially sorted list, hence it increases its efficiency.
•
110
It is better than Selection Sort and Bubble Sort algorithms.
•
Its space complexity is less, like Bubble Sorting. Insertion sort also requires a single
additional memory space.
•
It is Stable, as it does not change the relative order of elements with equal keys.
The procedure for insertion sort of elements when elements are equal is demonstrated in the
figure 2.2.4
Figure 2.2.4: Insertion sort with equal elements
The working of insertion sort algorithm with and example is depicted in figure 2.2.5
How Insertion Sorting Works
Figure 2.2.5: Working of Insertion Sort Algorithm
111
Pseudocode:
void insertionSort(int arr[], int length) {
int i, j, tmp;
for (i = 1; i < length; i++) {
j = i;
while (j > 0 && arr[j - 1] > arr[j]) {
tmp = arr[j];
arr[j] = arr[j - 1];
arr[j - 1] = tmp;
j--;
}
}
}
Program:
/* Implementation of Insertion sort Algorithm */
#include <stdio.h>
int main()
{
int arr[500], n, i, j, temp;
printf("Enter the total number of elements:");
scanf("%d", & n);
printf("Enter those %d Elements : \n", n);
for(i=0; i<n; i++)
{
scanf("%d",&arr[i]);
}
for(i=1; i<n; i++)
{
temp=arr[i];
j=i-1;
while((temp<arr[j]) && (j>=0))
{
arr[j+1]=arr[j];
j--;
}
arr[j+1]=temp;
}
printf("After sorting using insertion sort, the elements are: \n");
for(i=0; i<n; i++)
{
printf("%d\n",arr[i]);
}
return 0;
}
112
Output:
Complexity Analysis of Insertion Sort
The analysis is is same as bubble sorting.
Worst Case Time Complexity: O(n2)
Best Case Time Complexity: O(n)
Average Time Complexity: O(n2)
Space Complexity: O(1)
(iii) Selection Sort
Insertion sort has one major disadvantage. Even after most entries have been sorted properly
into the first part of the list, the insertion of a later entry may require that many of them be
moved. All the moves made by insertion sort are moves of only one position at a time. Thus
to move an entry 20 positions up the list requires 20 separate moves. If the entries are small,
perhaps a key alone, or if the entries are in linked storage, then the many moves may not
require excessive time. In case the entries are very large, records containing hundreds of
components like personnel files or student transcripts, and the records must be kept in
contiguous storage. In such cases it would be far more efficient if, when it is necessary to
move an entry, it could be moved immediately to its final position. Selection sort method
accomplishes this goal.
113
How Selection Sorting Works
Consider an array of n elements. Selection sort algorithm starts by comparing first two
elements of that array and swap the elements if required
For example, if you want to sort the elements in ascending order and if the first element is
greater than second element, then it will swap the elements but, if the first element is smaller
than second element, it will leave the elements as it is.
Then, again first element and third element are compared and swapped if required. This
process will continue until first and last element of that array is compared, thus completing
the first pass of selection sort.
In this algorithm, after the first pass the required element will be already placed at its final
position. Hence during the second pass through this algorithm, it starts from second element
of array and repeats the procedure n-1 times.
For sorting in ascending order, smallest element will be at first and in case of sorting in
descending order; largest element will be at first. Similarly, this process will continue until all
elements in array are sorted.
Below figure 2.2.6 demonstrates how selection sort algorithm works:
114
Figure 2.2.6: Working of selection sort with example
In the first pass, 2 is found to be the smallest. Hence, it is placed in the first position. In the
second pass 10 is found to be the smallest and placed at 2nd position and so on until the full
list is sorted.
Sorting using Selection Sort Algorithm
voidselectionSort(int a[], int size)
{
inti, j, min, temp;
for(i=0; i< size-1; i++ )
{
min = i;
//setting min as i
for(j=i+1; j < size; j++)
{
if(a[j] < a[min])
//if element at j is less than element at min position
{
min = j;
//then set min as j
}
}
temp = a[i];
a[i] = a[min];
a[min] = temp;
}
}
115
Selection sort algorithm implementation in c
/* Implementation of Selection sort Algorithm */
#include<stdio.h>
int main()
{
int arr[200],i,j,n,t,min,pos;
printf("Enter the total number of elements:");
scanf("%d",&n);
printf("Enter those %d elements:\n", n);
for(i=0; i<n; i++)
scanf("%d",&arr[i]);
for(i=0; i<n-1; i++)
{
min=arr[i];
pos=i;
for(j=i+1; j<n; j++)
{
if(min>arr[j])
//Compare values
{
min=arr[j];
pos=j;
}
}
t=arr[i];
//Swap the values
arr[i]=arr[pos];
arr[pos]=t;
}
printf("\n After sorting using selection sort, the elements are::\n");
for(i=0; i<n; i++)
printf("%d \n",arr[i]);
return 0;
}
Output:
116
Complexity Analysis of Selection Sorting
Worst Case Time Complexity: O(n2)
Best Case Time Complexity: O(n2)
Average Time Complexity: O(n2)
Space Complexity: O(1)
Did you Know?
The worst case time complexity of bubble sort, selection sort and insertion sort is n2.
(iv) Merge Sort
Merge sort is a fine example of a recursive algorithm.
The fundamental operation in this algorithm is merging two sorted lists. This can be done in
one pass through the input, if the output is put in a third list because the lists are sorted.
Merge sort is a sorting technique based on divide and conquer technique. Merge sort first
divides the array into equal halves and then combines them in a sorted manner.
The basic merging algorithm takes two input arrays A and B, an output array C, and three
counters, aptr, bptr, and cptr, which are initially set to the beginning of their respective
arrays. The smaller of A[aptr] and B[bptr] is copied to the next entry in C, and the
appropriate counters are advanced. When either input list is exhausted, the remainder of the
other list is copied to C. An example of how the merge routine works is provided for the
following input.
If the array A contains 1, 13, 24, 26, and b contains 2, 15, 27, 38, then the algorithm proceeds
as follows: First, a comparison is done between 1 and 2. 1 is added to C, and then 13 and 2 are
compared.
117
2 is added to C, and then 13 and 15 are compared.
13 is added to C, and then 24 and 15 are compared. This proceeds until 26 and 27 are
compared.
26 is added to C, and the A array is exhausted.
The remainder of the B array is then copied to C.
The time to merge two sorted lists is clearly linear, because at most n - 1 comparisons are
made, where n is the total number of elements. Note that every comparison adds an element
to c, except the last comparison, which adds at least two.
The merge sort algorithm is therefore easy to describe. If n = 1, there is only one element to
sort, and the answer is at hand. Otherwise, recursively merge sort the first half and the second
118
half. This gives two sorted halves, which can then be merged together using the merging
algorithm described above. For instance, to sort the eight-element array 24, 13, 26, 1, 2, 27, 38,
15, recursively sort the first four and last four elements, obtaining 1, 13, 24, 26, 2, 15, 27, 38.
Then merge the two halves as above, obtaining the final list 1, 2, 13, 15, 24, 26, 27, 38. This
algorithm is a classic divide-and-conquer strategy. The problem is divided into smaller
problems and solved recursively. The conquering phase consists of patching together the
answers. Divide-and-conquer is a very powerful use of recursion that will be seen many times.
Algorithm
Merge sort keeps on dividing the list into equal halves until it can no more be divided. By
definition, if it is only one element in the list, it is sorted. Then merge sort combines smaller
sorted lists keeping the new list sorted too.
Step 1 − if it is only one element in the list, it is already sorted, return.
Step 2 − divide the list recursively into two halves until it can no more be divided.
Step 3 − merge the smaller lists into new list in sorted order.
Pseudocode
We shall now see the pseudocodes for merge-sort functions. As our algorithms points out two
main functions − divide & merge. Merge sort works with recursion and we shall see our
implementation in the same way.
proceduremergesort(var a as array )
if( n ==1)return a
var l1 as array = a[0]... a[n/2]
var l2 as array = a[n/2+1]... a[n]
l1 =mergesort( l1 )
l2 =mergesort( l2 )
return merge( l1, l2 )
end procedure
procedure merge(var a asarray,var b as array )
var c as array
while( a and b have elements )
if( a[0]> b[0])
119
add b[0] to the
remove b[0]from
else
add a[0] to the
remove a[0]from
endif
endwhile
end of c
b
end of c
a
while( a has elements )
add a[0] to the end of c
remove a[0]from a
endwhile
while( b has elements )
add b[0] to the end of c
remove b[0]from b
endwhile
return c
end procedure
/* Implementation of Merge sort Algorithm */
#include<stdio.h>
int arr[20],i,n,b[20];
void merge(int arr[],int low,int m ,int high)
{
int h,i,j,k;
h=low;
i=low;
j=m+1;
while(h<=m && j<=high)
{
if(arr[h]<=arr[j])
b[i]=arr[h++];
else
b[i]=arr[j++];
i++;
}
if( h > m)
for(k=j;k<=high;k++)
b[i++]=arr[k];
else
for(k=h;k<=m;k++)
b[i++]=arr[k];
for(k=low;k<=high;k++)
{
arr[k]=b[k];
}
}
120
void mergesort(int arr[],int i,int j)
{
int m;
if(i<j)
{
m=(i+j)/2;
mergesort(arr,i,m);
mergesort(arr,m+1,j);
merge(arr,i,m,j);
}
}
int main()
{
printf("\nEnter the number of elements:");
scanf("%d",&n);
printf("Enter those %d elements:", n);
for(i=0; i<n; i++)
scanf("%d",&arr[i]);
mergesort(arr,0,n-1);
printf("\nAfter sorting using merge sort, the elements are: ");
for(i=0;i<n;i++)
printf("%d\n", arr[i]);
return 0;
}
Output:
The output of the program should be as follows:
Analysis of Merge Sort
Merge sort algorithm uses a divide and conquer strategy. This is a recursive algorithm that
continuously divides the list of elements into 2 parts.
121
Case 1: If the List is empty or if it has only single item, then the list is already sorted. This will
be considered as the Best Case.
Case 2: If the List contains N number of elements, the algorithm will divide the list into half
and perform merge sorting individually on both halves. Once both parts are sorted, a merge
operation is performed to combine the already sorted smaller parts.
Worst Case Time Complexity:
In worst case, in every step a comparison is required. This is because in every merge step, one
value will remain in the opposing list. Hence, merge sort algorithm must continue comparing
the elements in the opposing lists.
The complexity of worst-case Merge Sort is:
T (N) = 2T (N/2) + N-1
Equation 1
Where T (N) is the total number of comparisons between the elements in a list and N
refers to the total number of elements in a list.
2T (N/2) shows that merge sort is performed on two halves of the list during the
divide stage and
N-1 represents the total comparisons in the merge stage.
This merge sort procedure is recursive. Hence it will include substitutions also.
T (N) = 2[2T (N/4) + N/2-1] + N-1
Equation 2
T (N) = 4[2T (N/8) + N/4-1] + 2N-1
Equation 3
These equations represent those substitutions required during recursions,
So we have substituted T (N) value into 2T (N/2) of equation 1 to obtain equation 4
T (N) = 8T (N/8) + N + N + N - 4 - 2 - 1
Equation 4
This is during the 3rd recursive call.
Let us consider a value k representing the depth of recursion.
Recursion stops when the list will contain only one element. In general we get,
T (N)= 2k T (N/2k ) + kN – (2k – 1)
Equation 5
This procedure of dividing will continue until list contains a single element. And we
know that a list with a single element is already sorted.
122
T (1)=0
Equation 6
2k = N
Equation 7
k=log2 N
Equation 8
T (N) = N log2 N – N + 1
Equation 9
Hence, the worst time complexity of Merge sort algorithm is O (N log (N))
Best Case Time Complexity:
It is when the largest element of one sorted part is smaller than the first element of its
opposing part, for every merge step that occurs. Only one element from the opposing list is
compared thus reducing the number of comparisons in each merge step to N/2.
Hence best case time complexity is also O (N log (N)) because the merging is always linear.
Same follows for the Average case time complexity.
(v)
Quick Sort
Even though the time complexity of merge sort algorithm is O(nlogn), it is not desirable to
use as it consumes more space. It needs more space to merge the array partitions. Quick sort
is one of the fastest sorting algorithms.
The quick sort algorithm also uses divide and conquer rule to sort the elements without
using additional storage.
A quick sort algorithm first selects a value, which is called the pivot value. This algorithm will
partition all elements based on whether they are smaller than or greater than the pivot
element.
Thus we get two partitions: One partition having elements larger than the pivot element and
another partition having elements smaller than the pivot element. The selected pivot element
ends up in its final sorted position.
Thus, the elements to the right and left of the pivot element can be sorted successfully. Hence,
we can again implement a recursive algorithm to sort the elements using divide and conquer
approach. All the portioned array elements remains in the same array hence saving the space
where they can be combined together.
123
For sorting the elements we have to use a recursive function. We have to pass both the
partitions of array along with the pivot element to this function as parameters.
Our prior sorting functions, however, have no parameters, so for consistency of notation we
do the recursion in a function recursive_quick_sort that is invoked by the method quick_sort,
which has no parameters.
Quick Sort, as the name suggests, sorts any list very quickly. Quick sort is not stable search,
but it is very fast and requires very less additional space. It is based on the rule of Divide and
Conquer (also called partition-exchange sort).
This algorithm divides the list into three main parts:
1. Elements less than the Pivot element
2. Pivot element
3. Elements greater than the pivot element
In the list of elements, mentioned in below example, we have taken 25 as pivot. So after the
first pass, the list will be changed like this.
6 8 17 14 25 63 37 52
Hence, after the first pass, pivot will be set at its position, with all the elements smaller to it on
its left and all the elements larger than it on the right. Now 6 8 17 14 and 63 37 52 are
considered as two separate lists, and same logic is applied on them, and we keep doing this
until the complete list is sorted.
The working of Quick sort algorithm is shown in figure 2.2.7.
124
How Quick Sorting Works
Figure 2.2.7: Divide and Conquer-Quick Sort
QuickSort Pivot Algorithm
Based on our understanding of partitioning in quicksort, we should now try to write an
algorithm for it here.
Step 1 − Choose the highest index value has pivot
Step 2 − Take two variables to point left and right of the list excluding
pivot
Step 3 − left points to the low index
Step 4 − right points to the high
Step 5 − while value at left is less than pivot move right
Step 6 − while value at right is greater than pivot move left
Step 7 − if both step 5 and step 6 does not match swap left and right
Step 8 − if left ≥ right, the point where they met is new pivot
125
QuickSort Pivot Pseudocode
The pseudocode for the above algorithm can be derived as −
functionpartitionFunc(left, right, pivot)
leftPointer= left -1
rightPointer= right
whileTruedo
while A[++leftPointer]< pivot do
//do-nothing
endwhile
whilerightPointer>0&& A[--rightPointer]> pivot do
//do-nothing
endwhile
ifleftPointer>=rightPointer
break
else
swapleftPointer,rightPointer
endif
endwhile
swapleftPointer,right
returnleftPointer
endfunction
QuickSort Algorithm
Using pivot algorithm recursively we end-up with smaller possible partitions. Each partition
then processed for quick sort. We define recursive algorithm for quicksort as below −
Step 1 − Make the right-most index value pivot
Step 2 − partition the array using pivot value
Step 3 − quicksort left partition recursively
Step 4 − quicksort right partition recursively
126
QuickSortPseudocode
To get more into it, let see the pseudocode for quick sort algorithm −
procedurequickSort(left, right)
if right-left <=0
return
else
pivot= A[right]
partition=partitionFunc(left, right, pivot)
quickSort(left,partition-1)
quickSort(partition+1,right)
endif
end procedure
Sorting using Quick Sort Algorithm
/* a[] is the array, p is starting index, that is 0,
and r is the last index of array. */
voidquicksort(int a[], int p, int r)
{
if(p < r)
{
int q;
q = partition(a, p, r);
quicksort(a, p, q);
quicksort(a, q+1, r);
}
}
intpartition(int a[], int p, int r)
{
inti, j, pivot, temp;
pivot = a[p];
i = p;
j = r;
while(1)
{
while(a[i] < pivot && a[i] != pivot)
i++;
while(a[j] > pivot && a[j] != pivot)
j--;
127
if(i< j)
{
temp = a[i];
a[i] = a[j];
a[j] = temp;
}
else
{
return j;
}
}
}
/* Implementation of Quick sort Algorithm */
#include<stdio.h>
#include<stdbool.h>
#define MAX 7
int intArray[MAX]={ 14,6,23,12,76,49,57};
void printline(int count)
{
int i;
for(i=0;i <count-1;i++)
{
printf("=");
}
printf("=\n");
}
void display()
{
int i;
printf("[");
// navigate through all items
for(i=0;i<MAX;i++)
{
printf("%d ",intArray[i]);
}
printf("]\n");
}
void swap(int num1,int num2)
{
int temp =intArray[num1];
intArray[num1]=intArray[num2];
intArray[num2]= temp;
}
int partition(int left,int right,int pivot)
{
128
int leftPointer= left -1;
int rightPointer= right;
while(true)
{
while(intArray[++leftPointer]< pivot)
{
//do nothing
}
while(rightPointer>0&&intArray[--rightPointer]> pivot)
{
//do nothing
}
if(leftPointer>=rightPointer)
{
break;
}
else
{
printf(" item swapped :%d,%d\n",
intArray[leftPointer],intArray[rightPointer]);
swap(leftPointer,rightPointer);
}
}
printf(" pivot swapped :%d,%d\n",intArray[leftPointer],intArray[right]);
swap(leftPointer,right);
printf("Updated Array: ");
display();
return leftPointer;
}
void quickSort(int left,int right)
{
if(right-left <=0)
{
return;
}
else
{
int pivot =intArray[right];
int partitionPoint= partition(left, right, pivot);
quickSort(left,partitionPoint-1);
quickSort(partitionPoint+1,right);
}
}
int main()
129
{
printf("\nBefore Sorting: ");
display();
printline(50);
quickSort(0,MAX-1);
printf("\nAfter sorting using quick sort, the elements are: ");
display();
printline(50);
return 0;
}
Output:
Complexity Analysis of Quick Sort
Worst Case Time Complexity: O(n2)
Best Case Time Complexity: O(n log n)
Average Time Complexity: O(n log n)
Space Complexity: O(n log n)
•
Space required by quick sort is very less, only O(n log n) additional space is required.
•
Quick sort is not a stable sorting technique, so it might change the occurrence of two
similar elements in the list while sorting.
Analysis of Quick sort
To analyse the running time of Quick Sort, we use the same approach as we did for Merge
Sort (and is common for many recursive algorithms, unless they are completely obvious).
130
LetT(n) represent the worst-case running time of the Quick Sort algorithm on an array of size
n. To get a hold of T(n), we look at the algorithm line by line. The call to partition takes time
Θ (n), because it runs one linear scan through the array, plus some constant time. Then, have
two recursive calls to Quick Sort.
Let k = m − 1 − l
Denote the size of the left subarray. Then, the first recursive call takes time T(k), because it is
a call on an array of size k. The second recursive call will take time T(n − 1 − k), because the
size of the right subarray is n − 1 − k. Therefore, the total running time of Quick Sort satisfies
the recurrence.
T(n) = Θ (n) + T(k) + T(n − 1 − k),
T(1) = Θ (1).
This is quite a bit messier-looking than the recurrence for Merge Sort, and has no idea about
k, solving this recurrence problem isn’t feasible. Certainly can work around and explore
different possible values of k.
1. For k = n/2, the recurrence becomes much simpler: T(n) = Θ (n)+T(n/2)+T(n/2−1),
which — as we discussed in the context of Merge Sort — we can simplify to T(n) = Θ
(n) + 2T(n/2). That’s exactly the recurrence that is already solved for Merge Sort, and
thus the running time of Quick Sort would be Θ (n log n).
2. At the other extreme is k = 0 (or, similarly, k = n−1). Then, results only that T(n) = Θ
(n)+T(0)+T(n − 1), and since T(0) = Θ (1), this recurrence becomes T(n) = Θ (n) +
T(n − 1). This recurrence unrolls as T(n) = Θ (n) + Θ (n − 1) + Θ (n − 2) + . . . + Θ
(1), so
The running time for k = 0 or k = n − 1 is thus just as bad as for the simple algorithms, and in
fact, for k = 0, Quick Sort is essentially the same as Selection Sort. Of course, this quadratic
running time would not be a problem if only the cases k = 0 and k = n − 1 did not appear in
practice. But in fact, they do: with the pivot choice we implemented, these cases will happen
whenever the array is already sorted (increasingly or decreasingly), which should actually be
an easy case. They will also happen if the array is nearly sorted.
131
This is quite likely in practice, for instance, because the array may have been sorted, and then
just messed up a little with some new insertions.
Did you Know?
Quicksort (sometimes called partition-exchange sort) is an efficient sorting algorithm,
serving as a systematic method for placing the elements of an array in order. Developed by
Tony Hoare in 1959, with his work published in 1961, it is still a commonly used
algorithm for sorting. When implemented well, it can be about two or three times faster
than its main competitors, merge sort and heapsort.
Self-assessment Questions
4) Which of the following is an example of not in-place sorting algorithm?
a) Bubble Sort
b) Merge Sort
c) Selection Sort
d) Heap Sort
5) Sorting Algorithm that does not require any extra space for sorting is known as
________________.
a) In-Place Sorting
b) Out-Place Sorting
c) Not in-Place Sorting
d) Not Out-Place Sorting
6) Which of the following is not a stable sorting algorithm?
a) Insertion sort
b) Selection sort
c)Bubble sort
d)Merge sort
7) Running merge sort on an array of size n which is already sorted is,
132
a) O(nlogn)
b) O(n)
c) O(n2)
d) O(n3)
8) Merge sort uses,
a) Divide-and-conquer
b) Backtracking
c) Heuristic approach
d) Greedy approach
9) For merging two sorted lists of size m and n into sorted list of size m+n, we
require comparisons of:
a) O(m)
b) O(n)
c) O(m+n)
d) O(logm + logn)
10) Quick sort is also known as _____________.
a) Merge sort
b) Tree sort
c) Shell sort
d) Partition and exchange sort
133
Summary
o Bubble sort is a simple sorting algorithm. It compares the first two elements, and if
the first is greater than the second, then it swaps them. It continues doing this for
each pair of adjacent elements to the end of the data set, repeating until no swaps
have occurred on the last pass.
o Selection sort is an in-place comparison sort. It has O(n2) complexity, making it
inefficient on large lists, and generally performs worse than the similar insertion
sort.
o Insertion sort is a simple sorting algorithm that is relatively efficient for small lists
and mostly sorted lists, and often is used as part of more sophisticated algorithms. It
works by taking elements from the list one by one and inserting them in their
correct position into a new sorted list.
o Merge sort takes advantage of the ease of merging already sorted lists into a new
sorted list. It starts by comparing every two elements (i.e., 1 with 2, then 3 with 4...)
and swapping them if the first should come after the second. It then merges each of
the resulting lists.
o Quicksort is a divide and conquer algorithm which relies on a partition operation:
to partition an array an element called a pivot is selected. All elements smaller than
the pivot is moved before it and all greater elements are moved after it.
134
Terminal Questions
1. Explain different types of Sorting Algorithms.
2. Write down the procedure for Bubble sort.
3. Explain the sorting technique based on divide and conquer policy and find its
time complexity.
4. Explain merge sort algorithm and find its time complexity.
Answer Keys
Self-assessment Questions
Question No.
Answer
1
c
2
c
3
a
4
b
5
a
6
b
7
a
8
a
9
c
10
d
135
Activity
1. Activity Type: Offline
Description:
1. Divide the class into 5 groups.
2. Assign an algorithm and list of numbers to each group
3. Students should sort the list using assigned algorithm.
136
Duration: 10 Minutes
Bibliography
e-Reference
•
pages.cs.wisc.edu, (2016). Computer Sciences User Pages. Retrieved on 19 April
2016, from http://pages.cs.wisc.edu/~bobh/367/SORTING.html
External Resources
•
Kruse, R. (2006). Data Structures and program designing using ‘C’ (2nd ed.).
Pearson Education.
•
Srivastava, S. K., & Srivastava, D. (2004). Data Structures Through C in Depth (2nd
ed.). BPB Publications.
•
Weiss, M. A. (2001). Data Structures and Algorithm Analysis in C (2nd ed.).
Pearson Education
Video Links
Topic
Introduction to Basics of sorting
techniques
Link
https://www.youtube.com/watch?v=pkkFqlG0Hds
Selection Sort
https://www.youtube.com/watch?v=LeNbr2ftWIo
Merge Sort
https://www.youtube.com/watch?v=TzeBrDU-JaY
137
Notes:
138
MODULE - III
Stacks and Queues
MODULE 3
Stacks and Queues
Module Description
This module introduces two closely-related data types for manipulating large collections of
objects: stack and the queue. Each of them is basically defined by two simple operations: insert
or add a new item, and remove an item. When we add a data item we have a clear intension.
However, when we remove an item, we should decide which one to choose. For example, the
rule used in case of queue is to always remove the item that has been in the queue for longest
time. This policy is known as first-in-first-out or FIFO. And the rule used in case of stack is
that we always remove the element that has been in the stack for least amount of time. This
policy is known as last-in first-out or LIFO.
Chapter 3.1
Stacks
Chapter 3.2
Queue
Chapter Table of Contents
Chapter 3.1
Stacks
Aim ..................................................................................................................................................... 139
Instructional Objectives................................................................................................................... 139
Learning Outcomes .......................................................................................................................... 139
3.1.1 Introduction to Stack .............................................................................................................. 140
(i) Definition of a Stack........................................................................................................... 141
(ii) Array Representation of Stack ......................................................................................... 142
Self-assessment Questions ...................................................................................................... 144
3.1.2 Operations on Stack ................................................................................................................ 144
Self-assessment Questions ...................................................................................................... 149
3.1.3 Polish Notations ...................................................................................................................... 149
(i) Infix Notation ..................................................................................................................... 150
(ii) Prefix Notation .................................................................................................................. 151
(iii) Postfix Notation ............................................................................................................... 152
Self-assessment Questions ...................................................................................................... 155
3.1.4 Conversion of Arithmetic Expression from Infix to Postfix ............................................. 155
Self-assessment Questions ...................................................................................................... 159
3.1.5 Applications of Stack .............................................................................................................. 160
(i) Balancing Symbol ............................................................................................................... 160
(ii) Recursion............................................................................................................................ 161
(iii) Evaluation of Postfix Expression.................................................................................... 163
(iv) String Reversal .................................................................................................................. 163
Self-assessment Questions ...................................................................................................... 165
Summary ........................................................................................................................................... 166
Terminal Questions.......................................................................................................................... 167
Answer Keys...................................................................................................................................... 168
Activity............................................................................................................................................... 169
Case Study ......................................................................................................................................... 170
Bibliography ...................................................................................................................................... 171
e-References ...................................................................................................................................... 171
External Resources ........................................................................................................................... 171
Video Links ....................................................................................................................................... 171
Aim
To educate and equip the students with skills and technologies of Stacks
Instructional Objectives
After completing this chapter, you should be able to:
•
Outline the basic features of stack
•
Describe the array representation of stack
•
Explain the Polish Notations with example
•
Discuss the evaluation of postfix expression using stack
•
Explain the steps to convert Infix expression to postfix expression and vice versa
•
Outline the applications of stacks
Learning Outcomes
At the end of this chapter, you are expected to:
•
Explain operations on stack
•
Convert given expressions of infix to prefix and postfix expression
•
Explain string recursion applications of stack
•
Compute given postfix expression using stack
•
Convert to prefix expression for any given infix expression
139
3.1.1 Introduction to Stack
In this chapter we will introduce a data structure for representing stack as a limited access data
structure. Stack data structure is used for manipulating arbitrarily large collections of data. The
stack is a data structure which represents objects maintained in a particular order. In this
chapter we also explain how to operate a stack data structure. This chapter demonstrates the
operations for creating a stack, adding elements to a stack, deleting an element form a stack etc.
Some problems has solutions that require the data associated to be arranged or organized as
linear list of data elements in which operations are permitted to take place at only one end of
the list. The best and the simplest examples are set of books kept one on top of another, set of
playing cards, pancake, arranging laundry, stacked plates one above another, etc. Here, we
group things together by placing one thing on top of another and then we have to remove things
from top to bottom one at a time. The below figure 3.1.1 shows a set of books represented as a
stack.
Figure 3.1.1: Picture Representing a Stack
It is interesting that something that is so simple is a critical part of nearly every program that is
written. The nested function calls in a running program, conversion of an infix form of an
expression to an equivalent postfix or prefix, computing factorial of a number, and so on can
be accurately formulated using this simple technique. In all the above cases, it is clear that the
one which most recently entered into the list is the first one to be operated. Solution to these
types of problems is based on the principle Last-In-First-Out (LIFO) or First-In-Last-Out. A
logical structure, which organizes the data and performs operations in LIFO or FILO principle,
is termed as a Stack.
140
(i)
Definition of a Stack
Stack is an ordered list of similar data items in which operations such as insertion and deletion
are permitted to be done only at one end called top of the stack. It is a linear data structure in
which operations can be performed on data objects on principle of Last-In-First-Out or FirstIn-Last-Out.
More formally, a stack can be defined as an abstract data type with domain of data objects and
a set of functions that can be performed on data objects guided by list of axioms.
Some of the important functions used while doing operations of stacks are listed below:
1. Create-Stack() - Used for allocating memory
2. Isempty(S) - Used for checking if stack is empty or not; it returns a Boolean
3. Isfull() - Used for checking if stack is full; this also return Boolean
4. Push(S,e) - Use to add an element on top of stack
5. Pop(s) - Used to remove an element from the top of the stack
6. Top(S) - Used to display an element in the stack
Also, some axioms are needed to be known while we do operations on stacks. Following is a list
of axioms which a programmer must know:
•
Isempty(Create-Stack()): Always returns true value
•
Isfull(Create-Stack()): Always returns false values
•
Isempty(Push(S, e)): Always returns false value
•
Isfull(Pop(S)): Always returns false values
•
Top(push(S, e)): The element e is displayed
•
Pop(Push(S, e): The element e will be removed from stack
The detailed explanation and algorithm for implementing above operations will be covered in
forthcoming sections. The figure 3.1.2 demonstrates push() and pop() operations performed
on stack.
141
As shown in the figure, initially stack contains element 1. To push element 2, stack pointer is
incremented and then element 2 is pushed. So now stack contains 2 elements i.e., 1 and 2.
In the second step, we push element 3 onto the stack. Thus this element will be placed on top
of 2 as the top of stack is pointing one location above 2. Similarly elements 4, 5 and 6 are pushed
onto the stack. After pushing element 6 the stack contains total 6 elements.
In the second part of the figure, pop instructions are executed. The first element we can read
out is 6 as it is on the top of the stack. And the stack pointer is decremented. Next time if we
execute a pop instruction, element 5 will be removed and so on until last element 1 is removed.
Thus,
Figure 3.1.2: Push() and Pop() Operations
(ii)
Array Representation of Stack
As we know a stack is a data structure designed to store collection of data where the data can
be added and removed from only one end. We can implement this stack using a simple linear
array. As array is a collection of similar kinds of elements. We can create a stack using a one
dimensional array very easily.
For example, we can declare an array named stack [] to store all the data elements of a stack.
Normally, elements in a linear array can be accessed in any random way by using array name
and its index.
142
But a stack is operating from only one end. Thus we when a stack is implemented as a array,
we should allow insertion and deletion of elements from only one end of array.
Thus, a variable named “top” will keep track of the position of the topmost element in that
stack. This variable is also called as stack pointer.
Initially the value of “top” is assigned to -1, as the stack is empty. When we push an element
onto the stack, we need to increment the stack pointer by one and then insert the element at a
position where stack pointer is pointing. For every push operation we have to check if stack
pointer has reached the maximum size of the array stack [].
Similarly, when we perform a pop operation, the stack pointer should be decremented by 1. We
should also check for a condition to see whether the stack array is empty or no.
Figure 3.1.3 demonstrates the array representation of stack
Figure 3.1.3: Array Representation of Stack
As shown in the above figure, an array named S with size 7 is declared which acts as a stack.
Thus we can store total of 7 elements in this stack.
In part (a) of the figure, after adding elements 15, 6, 2, and 9, the stack pointer is pointing
location 4.
In part (b), we have pushed two more elements 17 and 3 making stack pointer to have value 6.
In part (c), it shows how a pop operation is carried out on that array causing last element 3 to
be popped out. Thus the stack pointer is decremented by 1 after pop operation.
143
Self-assessment Questions
1) A stack is works on the principle of ___________
a) First in first out (FIFO)
b) Last in last out (LILO)
c) First in last out (FILO)
d) Cyclical data structures
2) The difference between linear array and stack is that any elements can randomly be
accessed.
a) True
b) False
3) Top[s] returns ________________
a) Stack bottom
b) Stack top
c) Stack mid
d) Any random element
3.1.2 Operations on Stack
The operations discussed in the topics above are explained in detail this in this section.
Basically, stack operations may include, initializing a stack, using it storing data based on
different applications and again de-initializing it. Apart from these basic things, stack is used
for carrying our following two operations:
1. Push() – storing data item in to the stack
2. Pop() – Deleting a data item from the stack
Consider the operation of pushing a data on to the stack. In order to use stack most efficiently,
we need to aware about the status if the stack. For this purpose, following functions are
importing.
1. stacktop() – This function is used for displaying the topmost element in the stack
2. isFull() – This function is used to check if the stack is already full
3. isEmpty() – This function is used to check if stack is empty.
144
Throughout, we must maintain a pointer to the most recent pushed data on the stack. This
pointer always represents the top of the stack and hence is named as top.
Before we proceed to implement push () operation, we must first lean the procedure for these
support functions.
Algorithm for top() function
begin procedure stacktop
return stack[top]
end procedure
Implementation in C programming
int stacktop()
{
return stack[top];
}
Algorithm for isFull() function
begin procedure isfull
iftop equals to MAXSIZE
returntrue
else
returnfalse
endif
end procedure
Implementation in C programming
bool isfull()
{
if(top == MAXSIZE)
returntrue;
else
returnfalse;
}
Algorithm for isempty() function
begin procedure isempty
iftop less than 1
returntrue
else
returnfalse
endif
end procedure
145
Implementation in C programming
bool isempty()
{
if(top ==-1)
returntrue;
else
returnfalse;
}
Now, to get back to push operation, we must first understand the process how push() function
works. Following steps are involved:
•
Step 1: Check if stack is full.
•
Step 2: If stack is full then display an error and exit.
•
Step 3: If stack is not full, increment top to point to next empty space.
•
Step 4: Add the element on to the stack, where top is pointing.
•
Step 5: Return
Figure 3.1.4: Push Operation on Stack
Note: If linked list is used for stack implementation, then memory space needs to be allocation
in step 3.
Following is the algorithm for push operation
begin procedure push: stack, data
if stack is full
146
returnnull
endif
top ← top +1
stack[top]← data
end procedure
And the corresponding C program function is also shown below
void push(int data)
{
if(!isFull())
{
top = top +1;
stack[top]= data;
}else
{
printf("Could not insert data, Stack is full.\n");
}
}
Now, we move on to pop operation. Accessing the data element while removing it from the
stack is called pop operation. Following are the steps involved in process of popping out an
element from the stack.
•
Step 1 − Check if stack is empty.
•
Step 2 − If stack is empty, produce error and exit.
•
Step 3 − If stack is not empty, access the data element at which top is pointing.
•
Step 4 − Decrease the value of top by 1.
•
Step 5 − return
147
Figure 3.1.5: Pop Operation on Stack
Algorithm for implementation of pop operation
begin procedure pop: stack
if stack is empty
returnnull
endif
data ← stack[top]
top ← top -1
return data
end procedure
Corresponding C program function
int pop(int data)
{
if(!isempty())
{
data = stack[top];
top = top -1;
return data;
}else
{
printf("Could not retrieve data, Stack is empty.\n");
}
}
148
Self-assessment Questions
4) Match the following
1
stacktop()
A
Used for checking if stack is empty
2
Isfull()
B
Used for displaying topmost element
3
Isempty()
C
Used for checking if stack is full
5) What push(x) does to stack
a) Removes x from stack
b) Add x to topmost element
c) Add x to all the elements
d) Add x to top of stack
6) What pop() does to stack
a) Removes x from stack
b) Add x to topmost element
c) Add x to all the elements
d) Add x to top of stack
3.1.3 Polish Notations
First we need to understand Arithmetic Expressions. An arithmetic expression is
an expression which when evaluated, results in a numeric value. The method of writing
arithmetic expression is known as a Notation. Same Arithmetic Expression can be written in
different ways without changing the essence or meaning of that expression.
Consider an expression:
(5-6)*7
It can be written in its infix form as “*(-5 6)7”. In this case all the arithmetic operators are
binary in nature, thus bracketing is not necessary. The above expression can be also written as
*-567
149
Consider an expression “1+2”, which adds the values 1 and 2. Its prefix notation, the operators
precedes the operands, thus it will be “+ 1 2”.
The product calculation depends upon the availability of two operands i.e., 5-6 and 7.
Normally, the innermost expressions are evaluated first. But, in case of prefix notation
operators are written ahead of operands.
Thus infix notation with parenthesis will look like
5 – (6 * 7)
Or without parenthesis it will be
5–6 * 7
It would change the semantics or meaning of the expression because of precedence rule.
Similarly, Polish notation of
5 – (6 * 7)
Will be
–5*67
Polish notation
Polish notation, also called Polish prefix notation or prefix notation is a symbolic logic invented
by Polish mathematician Jan Lukasiewicz. It is a form of notation for logic, arithmetic, and
algebra. In prefix notations, the operators are placed to the left of their operands. If the
operator’s parity is fixed, the result is a syntax lacking parentheses or other brackets that can
still be parsed without any problem. The term Polish notations also include Polish postfix
notation, or Reverse Polish notation, in which the operators are placed after the operands.
(i)
Infix Notation
As already discussed in the previous section, infix notation is the most common and simplest
notation in which an operator is placed between two operands. This notation is also known as
general form of arithmetic expression. For example, if arithmetic expression for adding two
operands can be written in infix form as
A+B
In this example A and B are two operands and + is the operator.
150
Another example of infix expression is
A + B * C + (E – G)
These expressions follow a normal arithmetic precedence rule. For example, to evaluate the
above expression, the first precedence is given to multiplication. So the product of B and C will
be calculated first. Second precedence will be given to parenthesis. Therefore the result of E –
G will be calculated and then A, result of product of B and C, and subtraction result of E and G
will be added together.
(ii)
Prefix Notation
This is also called as a polish method. When using this method, operator precedes operands i.e.
instruction precedes data. Here the order of operations and operands determines the result,
making parenthesis unnecessary. Taking the example, consider infix expression 3 (4 + 5). This
could be expressed as
*3+45
This is in contrast with the traditional algebraic methodology for performing mathematical
operations, order of operation. In the expression 3(4+5), we first work inside the parentheses
to add four plus five and then multiply the result by three.
Did you know?
In the olden days of the calculator, the end-user would write down the results of every step when
using the algebraic Order of Operations. Not only did this slow things down, it provided an
opportunity for the end-user to make errors and sometimes defeated the purpose of using a
calculating machine. In the 1960's, engineers at Hewlett-Packard decided that it would be easier
for end-users to learn Jan Lukasiewicz' logic system than to try and use the Order of Operations
on a calculator. They modified Jan Lukasiewicz's system for a calculator keyboard by placing
the instructions (operators) after the data. In homage to Jan Lukasiewicz' Polish logic system,
the engineers at Hewlett-Packard called their modification reverse Polish notation (RPN).
151
(iii) Postfix Notation
Just opposite to prefix notation is postfix notation. Here operands precedes operator or
operator is placed after operands and hence it is called postfix notation. It is also called as
reverse polish expression. The infix expression A+B can be written in postfix as AB+.
Below are some of the examples of expressions represented in all three notations.
Infix
Prefix
Postfix
A+B
+AB
AB+
A+B*C
+A*BC
ABC*+
(A+B)*(C-D)
*+AB-CD
AB+CD-*
Algorithm for evaluation of postfix expression
Consider a string of postfix arithmetic expression of operands and operators. Below given
below should be followed for evaluation of a postfix expression:
•
Step 1: Scan the string from left to right.
•
Skip all the operands and values.
•
If an operator is found, perform the operation on preceding two operands.
•
Now replace these (2 operands and an operator) with one operand i.e., the result of
operation.
•
152
Continue the process until single value remains, which is the result of the expression.
Algorithm
Program:
#include<string.h>
#include<stdlib.h>
#define MAX 50
int stack[MAX];
char post[MAX];
int top=-1;
void pushstack(int tmp);
void calculator(char c);
void main()
{
int i;
printf("Insert a postfix notation :: ");
gets(post);
for(i=0;i<strlen(post);i++)
{
if(post[i]>='0' && post[i]<='9')
{
pushstack(i);
}
if(post[i]=='+' || post[i]=='-' || post[i]=='*' ||
post[i]=='/' || post[i]=='^')
{
calculator(post[i]);
}
}
printf("\n\nResult :: %d",stack[top]);
}
void pushstack(int tmp)
{
top++;
153
stack[top]=(int)(post[tmp]-48);
}
void calculator(char c)
{
int a,b,ans;
a=stack[top];
stack[top]='\0';
top--;
b=stack[top];
stack[top]='\0';
top--;
switch(c)
{
case '+':
ans=b+a;
break;
case '-':
ans=b-a;
break;
case '*':
ans=b*a;
break;
case '/':
ans=b/a;
break;
case '^':
ans=b^a;
break;
default:
ans=0;
}
top++;
stack[top]=ans;
}
Output:
154
Self-assessment Questions
7) In prefix notation ________ follows the operands. (fill in the blank)
8) A+B is an infix expression
a) True
b) False
9) Postfix notation of A+B is
a) +AB
b) A+B
c) AB+
d) ++A
3.1.4 Conversion of Arithmetic Expression from
Infix to Postfix
Let X is an arithmetic expression in its infix form. X is an expression containing operators,
operands, parenthesis etc. We have 5 basic operators in mathematics namely addition,
subtraction, multiplication, division and exponentiation.
The order of precedence is
•
Exponentiation (Highest Precedence)
•
Multiplication/division
•
Addition/subtraction (Lowest Precedence)
Consider that all the operators including exponentiations are on the same level, performed
from left to right unless indicated by the parentheses.
Below given algorithm transforms any infix expression X into its equivalent postfix expression
Y. We use stack data structure to store the operators and parenthesis.
Algorithm:
1. Read token from Left to Right in a given infix expression X and Postfix expression Y is
generated.
155
2. Input infix Expression may have following tokens:
a) Any Alphabet from A-Z or a-Z
b) Any Number from 0-9
c) Any Operator
d) Opening And Closing Braces ( , )
3. If token read is Alphabet:
a) Print that Alphabet as Output
4. If token read is Digit:
a) Print that Digit as Output
5. If token read is Opening Bracket “(” :
a) Push opening bracket ‘(’ Onto the Stack
b) If any Operator appears before ‘)’ then Push it onto Stack.
c) If Corresponding ‘)’ bracket appears then Start pop elements from Stack till ‘(’ is
popped out.
6. If token read is Operator:
a) Check if there is any Operator already present in Stack.
b) If Stack is empty, Push Operator onto the Stack.
c) If operator is present, check if Priority of Incoming Operator is greater than Priority
of Topmost Stack Operator.
d) If Priority of incoming Operator is Greater, push Incoming Operator Onto Stack.
e) Else Pop Operator from Stack, repeat Step 6.
156
Example of converting an expression from infix to postfix
Infix expression:
A*B+C
The order in which the operators appear is not reversed. When the '+' is read, it has lower
precedence than the '*', so the '*' must be printed first.
We will show this in a table with three columns. The first will show the symbol currently being
read. The second will show what is on the stack and the third will show the current contents of
the postfix string. The stack will be written from left to right with the 'bottom' of the stack to
the left.
Step
1.
2.
3.
4.
5.
6.
Current Symbol
A
*
B
+
C
Stack
*
*
+
+
Postfix expression
A
A
AB
AB*
AB*C
AB*C+
Step 1: The first input token is an alphabet “A”. Thus it is printed as output character of postfix
notation.
Step 2: Next token in the infix expression is an operator “*”. Thus it is pushed onto the top of
the stack if the stack is empty.
Step 3: The third token in infix expression is an alphabet “B”, hence it is printed as an output
character of postfix notation.
Step 4: Fourth input token is again an operator “+”. But the operator on the top of the stack i.e.
“*” has higher precedence as compared to operator “+”. Thus the operator “*” is popped out
from top of stack and printed as output of postfix notation. Push the operator “+” on to the top
of the stack now.
Step 5: Next input character is an alphabet “C”. Thus, printed as an output character of postfix
notation.
Step 6: now it is the end of the infix expression, thus we need to pop out all the operators from
the stack one by one and printed as postfix notation. Thus operator “+” is printed as the last
character of the postfix notation.
157
Thus the Postfix expression is AB*C+
Program:
#include<stdio.h>
#include<ctype.h>
char stack[20];
int top = -1;
void push(char x)
{
stack[++top] = x;
}
char pop()
{
if(top == -1)
return -1;
else
return stack[top--];
}
int priority(char x)
{
if(x == '(')
return 0;
if(x == '+' || x == '-')
return 1;
if(x == '*' || x == '/')
return 2;
}
int main()
{
char exp[20];
char *e, x;
printf("Enter the expression :: ");
scanf("%s",exp);
e = exp;
while(*e != '\0')
{
if(isalnum(*e))
printf("%c",*e);
else if(*e == '(')
push(*e);
else if(*e == ')')
{
while((x = pop()) != '(')
printf("%c", x);
}
else
{
158
while(priority(stack[top]) >= priority(*e))
printf("%c",pop());
push(*e);
}
e++;
}
while(top != -1)
{
printf("%c",pop());
}
return 0;
}
Output:
Self-assessment Questions
10) As per the algorithm to convert infix to postfix expression, we must ignore parenthesis
present in infix expression
a) True
b) False
11) While converting an infix expression to postfix expression, if an operator is
encountered, the operators are ___________.
a) Pushed on to the stack
b) Popped out of stack
c) Left without doing anything
d) Checked for precedence level
12) When the string scanning ends, next operation is ____________.
a) Popping out all operators from stack and adding them to postfix string
b) Exit and print result
c) Push all the operands on to the stack
d) Do nothing
159
3.1.5 Applications of Stack
Stacks have many useful applications in computer science. Stack form a base for many of the
compilers for programming languages and sometimes is also core part of low lever
programming languages like MATLAB and other assembly level languages. Some of the basic
and most frequently used applications are described in the section below.
(i)
Balancing Symbol
We always do syntax mistakes while typing programs. The compilers duty is to check the
programs for all the syntax errors. Most of the times, we make mistakes in typing brackets or
parenthesis or any operators. Lack of any one symbol may cause multiple errors in the program.
Thus real error remains unidentified.
Hence a stack can be used to check if the expressions in the programs are balanced. Thus, every
right bracket, parenthesis or braces must end with corresponding left counterparts.
For example, the sequence [()] is correct, however [(]) is invalid. As of now, consider a problem
just check for balancing of parentheses, brackets, and braces and ignore other characters. A
Stack can be used to balance symbols in a program. Following are the steps to do the same:
1. Create an empty stack s[].
2. Scan the program file character by character till the end of file.
3. Upon identifying any symbol (parenthesis, brace, bracket etc.), push it on to the stack.
4. If stack is empty and scanned character is close bracket, brace, parenthesis etc., print an
error message.
5. Else pop element from stack
6. If popped element is not corresponding open symbol, print an error message.
7. If stack is not empty at the end of file, print an error message.
This is clearly linear and actually makes only one pass through the input. It is thus on-line and
quite fast.
160
(ii)
Recursion
Recursion is considered to be the most powerful tools in a programming language. But
sometimes Recursion is also considered as the most tricky and threatening concept to a lot of
programmers. This is because of the uncertainty of conditions specified by user.
In short Something Referring to itself is called as a Recursive Definition
Recursion can be defined as defining anything in terms of itself. It can be also defined as
repeating items in a self-similar way.
In programming, if one function calls itself to accomplish some task then it is said to be a
recursive function. Recursion concept is used in solving those problems where iterative
multiple executions are involved.
Thus, to make any function execute repeatedly until we obtain the desired output, we can make
use of Recursion.
Example of Recursion:
The best example in mathematics is the factorial function.
n! = 1.2.3.........(n-1).n
If n=6, then factorial of 6 is calculated as
6! = 6(5)(4)(3)(2)(1)= 720
Consider we are calculating the factorial of any given using a simple. If we have to calculate
factorial of 6 then what remains is the calculation of 5!
In general we can say
n ! = n (n-1)!
(i.e., 6! = 6 (5!))
It means we need to execute same factorial code again and again which is nothing but
Recursion.
161
Thus the Recursive definition for factorial is:
f(n) = 1
if n=0
n * f (n-1)
otherwise
The above Recursive function says that the factorial of any number n=0 is 1, else the factorial
of any other number n is defined to be the product of that number n and the factorial of one
less than that number .
Any recursive definitions will have some properties. They are:
1. There are one or more base cases for which recursions are not needed.
2. All cycles of recursion stops at one of the base cases.
We should make sure that each recursion always occurs on a smaller version of the original
problem.
In C Programming a recursive factorial function will look like:
int factorial(int n)
{
if (n==0)
//Base Case
return 1;
else
return n*factorial (n-1);
//Recursive Case
}
The above program is for calculating factorial of any number n. First when we call this factorial
function, it checks for the base case. It checks if value of n equals 0. If n equals 0, then by
definition it returns 1.
Otherwise it means that the base case is not yet been satisfied. Hence it returns the product of
n and factorial of n-1.
Thus it calls the factorial function once again to find factorial of n-1. Thus forming recursive
calls until base case is met.
162
(iii) Evaluation of Postfix Expression
(iv) String Reversal
Since stack is LIFO data structure, it becomes of obvious use in application where there is
requirement of reversing a string or for checking if a string is a palindrome or not. The simplest
way to reverse a string is, scan a string from left to right and push every character on to the
stack until we reach the end of the string. Once we reach the end of the string, start popping
out elements from the stack and create a new string of popped elements. Repeat the process of
popping from stack until stack becomes empty.
/*Program of reversing a string using stack */
#include<stdio.h>
#include<string.h>
#include<stdlib.h>
#define MAX 20
int top = -1;
char stack[MAX];
char pop();
void push(char);
int main()
{
char str[20];
unsigned int i;
printf("Enter the string : ");
gets(str);
/*Push characters of the string str on the stack */
for(i=0;i<strlen(str);i++)
push(str[i]);
163
/*Pop characters from the stack and store in string str */
for(i=0;i<strlen(str);i++)
str[i]=pop();
printf("Reversed string is : ");
puts(str);
return 0;
}/*End of main()*/
void push(char item)
{
if(top == (MAX-1))
{
printf("Stack Overflow\n");
return;
}
stack[++top] =item;
}/*End of push()*/
char pop()
{
if(top == -1)
{
printf("Stack Underflow\n");
exit(1);
}
return stack[top--];
}/*End of pop()*/
Output:
164
Self-assessment Questions
13) Balancing symbol is useful for _____________.
a) Compiler optimization
b) Inserting symbols in program code
c) Inserting comments
d) Checking precedence of operator
14) In Stack winding phase of recursion involves popping out instructions from stack
a) True
b) False
15) What is the evaluation of postfix expression 4 6 + 7 a) 7
b) 4
c) 3
d) 8
165
Summary
o Stacks are Last-In-First-Out (LIFO) data structures in which the most recent
element inserted in the stack is the first one to be removed.
o The stack can be implemented using an array by creating a stack pointer variable
for keeping track of top position.
o Push () and pop () are the primary operations possible on stacks for insertion and
deletion of elements along with some support functions.
o Polish notation which is also called as prefix notation or simply prefix notation is a
form of notation for logic, arithmetic, and algebra.
o Infix notation is the most common and simplest notation in which an operator is
placed between two operands.
o In prefix notation, operator precedes operands and in postfix operands precedes
the operator.
o Stacks can be used for evaluating a postfix expression, also in Recursion, String
reversal etc.
166
Terminal Questions
1. Explain stack and its basic operations.
2. Explain the algorithm for push and pop operations.
3. Write a C program for converting infix expression to postfix expression.
4. Explain applications of stack in brief.
167
Answer Keys
Self-assessment Questions
168
Question No.
Answer
1
c
2
b
3
b
4
1 –b,2-c,3-a
5
b
6
a
7
Operator
8
a
9
c
10
a
11
a
12
a
13
a
14
b
15
c
Activity
Activity Type: Offline
Duration: 15 Minutes
Description:
Divide the students into 4 groups.
Below are 4 infix expression, assign an expression to each group.
Each group should convert the given expression to postfix and prefix expression using stack.
a) 3+4*5/6
b) 6 * (77 + 8 *15) + 20
c) (300+23)*(43-21)/(84+7)
d) (4+8)*(6-5)/((3-2)*(2+2))
169
Case Study
Stack based memory allocation
Stacks in computing architectures are regions of memory where data is added or removed in a
last-in-first-out (LIFO) manner.
In most modern computer systems, each thread has a reserved region of memory referred to as
its stack. When a function executes, it may add some of its state data to the top of the stack;
when the function exits it is responsible for removing that data from the stack. At a minimum,
a thread's stack is used to store the location of function calls in order to allow return statements
to return to the correct location, but programmers may further choose to explicitly use the
stack. If a region of memory lies on the thread's stack, that memory is said to have been allocated
on the stack.
Because the data is added and removed in a last-in-first-out manner, stack-based memory
allocation is very simple and typically faster than heap-based memory allocation (also known
as dynamic memory allocation). Another feature is that memory on the stack is automatically,
and very efficiently, reclaimed when the function exits, which can be convenient for the
programmer if the data is no longer required. If however, the data needs to be kept in some
form, then it must be copied from the stack before the function exits. Therefore, stack based
allocation is suitable for temporary data or data which is no longer required after the creating
function exits.
A thread's assigned stack size can be as small as only a few bytes on some small CPU's.
Allocating more memory on the stack than is available can result in a crash due to stack
overflow.
Some processor families, such as the x86, have special instructions for manipulating the stack
of the currently executing thread. Other processor families, including PowerPC and MIPS, do
not have explicit stack support, but instead rely on convention and delegate stack management
to the operating system's application binary interface (ABI).
Questions:
1. Explain how stack based memory allocation worked.
2. What are the advantages of stack based memory allocation?
170
Bibliography
e-Reference
•
bowdoin.edu, (2016). Computer Science 210: Data Structures. Retrieved on 19
April 2016, from
http://www.bowdoin.edu/~ltoma/teaching/cs210/fall10/Slides/StacksAndQueues.
pdf
External Resources
•
Kruse, R. (2006). Data Structures and program designing using ‘C’ (2nd ed.). Pearson
Education.
•
Srivastava, S. K., & Srivastava, D. (2004). Data Structures Through C in Depth (2nd
ed.). BPB Publications.
•
Weiss, M. A. (2001). Data Structures and Algorithm Analysis in C (2nd ed.). Pearson
Education.
Video Links
Topic
Link
Introduction and definition of stacks https://www.youtube.com/watch?v=FNZ5o9S9prU
Recursion
Evaluation of postfix expression
using stack
https://www.youtube.com/watch?v=k0bb7UYy0pY
https://www.youtube.com/watch?v=_EP4gpG-4kQ
171
Notes:
172
Chapter Table of Contents
Chapter 3.2
Queues
Aim ..................................................................................................................................................... 173
Instructional Objectives................................................................................................................... 173
Learning Outcomes .......................................................................................................................... 173
3.2.1 Introduction to Queue............................................................................................................ 174
(i) Definition of a Queue ........................................................................................................ 174
(ii) Array Representation of Queue....................................................................................... 175
Self-assessment Questions ...................................................................................................... 180
3.2.2 Types of Queue ........................................................................................................................ 181
(i) Simple Queue ...................................................................................................................... 181
(ii) Circular Queue .................................................................................................................. 182
(iii) Double Ended Queue ...................................................................................................... 188
(iv) Priority Queue .................................................................................................................. 194
Self-assessment Questions ...................................................................................................... 196
3.2.3 Operations on Queue ............................................................................................................. 196
(i) Insertion .............................................................................................................................. 197
(ii) Deletion in Queue ............................................................................................................. 198
(iii) Qempty Operation ........................................................................................................... 199
(iv) Qfull Operation ................................................................................................................ 200
(v) Display Operation ............................................................................................................. 200
Self-assessment Questions ...................................................................................................... 201
3.2.4 Application of Queue ............................................................................................................. 202
Self-assessment Questions ...................................................................................................... 203
Summary ........................................................................................................................................... 204
Terminal Questions.......................................................................................................................... 205
Answer Keys...................................................................................................................................... 205
Activity............................................................................................................................................... 206
Bibliography ...................................................................................................................................... 207
e-References ...................................................................................................................................... 207
External Resources ........................................................................................................................... 207
Video Links ....................................................................................................................................... 207
Aim
To educate the students with the basic knowledge of queues, its types and operations
on queues
Instructional Objectives
After completing this chapter, you should be able to:
•
Explain queue and its operations
•
Describe the array representation of Queue
•
Discuss different types of queue with example
•
Illustrate the creation, insertion, deletion and search operation on various types
of queue
Learning Outcomes
At the end of this chapter, you are expected to:
•
Demonstrate queue with its operations
•
Implement double ended queue using linked list
•
Identify requirement of priority queue
173
3.2.1 Introduction to Queue
In simple language a queue is a simple waiting line which keeps growing if we add the elements
to its end and keep shrinking on removal of elements from its front. If we compare stack, queue
reflects the more commonly used maxim in real-world, that is, “first come, first served”. Long
waiting lines in food counters, supermarkets, banks are common examples of queues.
For all computer applications, we define a queue as list in which all additions to the list are
made at one end, and all deletions from the list are made at the other end. Applications of
queues are, if anything, even more common than are applications of stacks, since in performing
tasks by computer, as in all parts of life, it is often necessary to wait one’s turn before having
access to something. Within a computer processor there can be queues of tasks waiting for
different devices like printer, for access to disk storage, or even, with multitasking, for using
the CPU. Within a single program, there may be multiple requests to be kept in a queue, or one
task may create other task, which must be done in turn by keeping them in a queue.
A queue is a data structure where elements are added at the back and remove elements from
the front. In that way a queue is like “waiting in line”: the first one to be added to the queue will
be the first one to be removed from the queue.
Queues are common in many applications. For example, while we read a book from a file, it is
quite natural to store the read words in a queue so that once reading is complete the words are
in the order as they appear in the book. Another common example is buffer for network
communication that temporarily store packets of data arriving on a network port. Generally
speaking, it is processed in the order in which the elements arrive.
(i)
Definition of a Queue
In a more formal way, queue can be defined as a list or a data structure in which data items can
be added at the end (generally referred as rear) and they can be deleted from font of the queue.
The data element to be deleted is the one which would spend maximum time in the queue. It
is because of this property, queue is also referred to as a first-in-first-out (FIFO) data structure.
The figure 3.2.1 below shows pictorial representation of a queue.
Figure 3.2.1: Representation of a Queue in Computer’s Memory
174
Generally a queue can also be referred to as a container of objects (in other words linear
collection) that are deleted or added based on principle of First-In-First-Out (FIFO). A very
good example of a queue can be a line of students in the ice-cream counter of the college
canteen. New arrival of students can be added to a line at back of the queue, while removal
serving (or removal) happens in the front of the queue. The queue allows only two operations;
enqueue and dequeues. Enqueue is an operation that allows insertion operation, dequeue
allows us to remove an item.
Stack and a queue difference lies only in deletion of item. A stack removes most recently added
item; while a queue removes the least recently added item first.
In spite of its simplicity, the queue is a very important concept with many applications in
simulation of real life events such as lines of customers at a cash register or cars waiting at an
intersection, and in programming (such as printer jobs waiting to be processed. Many Smalltalk
applications use a queue but instead of implementing it as a new class, they use an Ordered
Collection because it performs all the required functions.
Dequeuing or removing an item from a queue is only possible on non-empty queues, which
requires contract in the interface. This interface can be written without committing to an
implementation of queues. This is important so that different implementations of the functions
in this interface can choose different representations.
(ii)
Array Representation of Queue
The array to implement the queue would need two variables (indices) called front and rear to
point to the first and the last elements of the queue. The figure 3.2.2 shows array
implementation of queue.
Figure 3.2.2: Array Implementation of Queue
175
Initially:
q->rear = -1;
q->front = -1;
For every enqueue operation we increment rear by one, and for every dequeue operation, we
increment front by one. Even though enqueue and dequeue operations are simple to
implement, there is a disadvantage in this set up. The size of the array required is huge, as the
number of slots would go on increasing as long as there are items to be added to the list
(irrespective of how many items are deleted, as these two are independent operations.)
Problems with this representation
Although there is space in the following queue in the initial blocks, we may not be able to add
a new item. An attempt will cause an overflow.
Figure 3.2.3: Queue Overflow Situation
It is possible to have an empty queue yet no new item can be inserted. (When front moves to
the point of rear, and the last item is deleted.)
Figure 3.2.4: Overflow Situation in an Empty Queue
176
The below program shows the implementation of a queue using an Array
Program:
/*
* C Program to Implement a Queue using an Array
*/
#include<stdio.h>
#include<stdlib.h>
#define SIZE 50
int queue_arr[SIZE];
int rear =-1;
int front =-1;
int main()
{
int ch;
while(1)
{
printf("1.Insert element to the queue \n");
printf("2.Delete element from the queue \n");
printf("3.Display all elements of the queue \n");
printf("4.Quit \n");
printf("Enter your choice : ");
scanf("%d",&ch);
switch(ch)
{
case 1:
insert();
break;
case 2:
delete();
break;
case 3:
display();
break;
case 4:
exit(1);
default:
printf("Invalid Input \n");
}/*End of switch*/
}/*End of while*/
}/*End of main()*/
insert()
{
int add_item;
if(rear == SIZE -1)
printf("Queue Overflow \n");
else
{
177
if(front ==-1)
/*If queue is initially empty */
front =0;
printf("Inset the element in the queue : ");
scanf("%d",&add_item);
rear = rear +1;
queue_arr[rear]= add_item;
}
}/*End of insert()*/
delete()
{
if(front ==-1|| front > rear)
{
printf("Queue Underflow \n");
return;
}
else
{
printf("Element deleted from the queue is : %d\n", queue_arr[front]);
front = front +1;
}
}/*End of delete() */
display()
{
int i;
if(front ==-1)
printf("Queue is empty \n");
else
{
printf("The Queue elements are : \n");
for(i = front; i <= rear; i++)
printf("%d ", queue_arr[i]);
printf("\n");
}
}/*End of display() */
178
Output:
179
Did you know?
Though the simple queue appears to be very simple FIFO model, it is a very important
model used for any applications. During the initial implementation of most of the operating
systems like Linux which is generic programmer friendly OS and others like TinyOs which
is OS used by wireless sensor networks used simple FIFO queue for scheduling different
tasks due to ease of implementation and limitations of resources.
Self-assessment Questions
1) Which one of the following is an application of Queue Data Structure?
a) When
b) A resource is shared among multiple devices
c) Printer jobs waiting to be processed.
d) Buffer used in network communication to store data packets
e) All of the above
2) For every enqueue operation, we __________ by one, and for every dequeue operation,
we __________ by one.
a) Decrement rear, decrement front
b) Increment rear, increment front
c) Increment front, increment rear
d) Decrement front, decrement rear
3) For queue implementation, we need two pointers namely front and rear. This pointers
are initialized as:
180
a) front=1 and rear=-1
b) front=-1 and rear=-1
c) front=-1 and rear=1
d) front=1 and rear=1
3.2.2 Types of Queue
A queue represents a basket of items. Enqueue is an operation that adds an item to this basket
and dequeue is an operation that chooses an item to be removed from the queue (if the queue
is not empty). Similar to human queues, these queues will vary based on the rule used to choose
the item to be removed from the queue. Giving different names to the basic operations of the
queue based on the operations they perform is usual and it helps in avoiding the confusion.
However, using the general signature for different kinds of queue will make our code more
modular later, when algorithms based on the different kinds of queue are discussed.
(i)
Simple Queue
Like the stacks, we can also implement queue using lists and arrays. Both the arrays and the
linked lists implementations have running time complexity of O (1) for every operation. This
section covers the array implementation of queues.
For every queue data structure, an array is kept namely, QUEUE [], and there are two positions
q_front and q_rear, which represent the beginning and the ends of the queue respectively.
q_size takes care of the size of the queue.
All the above information makes part of a structure, and except for the queue functions
themselves, no functions should access these directly. The figure 3.2.5 shows a queue in some
intermediate state. The blank cells have undefined values in them. In particular, the elements
in the first two cells have spent maximum time in the queue.
Figure 3.2.5: Basic Queue example
In order to enqueue an element x, we must first increment q_rear and q_size, then set QUEUE
[q_rear] = x. To dequeue an element, assign return value to QUEUE [q_front], decrement
q_size, and then increment q_front. Other strategies are possible (this is discussed later).
Using array Simple queue can be declared as
#define MAX 10
int queue[MAX], rear=0, front=0;
181
(ii)
Circular Queue
One biggest problem with the simple queue is its implementation. After adding 10 elements in
the queue (considering previously discussed situation), the queue looks like full, since q_front
is 10, and the next enqueue would be in a non-existent position. However, there might some
positions available in the queue, as many elements may have been already dequeued. Queues,
like stacks, frequently stay small even in the presence of a lot of operations.
The solution for this problem is that whenever q_rear or q_front comes to the end of the array,
it is wrapped around to the beginning. The following figure shows the queue during some
operations
Figure 3.2.6: Circular Queue
There is a minimal need to write an extra code to implement the wraparound although it
increases the running time complexity). If incrementing either q_front or q_rear makes it to go
past the array, the value is reset to the first position in the array.
However, two things are needed to be taken care of while using circular array implementation
of queues. First and the foremost thing, we must check if queue is not empty, because a dequeue
operation when the queue is empty returns an undefined value.
182
Secondly, sometimes programmers represent front and rear differently for queues. For
example, some programmers do not use an entry to keep track the size of the queue, because
they rely on the assumption that the queue is empty, q_front = q_rear - 1. The size is computed
implicitly by comparing q_front and q_rear. This is a very tricky way, since there are some
special cases, therefore one need to be very careful if we need to modify code written this way.
Consider the situation that size is not part of the structure, then if the size of the array is
A_SIZE, the queue is full when there are A_SIZE -1 elements, since only A_SIZE different sizes
can be differentiated, and one of these is 0.
In applications where it is sure that the number of enqueue is not larger than the size of the
queue, obviously the wraparound is not necessary. As with stacks, dequeues are rarely
performed unless the calling routines are certain that the queue is not empty. Thus error calls
are frequently skipped for this operation, except in critical code. This is generally not justifiable,
because the time savings that you are likely to achieve are too minimal.
We can think of an array as a circle rather than a straight line in order to overcome the
inefficient use of space as depicted in the figure 3.2.7. In this way, as entries are added and
removed from the queue, the head will continually chase the tail around the array, so that the
snake can keep crawling indefinitely but stay in a confined circuit. At different times, the queue
will occupy different parts of the array, but there no need to worry about running out of space
unless the array is fully occupied, in which case there is truly overflow
183
Figure 3.2.7: Queue in a Circular Array
Implementation of Circular Arrays
In order to implement the circular queue as a linear array, consider the positions around the
circular arrangement as numbered from zero to max-1, where max is the total number of
elements in the circular arrays. We use same numbered entries of a linear array to implement
a circular array. Now it becomes a very simple logic of using modular arithmetic i.e. whenever
the index crosses max-1, we start again from 0. This is as simple as doing arithmetic on circular
clock face where the hours are numbered from 1 to 12, and if four hours are added to ten
o’clock, two o’clock is obtained.
Program:
//Program for Circular Queue implementation through Array
#include <stdio.h>
#include<ctype.h>
#include<stdlib.h>
#define SIZE 5
int circleq[SIZE];
int front,rear;
184
int main()
{
void insert(int, int);
void delete(int);
int ch=1,i,n;
front = -1;
rear = -1;
while(1)
{
printf("\nMAIN MENU\n1.INSERTION\n2.DELETION\n3.EXIT");
printf("\nENTER YOUR CHOICE : ");
scanf("%d",&ch);
switch(ch)
{
case 1:
printf("\nEnter the elements of the queue: ");
scanf("%d",&n);
insert(n,SIZE);
break;
case 2:
delete(SIZE);
break;
case 3:
exit(0);
default: printf("\nInvalid input. ");
}
} //end of outer while
}
//end of main
void insert(int item,int MAX)
{
//rear++;
//rear= (rear%MAX);
if(front ==(rear+1)%MAX)
{
printf("\nCircular queue overflow\n");
}
else
{
if(front==-1)
front=rear=0;
else
rear=(rear+1)%MAX;
circleq[rear]=item;
printf("\nRear = %d
Front = %d ",rear,front);
}
}
void delete(int MAX)
{
int del;
if(front == -1)
185
{
printf("\nCircular queue underflow\n");
}
else
{
del=circleq[front];
if(front==rear)
front=rear=-1;
else
front = (front+1)%MAX;
printf("\nDeleted element from the queue is: %d ",del);
printf("\nRear = %d
Front = %d ",rear,front);
}
}
Output:
186
187
(iii) Double Ended Queue
A double-ended queue is also known as dequeue. This is an ordered collection of items similar
to the queue. A dequeue has two ends, a front and a rear, and all the items remains positioned
in the entire collection. The special thing about dequeue is the unrestrictive nature of removing
and adding items. An item can be added at either the rear or the front. Similarly, the existing
items can be removed from either end. This makes dequeue a hybrid linear structure that
provides all the capabilities of queues and stacks in a single data structure. Figure 3.2.8 shows a
dequeue.
It is important to note that even though the dequeue can assume many of the characteristics of
stacks and queues, it does not require the LIFO and FIFO orderings that are enforced by those
data structures. The use of the addition and removal operations must be done consistently.
188
Figure 3.2.8: Dequeue
A double-ended queue (dequeue, often abbreviated to dequeue, pronounced deck) is an
abstract data structure that implements a queue for which elements can only be added to or
removed from the front (head) or back (tail). It is also often called a head-tail linked list.
Dequeue is a special type of data structure in which deletion and insertion can be done either
at the rear end or at the front end of the queue. The operations that can be performed on
dequeues are:
•
Insertion of an item from front end
•
Insertion of an item from rear end
•
Deletion of an item from front end
•
Deletion of an item from rear end
•
Displaying the contents of queue
Application of dequeue
•
A nice application of the dequeue is storing a web browser's history. Recently visited
URLs are added to the front of the dequeue, and the URL at the back of the dequeue is
removed after some specified number of insertions at the front.
•
Another common application of the dequeue is storing a software application's list of
undo operations.
•
One example where a dequeue can be used is the A-Steal job scheduling algorithm.[5]
This algorithm implements task scheduling for several processors. A separate dequeue
with threads to be executed is maintained for each processor. To execute the next
189
thread, the processor gets the first element from the dequeue (using the "remove first
element" dequeue operation). If the current thread forks, it is put back to the front of
the dequeue ("insert element at front") and a new thread is executed. When one of the
processors finishes execution of its own threads (i.e. it’sdequeue is empty), it can "steal"
a thread from another processor: it gets the last element from the dequeue of another
processor ("remove last element") and executes it.
•
In real scenario we can attached it to a Ticket purchasing line, It performs like a queue
but some time It happens that somebody has purchased the ticket and sudden they
come back to ask something on front of queue. In this scenario because they have
already purchased the ticket so they have privilege to come and ask for any further
query. So in this kind of scenario we need a data structure where according to
requirement we add data from front. And In same scenario user can also leave the queue
from rear.
Program:
/*Implementation of De-queue using arrays*/
#include<stdio.h>
#include<stdlib.h>
#define MAX 10
typedef struct dequeue
{
int front,rear;
int arr[MAX];
}dq;
/*If flag is zero, insertion is done at beginning
else if flag is one, insertion is done at end.
*/
void enqueue(dq *q,int x,int flag)
{
int i;
if(q->rear==MAX-1)
{
printf("\nQueue overflow!");
exit(1);
}
if(flag==0)
{
for(i=q->rear;i>=q->front;i--)
q->arr[i+1]=q->arr[i];
q->arr[q->front]=x;
q->rear++;
}
else if(flag==1)
190
{
q->arr[++q->rear]=x;
}
else
{
printf("\nInvalid flag value");
return;
}
}
void dequeue(dq *q,int flag)
{
int i;
/*front is initialized with zero, then rear=-1
indicates underflow*/
if(q->rear < q->front)
{
printf("\nQueue Underflow");
exit(1);
}
if(flag==0)/*deletion at beginning*/
{
for(i=q->front;i<=q->rear;i++)
q->arr[i]=q->arr[i+1];
q->arr[q->rear]=0;
q->rear--;
}
else if(flag==1)
{ q->arr[q->rear--]=0;
}
else
{ printf("\nInvalid flag value");
return;
}
}
void display(dq *q)
{
int i;
for(i=q->front;i<=q->rear;i++)
printf("%d ",q->arr[i]);
}
void main()
{
dq q;
q.front=0;
q.rear=-1;
int ch,n;
while(1)
{
printf("\nMenu-Double Ended Queue");
printf("\n1. Enqueue – Begin");
191
printf("\n2. Enqueue – End");
printf("\n3. Dequeue – Begin");
printf("\n4. Dequeue – End");
printf("\n5. Display");
printf("\n6. Exit");
printf("\nEnter your choice: ");
scanf("%d",&ch);
switch(ch)
{
case 1:
printf("\nEnter the number: ");
scanf("%d",&n);
enqueue(&q,n,0);
break;
case 2:
printf("\nEnter the number:" );
scanf("%d",&n);
enqueue(&q,n,1);
break;
case 3:
printf("\nDeleting element from beginning");
dequeue(&q,0);
break;
case 4:
printf("\nDeleting element from end");
dequeue(&q,1);
break;
case 5:
display(&q);
break;
case 6:
exit(0);
default:
printf("\nInvalid Choice");
}
}
}
192
Output:
193
(iv) Priority Queue
Consider a job sent to a line printer. Although these jobs are placed in a queue which is served
by the printer on FIFO basis, this may not be a best practice always. Sometimes one job waiting
in a queue might be particularly important, so that it might be required to allow the job to be
run as soon as printer becomes available. Also, when the printer becomes available, there are
several single page jobs in the queue and only one hundred-page job. Here, this might be
reasonable to make the long job go last, even if it is not the last job submitted. (Unfortunately,
most systems may not follow this, which can be particularly annoying at times.)
In a similar way, in multi-tasking and multi-user environment, an operating system scheduler
must decide on which task or user to be allocated a processor. In general a process is allowed
to be executed in time slots or time frames. A simple algorithm used for processing jobs is use
of queue which process jobs of FIFO bases. Whenever a new job is arrived, it is placed at the
end of the queue. The scheduler process the jobs on first come first serve bases from the queue
until the queue is empty. This algorithm may not be appropriate since jobs which require short
time slot may seem to take a long time because of the wait involved in running. Generally, it is
194
important that jobs requiring short time slot finishes as fast as possible. Therefore, these jobs
must have higher priority over jobs that have already been running. Furthermore, there may
be some jobs that are not short which may be still very important and thus must also be
considered on priority.
A priority queue is a data structure that allows at least the following two operations: insert,
which does the obvious thing, and delete_min, which finds, returns and removes the minimum
element in the queue. The insert operation is the same enqueue, and delete_min is the priority
queue equivalent of the queues dequeue operation. The delete_minfunction also alters its input.
Figure 3.2.9: Basic Model of a Priority Queue
As with most data structures, at times it is possible to add other operations too, but these are
extra operations are not part of the basic model depicted in Figure 3.2.8.
Besides operating systems, priority queues have many applications. They are used for external
sorting. Priority queues are also important in the implementation of greedy algorithms, which
operate by repeatedly finding a minimum.
Did you know?
Circular queue is very famous in computer networks because of its very own circular structure.
The simplest application of circular queues for network engineers is in implementation of
round robin algorithm which is used for token passing and also it is used in FIFO buffering
systems.
195
Self-assessment Questions
4) A circular queue is implemented using an array of size 10. The array index starts with
0, front is 6, and rear is 9. The insertion of next element takes place at the array index.
a) 0
b) 7
c) 9
d) 10
5) If the MAX_SIZE is the size of the array used in the implementation of circular queue,
array index start with 0, front point to the first element in the queue, and rear point to
the last element in the queue. Which of the following condition specify that circular
queue is EMPTY?
a) Front=rear=0
b) Front= rear=-1
c) Front=rear+1
d) Front= (rear+1)%MAX_SIZE
6) A normal queue, if implemented using an array of size MAX_SIZE, gets full when
a) Rear=MAX_SIZE-1
b) Front= (rear+1)mod MAX_SIZE
c) Front=rear+1
d) Rear=front
3.2.3 Operations on Queue
Similar to the operations performed on stacks, all such operations can be performed on queues
too. The basic operations involve insertion (enqueue) and deletion (dequeue). The other
supporting functions involve Qfill and Qempty. Other queuing operations involve initialising
or defining a queue, using it and then completely erasing it from the computer’s memory. This
section covers all the functions on queues in detail.
•
enqueue() – Insert a data element in a queue.
•
dequeue() – remove a data element from the queue.
Additional functions are required to make above mentioned queue operation efficient. These
are −
196
•
Qfull () − checks if the queue is full; returns Boolean
•
Qempty () − checks if the queue is empty; returns Boolean
While performing dequeues operations the data is accessed from front pointer and while
performing enqueue operation, data is accessed from rear pointer.
(i)
Insertion
As already discussed in the previous sections, insertion in queue is also called as enqueue
operation. The following are the to be followed while performing enqueue operation −
•
Step 1 − Check if queue is full.
•
Step 2 − If queue is full, display an overflow error and exit.
•
Step 3 − If queue is not full, increment rear pointer to point next empty array space.
•
Step 4 − Add data element to the queue location, where rear is pointing.
•
Step 5 − return success.
Figure 3.2.10: Enqueue Operation
Following is an algorithm for enqueue operation
197
The above algorithm implemented in C programming is shown below:
(ii)
Deletion in Queue
Deletion operation from the queue is also called as dequeue operation. The following are the
steps to be followed while performing dequeue operation −
•
Step 1 − Check if queue is empty.
•
Step 2 − If queue is empty, display an underflow error and exit.
•
Step 3 − If queue is not empty, access data where front is pointing.
•
Step 4 − Increment front pointer to point next available data element.
•
Step 5 − return success.
Figure 3.2.10: Dequeue Operation
198
Following is an algorithm used for performing dequeue operation
The above algorithm implemented in C programming is shown below:
(iii) Qempty Operation
In order to delete an element from the queue first check weather Queue is empty or not. If
queue is empty then do not delete an element from the queue. This condition is known as
“Underflow”.
If queue is not underflow then we can delete the element from queue. After deleting element
from queue we must update the values of rear and front as per the position of elements in the
queue.
Algorithm of Qempty() function −
199
The above algorithm implemented in C programming is shown below:
(iv) Qfull Operation
Since we are using single dimensional array for implementation of queue, the best way to know
if queue is full or not is to check if rear pointer has reached MAXSIZE-1; which means no space
is available in array for additional elements and hence queue is full.
Algorithm of Qfull () function –
The above algorithm implemented in C programming is shown below:
(v)
Display Operation
Queue can be displayed by simply moving rear pointer till it reaches the front pointer. Only
condition that should be considered is the Qempty condition and display “Queue Empty”
message accordingly.
200
Self-assessment Questions
7) If the elements “A”, “B”, “C” and “D” are placed in a queue and are deleted one at a
time, in what order will they be removed?
a) ABCD
b) DCBA
c) DCAB
d) BADC
8) Deletion operation is done using __________in a queue.
a) Front
b) Rear
c) Top
d) Bottom
9) An array of size MAX_SIZE is used to implement a circular queue. Front, Rear, and
count are tracked. Suppose front is 0 and rear is MAX_SIZE -1. How many elements
are present in the queue?
a) Zero
b) One
c) MAX_SIZE-1
d) MAX_SIZE
201
3.2.4 Application of Queue
As per the very nature of queue, it can be used in all the applications requiring the use of fist
come first serve property. Following are the some of the very common applications in computer
science where use of queue makes it easy.
1. As already discussed in the previous chapters, queue plays a key role in scheduling for
the computer resource sharing based applications. For example, simplest of the printer
queue where printing jobs are added to the scheduling queue and printer serves the
requests as per FIFO basis.
2. Similarly queues also play a very important role in CPU scheduling. All the requests for
using processors are stored in queue by the CPU scheduler program. The requests are
then serviced as per FIFO basis.
3. Another most common application of queue is routing calls in call centres. All the calls
made by clients are stored in waiting queue and allotted to different executives who
attend the calls. When all the executives are busy handling customers, the call which
was made first out of all the other calls is given connected to executive as soon as he
becomes available.
4. Another most important application of queue in computer system is interrupt
handling. Since computer system is connected to many input and output devices. These
devices keep sending requests to processor, by creating interrupts repeatedly. These
interrupts are handled by interrupt handle program which put these interrupt in queue
as and when they arrive. Then it service there interrupt as per the availability of CPU.
5. M/M/1 queue. The M/M/1 queue is a fundamental queueing model in operations
research and probability theory. Tasks arrive according to a Poisson process at a certain
rate λ. This means that λ customers arrive per hour. More specifically, the arrivals
follow an exponential distribution with mean 1 / λ: the probability of k arrivals between
time 0 and t is (λ t)^k e^(-λ t) / k!. Tasks are serviced in FIFO order according to a
Poisson process with rate μ. The two M's standard for Markov: it means that the system
is memoryless: the time between arrivals is independent, and the time between
departures is independent.
202
Self-assessment Questions
10) Which data structure allows deleting data elements from front and inserting at rear?
a) Stacks
b) Queues
c) Dequeues
d) Binary search tree
11) The push and enqueue operations are essentially the same operations, push is used for
Stacks and enqueue is used for Queues.
a) True
b) False
12) In order to input a list of values and output them in order, you could use a Queue. In
order to input a list of values and output them in opposite order, you could use a Stack.
a) True
b) False
203
Summary
o Similar to stacks, Queues are data structure usually used to simplify certain
programming operations.
o In these data structures, only one data item can be immediately accessed.
o A queue, in general, allows access to the first item that was inserted.
o The important queue operations are inserting an item at the rear of the queue and
removing the item from the front of the queue.
o A queue can be implemented as a circular queue, which is based on an array in
which the indices wrap around from the end of the array to the beginning.
o A priority queue allows access to the smallest (or sometimes the largest) item in the
queue.
o The important priority queue operations are inserting an item in sorted order and
removing the item with the smallest key.
o Few important operations performed on the queue are insertion which is also called
enqueue, deletion also called dequeues, Qempty which check if queue is empty,
Qfull which check if queue is full and display which is used to display all the
elements in the queue.
o Queues finds its applications in implementing job scheduling algorithms, page
replacement algorithms, interrupt handling mechanisms etc. in the design of
operating systems.
204
Terminal Questions
1. Explain the basic operations of queue.
2. Discuss the functioning of circular Queue?
3. Mention the limitation of linear queue with a suitable example
4. Discuss applications of Queue
Answer Keys
Self-assessment Questions
Question No.
Answer
1
d
2
b
3
b
4
a
5
b
6
a
7
a
8
a
9
d
10
b
11
a
12
a
205
Activity
Activity Type: Offline
Duration: 15 Minutes
Description:
Fill in the following table to give the running times of the priority queue operations for the
given two implementations using O () notation. You should assume that the
implementation is reasonably well done, For example, not performing expensive
computations when a value can be stored in an instance variable and be used as needed.
A priority queue is a data structure that supports storing a set of values, each of which has
an associated key. Each key-value pair is an entry in the priority queue. The basic operations
on a priority queue are:
•
insert(k, v)
– insert value v with key k into the priority queue
•
removeMin()
– return and remove from the priority queue the entry with the smallest key
Other operations on the priority queue include size (), which returns the number of entries
in the queue and is Empty () which returns true if the queue is empty and false otherwise.
Two simple implementations of a priority queue are an unsorted list, where new entries are
added at the end of the list, and a sorted list, where entries in the list are sorted by their key
values.
Operation
Size, isempty()
Insert
Removemin()
206
Unsorted list
Sorted list
Bibliography
e-Reference
•
bowdoin.edu, (2016). Computer Science 210: Data Structures. Retrieved on 19
April 2016, from
http://www.bowdoin.edu/~ltoma/teaching/cs210/fall10/Slides/StacksAndQueues.
pdf
External Resources
•
Kruse, R. (2006). Data Structures and program designing using ‘C’ (2nd ed.). Pearson
Education.
•
Srivastava, S. K., & Srivastava, D. (2004). Data Structures through C in Depth (2nd ed.).
BPB Publications.
•
Weiss, M. A. (2001). Data Structures and Algorithm Analysis in C (2nd ed.). Pearson
Education.
Video Links
Topic
Link
Circular queue
https://www.youtube.com/watch?v=g9su-lnW2Ks
Priority queue
https://www.youtube.com/watch?v=gJc-J7K_P_w
Double ended queue
https://www.youtube.com/watch?v=4xLh68qokxQ
207
Notes:
208
MODULE - IV
Linked List
MODULE 4
Linked List
Module Description
Until now implementation of data structures were done only in arrays, which consists of
contiguous memory allocation. This module covers a new way of representing and
implementing data structures linked list. The chapter 4.1 covers advantages, disadvantages of
array implementation and introduces the linked lists. The specifications of linked lists which
include its definition, its components and representation also form part of the chapter.
Moreover the chapter also covers different types of linked lists. The chapter 4.2 majorly focuses
on different operations which can be performed on linked lists and their implementation.
Chapter 4.1
Introduction to Linked List
Chapter 4.2
Operations on Linked List
Chapter Table of Contents
Chapter 4.1
Introduction to Linked List
Aim ..................................................................................................................................................... 209
Instructional Objectives................................................................................................................... 209
Learning Outcomes .......................................................................................................................... 209
Introduction to Linked Lists ........................................................................................................... 210
4.1.1 Linked List Specifications....................................................................................................... 211
(i) Definition ............................................................................................................................ 212
(ii) Components....................................................................................................................... 212
(iii) Representation.................................................................................................................. 213
(iv) Advantages and Disadvantages ...................................................................................... 214
Self-assessment Questions ...................................................................................................... 216
4.1.2 Types of Linked Lists .............................................................................................................. 217
(i) Singly Linked Lists ............................................................................................................. 217
(ii) Doubly Linked Lists .......................................................................................................... 220
(iii) Circular Linked Lists ....................................................................................................... 222
Self-assessment Questions ...................................................................................................... 224
Summary ........................................................................................................................................... 225
Terminal Questions.......................................................................................................................... 225
Answer Keys...................................................................................................................................... 226
Activity............................................................................................................................................... 227
Case Study: ........................................................................................................................................ 227
Bibliography ...................................................................................................................................... 229
e-References ...................................................................................................................................... 229
External Resources ........................................................................................................................... 229
Video Links ....................................................................................................................................... 229
Aim
To provide the students the knowledge of Linked Lists and enable the students to write
programs using linked list in C
Instructional Objectives
After completing this chapter, you should be able to:
•
Explain components and representation of linked list
•
Outline the advantages and disadvantages of linked list
•
Differentiate singly, doubly and circular linked list
•
Discuss the applications of linked list
Learning Outcomes
At the end of this chapter, you are expected to:
•
Construct a table of advantages and disadvantages of linked list
•
Discuss various types of linked lists
•
Identify the components of all the different types of linked lists
209
Introduction to Linked Lists
In this chapter, we will introduce the concept of linked list data structures. This chapter focuses
on how linked list can be used to overcome the limitations of array data structures. Use of
linked list helps to achieve high flexibility in programming. In this chapter, we will come across
the basic structure of a linked list and its various representations, types of linked lists,
advantages and disadvantages associated with linked lists and its applications.
A linked list is a linear representation of interconnected nodes which consists of two
components namely, data and link. The data consists of the value stored at that node and link
consists of the address of the next interconnected node.
As already covered in previous chapters, the linear data structures such as stacks and queues
can be implemented using arrays which allocates memory sequentially. The contiguous
memory allocation of arrays provides several advantages for implementing stacks and queues
as given below:
•
Faster data access: Arrays operate on computed addresses. Therefore, direct access of
data is possible which reduces the data access time.
•
Simple to understand and use: Arrays are very simple to understand. Declaring,
accessing, displaying, etc. of data from arrays is very simple. This makes
implementation of stacks and queues very simple.
•
Adjacency of data: In arrays, data are both physically and logically adjacent. Therefore
loss of data elements does not affect the other part of the list.
There are also drawbacks associated with sequential allocation of data. The disadvantages of
array implementation are given below:
•
Static memory allocation: While using arrays, the compiler allocates fixed amount of
memory for the program before execution begins and this allocated memory cannot be
changed during its execution. It is difficult and sometimes not possible to predict the
amount of memory that may be required an applications in advance. If more than the
required memory is allocated to a program, and the application does not utilize it and
hence results in wastage of memory space. In the same way, if less memory is allocated
to a program and if application demands more memory then additional amount of
memory cannot be allocated during execution.
210
•
Requirement of contiguous memory space: For implementation of linear data
structures using arrays, sufficient amount of contiguous memory is required.
Sometimes even though there is memory space available in the memory it is not
contiguous, which makes it of no use.
•
Insertion and deletion operations on arrays are time taking and sometimes tedious
tasks.
In order to overcome the drawbacks of contiguous memory allocation, the linear data
structures like linked list and arrays can be implemented using linked allocation technique. The
structure in the form of linked lists can be used to implement linear data structures like stacks
and queues efficiently. This mechanism can be used for many different purposes and data
storage applications.
4.1.1 Linked List Specifications
Linked lists are used to implement a list of data items in some order. The linked lists structures
use memory that grows and shrinks are per the requirement. Figure 4.1.1 below helps better
understand linked list structure.
Figure 4.1.1: Linked List
Every component comprising “data” and “next” in the above figure 4.1.1 is called a node and
the arrow represents the link to the next node. Head represents the starting of the linked list.
The size of the data item can vary as per requirement. Linked lists can grow as much as there is
memory space available.
211
(i)
Definition
A linked representation of a data structure known as a linked list is a collection of nodes. An
individual node is divided into two fields named data and link. The data field contains the
information or the data to be stored by the node. The link field contains the addresses of the
next node. Figure 4.1.2(a) below, demonstrates the general structure of a node in a linked list.
Here, the field that holds data is termed as Info and the field that holds the address is termed
as a Link. Figure 4.1.2(b) demonstrates an instance of a node. Here the info field contains an
integer number 90 and the address field contains the address 2468 of the next node in a list.
Figure 4.1.3 shows an example of linked list for integer numbers.
Figure 4.1.2: (a) Structure of Node (b) Instance of Node
Did you know?
Linked lists were developed in 1955–1956 by Allen Newell, Cliff Shaw and Herbert A.
Simon at RAND Corporation as the primary data structure for their Information
Processing Language. IPL was used by the authors to develop several early artificial
intelligence programs, including the Logic Theory Machine, the General Problem Solver,
and a computer chess program.
(ii)
Components
As already discussed in the previous chapter, linked list basically has a collection of nodes.
There, nodes individually have two components; data and link. The data component holds the
actual data that the node should hold and link contains the address of the node that is next to
it. In this way the linked list grows like a chain. Along with this, linked list also has a head
(alternatively called start) pointer which points to the beginning or start of the linked list. The
212
link part of the last node of the linked list contains null which represents the end of the list.
Figure 4.1.3 shows a lined list containing integer data elements.
Figure 4.1.3: Linked List Containing Integer Values
(iii) Representation
One way to represent a linked list is by using two linear arrays for memory representation.
Assume there are two arrays; INFO and LINK. In these two arrays, INFO[K] contains data part
and LINK[k] pointer field to node k. Assume START as a variable storing the starting address
of the list and NULL is a pointer indicating the end of the list.
The pictorial representation of linked list as discussed above is shown in figure 4.1.5 and figure
4.1.6.
Figure 4.1.4: Representation of Linked List
Figure 4.1.5: INFO and LINK Arrays
213
Here,
START = 9
=> INFO[9]= H is the first character
LINK[9]= 4
=> INFO[4]= E is the first character
LINK[4]= 6
=> INFO[6]= L is the first character
LINK[6]= 2
=> INFO[2]= L is the first character
LINK[2]= 8
=> INFO[8]= 0 is the first character
LINK[8]= 0
=> NULL value so the list ends
Generally a linked list node is implemented in C++ programming using a structure and
allocating dynamic memory to the structure. The syntax shown below demonstrates the
implementation of the node using a structure.
struct test_struct
{
int val;
struct test_struct *next;
};
(iv) Advantages and Disadvantages
The advantages and disadvantages of linked lists are given below:
Advantages of Linked list specifications
•
Number of elements: First and foremost advantage of linked lists over arrays is that
we need not know in advance about how many elements will form a part of list.
Therefore we need not allocate memory for the linked list in advance.
•
Insertion and deletion operations: While using linked lists, insertion and deletion
operations can be performed without fixing the size of the memory in advance.
•
Memory allocation: One of the most important advantages of linked lists over
arrays is that it utilizes only the exact amount of memory required for it to store
data and it can expanded to acquire all the memory locations if needed.
•
Non-contiguous memory: Unlike arrays, for linked list, we do not require
contiguous memory allocation. We do not require elements to be stored in
214
consecutive memory locations. That means even if the contiguous memory block is
unavailable, we can still store data.
Disadvantages of Linked list specifications
•
Pointer memory: Use of linked list requires extra memory space as pointers are also
stored with information or data. This makes implementation expensive considering
memory requirement.
•
No random access: Since we have to access nodes or elements sequentially, we do
not have access to elements in linked list directly at particular node. Also sequential
access makes data access time consuming depending on location of the elements.
•
Traversal: In linked lists traversal from end of the list to beginning of the list is not
possible.
•
Sorting: Sorting elements in linked list is not as easy as the sorting operations in
arrays.
The differences between static and linked allocation are mentioned below
Table 4.1.1: Differences between Static and Linked Allocation
Static Allocation Technique
Linked Allocation Technique
Memory is allocated during compile time
Memory is allocated during execution time.
The size of the memory allocated is fixed
The size of the memory allocated may vary.
Suitable for applications where data size is
Suitable for applications where data size is
fixed and known in advance
unpredictable.
Execution is faster
Execution is slow.
Insertion and deletion operations are
strictly not defined. It can be done
conceptually but inefficient way.
Insertion and deletion operations are
defined and can be done more efficiently.
215
Self-assessment Questions
1) Individual node in the linked list consists of _________ number of fields.
a) One
b) Two
c) Three
d) Four
2) The info or data field of linked list contains _______ and link field contains _______.
a) Data to be stored; link to the previous node
b) Data to be deleted; link to the next node
c) Data to be stored; link to the next node
d) Data to be stored; link to the previous node.
3) The linked list allocated memory for its data elements in _____________ order.
a) Even
b) Contiguous
c) Consecutive
d) Non-contiguous
4) Linked lists do not require additional memory space for holding pointer.
a) True
216
b) False
4.1.2 Types of Linked Lists
Based on the access to the list or traversal linked lists, the different types of linked lists are:
1. Singly linked lists
2. Doubly linked lists
3. Circular linked lists
(i)
Singly Linked Lists
It is the simple representation of the linked list. The structure of linked lists discussed till now
may be called as singly linked lists. The individual data elements in the list are called a ‘node’.
The elements in the lists may or may not be present in consecutive memory location. Therefore
pointers are used to maintain the order of the list. Every individual node is divided into two
parts called INFO and LINK. INFO is used to store data and LINK is used to store the address
of the next node. Pointer START is used to indicate the starting of the linked list and NULL is
used to represent the end of the list.
The following figure 4.1.6 shows the representation of a singly linked list.
Figure 4.1.6: Singly linked lists
217
Representation using C programming
struct test_struct
{
int val;
struct test_struct *next;
};
Program to Create Singly Linked List.
#include <stdio.h>
#include <malloc.h>
#include <stdlib.h>
void main()
{
struct node
{
int n;
struct node *ptr;
};
typedef struct node NODE;
NODE *head, *first, *temp = 0;
int cnt = 0;
int ch = 1;
first = 0;
while (choice)
{
head = (NODE *)malloc(sizeof(NODE));
printf("Enter the data item:\n");
scanf("%d", &head-> n);
if (first != 0)
{
temp->ptr = head;
temp = head;
}
else
{
first = temp = head;
}
fflush(stdin);
printf("Do you wish to continue(Press 0 or 1)?\n");
scanf("%d", &ch);
}
temp->ptr = 0;
218
/* reset temp to the beginning */
temp = first;
printf("\nThe linked lists elements are:\n");
while (temp != 0)
{
printf("%d=>", temp->n);
cnt++;
temp = temp -> ptr;
}
printf("NULL\n");
}
Output:
Advantages of Singly Linked Lists
•
The most obvious advantage of singly linked lists is its easy representation and
simple implementation.
•
Secondly, we only need to keep track of one pointer i.e., forward pointer without
having to bother about the previous node information.
•
It is persistent data structure. A simple list of objects formed by each carrying a
reference to the next in the list. This is persistent because we can take a tail of the
list, meaning the last k items for some k, and add new nodes on to the front of it.
The tail will not be duplicated, instead becoming shared between both the old list
and the new list. So long as the contents of the tail are immutable, this sharing will
be invisible to the program.
219
Disadvantages of Singly Linked Lists
•
The very own advantage of singly linked lists of forward sequential access becomes
its own drawback when we need to traverse back to previous nodes. Since singly
linked list structure does not keep any information of the previous node, traversing
backward is impossible.
•
If we want to delete an element in the linked list and if the element to be deleted is
present at the end of the list, then this becomes worst case for singly linked list as
we have to check for all the nodes from beginning till the end. The worst case for
this operation is O(n).
Did you know?
Several operating systems developed by Technical Systems Consultants (originally of
West Lafayette Indiana, and later of Chapel Hill, North Carolina) used singly linked lists
as file structures. A directory entry pointed to the first sector of a file, and succeeding
portions of the file were located by traversing pointers. Systems using this technique
included Flex (for the Motorola 6800 CPU), mini-Flex (same CPU), and Flex9 (for the
Motorola 6809 CPU). A variant developed by TSC for and marketed by Smoke Signal
Broadcasting in California, used doubly linked lists in the same manner.
(ii)
Doubly Linked Lists
Doubly linked lists also referred ad D-lists are data structures where individual node consist of
three fields. One is the usual INFO field which hold the data element. The other two are called
NEXT and PREV which has addresses or links to next and previous nodes respectively.
Use of D-lists provides us an access to both sides of the list. We can traverse forward using
NEXT link and backward using PREV. The figure 4.1.7 below shows the pictorial
representation of doubly linked lists.
220
Figure 4.1.7: Doubly Linked List
Representation using a c program
struct test_struct
{
int val;
struct test_struct *next;
Struct test_struct *prev
};
Advantages of Doubly linked lists
•
This structure overcomes the disadvantage of singly linked list of traversing only
forward by having two pointers NEXT and PREV. Using NEXT we can move
forward and using PREV we can traverse backward.
•
This structure also provides us a flexibility to move to any of the node from any
node by moving to and fro throughout the list which is not possible in singly linked
list.
•
A node on a doubly linked list may be deleted with little trouble, since we have
pointers to the previous and next nodes. A node on a singly linked list cannot be
removed unless we have the pointer to its predecessor.
Disadvantages of Doubly linked lists
•
Doubly linked list implementation require higher amount of energy as compared to
singly linked list. This is because of its very own structure of having two links for
moving forward and backward. The use of extra pointer for each node increases
need of extra memory.
221
•
The deletion and insertion operations on doubly linked lists are more time
consuming because of the use of two pointers. It consumes a lot of time to update
these pointers after each operation and require extra coding which increases time
complexity.
(iii) Circular Linked Lists
The third type of linked list is circular linked list. Here, we have START, however the link of
the last node is not NULL. The list part of the last node is again START. This makes it connect
to START so that we can traverse back to the beginning making a round robin structure. This
overcomes the drawbacks of singly linked list where one cannot move back to the first node.
Figure 4.1.8 shows the structure of circular linked list.
Figure 4.1.8: Circular Linked List
Advantages of Circular Linked Lists
•
Circular linked list enables traversal back to the starting of the list by connecting the
last node in the list back to the start. This removes the drawback of a singly linked
list where one cannot access starting elements once moved forward.
•
A circular linked list also eliminates the use of two pointers like that of doubly linked
list, and helps us serve our purpose of traversing across the list. This result in saving
a lot of memory required (saving extra pointer) for each node.
•
Handling pointers in circular linked list becomes as easy as singly linked list as we
have to track only one pointer that moves forward.
Disadvantages of Circular Linked Lists
•
Though it helps us traverse across the linked list, the traversal to previous node is
very time consuming. This is because there is no pointer that keep track of previous
222
node, which makes us traverse the entire linked list again and reach the node that
we desire to access.
•
If proper exception handling mechanism is absent, then implementation of circular
lists can be dangerous as it might lead to infinite loop.
•
Reversal on the list is difficult.
Applications of circular linked lists
•
Implementation of waiting and context switch queues in operating system. When there
are multiple processes running on operating system and there is mechanism to provide
limited time slots for each process, waiting process can form a circular linked list. Task
at the head of list is given CPU time, once time allocated finishes, task is taken out and
added to list again at the end and it continues.
•
Circular linked list is useful in implementation of queues using lists. In this, we need to
have two pointers, one to point to head and other to point to end of list, because in
queue, addition happens at end and removal at head. With circular list, that can be done
using only one pointer.
•
The real life application where the circular linked list is used is our Personal Computers,
where multiple applications are running. All the running applications are kept in a
circular linked list and the OS gives a fixed time slot to all for running. The Operating
System keeps on iterating over the linked list until all the applications are completed.
•
Multiplayer games. All the Players are kept in a Circular Linked List and the pointer
keeps on moving forward as a player's chance ends.
•
Circular Linked List can also be used to create Circular Queue. In a Queue we have to
keep two pointers, FRONT and REAR in memory all the time, whereas in Circular
Linked List, only one pointer is required.
223
Self-assessment Questions
5) In ___________ type of linked lists we can traverse in both the directions.
a) Singly linked list
b) Circular linked list
c) One dimensional linked list
d) Doubly linked list
6) In circular linked list, the link part of the last element in the list hold the address of
_______.
a) Random node
b) NULL
c) START
d) Previous Node
7) It is possible to traverse across the list using circular lists.
a) True
b) False
8) In singly linked list it is possible to traverse back to the START.
a) True
224
b) False
Summary
o A linked representation of a data structure known as a linked list is a collection of
nodes.
o Individual node is divided into two fields named data and link.
o The data field contains the information or the data to be stored by the node. The
link field contains the addresses of the next node.
o Based on the access to the list or traversal there are three types of linked lists; Singly
Linked Lists, Doubly linked lists and Circular linked lists.
o The use of linked list depends on application as there are some applications that
demands sequential allocation where linked lists cannot be used.
Terminal Questions
1. Explain the advantages and disadvantages of static memory allocation.
2. Explain the advantages and disadvantages of linked lists.
3. Compare singly linked list and doubly linked lists
4. Compare singly linked list and circular linked lists
225
Answer Keys
Self-assessment Questions
226
Question No.
Answer
1
b
2
c
3
d
4
b
5
d
6
c
7
a
8
b
Activity
Activity Type: Offline
Duration: 30 Minutes
Description:
Students should demonstrate the operations on linked list. Here few students can act as
nodes and another student can act as a pointer. Students should perform operations like
Insertion, Deletion on these nodes.
Students should act as nodes of singly linked list, doubly linked list, circular linked list.
Operations should be performed on different types of linked list.
Case Study:
Stack Implementation through Linked List
We can avoid the size limitation of a stack implemented with an array, with the help of a linked
list to hold the stack elements. As needed in case of array, we have to decide where to insert
elements in the list and where to delete them so that push and pop will run at the fastest.
Primarily, there are two operations of a stack; push() and pop(). A stack carries lifo behavior
i.e. last in, first out. You know that while implementing stack with an array and to
achieve lifo behavior, we used push and pop elements at the end of the array. Instead of pushing
and popping elements at the beginning of the array that contains overhead of shifting elements
towards right to push an element at the start and shifting elements towards left to pop an
element from the start. To avoid this overhead of shifting left and right, we decided
to push and pop elements at the end of the array.
Q.1) Now, if we use linked list to implement the stack, where will we push the element inside
the list and from where will we pop the element?
Hint: There are few facts to consider, before we make any decision: Insertion and removal in
stack takes constant time. Singly linked list can serve the purpose.
227
There are two parts of above figure. On the left hand, there is the stack implemented using an
array. The elements present inside this stack are 1, 7, 5 and 2. The most recent element of the
stack is 1. It may be removed if the pop() is called at this point of time. On the right side, there
is the stack implemented using a linked list. This stack has four nodes inside it which are linked
in such a fashion that the very first node pointed by the head pointer contains the value 1. This
first node with value 1 is pointing to the node with value 7. The node with value 7 is pointing
to the node with value 5 while the node with value 5 is pointing to the last node with value 2.
To make a stack data structure using a linked list, we have inserted new nodes at the start of the
linked list.
Q.2) Write a pseudo-code to carry out insertion and deletion operations of stack with the help
of linked list.
Q.3) Will the stack implementation using linked list be cost effective?
228
Bibliography
e-Reference
•
cs.cmu.edu, (2016). Linked Lists. Retrieved on 19 April 2016, from
https://www.cs.cmu.edu/~adamchik/15121/lectures/Linked%20Lists/linked%20lists.html
External Resources
•
Kruse, R. (2006). Data Structures and program designing using ‘C’ (2nd ed.). Pearson
Education.
•
Srivastava, S. K., & Srivastava, D. (2004). Data Structures Through C in Depth (2nd
ed.). BPB Publications.
•
Weiss, M. A. (2001). Data Structures and Algorithm Analysis in C (2nd ed.). Pearson
Education.
Video Links
Topic
Link
Doubly Linked List
https://www.youtube.com/watch?v=k0pjD12bzP0
Linked Lists
https://www.youtube.com/watch?v=LOHBGyK3Hbs
Circular Linked List
https://www.youtube.com/watch?v=I4tVBFBoNSA
229
Notes:
230
Chapter Table of Contents
Chapter 4.2
Operations on Linked List
Aim ..................................................................................................................................................... 231
Instructional Objectives................................................................................................................... 231
Learning Outcomes .......................................................................................................................... 231
Introduction ...................................................................................................................................... 232
4.2.1 Operations on Singly List ....................................................................................................... 232
(i) Creating a Linked List ........................................................................................................ 233
(ii) Insertion of a Node in Linked List .................................................................................. 234
(iii) Deletion of a Node from the Linked List ...................................................................... 238
(iv) Searching and Displaying Elements from the Linked List .......................................... 242
Self-assessment Questions ............................................................................................................... 245
Summary ........................................................................................................................................... 246
Terminal Questions.......................................................................................................................... 247
Answer Keys...................................................................................................................................... 248
Activity............................................................................................................................................... 248
Bibliography ...................................................................................................................................... 249
e-References ...................................................................................................................................... 249
External Resources ........................................................................................................................... 249
Video Links ....................................................................................................................................... 249
Aim
To educate the students on the importance of using Linked List data structure in
computer science
Instructional Objectives
After completing this chapter, you should be able to:
•
Demonstrate the traversing in the linked list*
•
Illustrate insertion and deletion operations at the end and beginning of linked
list
•
Demonstrate the approach to search and display a value in the linked list
Learning Outcomes
At the end of this chapter, you are expected to:
•
Identify the traversal path of the linked list
•
Discuss insertion and deletion operations in singly linked
•
Determine the position of the value to be searched in the linked list
231
Introduction
In the previous chapter, the basic concepts and fundamentals of the linked lists were covered.
Linked lists are very robust and dynamic data structures and therefore, different nodes can be
added, deleted, or can be updated at miniscule cost. Moreover, while using linked lists, there is
no need of large contiguous memory at the compile time; rather it accesses memory at runtime
based on requirements. These properties make linked lists a primary choice of programmers
for many practical applications.
This chapter focuses on understanding the most important and fundamental operations on
linked lists such as creation of linked list, adding or insertion a node, deleting a node from the
list, searching a node element and displaying elements of linked lists.
4.2.1 Operations on Singly List
Deleting and adding an element in an array requires shifting of array elements to create or fill
the holes (empty spaces). Also, updating an array element only requires accessing that
particular value at index and simply overwriting or replacing the value. Moreover, one of the
distinguishing features of the arrays is its random access which enables it to access any index
value directly i.e. given an access to particular index; we can easily find the value at that index.
In case of linked lists, the entry point constitutes head of the linked lists. Head or start of the
list is actually not the node, but a reference to the first node in the linked list. That means, a
head constitutes a value. For an empty linked list, the value of head is null. It is also known that
linked list always ends with a null pointer (except for circular linked list where last node is
connected to the start or head node). Great care must be taken while manipulating linked lists,
as any wrong link in the middle makes the entire list inaccessible. That is because the only way
to traverse a list is by using a reference to the next node from the current node. This concept of
linking wrong nodes is called as “memory leaks”. In case of lost memory reference, the entire
list from that point becomes inaccessible.
Some of the basic operations on linked lists discussed in this chapter are as follows:
232
•
Creation of linked lists
•
Insertion of node from the linked lists
•
Deletion of node from the linked lists
•
Searching an elements from the lists
•
Displaying all linked lists elements
Did you know?
Many programming languages such as Lisp and Scheme have singly linked lists built in.
In many functional languages, these lists are constructed from nodes, each called a cons
or cons cell. The cons has two fields: the car, a reference to the data for that node, and the
cdr, a reference to the next node. Although cons cells can be used to build other data
structures, this is their primary purpose.
In languages that support abstract data types or templates, linked list ADTs or templates
are available for building linked lists. In other languages, linked lists are typically built
using references together with records.
(i)
Creating a Linked List
The most simple of the operations of linked lists is creating a linked list. It includes a simple
step of getting a free node and copying a data item into the “data” field of the node. The next
step is to update the links of the node i.e., in case of entirely a new list, start and null pointers
are supposed to be updated and in case of merging lists, the references must be updated
accordingly.
Below is a C programming code segment for declaring a linked list.
struct node
{
int data;
struct node *next;
}*start=NULL;
Below is a C programming function for creating a new node
void create()
{
char c;
do
{
struct node *new_node,*current;
new_node=(struct node *)malloc(sizeof(struct node));
233
printf("nEnter the data : ");
scanf("%d",&new_node->data);
new_node->next=NULL;
if(start==NULL)
{
start=new_node;
current=new_node;
}
else
{
current->next=new_node;
current=new_node;
}
printf("nDo you want to create another node : ");
c=getch();
}while(c!='n');
}
(ii)
Insertion of a Node in Linked List
Insertion is an important operation while working with the data structures. This operation
involves adding data to the data structure. For carrying out the operation there can be notably
three different cases mentioned as below:
•
Insertion in front of the list
•
Insertion at any given location within the list
•
Insertion at the end of the list.
The general procedure for inserting a node in the linked list is detailed below:
Step 1: Input the value of the new node and the position where it is supposed to be inserted.
Step 2: Check if the linked list is full. If YES then display an error message “Overflow”. Else
continue.
Step 3: Create a new node and Insert the data value in the data field of new node.
Step 4: Add the new node to the desired location in the linked list.
Below is C programming function for inserting a node in the linked list.
void insert(node *ptr, int data)
{
/* Iterate through the list till we encounter the last node.*/
234
while(ptr->next!=NULL)
{
ptr = ptr -> next;
}
/* Allocate memory for the new node and put data in it.*/
ptr->next = (node *)malloc(sizeof(node));
ptr = ptr->next;
ptr->data = data;
ptr->next = NULL;
}
Insertion of node at the starting on the linked list
Figure 4.2.1 below shows the case for insertion of node in starting of the linked list.
Figure 4.2.1: Insertion of Node in Starting of the Linked List
Algorithm below shows insertion of node at the starting of the linked list.
Step 1. Create a new node and assign the address to any node say ptr.
Step 2. OVERFLOW,IF(PTR = NULL)
write : OVERFLOW and EXIT.
Step 3. ASSIGN INFO[PTR] = ITEM
Step 4.
IF(START = NULL)
ASSIGN NEXT[PTR] = NULL
ELSE
ASSIGN NEXT[PTR] = START
Step 5. ASSIGN START = PTR
235
Step 6. EXIT
Following is the C programming function for implementing above algorithm.
void insertion(struct node *nw)
{
struct node start, *previous, *new1;
nw = start.next;
previous = &start;
new1 = (struct node* ) malloc(sizeof(struct node));
new1->next = nw ;
previous->next = new1;
printf("\n Input the fisrt node value: ");
scanf("%d", &new1->data);
}
Insertion of node at any given position in the linked list
Figure 4.2.2 below shows the case for insertion of new node at any given location.
Figure 4.2.2: Insertion of Node at any Desired Location
The following algorithm shows insertion of node at any given location within the list.
InsertAtlocDll(info,next,start,end,loc,size)
1.set nloc = loc-1 , n=1
2.create a new node and address in assigned to ptr.
3.check[overflow] if(ptr=NULL)
write:overflow and exit
4.set Info[ptr]=item;
5.if(start=NULL)
set next[ptr] = NULL
set start = ptr
else if(nloc<=size)
repeat steps a and b while(n != nloc)
a.
loc = next[loc]
b.
n = n+1
236
[end while]
next[ptr] = next[loc]
next[loc] = ptr
else
set last = start;
repeat step (a) while(next[last]!= NULL)
a. last=next[last]
[end while]
last->next = ptr ;
[end if]
6.Exit.
The below is the C programming function for implementing the above algorithm.
void insertAtloc(node **start,int item , int i,int k )
{
node *ptr,*loc,*last;
int n=1 ;
i=i-1;
ptr=(node*)malloc(sizeof(node));
ptr->info=item;
loc = *start ;
if(*start==NULL)
{
ptr->next = NULL ;
*start = ptr ;
}
else if(i<=k)
{
while(n != i)
{
loc=loc->next;
n++;
}
ptr->next = loc->next ;
loc->next = ptr ;
}
else
{
last = *start;
while(last->next != NULL)
{last=last->next;
}
last->next = ptr ;
}
}
237
Insertion of node at end of the linked list
Figure 4.2.3 below shows the case for insertion of node at end of the list.
Figure 4.2.3: Insertion of Node at the End of the List
(iii) Deletion of a Node from the Linked List
The deletion of a node in linked is similar to insertion. Again in deletion operation there are
three cases as listed below:
•
Deletion of the front node
•
Deletion of any intermediate node
•
Deletion of the last node.
The general algorithm for deletion of any given node is given below:
Step 1: Search for the appropriate node to be deleted
Step 2: Remove the node
Step 3: Reconnect the linked list
Step 4: Update all the links.
238
Deleting the front node
Figure 4.2.4 shows deletion of node from the front of the linked list.
Figure: 4.2.4: Deleting front Node
Following algorithm shows deletion of the front node from the linked list.
DELETE AT BEG(INFO,NEXT,START)
1.IF(START=NULL)
2.ASSIGN PTR = STRAT
3.ASSIGN TEMP = INFO[PTR]
4.ASSIGN START = NEXT[PTR]
5.FREE(PTR)
6.RETURN(TEMP)
Below is the C programming function for implementing above algorithm.
void deleteatbeg(node **start)
{
node *ptr;
int temp;
ptr = *start ;
temp = ptr->info;
*start = ptr->next ;
free(ptr);
printf("\nDeleted item is %d : \n",temp);
}
239
Deletion of Intermediate node
Figure 4.2.5 shows deletion of intermediate node from the linked list.
Figure 4.2.5: Deletion of Intermediate Node from the Linked List
Following is the algorithm for deleting an intermediate node from the linked list.
Step 1: Traverse to the node to be deleted in the linked list.
Step 2: Examine the next node by using the node pointers to move from node to node until the
correct node is identified.
Step 3: Copy the pointer of the removed node into temporary memory.
Step 4: Remove the node from the list and mark the memory it was using as free once more.
Step 5: Update the previous node's pointer with the address held in temporary memory.
Below is the C programming function for implementation of above algorithm.
//Function to delete any node from linked list.
void delete_any()
{
int key;
if(header->link == NULL)
{
printf("\nEmpty Linked List. Deletion not possible.\n");
}
else
{
printf("\nEnter the data of the node to be deleted: ");
scanf("%d", &key);
ptr = header;
while((ptr->link != NULL) && (ptr->data != key))
{
240
ptr1 = ptr;
ptr = ptr->link;
}
if(ptr->data == key)
{
ptr1->link = ptr->link;
free(ptr);
printf("\nNode with data %d deleted.\n", key);
}
else
{
printf("\nValue %d not found. Deletion not possible.\n",
key);
}
}
}
Deletion of end node from the linked list
Figure 4.2.6 below shows deletion of the end node from the linked list.
Figure 4.2.6: Deletion of Last Node from the Linked List
Below is the algorithm for deletion of last node from the linked list.
Delete End(info,next,start)
1.if(start=NULL)
Print Underflow and Exit.
2.if(next[start]==NULL)
Set ptr =start and start=NULL.
set temp = info[ptr].
else cptr = start and ptr = next[start].
Repeat steps(a) and (b) while(next[ptr]!=NULL)
(a) set cptr = ptr
(b) set ptr = next[ptr]
241
[end while]
set next[cptr] = NULL
temp = info[ptr]
[end if]
3. free(ptr)
4. return temp
5. exit
Following is the C programming function foe implementation of above algorithm.
void deleteatlast(node **start)
{
node *ptr,*cptr;
int temp;
if((*start)->next == NULL)
{
ptr = *start ;
*start = NULL;
temp = ptr->info;
}
else
{
cptr = *start ;
ptr =(*start)->next;
while(ptr->next != NULL)
{
cptr = ptr;
ptr = ptr->next;
}
cptr->next = NULL;
temp = ptr->info;
}
free(ptr);
printf("\nDeleted item is %d : \n",temp);
}
(iv) Searching and Displaying Elements from the Linked
List
Searching and displaying elements of the linked list are very similar and simple as both
operations involve simple traversal across the linked list.
Searching: For linked list, it is simple linear search which begins from starting of the linked list
from start pointer till we find the NULL pointer.
242
Algorithm for searching an element in the linked is as below.
Step 1: Input the element to be searched KEY.
Step 2: Initiate the current pointer with the beginning of the list.
Step 3: Compare KEY with the value in the data field of current node.
Step 4: If they match then quit.
Step 5: else go to Step 3.
Step 6: Move the current pointer to point to the next node in the list and go to step 3; till the
list is not over or else quit.
int search(int item)
{
int count=1;
nw=&first;
while(nw->next!=NULL)
{
if(nw->data==item)
break;
else
count++;
nw=nw->next;
}
return count;
}
The above code fragment explains how searching of an element takes place in a single linked
list.
Searching in a single linked list follows linear search algorithm. Here, based on the search key
which is given by the user, the pointer will point to the first node of the list. It will then compare
the search key with the data found at every node. If the match is found then the search is
complete. If not found, then the pointer moves forward in a linked list and the process
continues till the last node of the list.
243
Did you know?
Finding a specific element in a linked list, even if it is sorted, normally requires O (n) time. This
is one of the primary disadvantages of linked lists over other data structures. One of the better
way to improve search time is move-to-front heuristic, which simply moves an element to the
beginning of the list once it is found. This scheme, handy for creating simple caches, ensures
that the most recently used items are also the quickest to find again.
Another common approach is to "index" a linked list using a more efficient external data
structure. For example, one can build a red-black tree or hash table whose elements are
references to the linked list nodes. Multiple such indexes can be built on a single list. The
disadvantage is that these indexes may need to be updated each time a node is added or removed
(or at least, before that index is used again).
Displaying the linked list content
Similar to search function, display function begins by traversing from the starting of the list to
display the value in the data field of each and every node till it reaches the last node.
Below is the algorithm for displaying the content of the linked list.
Step 1: Initiate the current pointer from the beginning of the list.
Step 2: Display the value in the data field of current pointer.
Step 3: Increment the current pointer to pint to the next node.
Step 4: Repeat step 3 until the end of the list.
Following is the algorithm for implementation of the above algorithm.
void display()
{
struct node *temp;
temp=start;
while(temp!=NULL)
{
printf("%d",temp->data);
temp=temp->next;
}
}
244
Self-assessment Questions
1) While inserting a node at the first position in the linked list the link part of the node to
be inserted should point to ____________.
a) Any random node in the list
b) NULL pointer
c) Start pointer
d) Midpoint of all the nodes
2) Trying to insert a node where the linked list has reached its maximum capacity is called
_________
a) Underflow
b) Overflow
c) Mid flow
d) Overcover
3) The concept of linking wrong nodes is called as _____________
a) Memory leaks
b) Memory holes
c) Memory pits
d) Memory wastage
4) Trying to delete a node from an empty linked list is called _______________
a) Underflow
b) Overflow
c) Mid flow
d) Overcover
245
Summary
o A linked list is a collection of nodes which consists of two fields namely data and
link. The data field contains the information or the data to be stored by the node.
The link field contains the addresses of the next node.
o The basic operations of linked list are creation of linked lists, insertion of node into
the linked lists, deletion of node from the linked lists, searching an element from
the lists and displaying all linked lists elements.
o Insertion operation involves addition of a node to a linked which happens either at
the start of the linked list or at any given position in the linked list or at the end of
the linked list.
o Similarly, deletion operation involves deletion of a node from a linked which
happens either at the start of the linked list or at any given position in the linked list
or at the end of the linked list.
o Searching in a single linked list is based on a simple linear search technique which
traverses from the start till the end of the list until the desired element is found.
o Display operation is responsible for traversing the whole list and then displaying
the data component of the list in a linear fashion.
246
Terminal Questions
1. Write a C program for inserting node in the linked list
2. Write and explain the algorithm for deleting a node in the linked list
3. Write a C program for searching a node in the linked list.
4. Explain the algorithm for displaying all the nodes in the linked list.
247
Answer Keys
Self-assessment Questions
Question No.
Answer
1
c
2
b
3
a
4
a
Activity
Activity Type: Offline
Duration: 20 Minutes
Description:
Write a function to sort an existing linked list of integers using insertion sort.
248
Bibliography
e-Reference
•
cs.cmu.edu, (2016). Lecture 10 Linked List Operations Concept of a Linked List
Revisited. Retrieved on 19 April 2016, from http://www.cs.cmu.edu/~ab/15123S09/lectures/Lecture%2010%20%20%20Linked%20List%20Operations.pdf
External Resources
•
Kruse, R. (2006). Data Structures and program designing using ‘C’ (2nd ed.). Pearson
Education.
•
Srivastava, S. K., & Srivastava, D. (2004). Data Structures Through C in Depth (2nd
ed.). BPB Publications.
•
Weiss, M. A. (2001). Data Structures and Algorithm Analysis in C (2nd ed.). Pearson
Education.
Video Links
Topic
Operations on singly linked list
Singly Linked List-Deletion of Last
Node
Linked Lists in 10 minutes
Link
https://www.youtube.com/watch?v=McgL6JuWUpM
https://www.youtube.com/watch?v=Hn8Hs9sVSCM
https://www.youtube.com/watch?v=LOHBGyK3Hbs
249
Notes:
250
MODULE - V
Tree Graphs and
Their Applications
MODULE 5
Tree Graphs and Their Applications
Module Description
Trees and graphs are typical examples of non-linear data structure as discussed in Module 1.
Non-linear data structure unlike linear data structure, is a structure where in an element is
permitted to have any number of adjacent elements.
Trees are non-linear data structures which are very useful for representing hierarchical
relationships among the data elements. For example, for representation of relationship
between members of the family we can uses non-linear data structures like trees. Data
organization in hierarchical forms or structures is very essential for many applications
involving searching of data elements. Trees are most important and useful data structures in
Computer Science in the areas of data parting, compiler designs, expression evaluation and
managing storages.
Similar to trees, graph is a powerful tool used for representing a physical problem in
mathematical form. One of the famous problems where graphs are used is for finding optimum
shortest path for a travelling salesman while travelling from one city to another city, so as to
minimize the cost. A graph may also have unconnected node. Moreover, there may be more
than one path between two nodes. Graphs and directed graphs are powerful tools used in
computer science for many real world applications. For example, building compilers and also
in modelling physical communication networks. A graph is an abstract notion of a set of nodes
(vertices or points) and connection relations (edges or arcs) between them.
Chapter 5.1
Tree Fundamentals
Chapter 5.2
Graph Fundamentals
Chapter Table of Contents
Chapter 5.1
Tree Fundamentals
Aim ..................................................................................................................................................... 251
Instructional Objectives................................................................................................................... 251
Learning Outcomes .......................................................................................................................... 251
Introduction ...................................................................................................................................... 252
5.1.1 Definition of Tree.................................................................................................................... 252
Self-assessment Questions ..................................................................................................... 255
5.1.2 Tree Terminologies ................................................................................................................. 255
(i) Root ..................................................................................................................................... 255
(ii) Node................................................................................................................................... 256
(iii) Degree of a Node or a Tree ............................................................................................ 256
(iv) Terminal Node................................................................................................................. 257
(v) Non-terminal Nodes ........................................................................................................ 257
(vi) Siblings .............................................................................................................................. 258
(vii) Level ................................................................................................................................. 258
(viii) Edge ................................................................................................................................ 259
(ix) Path ................................................................................................................................... 260
(x) Depth.................................................................................................................................. 260
(xi) Parent Node ..................................................................................................................... 261
(xii) Ancestor of a Node ........................................................................................................ 261
Self-assessment Questions ..................................................................................................... 262
5.1.3 Types of Trees .......................................................................................................................... 262
(i) Binary Trees ....................................................................................................................... 262
(ii) Binary Search Trees ......................................................................................................... 264
(iii) Complete Binary Tree .................................................................................................... 269
Self-assessment Questions ..................................................................................................... 270
5.1.4 Heap .......................................................................................................................................... 271
(i) Heap Order Property ........................................................................................................ 272
(ii) Heap Sort........................................................................................................................... 277
Self-assessment Questions ..................................................................................................... 280
5.1.5 Binary Tree............................................................................................................................... 280
(i) Array Representation ........................................................................................................ 280
(ii) Creation of a Binary Tree ................................................................................................ 283
Self-assessment Questions ..................................................................................................... 287
5.1.6 Traversal of Binary Tree ......................................................................................................... 288
(i) Preorder Traversal............................................................................................................. 288
(ii) Inorder Traversal.............................................................................................................. 289
(iii) Postorder Traversal......................................................................................................... 290
Self-assessment Questions ..................................................................................................... 290
Summary ........................................................................................................................................... 291
Terminal Questions.......................................................................................................................... 291
Answer Keys...................................................................................................................................... 292
Activity............................................................................................................................................... 292
Bibliography ...................................................................................................................................... 293
e-References ...................................................................................................................................... 293
External Resources ........................................................................................................................... 293
Video Links ....................................................................................................................................... 293
Aim
To equip the students with the techniques of Trees so that it can be used in the program
to search and sort the elements in the list
Instructional Objectives
After completing this chapter, you should be able to:
•
Explain tree and its different types
•
Outline various tree terminologies with example
•
Illustrate max heap and min heap technique
•
Demonstrate binary search tree with an example
•
Illustrate preorder, postorder and inorder
Learning Outcomes
At the end of this chapter, you are expected to:
•
Discuss tree and various types with example
•
Write a code to demonstrate max heap and min heap
•
Explain array representation of binary tree
•
Outline the steps to traverse inorder, preorder and postorder
•
Construct binary tree from inorder, preorder and postorder
251
Introduction
So far in the previous chapters, we have come across linear data structures like stack, queues
and linked lists. This chapter focuses on non-linear data structures such as trees. In this chapter,
we will come across the basic structure of trees, its types and its various terminologies. We will
then focus on heaps, its types, then the concept of binary search tress and its various types of
traversals.
5.1.1 Definition of Tree
Tree can be defined as a non-linear data structure consisting of a root node and other nodes
present at different levels forming a hierarchy. A tree essentially has one node called its root
and one or more nodes adjacent below, connected to it. A tree with no nodes is called an empty
tree.
Therefore, a tree is a finite set of one or more nodes such that:
•
There is specially designated node called a root.
•
The remaining nodes are partitioned into n>=0 disjoint set T1, T2,…….., Tn where
each of these sets is a tree. T1, T2… Tn are called sub-trees of the root.
A node in the definition of a tree depicts an item of information and the links between the
nodes are called as its branches which represents an association between these items of
information.
The figure 5.1.1 below shows pictorial representation of a Tree.
Figure 5.1.1: Pictorial Representation of a Tree
252
In the above figure, node 1 is the root of the tree, nodes 2, 3, 4 and 9 are called intermediate
nodes and nodes 5, 6, 7, 8, 10, 11 and 12 are its leaf nodes. It is important to note that the tree
emphasizes on the aspect of (i) connectedness and (ii) absence of loops or cycles. Starting from
the root, the tree structure allows connectivity of the root to each of the node in the tree.
Generally, any node can be reached from any part of the tree. Moreover, with all the branches
providing links between the nodes, the tree structure makes the point that there are no sets of
nodes forming a closed loop or cycle.
Tree data structure is widely used in the field of computer science. The following points shows
its use in many applications, they are
•
Folder/directory structure in operating systems like windows and linux.
•
Network routing
•
Syntax tree used in compilers, etc.
Example:
The figure 5.1.2 below shows directory structure in Windows OS.
Figure 5.1.2: Directory Structure in Windows OS.
253
The figure 5.1.3 below shows directory structure in Linux OS.
Figure 5.1.3: Directory Structure in Linux OS.
Advantages of trees
•
Trees reflect structural relationships in the data
•
Trees are used to represent hierarchies
•
Trees provide an efficient insertion and searching
•
Trees are very flexible data, allowing to move sub-trees around with minimum
efforts
Did you know?
Portable Document Format (PDF) is a tree based format. It has a root node followed by a catalog
node (these are often the same) followed by a pages node which has several child page nodes.
Producers/consumers often use a balanced tree implementation to store a document in
memory.
254
Self-assessment Questions
1) A Tree is _____________ type of data structure
a) Linear
b) Advanced
c) Non-Linear
d) Data Driven
2) The tree with no nodes is called an empty tree
a) True
b) False
3) Nodes in a tree can be infinite.
a) True
b) False
5.1.2 Tree Terminologies
Terminologies are ways of giving names to any part of data structure. Therefore this section
covers the various terminologies that are used to define and help us to identify the components
of the tree. This will make us easy to analyze and solve various complex mathematical problems.
Some of the very important terminologies are root, node, degree of a node or a tree, terminal
nodes, siblings, levels, edge, path, depth, parent node and ancestral of the node.
(i)
Root
Definition: In trees, the origin node or the first node is called a root.
This is the node from which all the other nodes gets evolved and sometimes referred to as a
seed node. In any given tree there is at least one root node and the entire structure of the tree
is built on this node.
255
The figure 5.1.4 shows the root node in a tree data structure.
Figure 5.1.4: Root Node Representation
(ii)
Node
Definition: Node is a data point which forms a unit in a tree data structure.
In real time applications, node can be a computer system connected to network which is in turn
connected to a computer server. Different nodes connect together in specific structure to form
a tree data structure.
(iii) Degree of a Node or a Tree
Definition: Degree of a node is the total number of child node connected to a particular node.
The highest degree of a node among all the nodes is called ‘Degree of a Tree’.
The degree of a node varies as at times a node many be connected to more than two nodes.
However, in case of a binary tree, degree is always 2.
Figure 5.1.5: Degree of a Node
256
(iv) Terminal Node
Definition: A node having no child node is called as a terminal node or a leaf node.
These nodes are also called as External nodes. These are the nodes having no further
connectivity and form leafs of the tree. Figure 5.1.6 below shows leaf nodes.
Figure 5.1.6: Leaf Nodes
(v)
Non-terminal Nodes
Definition: All the intermediate nodes which have at least one child node are called nonterminal nodes or internal nodes.
All the nodes which are non-terminal nodes are non-terminal nodes. They can have degree
greater than zero. Figure 5.1.7 below shows internal or non-terminal nodes.
Figure 5.1.7: Internal or Non-Terminal Nodes.
257
(vi) Siblings
Definitions: Nodes which belong to same Parent are called as SIBLINGS. In simple words, the
nodes with same parent are called as Sibling nodes.
The following Figure 5.1.8 shows sibling nodes in a tree.
Figure 5.1.8: Sibling Nodes
Here B & C are siblings
Here D E & F are siblings
Here G & H are siblings
Here I & J are siblings
In any tree, the nodes which has same parent are called ’Siblings’.
The children of a parent are called ‘Siblings’.
(vii)
Level
In a tree data structure, the root node is said to be at Level 0 and the children of root node are
at Level 1 and the children of the nodes which are at Level 1 will be at Level 2 and so on... In
simple words, in a tree each step from top to bottom is called as a Level and the Level count
starts with '0' and incremented by one at each level (Step).
258
Figure 5.1.9 below shows levels in a tree.
Figure 5.1.9: Levels in the Tree
(viii) Edge
Definition: Connecting link between two nodes is called an Edge.
An edge is basically representation of links which help us understand which node is connected
to what other node in the tree. It also helps us determine if the node is not a leaf node. In a tree
with N number of nodes there are maximum of N-1 numbers of edges. Figure 5.1.10 below
shows representation of edge.
Figure 5.1.10: Representation of an Edge
259
(ix) Path
In a tree data structure, the sequence of Nodes and Edges from one node to another node is
called as Path between those two Nodes. Length of a Path is total number of nodes in that
path. In below example the path A - B - E - J has length 4. Figure 5.1.11 shows path in the tree.
Figure 5.1.11: Path in the Tree.
(x)
Depth
In a tree data structure, the total number of edges from root node to a particular node is called
as Depth of that Node. In a tree, the total number of edges from root node to a leaf node in the
longest path is said to be Depth of the tree. In simple words, the highest depth of any leaf node
in a tree is said to be depth of that tree. In a tree, depth of the root node is '0'. Figure 5.1.12
below shows paths in a tree.
Figure 5.1.12: Paths in a Tree
260
In any tree, ‘Path’ is a sequence of nodes and edges between two nodes.
Here ‘Path’ between A and J is A – B – E – J
Here, ‘Path’ between C and K is C – G - K
(xi) Parent Node
Definition: Parent is a converse notion of a child
In figure 5.1.13 below 2 is parent node of 7 and 5.
Figure 5.1.13: General Tree Architecture
(xii)
Ancestor of a Node
Definition: A node reachable by repeated proceedings from the child to the parent is called as
an ancestor node.
In figure 5.1.13, 2 is an ancestor node of 7, 5, 12, 6, 9, 15, 11 and 4
261
Self-assessment Questions
4) In trees, the origin node or the first node is
a) Intermediate node
b) Root node
c) Leaf node
d) Ancestor node
5) A node having no child node is called ___________
a) Intermediate node
b) Root node
c) Leaf node
d) Ancestor node
6) A node reachable by repeated proceedings from the child to the parent is called as an
ancestor node.
a) Intermediate node
b) Root node
c) Leaf node
d) Ancestor node
5.1.3 Types of Trees
Based on the structure, properties, operational permeability and applications, trees can be of
various types. Some of the most important types of tree data structures are discussed in this
section. Following are the three most common and important types of tree data structures
which are widely used in different applications of Computer Science.
1. Binary trees
2. Binary search trees
3. Complete binary trees
(i)
Binary Trees
A binary tree is a type of tree in which node can have only two child node. The figure 5.1.14
below shows binary tree consisting of a root node and two child nodes or subtrees Tl and Tr.
Both of these two child nodes can also be empty.
262
Figure 5.1.14: Generic Binary Tree
One of the important properties of binary tree is that the depth of the average binary tree is
considerably lesser than n. According to the analysis the average depth is and that for a special
type of binary tree is called binary search tree, the average depth is O (log n). In fact, the depth
can be as large as n -1, as the example in Fig. 5.1.15 shows.
Here, the depth of the tree is 4 and height of the tree is also 4
Figure 5.1.15 Worst-case Binary Tree
Most of the rules that implies to linked lists will also apply for trees. Particularly, while
performing an insertion operation, a node must be created by allocating memory by calling
malloc function. Memory allocated to nodes can be freed after their deletion by calling free
function.
The trees could also be drawn using rectangular (boxes) representation which we usually use
for representing linked lists. However circles are used for representation as trees are in fact
263
graphs. Also NULL pointer is not drawn as every binary tree with n nodes would require (n+1)
NULL pointers.
We could draw the binary trees using the rectangular boxes that are customary for linked lists,
but trees are generally drawn as circles connected by lines, because they are actually graphs. We
also do not explicitly draw NULL pointers when referring to trees, because every binary tree
with n nodes would require n + 1 NULL pointers.
Terminologies of Binary tree
•
The depth of a node is the number of edges from the root to the node.
•
The height of a node is the number of edges from the node to the deepest leaf.
•
The height of a tree is a height of the root.
•
A full binary tree is a binary tree in which each node has exactly zero or two children.
•
A complete binary tree is a binary tree, which is completely filled, with the possible
exception of the bottom level, which is filled from left to right.
(ii)
Binary Search Trees
The binary search trees (BST) are basically binary trees which are aimed at providing efficient
way of data searching, sorting, and retrieving.
A binary search tree can be defined as a binary tree that is either empty or in which every node
has a key (within its data entry) and satisfies the following conditions:
•
The key of the root (if it exists) is greater than the key in any node in the left sub-tree of
the root.
•
The key of the root (if it exists) is less than the key in any node in the right sub-tree of
the root.
•
The left and right sub-trees of the root are again binary search trees.
In above definition, the property 1 and 2 describes ordering relative to the key of the root node
and property 3 reaches out to all the nodes of the tree; therefore we can easily implement
recursive structure of binary tree. After examining the root of the tree, we move to the either
264
left or right sub-tree which in-turn is binary search tree. Therefore, we can use the same method
again on this smaller tree.
This algorithm is used in such a way that no two entries in the binary tree can have equal key
as key of the left sub tree are always smaller than that if root and those of right sub tree are
always greater. It is possible to change the definition to allow entries with equal keys; however
doing so makes the algorithms somewhat more complicated. Therefore, we always assume: No
two entries in a binary search tree may have equal keys.
Insertion in Binary Search Tree:
•
Check whether root node is present or not (tree available or not). If root is NULL,
create root node.
•
If the element to be inserted is less than the element present in the root node, traverse
the left sub-tree recursively until we reach T->left/T->right is NULL and place the new
node at T->left(key in new node < key in T)/T->right (key in new node > key in T).
•
If the element to be inserted is greater than the element present in root node, traverse
the right sub-tree recursively until we reach T->left/T->right is NULL and place the
new node at T->left/T->right.
Algorithm for insertion in Binary Search Tree:
TreeNode insert(int data, TreeNode T) {
if T is NULL {
T = (TreeNode *)malloc(sizeof (StructTreeNode));
(Allocate Memory of new node and load the data into
it)
T->data = data;
T->left
= NULL;
T->right = NULL;
} else if T is less than T->left {
T->left = insert(data, T->left);
(Then node needs to be inserted in left subtree.So,
recursively traverse left sub-tree to find the
place
where the new node needs to be inserted)
} else if T is greater than T->right {
T->right = insert(data, T->right);
(Then node needs to be inserted in right sub-tree
So, recursively traverse right sub-tree to find
the
place where the new node needs to be inserted.)
}
return T;
}
265
Example:
Insert 30 into the Binary Search Tree.
Tree is not available. So, create root node and place 30 into it.
30
Insert 23 into the given Binary Search Tree. 35> 30 (data in root). So, 35 needs to be
inserted in the right sub-tree of 30.
30
\
35
Insert 13 into the given Binary Search Tree. 20<30(data in root). So, 20 needs to be
inserted in left sub-tree of 30.
/
20
30
\
35
Insert 15 into the given Binary Search Tree.
/
20
/
15
30
\
35
Inserting 25.
30
/ \
20
35
/ \
15
25
Inserting 27.
266
30
/ \
20
35
/ \
15
25
\
27
Inserting 32.
Inserting 40.
Inserting 38.
30
/ \
20
35
/ \
/
15 25 32
\
27
30
/ \
20
35
/ \ / \
15 25 32 40
\
27
30
/ \
20
35
/ \ / \
15 25 32 40
\
/
27 38
Deletion in Binary Search Tree:
How to delete a node from binary search tree?
There are three different cases that need to be considered for deleting a node from
binary search tree.
case 1: Node with no children (or) leaf node
case 2: Node with one child
case 3: Node with two children.
30
/ \
20
35
/ \ / \
15 25 32 40
\
/
27 38
267
Case 1: Delete a leaf node/ node with no children.
30
/ \
20
35
/ \ / \
15 25 32 40
\
/
27 38
Delete 38 from the above binary search tree.
30
/ \
20
35
/ \ / \
15 25 32 40
\
27
Case 2: Delete a node with one child.
30
/ \
20
35
/ \ / \
15 25 32 40
\
27
Delete 25 from above binary search tree.
30
/ \
20
35
/ \ / \
15 27 32 40
Case 3: Delete a node with two children.
Delete a node whose right child is the smallest node in the right sub-tree. (25 is the
smallest node present in the right sub-tree of 20).
30
/ \
20
35
/ \ / \
15 25 32 40
\
27
268
Delete 20 from the above binary tree. Find the smallest in the left subtree of 20. So,
replace 20 with 25.
30
/ \
25
35
/ \ / \
15 27 32 40
Delete 30 from the below binary search tree.
30
/ \
20
35
/ \ / \
15 25 32 40
\
34
Find the smallest node in the right sub-tree of 30. And that smallest node is 32. So,
replace 30 with 32. Since 32 has only one child(34), the pointer currently pointing to 32
is made to point to 34. So, the resultant binary tree would be the below.
32
/ \
20
35
/ \ / \
15 25 34 40
(iii) Complete Binary Tree
If the nodes of a complete binary tree are labeled in order sequence, starting with 1, then each
node is exactly as many levels above the leaves as the highest power of 2 that divides its label.
The following figure 5.1.16 depicts how a complete binary tree looks like.
Figure 5.1.16: An Example of Complete Binary Tree
269
In the above figure 5.1.4, the nodes are labeled as per the execution or traversal of called inorder traversal. Here the left most extreme node is chosen for traversal and the tree is traversed
as per the sequence of numbers labeled adjacent to the nodes.
Self-assessment Questions
7) How many child nodes a node in a binary tree can have?
a) One
b) Two
c) Three
d) Four
8) The maximum depth of a binary tree can be _________
a) n+1
b) n
c) n-1
d) 0
9) Binary search trees are aimed at providing efficient calculations.
a) True
270
b) False
5.1.4 Heap
In computer science, a heap is a specialized tree based data structure that satisfies a heap
priority. i.e., if A is a parent node of B then the key of node A is ordered with respect to the key
of node B with the same ordering applying across the heap.
Like binary trees, heap has two properties, first is structure property and second is heap order
property. A heap is basically a binary tree which is completely filled (exception for bottom level)
from left to right. Such tree is also referred as a complete binary tree. Figure 5.1.17 shows an
example of a complete binary tree.
Figure 5.1.17: Complete Binary Tree
In a complete binary tree, for any array element i, its left child is in position 2i, and the right
child is in cell after left child (2i+1). Therefore, there is no requirement of a pointer and tree
traversal is very simple and fast on many of the computers. The only limitation of this type of
implementation is that there must be an estimate of a heap size in advance, which is also not a
major issue.
271
(i)
Heap Order Property
This property allows operations to be performed quickly on binary trees. Suppose, the
operation is to find the minimum of the elements quickly, then it makes logical sense to have
smallest element at the root. If we assume, any sub-tree should be heap, then it must be noted
that any node should be smaller than its descendants.
If this logic is applied to a binary tree then the result gives heap order property. Each node X
present in heap, the key in the parent of X is always smaller (or equal to) than the key X. Expect
for the root which has no parents. In Figure 5.1.18 the tree on the left is a heap, but the tree on
the right is not (the dashed line shows the violation of heap order). As usual, we will assume
that the keys are integers, although they could be arbitrarily complex.
Figure 5.1.18: Two Complete Trees (only the Left Tree is a Heap)
On similar lines, max heap can be declared, which can efficiently find and remove the
maximum element, by changing the heap order property. Thus, a priority queue can be used
to find either a minimum or a maximum, but this need to be decided ahead of time.
By the heap order property, the minimum element can always be found at the root. Thus, we
get the extra operation, find_min, in constant time.
A min-heap is a binary tree such thatλ - the data contained in each node is less than (or equal
to) the data in that node’s children. - The binary tree is complete
272
The following figure 5.1.19 depicts a min heap property.
Figure 5.1.19: Min Heap
A max-heap is a binary tree such thatλ - the data contained in each node is greater than (or
equal to) the data in that node’s children. - The binary tree is complete.
The following figure 5.1.20 depicts a max heap property.
Figure 5.1.20: Max Heap
A binary heap is a heap data structure created using a binary tree.
Binary tree has two rules 1. Binary Heap has to be complete binary tree at all levels except the last level. This is
called shape property.
273
2. All nodes are either greater than equal to (Max-Heap) or less than equal to (Min-Heap)
to each of its child nodes. This is called heap property.
Implementation:
•
Use array to store the data.
•
Start storing from index 1, not 0.
•
For any given node at position i:
•
Its Left Child is at [2*i] if available.
•
Its Right Child is at [2*i+1] if available.
•
Its Parent Node is at [i/2]if available.
Min heap
Here the value of the root is less than or equal to either (left or right) of its children.
The following figure 5.1.21 shows a min heap example.
Figure 5.1.21: Min Heap Example
274
Max heap
Here the value of the root is greater than or equal to either (left or right) of its children.
The following figure 5.1.22 shows a max heap example.
Figure 5.1.22: Max Heap Example
Below is the algorithm used for maintaining the heap property
// Input: A: an array where the left and right children of i root heaps (but
i may not), i: an array index
// Output: A modified so that i roots a heap
// Running Time: O(log n) where n = heap-size[A] − i
l ← Left(i)
r ← Right(i) if l ≤ heap-size[A] and A[l] > A[i]
largest ← l
else largest ← i
if r ≤ heap-size[A] and A[r] < A[largest]
largest ← r
if largest != i
exchange A[i] and A[largest]
Max-Heapify(A, largest)
Heap build algorithm
BUILD-HEAP(A)
// Input: A: an (unsorted) array
// Output: A modified to represent a heap.
// Running Time: O(n) where n = length[A]
heap-size[A] ← length[A]
for i ← blength[A]/2 downto 1
Max-Heapify(A, i)
Heap build algorithm
275
The following figure 5.1.23 shows Example for building a heap
Figure 5.1.23: Example for Building a Heap
276
(ii)
Heap Sort
The heap sort algorithm starts by using BUILD-HEAP to build a heap on the input array A[1 .
. n], where n = length[A]. Since the maximum element of the array is stored at the root A[1], it
can be put into its correct final position by exchanging it with A[n]. If we now "discard"
node n from the heap (by decrementing heap-size[A]), we observe that A[1 . . (n - 1)] can easily
be made into a heap. The children of the root remain heaps, but the new root element may
violate the heap property (7.1). All that is needed to restore the heap property, however, is one
call to HEAPIFY(A, 1), which leaves a heap in A[1 . . (n - 1)]. The heap sort algorithm then
repeats this process for the heap of size n - 1 down to a heap of size 2.
Program for heap sort:
/*
* C Program to sort an array based on heap sort algorithm(MAX heap)
*/
#include <stdio.h>
int main()
{
int heap[10], n, i, j, cnt, root, temp;
printf("\nEnter the number of elements: ");
scanf("%d",&n);
printf("\nEnter the elements: ");
for(i =0; i< n; i++)
scanf("%d",&heap[i]);
for(i =1; i< n; i++)
{
cnt = i;
do
{
root =(cnt -1)/2;
if(heap[root]< heap[cnt])/* to create MAX heap array */
{
temp = heap[root];
277
heap[root]= heap[cnt];
heap[cnt]= temp;
}
cnt = root;
}while(cnt !=0);
}
printf("Heap array elements are: ");
for(i =0; i< n; i++)
printf("%d ", heap[i]);
for(j = n -1; j >=0; j--)
{
temp = heap[0];
heap[0]= heap[j];
/* swap max element with rightmost leaf element */
heap[j]= temp;
root =0;
do
{
cnt =2* root +1;/* left node of root element */
if((heap[cnt]< heap[cnt +1])&&cnt< j-1)
cnt++;
if(heap[root]<heap[cnt]&&cnt<j)/* again rearrange to max heap array
*/
{
temp = heap[root];
heap[root]= heap[cnt];
heap[cnt]= temp;
}
root = cnt;
}while(cnt< j);
}
printf("\nThe sorted array elements after Heap sort are: ");
for(i =0; i< n; i++)
printf("%d ", heap[i]);
}
Output:
278
The HEAPSORT procedure takes time O(n lg n), since the call to BUILD-HEAP takes
time O(n) and each of the n - 1 calls to HEAPIFY takes time O(lg n). The Figure 5.1.24 shows
heap sort.
Figure 5.1.24: HeapSort
Did you know?
Data structure "heap" might be used in various places. Heap used for dynamic memory
allocation wherever it is needed. There are two types of heap "ascending heap" and "descending
heap". In ascending heap root is the smallest one and in descending heap root is the largest
element of the complete or almost complete binary tree.
279
Self-assessment Questions
9) If A is a parent node of B then the key of node A is ordered with respect to the key of
node B with the same ordering applying across the heap.
a) True
b) False
10) As per the heap property it makes logical sense to have smallest element at __________
a) Leaf or terminal nodes
b) Intermediate nodes
c) Root nodes
d) Non-terminal nodes
11) We can easily implement priority queue using heap order property.
a) True
b) False
5.1.5 Binary Tree
As discussed in section 5.1.2.a, a binary tree is a type of tree in which node can have only two
child node. That is maximum degree of a binary tree can be 2. Let us discuss how a binary tree
can be represented in the form of an array and also how to create binary trees.
(i)
Array Representation
A binary tree can be represented as both single dimension as wee as two dimensional array
called as Adjacency Matrix.
Adjacency Matrix representation:
A two dimensional array can be used to store the adjacency relations very easily and can be
used to represent a binary tree. In this representation, to represent a binary tree with n vertices
we use n×n matrix.
280
Figure 5.1.25(a) shows a binary tree and Figure 5.1.24(b) shows its adjacency matrix
representation.
Figure 5.1.25: Representation of a Binary Tree in the Form of Adjacency Matrix
From the above representation, we can understand that the storage space utilization is not
efficient. Now, let us see the space utilization of this method of binary tree representation. Let
‘n’ be the number of vertices. The space allocated is n x n matrix. i.e., we have n2number of
locations allocated, but we have only n-1 entries in the matrix. Therefore, the percentage of
space utilization is calculated as follows:
The percentage of space utilized decreases as n increases. For large ‘n’, the percentage of
utilization becomes negligible. Therefore, this way of representing a binary tree is not efficient
in terms of memory utilization.
Single dimension array representation
Since the two dimensional array is a sparse matrix, we can consider the prospect of mapping it
onto a single dimensional array for better space utilization.
281
In this representation, we have to note the following points:
•
The left child of the ith node is placed at the 2ith position.
•
The right child of the ith node is placed at the (2i+1)th position.
•
The parent of the ith node is at the (i/2)th position in the array.
If l is the depth of the binary tree then, the number of possible nodes in the binary tree is 2l+11. Hence it is necessary to have 2l+1-1 locations allocated to represent the binary tree.
If ‘n’ is the number of nodes, then the percentage of utilization is
Figure 5.1.26 shows a binary tree and Figure 5.1.27 shows its one-dimensional array
representation.
Figure 5.1.26: A Binary Tree
Figure 5.1.27: One- dimensional Array Representation
282
For a complete and full binary tree, there is 100% utilization and there is a maximum wastage
if the binary tree is right skewed or left skewed, where only l+1 spaces are utilized out of the
2l+1 – 1 spaces.
An important observation to be made here is that the organization of the data in the binary tree
decides the space utilization of the representation used.
(ii)
Creation of a Binary Tree
Following is an algorithm for creation of a binary tree.
Step 1: Pick an element from Preorder. Increment a Preorder Index Variable (preIndex in
below code) to pick next element in next recursive call.
Step 2: Create a new tree node tNode with the data as picked element.
Step 3: Find the picked element’s index in Inorder. Let the index be in Index.
Step 4: Call buildTree for elements before inIndex and make the built tree as left subtree of
tNode.
Step 5: Call buildTree for elements after inIndex and make the built tree as right subtree of
tNode.
Step 6: Return tNode.
Below is the C program for implementing the above algorithm.
intinIndex = search(in, inStrt, inEnd, tNode->data);
/* Using index in Inorder traversal, construct left and
rightsubtress */
tNode->left = buildTree(in, pre, inStrt, inIndex-1);
tNode->right = buildTree(in, pre, inIndex+1, inEnd);
returntNode;
}
/* UTILITY FUNCTIONS */
/* Function to find index of value in arr[start...end]
The function assumes that value is present in in[] */
283
intsearch(chararr[], intstrt, intend, charvalue)
{
inti;
for(i = strt; i<= end; i++)
{
if(arr[i] == value)
returni;
}
}
/* Helper function that allocates a new node with the
given data and NULL left and right pointers. */
structnode* newNode(chardata)
{
structnode* node = (structnode*)malloc(sizeof(structnode));
node->data = data;
node->left = NULL;
node->right = NULL;
return(node);
}
/*
}
Program:
#include <stdlib.h>
typedefstructtnode
{
int data;
structtnode *right,*left;
}TNODE;
TNODE *CreateBST(TNODE *,int);
voidInorder(TNODE *);
void Preorder(TNODE *);
voidPostorder(TNODE *);
main()
{
TNODE *root=NULL;
/* Main Program */
intopn,elem,n,i;
do
{
clrscr();
printf("\n ### Binary Search Tree Operations ### \n\n");
printf("\n Press 1-Creation of BST");
printf("\n
2-Traverse in Inorder");
printf("\n
3-Traverse in Preorder");
printf("\n
4-Traverse in Postorder");
printf("\n
5-Exit\n");
printf("\n
Your option ? ");
scanf("%d",&opn);
284
switch(opn)
{
case1: root=NULL;
printf("\n\nBST for How Many Nodes ?");
scanf("%d",&n);
for(i=1;i<=n;i++)
{
printf("\nRead the Data for Node %d ?",i);
scanf("%d",&elem);
root=CreateBST(root,elem);
}
printf("\nBST with %d nodes is ready to Use!!\n",n);
break;
case2:printf("\n BST Traversal in INORDER \n");
Inorder(root);break;
case3:printf("\n BST Traversal in PREORDER \n");
Preorder(root);break;
case4:printf("\n BST Traversal in POSTORDER \n");
Postorder(root);break;
case5:printf("\n\n Terminating \n\n");break;
default:printf("\n\nInvalid Option !!! Try Again !! \n\n");
break;
}
printf("\n\n\n\n Press a Key to Continue . . . ");
getch();
}while(opn !=5);
}
TNODE *CreateBST(TNODE *root,intelem)
{
if(root == NULL)
{
root=(TNODE *)malloc(sizeof(TNODE));
root->left= root->right = NULL;
root->data=elem;
return root;
}
else
{
if(elem< root->data )
root->left=CreateBST(root->left,elem);
else
if(elem> root->data )
root->right=CreateBST(root->right,elem);
else
printf(" Duplicate Element !! Not Allowed !!!");
return(root);
}
}
voidInorder(TNODE *root)
285
{
if( root != NULL)
{
Inorder(root->left);
printf(" %d ",root->data);
Inorder(root->right);
}
}
void Preorder(TNODE *root)
{
if( root != NULL)
{
printf(" %d ",root->data);
Preorder(root->left);
Preorder(root->right);
}
}
voidPostorder(TNODE *root)
{
if( root != NULL)
{
Postorder(root->left);
Postorder(root->right);
printf(" %d ",root->data);
}
}
286
Self-assessment Questions
12) A tree can only be represented as a 2 dimensional array.
a) True
b) False
13) A two dimensional representation of an array is a sparse matrix
a) True
b) False
287
5.1.6 Traversal of Binary Tree
A traversal of a binary tree is where its nodes are visited in a particular but repetitive order,
rendering a linear order of nodes or information represented by them. There are three simple
ways to traverse a tree. They are called preorder, inorder, and postorder. In each technique, the
left sub-tree is traversed recursively, the right sub-tree is traversed recursively, and the root is
visited. What distinguishes the techniques from one another is the order of those three tasks.
The following sections discuss these three different ways of traversing a binary tree.
(i)
Preorder Traversal
In this traversal, the nodes are visited in the order of root, left child and then right child.
•
Process the root node first.
•
Traverse left sub-tree.
•
Traverse right sub-tree.
Repeat the same for each of the left and right sub-trees encountered. Here, the leaf nodes
represent the stopping criteria. The pre-order traversal sequence for the binary tree shown in
Figure 5.1.28 is: A B D E H I C F G
A
C
B
D
F
E
H
I
Figure 5.1.28: A Binary Tree
288
G
Consider the following example
The following figure 5.1.29 shows pre-order traversal example
Figure 5.1.29: Pre-order Traversal Example
The pre-order traversal for the above tree is 1->2->4->5->3.
(ii)
Inorder Traversal
In this traversal, the nodes are visited in the order of left child, root and then right child. i.e.,
the left sub-tree is traversed first, then the root is visited and then the right sub-tree is traversed.
The function must perform only three tasks.
•
Traverse the left subtree.
•
Process the root node.
•
Traverse the right subtree
Remember that visiting a node means doing something to it: displaying it, writing it to a file
and so on. The Inorder traversal sequence for the binary tree shown in Figure 5.1.27 is: D B H
E I A F C G.
Consider the following example:
The following figure 5.1.30 shows in-order traversal example
Figure 5.1.30: In-order Traversal Example
289
The in-order traversal for the above tree is 4->2->5->1->3.
(iii) Postorder Traversal
In this traversal, the nodes are visited in the order of left child, right child and then the root.
i.e., the left sub-tree is traversed first, then the right sub-tree is traversed and finally the root is
visited. The function must perform the following tasks.
•
Traverse the left subtree.
•
Traverse the right subtree.
•
Process the root node.
The post order traversal sequence for the binary tree shown in Figure 5.1.27 is: D H I E B F G
C A.
Consider the following example:
The following figure 5.1.31 shows post-order traversal example
Figure 5.1.31: Post-order Traversal Example
The post-order traversal for the above tree is 4->5->2->3->1.
Self-assessment Questions
14) Traversal in a binary tree is order in which we visit the nodes in a tree.
a) True
b) False
15) In preorder traversal nodes are visited in order of __________
290
a) Life child – Root – Right child
b) Left child – Right child – Root
c) Root – Left child – Right child
d) Root – Right child – Left child
Summary
o Tree can be defined as a non-linear data structure consisting of a root node and
other nodes present at different levels forming a hierarchy.
o There are three different types of trees; Binary trees, Binary search trees and
complete binary trees.
o A binary tree is a type of tree in which a node can have only two child nodes.
o A binary search tree is a binary tree that is either empty or in which every node has
a key.
o A complete binary tree is a binary tree in which every level of the tree is completely
filled except the last nodes towards the right.
o A heap is a specialized tree based data structure that satisfies a heap priority which
makes it suitable for implementing priority queues.
o There are various terminologies used for identification and analysis of various types
of trees like root, nodes, degree of node/tree, terminal nodes, non-terminal nodes,
siblings, level, edge, path, depth, parent node and ancestral node.
o Binary tree can be implemented in both single and two dimensional arrays.
o Binary tree can be traversed using three orders or traversal; inorder, preorder and
postorder.
Terminal Questions
1. List and explain advantages of a tree.
2. Explain different types of tree.
3. Explain various terminologies used in context of a tree.
4. Explain different types of tree traversals.
291
Answer Keys
Self-assessment Questions
Question No.
Answer
1
c
2
a
3
b
4
b
5
c
6
d
7
b
8
a
9
c
10
a
11
c
12
b
13
a
14
a
15
a
16
c
Activity
Activity Type: Offline
Duration: 15 Minutes
Description:
Ask all the students to solve the given problem,
Convert an array [10,26,52,76,13,8,3,33,60,42] into a maximum heap.
292
Bibliography
e-References
•
cs.cmu.edu, (2016). Binary Trees. Retrieved on 19 April 2016, from,
https://www.cs.cmu.edu/~adamchik/15-121/lectures/Trees/trees.html
•
comp.dit.ie, (2016). Heap Sort. Retrieved on 19 April 2016, from,
http://www.comp.dit.ie/rlawlor/Alg_DS/sorting/heap%20sort.pdf
External Resources
•
Kruse, R. (2006). Data Structures and program designing using ‘C’ (2nd ed.). Pearson
Education.
•
Srivastava, S. K., & Srivastava, D. (2004). Data Structures Through C in Depth (2nd
ed.). BPB Publications.
•
Weiss, M. A. (2001). Data Structures and Algorithm Analysis in C (2nd ed.). Pearson
Education.
Video Links
Topic
Link
Tree Terminologies
https://www.youtube.com/watch?v=nq7m0Gll-60
Binary Tree Traversal
https://www.youtube.com/watch?v=-aIcPlIQ_MI
Binary Tree Representation
https://www.youtube.com/watch?v=1EsBpPmGEEE
293
Notes:
294
Chapter Table of Contents
Chapter 5.2
Graph Fundamentals
Aim ..................................................................................................................................................... 295
Instructional Objectives................................................................................................................... 295
Learning Outcomes .......................................................................................................................... 295
Introduction ...................................................................................................................................... 296
5.2.1 Definition of Graph ................................................................................................................ 296
Self-assessment Questions ...................................................................................................... 308
5.2.2 Types of Graphs ...................................................................................................................... 309
(i) Defined Graph .................................................................................................................... 309
(ii) Undefined Graph .............................................................................................................. 312
Self-assessment Questions ...................................................................................................... 316
5.2.3 Graph Traversal....................................................................................................................... 317
(i) Depth First Search (DFS) Traversal ................................................................................. 317
(ii) Breadth First Search (BFS) Traversal.............................................................................. 326
Self-assessment Questions ...................................................................................................... 334
Summary ........................................................................................................................................... 335
Terminal Questions.......................................................................................................................... 336
Answer Keys...................................................................................................................................... 337
Activity............................................................................................................................................... 338
Case study .......................................................................................................................................... 339
Bibliography ...................................................................................................................................... 341
e-References ...................................................................................................................................... 341
External Resources ........................................................................................................................... 341
Video Links ....................................................................................................................................... 341
Aim
To educate the students about the basics of Graphs and their applications
Instructional Objectives
After completing this chapter, you should be able to:
•
Describe graph and its types
•
Discuss the depth first search and breadth first search algorithm
•
Distinguish between different types of graphs
Learning Outcomes
At the end of this chapter, you are expected to:
•
Outline the steps to traverse depth first search and breadth first search
•
Write a C programme for DFS and BFS
•
Compare DFS and BFS
•
Outline applications of graph
295
Introduction
Till now we have studied data structures like arrays, stack, queue, linked list, trees etc. In this
chapter, we introduce an important mathematical and graphical structure called as Graphs.
Graphs are used in subjects like Geography, Chemistry etc. This chapter deals with the study
of graphs and how to solve problems using graph theory. This chapter also deals with different
types of Graphs traversals.
5.2.1 Definition of Graph
A Graph G is a Graphical representation of a collection of objects having vertices and edges.
Vertices or nodes are formed by the interconnected objects. The links that joins a pair of
vertices is called as Edges.
A Graph G consists of a set V of vertices and a set E of edges. A Graph is defined as G = (V, E),
where V is a finite and non-empty set of all the Vertices and E is a set of pairs of vertices called
as edges. A graph is depicted below in the figure 5.2.1.
Figure 5.2.1: A Graph
Thus V (G) is the set of vertices of a graph and E (G) is a set of edges of that Graph.
Figure 5.2.1 shows an example of a simple graph. In this graph:
V (G) = {V1, V2, V3, V4, V5, V6}, forming 6 vertices and
E (G) = {E1, E2, E3, E4, E5, E6E, E7}, forming 7 edges.
296
Graphs are one of the objects of study in discrete mathematics and are very important in the
field of computer science for many real world applications from building compilers to
modelling communication networks.
A graph, G, consists of two sets V and E. V is a finite non-empty set of vertices. E is a set of
pairs of vertices, these pairs are called edges.
V(G) and E(G) will represent the sets of vertices and edges of graph G.
We can also write G = (V,E) to represent a graph.
Terminologies used in a Graph
1. Adjacent Vertices:
Any vertex v1 is said to be an adjacent vertex of another vertex v2, if there exists and edge from
vertex v1 to vertex v2.
Consider the following graph in figure 5.2.2. In this graph, adjacent vertices to vertex V1 are
V2 and V4, whereas adjacent vertices for V4 are vertex V2, V1 and V5
Figure 5.2.2: Adjacent Vertices
2. Point:
A point is any position in 1-Dimesioanl, 2-Dimensional or 3-Dimensional space. It is usually
denoted by an alphabet or a dot.
3. Vertex:
A vertex is defined as the node where multiple lines meet. In the above figure, V1 is a vertex;
V2 is a vertex and so on.
297
4. Edge:
An edge is a line which joins any two vertices in a graph. In the above sample graph, E1 is an
Edge joining two vertices V1 and V2.
5. Path:
A path is a sequence of all the vertices adjacent to the next vertex starting from any vertex v.
Consider above given figure 5.2.2. In this figure, V1, V2, V4, V5 is a path.
6. Cycle:
A Cycle is a path by itself having same first and last vertices. Thus it forms a cycle or a loop like
structure. In the above figure, V1, V2, V4, V1 is and Cycle.
7. Connected Graph:
A graph is a connected graph if there is a path from any vertex to any other vertex. Consider
the following figure 5.2.3 (a), this graph is a connected graph as there is a path from every vertex
to other vertex.
Figure 5.2.3(b) shows an unconnected graph as there is no edge between vertex V4 and V5.
Thus it forms two disconnected components of a graph.
Figure 5.2.3 (a): A Connected Graph
Figure 5.2.3 (b): An Unconnected Graph
298
8. Degree of Graph:
It means the number of edges incident on a vertex. The degree of any vertex v is denoted as
degree (v). If degree (v) is 0, it means there are no edges incident on vertex v. Such a vertex is
called as an isolated vertex. Below figure 5.2.4 shows a graph and its degrees of all the vertices.
Figure: 5.2.4: Degree of Vertices
9. Complete Graph:
A graph is said to be a Complete Graph if there is a path from every vertex to every other vertex.
It is also called as Fully Connected Graph.
Figure 5.2.5 shows a complete Graph.
Figure 5.2.5: A Complete or Fully Connected Graph
299
Graph Data Structure
Graphs can be formally defines as an abstract data type with data objects and operations on it
as follows:
Data objects: A graph G of vertices and edges. Vertices represent data or elements. The below
given figure 5.2.6 shows a simple graph with 5 vertices.
Figure 5.2.6: A Simple Graph
Operations
•
Check-Graph-Empty (G): Check if graph G is empty - Boolean function. Above graph
is having 5 vertices, thus this operation will give a false value as graph is not empty.
•
Insert-Vertex (G, V): Insert an isolated vertex V into a graph G. Ensure that vertex V
does not exist in G before insertion. In the below graph in figure 5.2.7, we are adding a
new vertex named F. Before adding this vertex to the graph, it acts as an isolated vertex.
Figure 5.2.7: Isolated Vertex
300
•
Insert-Edge (G, u, v): Insert an edge connecting vertices u, v into a graph G. Ensure
that an edge does not exist in G before insertion. In the below given figure 5.2.8, a new
edge joining vertices E and F is inserted.
Figure 5.2.8: Adding New Edge
•
Delete-Vertex (G, V): Delete vertex V and the entire edges incident on it from the
graph G. Ensure that such a vertex exists in the graph G before deletion. In the graph
given in below figure 5.2.9, a vertex D is deleted along with all the vertices incident on
that vertex D.
Figure 5.2.9: Vertex Deletion
•
Delete-Edge (G, u, v): Delete an edge from the graph G connecting the vertices u, v.
Ensure that such an edge exists before deletion. In the graph shown below (figure
5.2.10), an edge connecting vertices B and D is removed.
301
Figure 5.2.10: Deleting an Edge
•
Store-Data (G, V, Item): Store Item into a vertex V of graph G. Once a vertex is created,
its value can be added too. In below given graph (figure 5.2.11) a vertex is created and
item G is stored in that vertex.
Figure 5.2.11: Adding Data into Vertex
•
Retrieve-Data (G, V, and Item): Retrieve data of a vertex V in the graph G and return
it in Item. In the below given example (figure 5.2.12) we have retrieved data from the
vertex G.
302
Figure 5.2.12: A Complete or Fully Connected Graph
•
BFT (G): Perform Breath First Traversal of a graph. This traversal starts from the root
node and explores the nodes level wise, thus exploring a node completely.
•
DFT (G): Perform Depth First Traversal of a graph. This traversal starts from a root
node and visit all the adjacent nodes completely in depth before backtracking.
Representation of Graphs
Since a graph is a mathematical structure, the representation of graphs is categorised into two
types, namely (i) sequential representation and (ii) linked representation. Sequential
representation uses array data structure whereas linked representation uses single linked list as
its data structure.
The sequential or the matrix representations of graphs have the following methods:
•
Adjacency Matrix Representation
•
Incidence Matrix Representation
a) Adjacency Matrix Representation
A graph with n nodes can be represented as n x n Adjacency Matrix A such that an element
Ai j = 1
0
if there exists an edge between nodes i and j
Otherwise
303
For example 1:
Figure 5.2.13 (a) Graph
Figure 5.2.13 (b) Adjacency Matrix
Explanation: Above graph contains 5 vertices: 1, 2, 3, 4 and 5
Consider vertex 1: vertex 1 is connected to vertex 2 and vertex 5.
Thus A 1, 2=1 and A 1, 5=1
Similarly vertex 1 is not connected to vertex 3, 4 and 1 itself
Thus A 1, 3=0, A 1,4=0 and A 1,1=0.
Consider vertex 5: vertex 5 is connected to vertex 1, vertex 2 and vertex 4.
Thus A 5,1 =1, A 5,2 =1, and A 5,4=1
Similarly vertex 5 is not connected to vertex 3 and 5 itself
Thus A 5, 3=0 and A 5,5=0
Same rule follows for other vertices also.
For example 2:
Figure 5.2.14 (a) Digraph
Figure 5.2.14 (b) Adjacency Matrix
Consider vertex 4: The given graph is a digraph, hence there is only an unidirectional path
from 4 to 2.
304
Thus A 4, 2 =1
Similarly vertex 4 is not connected to any other vertex rest all entries in that row are 0.
b) Incidence Matrix Representation
Let G be a graph with n vertices and e edges. Define an n x e matrix M = [mij] whose n rows
Corresponds to n vertices and e columns correspond to e edges, as
Aij = 1
ej incident upon vi
For example:
e1 e2 e3 e4 e5 e6 e7
v1 1
0
0
0
1
0
0
v3 0
1
1
0
0
0
0
v2 1
v4 0
v5 0
Figure 5.2.15 (a) Undirected Graph
1
0
0
0
1
0
0
1
1
0
0
1
1
0
1
1
1
0
Figure 5.2.15 (b) Incidence Matrix
Explanation: Above graph contains 5 vertices: 1, 2, 3, 4 and 5
Consider vertex 1: 2 edges are incident on vertex 1. They are e1 and e5.
Thus A 1, e1=1 and A 1,e5=1
Rest of the entries in that row will be 0.
Consider vertex 4: 3 edges are incident on vertex 4. They are e3, e4 and e7.
Thus A 4, e3=1, A 4, e4=1 and A 4, e7=1
Rest of the entries in that row will be 0.
The incidence matrix contains only two elements, 0 and 1. Such a matrix is also called as binary
matrix or a (0, 1)-matrix.
305
c) Linked Representation of Graphs
The linked representation of graphs is also referred to as adjacency list representation and is
comparatively efficient with regard to adjacency matrix representation.
In Linked representation of Graphs, a graph is stores as a linked structure of nodes. It can be
defined as a Graph G=(V, E), where all vertices are stored in a list and each vertex points to a
singly linked list of nodes which are adjacent to that head node.
For example 1:
Figure 5.2.16 (a) Undirected Graph
Figure 5.2.16 (b) Linked Representation of a Graph
In the above figure 5.2.16(a), an undirected graph is shown. In figure 5.2.16(b), its equivalent
Linked list representation is shown.
Adjacent vertices to vertex 1 are vertex 2 and 5. Thus in the linked list representation, vertex 1
is shown linked to node 2 and node 2 is linked to node 5, thus forming a chain like structure.
Similarly, adjacent vertices to vertex 2 are vertex 1, 5, 3 and 4. Thus in the linked list
representation, vertex 2is shown linked to node 1, node 1 is linked to node 5, node 5 is linked
to node 3 and node 3 in turn is connected to node 4 thus forming a linked list of nodes.
306
For example 2:
Figure 5.2.17 (a) Digraph
Figure 5.2.17 (b) Linked Representation of a Graph
Did you know?
Social network graphs: to tweet or not to tweet. Graphs that represent who knows whom, who
communicates with whom, who influences whom or other relationships in social structures. An
example is the twitter graph of who follows whom. These can be used to determine how
information flows, how topics become hot, how communities develop, or even who might be a
good match for who, or is that whom.
307
Self-assessment Questions
1) A Graph is a mathematical representation of a set of objects consisting of ________ and
____________.
a) Vertices and indices
b) Indices and edges
c) Vertices and edges
d) Edges and links
2) In a graph if e=(u,v) means _____________.
a) u is adjacent to v but v is not adjacent to u.
b) e begins at u and ends at v
c) u is node and v is an edge.
d) Both u and v are edges.
3) Graph can be represented as an adjacency matrix
a) True
b) False
4) A ________ is a particular position in a one-dimensional, two-dimensional, or threedimensional space.
308
a) Point
b) Node
c) Edge
d) Vertex
5.2.2 Types of Graphs
Based on the degree of vertex, graphs are categorised into two types namely,
1. Undirected Graph
2. Directed Graph
In an undirected graph, the pair of vertices representing any edge is unordered. Thus, the pairs
(v1, v2) and (v2, v1) represent the same edge.
In a directed graph, each edge is represented by a directed pair (v1, v2) where v1 is the tail and
v2 is the head of the edge. Therefore <v2, v1> and <v1, v2> represent two different edges.
(i)
Defined Graph
Degree of Vertex in a Defined Graph
In a directed graph, each vertex has an indegree and an outdegree.
Consider the following Directed graph from below figure:
Figure 5.2.18: Directed Graph
Indegree of a Graph
Indegree of vertex V is the number of edges which are coming into the vertex V (incoming
edges).
Notation − deg+(V).
309
In the above example directed graph, there are 5 vertices: V1, V2, V3, V4 and V5.
Consider vertex V1: V2 is connected to V1 through edge E1. The edge comes from V2 towards
Vertex V1. Thus the Indegree of vertex V1 is 1.
i.e., deg+(V1)=1
Consider vertex V3: V1 and V4 are connected to V3 through edges E4 and E5. Thus the
Indegree of vertex V3 is 2.
i.e., deg+(V3)=2
Outdegree of a Graph
Outdegree of vertex V is the number of edges which are going out from the vertex V (outgoing
edges).
Notation − deg-(V).
In the above example directed graph, there are 5 vertices: V1, V2, V3, V4 and V5.
Consider vertex V1: V1 is connected to V3 through edge E4. The edge goes from V1 towards
Vertex V3. Thus the Outdegree of vertex V1 is 1.
i.e., deg-(V1)=1
For example 1:
Consider the following directed graph.
Vertex ‘a’ has two edges, ‘ad’ and ‘ab’, which are going outwards. Hence its outdegree is 2.
Similarly, there is an edge ‘ga’, coming towards vertex ‘a’. Hence the indegree of ‘a’ is 1.
310
The indegree and outdegree of other vertices are shown in the following table.
Table 5.2.1 : Indegree and Outdegree of Vertices
Vertex
Indegree
Outdegree
a
1
2
b
2
0
c
2
1
d
1
1
e
1
1
f
1
1
g
0
2
For example 2:
Consider the following directed graph.
Vertex ‘a’ has an edge ‘ae’ going outwards from vertex ‘a’. Hence its outdegree is 1. Similarly,
the graph has an edge ‘ba’ coming towards vertex ‘a’. Hence the indegree of ‘a’ is 1.
311
The indegree and outdegree of other vertices are shown in the following table.
Table 5.2.1: Indegree and Outdegree of Vertices
(ii)
Vertex
Indegree
Outdegree
a
1
1
b
0
2
c
2
0
d
1
1
e
1
1
Undefined Graph
An undefined graph is a graph in which the nodes are connected by undefinedarcs. An
undefined arc is an edge that has no arrow. Both ends of an undefined arc are equivalent--there
is no head or tail. Therefore, we represent an edge in an undefined graph as a set rather than
an ordered pair:
Definition (Undefined Graph) An defined graph is an ordered pair
with the
following properties:
1. The first component,
, is a finite, non-empty set. The elements of
are called
the vertices of G.
2. The second component, , is a finite set of sets. Each element of
of exactly two (distinct) vertices. The elements of
312
is a set that is comprised
are called the edges of G.
For example, consider the undefined graph
comprised of four vertices and four
edges:
The graph
can be represented graphically as shown in Figure 5.2.19. The vertices are
represented by appropriately labelled circles, and the edges are represented by lines that
connect associated vertices.
Figure 5.2.19: An Undefined Graph
Notice that because an edge in an undefined graph is a set,
, and since
is also
a set, it cannot contain more than one instance of a given edge. Another consequence of
Definition is that there cannot be an edge from a node to itself in an undirected graph because
an edge is a set of size two and a set cannot contain duplicates.
Degree of Vertex in a Directed Graph
In a defined graph, each vertex has an indegree and an outdegree.
For example 1:
Consider the following graph −
In the above Undirected Graph,
313
deg(a) = 2, since there are 2 edges meeting at vertex ‘a’.
deg(b) = 3, since there are 3 edges meeting at vertex ‘b’.
deg(c) = 1, since there is 1 edge formed at vertex ‘c’
deg(d) = 2, since there are 2 edges meeting at vertex ‘d’.
deg(e) = 0, since there are 0 edges formed at vertex ‘e’.
For example 2:
Consider the following graph −
In the above graph,
deg(a) = 2, deg(b) = 2, deg(c) = 2, deg(d) = 2, and deg(e) = 0.
Tree
A tree can be defined as a nonlinear data structure similar to graphs, in which all the elements
are arranged in a sorted manner. A tree can be used to represent some hierarchical relations
among various data elements. Trees do not contain any Cycle.
Tree has a root node from where the tree structure begins. Starting from the root node the tree
will have many subtrees formed by its child nodes. A node is a data element present in a tree.
314
The figure 5.2.20 given below shows a simple tree:
Figure 5.2.20: A Tree
Degree of any node:
•
Degree of a node is defined as a number of subtrees of that node.
•
For example, Degree of nod A is 3, whereas Degree of node H is 0 as there are no
subtrees of H.
Degree of a tree:
•
Degree of a tree is the maximum degree of any nodes in the given tree.
•
For example, in the above given tree, Degree of that tree is 3 as node A is having the
degree 3, which is the maximum value in that tree.
Did you know?
Graphs are often used to represent constraints among items. For example the GSM network for
cell phones consists of a collection of overlapping cells. Any pair of cells that overlap must
operate at different frequencies. These constraints can be modeled as a graph where the cells are
vertices and edges are placed between cells that overlap.
315
The weight of an edge is often referred to as the "cost" of the edge.
In applications, the weight may be a measure of the length of a route, the capacity of a line, the
energy required to move between locations along a route, etc.
Given a weighted graph, and a designated node S, we would like to find a path of least total
weight from S to each of the other vertices in the graph.
The total weight of a path is the sum of the weights of its edges.
Self-assessment Questions
5) An undirected graph has no directed edges
a) True
b) False
6) ___________ of vertex V is the number of edges which are coming into the vertex
a) Degree
b) Indegree
c) Outdegree
d) Path
7) A connected acyclic graph is called a ___________
a) Connected Graph
b) Tree
c) Hexagon
d) Pentagon
8) ______________ of vertex V is the number of edges which are going out from the
vertex
316
a) Degree
b) Indegree
c) Outdegree
d) Path
5.2.3 Graph Traversal
A Graph traversal is a method by which we visit all the nodes in any given graph. Graph
traversals are required in many application areas, like searching an element in graph, finding
the shortest path to any node etc. There are many methods for traversing through a graph. In
this chapter we will study the following 2 methods for graph traversals. They are:
1. Depth First Search (DFS)
2. Breadth First Search (BFS)
(i)
Depth First Search (DFS) Traversal
In this Depth First Search method, we need to start from any root node in a graph and explore
all the nodes along each branch before backtracking. It means we need to explore all the
unvisited graph nodes of any root node. We can use a Stack data structure to keep track of the
visited nodes of a graph.
Algorithm:
1. Start
2. Consider a Graph G=(V, E)
3. Initially mark all the nodes of graph G as unseen
4. Push Root node onto the Stack from where it begins
5. Repeat until the Stack S is empty
6. Pop the node from the Stack S
7. If this popped node is having any unseen nodes, traverse the unseen child nodes, mark
them as visited and push it on stack
8. If the node is not having any unseen child nodes, Pop the node from the stack S
9. End
317
Pseudo Code:
Consider a Graph G= (V, E) where V is set of vertices and E is a set of Edges.
DFS(G, Root)
{
root: is a vertex from where the traversal begins
Visited[v] is a status flag denoting that any vertex v is visited
Consider Stack s used to store visited vertices
For each vertex v in the graph
Set Visited[v] = false;
//to mark all nodes unvisited initially
Push root vertex on Stack S;
//Start from Root vertex
While Stack is not empty
{
Pop element from S and put in v
If (not visited[v] =true)
{
For every unvisited node x of v
Push vertex x on Stack S
}
}
}
Example:
Consider the following Graph. We will apply the above algorithm on this graph to implement
Depth first Search Traversal.
318
Step 1: Initially we start from root node B. Stack S is empty.
Step 2: Mark the root node B as visited and Push B on the Stack S.
Step 3: Check for the nodes adjacent to node B. Only Node D is adjacent to node B. So visit
that node D and Push it onto the stack S.
319
Step 4: Check for the nodes adjacent to node D. There are 3 Nodes adjacent to node D. They
are B, C and E. But node B is already visited. So we need to consider either of the nodes C and
E. Let us consider node C. Mark first child node C as visited and push C on stack S.
Step 5: Check for the nodes adjacent to node C. There are 2 Nodes adjacent to node C. They
are D and E. But node D is already visited. So select node E as the next node. Mark node E as
visited and push E on stack S.
320
Step 6: Check for the nodes adjacent to node E. There are 3 Nodes adjacent to node E. They
are D, C and A. But nodes D and C are already visited. So select node A as the next node. Mark
node A as visited and push A on stack S.
Step 7: Check for the nodes adjacent to node A. But there is only 1 node adjacent to A i.e. E
and E is already visited. So as per the algorithm, if there are no unseen nodes then pop that
node from stack. So pop out node A from stack. Hence A has no vertices left to be visited now.
So we need to backtrack to node E.
321
Step 8: Check for the nodes adjacent to node E. Nodes A, C and D are adjacent. But A, C and
D are already marked as visited. So as per the algorithm, if there are no unseen nodes then pop
that node from stack. So pop out node E from stack. Hence E has no vertices left to be visited
now. So we need to backtrack to node C.
Step 9: Check for the nodes adjacent to node C. Nodes D and E are adjacent. But D and E are
already marked as visited. So pop out node C from stack. So we need to backtrack to node D.
322
Step 10: Check for the nodes adjacent to node D. Nodes B, C and E are adjacent. But C, B and
E are already marked as visited. So pop out node D also from stack. So we need to backtrack to
node B.
Step 11: Check for the nodes adjacent to node B. Node D is adjacent. But D is already marked
as visited. So pop out node B(root) also from stack. Hence the stack is empty now. It means the
algorithm is successfully executed.
Result of DFS Traversal: B. D, C, E, A
Program:
/*C program for Depth first search graph traversal using Stack */
#include<stdio.h>
char nodeid[20]; //to store nodes names
char stack[50];
int temp=0;
int tos=-1, nodes; //top of stack initialized to -1
char arr[20];
char dfs(int );
323
int matrix[30][30];
void push(char val) //push vertex on stack
{
tos=tos+1;
stack[tos]=val;
}
char pop()
//pop vertex from stack
{
return stack[tos];
}
void outputDFS()
{
printf("Depth First Traversal gives: ");
for(int i=0; i<nodes; i++)
printf("%c ",arr[i]);
}
int unVisited(char val)
{
for(int i=0; i<temp; i++)
if(val==arr[i])
return 0;
for(int i=0; i<=tos; i++)
if(val==stack[tos])
return 0;
return 1;
}
char dfs(int i)
{
int k;
char m;
if(tos==-1)
{
push(nodeid[i]);
}
m=pop();
tos=tos-1;
arr[temp]=m;
temp++;
for(int j=0; j<nodes; j++)
{
if(matrix[i][j]==1)
{
if(unVisited(nodeid[j]))
{
push(nodeid[j]);
}
}
}
return stack[tos];
}
324
int main()
{
char v;
int l=0;
printf("How many nodes in graph?");
scanf("%d",&nodes);
printf("Enter the names of nodes one by one: \n");
for(int i=0; i<nodes; i++)
{
scanf("%s",&nodeid[i]);
}
char root=nodeid[0]; //consider first node as root node
printf("Enter the adjacency matrix. Edge present=1, else 0\n");
for(int i=0;i<nodes; i++)
{
for(int j=0; j<nodes; j++)
{
printf("matrix[%c][%c]= ", nodeid[i], nodeid[j]);
scanf("%d", &v);
matrix[i][j]=v;
}
}
for(int i=0;i<nodes;i++)
{
l=0;
while(root!=nodeid[l])
l++;
root=dfs(l);
}
outputDFS();
}
Output:
325
(ii)
Breadth First Search (BFS) Traversal
In this method of traversing, we select a node as a start node from where the traversal begins.
It is visited and marked, and all the unvisited nodes adjacent to the next node are visited and
marked in an order. Similarly, all the unvisited nodes of adjacent nodes are visited and marked
until full graph nodes are covered.
Algorithm:
1. Start
2. Consider a Graph G=(V, E)
3. Initially mark all the nodes of graph G as unseen
4. Add the Root node into the Queue Q from where it begins
5. Repeat until the Queue Q is empty
6. Remove the node from the Queue Q
326
7. If this popped node is having any unseen nodes, visit and mark all the unvisited nodes.
8. End
Pseudo Code:
Consider a Graph G= (V, E) where V is set of vertices and E is a set of Edges.
BFS (G, Root)
{
Root: is a vertex from where the traversal begins
Visited[v] is a status flag denoting that any vertex v is visited
Consider Queue Q used to store visited vertices
For each vertex v in the graph
Set Visited[v] = false;
//to mark all nodes unvisited initially
Add Root vertex to the Queue Q;
//Start from Root vertex
While the Queue Q is not empty
{
Remove element from Q and put in v
If (not visited[v] =true)
{
Visit and mark all the unvisited node x of v
}
}
}
Example:
Consider the following Graph. We will apply the above algorithm on this graph to implement
Breadth first Search Traversal.
327
Step 1: We will start from node B. Hence B is a root node. We have a queue Q for keeping track
of vertices.
Step 2: We will start from node B. Hence B is a root node. We have a queue Q for keeping track
of vertices. Add B to the Queue, Mark B as visited.
328
Step 3: Nodes adjacent to B are A, C and D. All 3 are unvisited. But we will start from A. As B
is visited, Remove B from Queue. Mark A as visited and add it to the Queue Q.
Step 4: Next Node adjacent to B is C. Mark C as visited and add it to the Queue Q.
Step 5: Next Node adjacent to B is D. Mark D as visited and add it to the Queue Q.
329
Step 6: Now the node B is fully explored. So, the next node to be visited is A. Nodes adjacent
to node A are B, C and E. But B and C are already visited. Mark E as visited and add it to the
Queue Q. But remove A from queue Q as it is fully explored.
Step 7: Now the nodes adjacent to E are A, C and D. But all are marked and visited.
If we see the graph, vertices E, C and D are fully visited so remove them from the queue. Hence
the Queue is empty. Thus the algorithm is successfully implemented.
Result of BFS: B, A, C, D, E
Program:
/*C program for Breadth first search graph traversal using Queue */
#include<stdio.h>
char nodeid[20]; //to store nodes names
char queue[50];
int temp=0;
int front=0, rear=0, nodes; //front and rear =0
char arr[20];
330
int bfs(int);
int matrix[30][30];
void qadd(char value)
//function to add vertex to a queue
{
queue[front]=value;
front++;
}
char qremove() //function to remove visited vertex from queue
{
rear=rear+1;
return queue[rear-1];
}
void outputBFS()
{
printf("Breadth First Traversal gives: ");
for(int i=0; i<nodes; i++)
printf("%c ",arr[i]);
}
int unVisited(char value)
{
for(int i=0; i<front; i++)
{
if(value==queue[i])
return 0;
}
return 1;
}
int bfs(int i)
{
char r;
if(front==0)
{
qadd(nodeid[i]);
}
for(int j=0; j<nodes; j++)
{
if(matrix[i][j]==1)
{
if(unVisited (nodeid[j]))
{
qadd(nodeid[j]);
}
}
}
r=qremove();
arr[temp]=r;
temp++;
return 0;
331
}
int main()
{
char v;
printf("How many nodes in graph?");
scanf("%d",&nodes);
printf("Enter the names of nodes one by one: \n");
for(int i=0; i<nodes; i++)
{
scanf("%s",&nodeid[i]);
}
printf("Enter the adjacency matrix. Edge present=1, else 0\n");
for(int i=0;i<nodes; i++)
{
for(int j=0; j<nodes; j++)
{
printf("matrix[%c][%c]= ", nodeid[i], nodeid[j]);
scanf("%d", &v);
matrix[i][j]=v;
}
}
for(int i=0;i<nodes;i++)
bfs(i);
outputBFS();
}
Output:
332
Comparison: DFS versus BFS
Table: 5.2.1: Comparison of DFS with BFS
Depth First Search Traversal
Breadth First Search Traversal
This traversal starts from a root node and
This traversal starts from the root node and
visit all the adjacent nodes completely in
explores the nodes level wise, thus exploring a
depth before backtracking.
node completely.
It may not necessarily give a shortest path
in a graph.
It always gives the shortest path within a graph,
thus gives an optimal solution by giving
shortest path.
If a loop exists in a graph, the algorithm
Marking visited nodes can improve the
may go into infinite loop. Hence care
efficiency of the algorithm, but even without
should be taken while marking visited
doing this, the search is guaranteed to
vertices.
terminate.
Applications:
Applications:
1.
Connectivity testing
1.
Finding Shortest path
2.
Spanning trees
2.
Spanning tree
333
Self-assessment Questions
9) Sequential representation of binary tree uses___________.
a) Array with pointers
b) Single linear array
c) Two dimensional arrays
d) Three dimensional arrays
10) In the _____________traversal, we process all of a vertex’s descendants before we move
to an adjacent vertex.
a) Depth First
b) Breadth First
c) Path First
d) Root First
11) The data structure required for Breadth First Traversal on a graph is__________.
a) Tree
b) Stack
c) Array
d) Queue
12) The aim of BFS algorithm is to traverse the graph that are__________
a) As close as possible to the root node
b) With high depth
c) With large breadth
d) With large number or nodes
334
Summary
o Graphs are non-linear data structures. Graph is an important mathematical
representation of a physical problem.
o Graphs and directed graphs are important to computer science for many real world
applications from building compilers to modeling physical communication
networks.
o A graph is an abstract notion of a set of nodes (vertices or points) and connection
relations (edges or arcs) between them.
o The representation of graphs can be categorized as (i) sequential representation and
(ii) linked representation.
o The sequential representation makes use of an array data structure whereas the
linked representation of a graph makes use of a singly linked list as its fundamental
data structure.
o The depth first search (DFS) and breadth first search (BFS) and are the two
algorithms used for traversing and searching a node in a graph.
335
Terminal Questions
1. Explain graphs as a data structure.
2. Explain two different ways of sequential representation of a graph with an example.
3. Explain the linked representation of an undirected and directed graph.
4. Which are the two standard ways of traversing a graph? Explain them with an
example of each.
5. Consider the following specification of a graph G,
V(G) = { 4,3,2,1 }
E(G) = {( 2,1 ),( 3,1 ),( 3,3 ),( 4,3 ),( 1,4 )}
a) Draw an undirected graph.
b) Draw its adjacency matrix.
336
Answer Keys
Self-assessment Questions
Question No.
Answer
1
c
2
d
3
a
4
a
5
a
6
b
7
b
8
c
9
b
10
a
11
d
12
a
337
Activity
Activity Type: Offline
Duration: 30 Minutes
Description:
Ask the students to solve given problem:
Consider the graph G with vertices V ={1, 2, 3, 4} and edges
E={(1,2),(2,3),(3,4),(4,1),(2,1),(2,4)}.
338
•
For every vertex u, find its indegree in (u) and its out degree out (u).
•
What is the value of the following sum for this graph?
Case study
Study of different applications of Graphs
Since they are powerful abstractions, graphs can be very important in modelling data. In fact,
many problems can be reduced to known graph problems. Here we outline just some of the
many applications of graphs.
1. Transportation networks. In road networks vertices are intersections and edges are the road
segments between them, and for public transportation networks vertices are stops and
edges are the links between them. Such networks are used by many map programs such as
Google maps, Bing maps and now Apple IOS 6 maps (well perhaps without the public
transport) to find the best routes between locations. They are also used for studying traffic
patterns, traffic light timings, and many aspects of transportation.
2. Utility graphs. The power grid, the Internet, and the water network are all examples of
graphs where vertices represent connection points, and edges the wires or pipes between
them. Analysing properties of these graphs is very important in understanding the
reliability of such utilities under failure or attack, or in minimizing the costs to build
infrastructure that matches required demands.
3. Document link graphs. The best known example is the link graph of the web, where each
web page is a vertex, and each hyperlink a directed edge. Link graphs are used, for example,
to analyse relevance of web pages, the best sources of information, and good link sites.
4. Protein-protein interactions graphs. Vertices represent proteins and edges represent
interactions between them that carry out some biological function in the cell. These graphs
can be used, for example, to study molecular pathways—chains of molecular interactions
in a cellular process. Humans have over 120K proteins with millions of interactions among
them.
5. Network packet traffic graphs. Vertices are IP (Internet protocol) addresses and edges are
the packets that flow between them. Such graphs are used for analysing network security,
studying the spread of worms, and tracking criminal or non-criminal activity.
6. Scene graphs. In graphics and computer games scene graphs represent the logical or special
relationships between objects in a scene. Such graphs are very important in the computer
games industry.
339
7. Finite element meshes. In engineering many simulations of physical systems, such as the
flow of air over a car or airplane wing, the spread of earthquakes through the ground, or
the structural vibrations of a building, involve partitioning space into discrete elements.
The elements along with the connections between adjacent elements form a graph that is
called a finite element mesh.
8. Robot planning. Vertices represent states the robot can be in and the edges the possible
transitions between the states. This requires approximating continuous motion as a
sequence of discrete steps. Such graph plans are used, for example, in planning paths for
autonomous vehicles.
9. Neural networks. Vertices represent neurons and edges the synapses between them. Neural
networks are used to understand how our brain works and how connections change when
we learn. The human brain has about 1011 neurons and close to 1015 synapses.
10. Graphs in quantum field theory. Vertices represent states of a quantum system and the
edges the transitions between them. The graphs can be used to analyse path integrals and
summing these up generates a quantum amplitude (yes, I have no idea what that means).
11. Semantic networks. Vertices represent words or concepts and edges represent the
relationships among the words or concepts. These have been used in various models of how
humans organize their knowledge, and how machines might simulate such an organization.
12. Graphs in epidemiology. Vertices represent individuals and directed edges the transfer of
an infectious disease from one individual to another. Analysing such graphs has become an
important component in understanding and controlling the spread of diseases.
13. Graphs in compilers. Graphs are used extensively in compilers. They can be used for type
inference, for so called data flow analysis, register allocation and many other purposes.
They are also used in specialized compilers, such as query optimization in database
languages.
Questions:
1. List down different applications of graphs from above study.
2. Explain in brief how graphs can be used in computer networks.
3. Can you thinks of any additional application of graph to solve real world problem.
340
Bibliography
e-Reference
•
courses.cs.vt.edu, (2016). Graph Traversals .Retrieved on 19 April 2016, from,
http://courses.cs.vt.edu/~cs3114/Fall09/wmcquain/Notes/T20.GraphTraversals.pdf
External Resources
•
Kruse, R. (2006). Data Structures and program designing using ‘C’ (2nd ed.). Pearson
Education.
•
Srivastava, S. K., & Srivastava, D. (2004). Data Structures Through C in Depth (2nd
ed.). BPB Publications.
•
Weiss, M. A. (2001). Data Structures and Algorithm Analysis in C (2nd ed.). Pearson
Education.
Video Links
Topic
Link
Introduction to Graphs
https://www.youtube.com/watch?v=vfCo5A4HGKc
Graph Types and Representations
https://www.youtube.com/watch?v=VeEneWqC5a4
Graph Traversals
https://www.youtube.com/watch?v=H4_vRy4xQpc&li
st=PLT2H5PXNSXgM_Mqzk7bChFvB6xyuWilIa
341
Notes:
342
Download