Uploaded by [60] Sahil Ahmad

Unit I

advertisement
Introduction to
Data Structures
This document has been made available as study material for the students of B.Tech. CSE and
the content has been collected from various resources available online and offline by:
Er. ANDLEEB ZUHRA
andleebz@uok.edu.in
(Lecturer)
Department of Computer Science and Engineering
University of Kashmir, Srinagar
(North Campus)
Er. Andleeb Zuhra
Lecturer
Department of CSE
University of Kashmir, Srinagar
Introduction to Data Structures
and Algorithms
P a g e 2 | 27
Er. Andleeb Zuhra
Lecturer
Department of CSE
University of Kashmir, Srinagar
What are Data Structures?
A data structure is defined by
 the logical arrangement of data elements, combined with
 the set of operations we need to access the elements.
Data structure is the structural representation of logical relationships between
elements of data. In other words a data structure is a way of organizing data items
by considering its relationship to each other.
Data structure mainly specifies the structured organization of data, by
providing accessing methods with correct degree of associativity. Data structure
affects the design of both the structural and functional aspects of a program.
Algorithm + Data Structure = Program
Data structures are the building blocks of a program; here the selection of
a particular data structure will help the programmer to design more efficient
programs as the complexity and volume of the problems solved by the computer
is steadily increasing day by day. The programmers have to strive hard to solve
these problems. If the problem is analysed and divided into sub problems, the task
will be much easier i.e., divide, conquer and combine.
A complex problem usually cannot be divided and programmed by set of
modules unless its solution is structured or organized. This is because when we
divide the big problems into sub problems, these sub problems will be
programmed by different programmers or group of programmers. But all the
programmers should follow a standard structural method so as to make easy and
efficient integration of these modules. Such type of hierarchical structuring of
program modules and sub modules should not only reduce the complexity and
control the flow of program statements but also promote the proper structuring of
information. By choosing a particular structure (or data structure) for the data
items, certain data items become friends while others loses its relations.
Example: A Library is composed of elements (books). Accessing a particular
book requires knowledge of the arrangement of the books. Users access books
only through the librarian.
The logical arrangement of data elements, combined with the set of operations
we need to access the elements.
P a g e 3 | 27
Er. Andleeb Zuhra
Lecturer
Department of CSE
University of Kashmir, Srinagar
(a)
(b)
(c)
Algorithm:
A computable set of steps to achieve a desired result i.e., Relationship to Data
Structure.
Example: Find an element.
P a g e 4 | 27
Er. Andleeb Zuhra
Lecturer
Department of CSE
University of Kashmir, Srinagar
Complexity of Algorithms
 In Computer Science, it is important to measure the quality of algorithms,
especially the specific amount of a certain resource an algorithm needs
 Resources: time or memory storage (PDA?)
 Different algorithms do same task with a different set of instructions in less
or more time, space or effort than other.
 The analysis has a strong mathematical background.
 The most common way of qualifying an algorithm is the Asymptotic
Notation, also called Big O.
 It is generally written as
 Polynomial time algorithms,
O(1) --- Constant time --- the time does not change in response to
the size of the problem.
O(n) --- Linear time --- the time grows linearly with the size (n) of
the problem.
O(n2) --- Quadratic time --- the time grows quadratically with the
size (n) of the problem. In big O notation, all polynomials with the same
degree are equivalent, so O(3n2 + 3n + 7) = O(n2).
 Sub-linear time algorithms
O(log n) -- Logarithmic time
 Super-polynomial time algorithms
O(n!)
O(2n)
Example 1: Complexity of an algorithm
P a g e 5 | 27
Er. Andleeb Zuhra
Lecturer
Department of CSE
University of Kashmir, Srinagar
void f ( int a[], int n )
2 * O(1) + O(N)
{
int i;
cout<< "N = “<< n;
for ( i = 0; i < n; i++ )
O(N)
cout<<a[i];
printf ( "n" );
}
Example 2: Complexity of an algorithm
void f ( int a[], int n )
{
int i;
cout<< "N = “<< n;
2
2 * O(1) + O(N)+O(N ) c
for ( i = 0; i < n; i++ )
for (int j=0;j<n;j++)
cout<<a[i]<<a[j];
O(N2)
for ( i = 0; i < n; i++ )
cout<<a[i];
printf ( "n" );
}
Structures
Structures are used when you want to process data of multiple data types.
But you still want to refer to the data as a single entity.
Access data:
structurename.membername
P a g e 6 | 27
Er. Andleeb Zuhra
Lecturer
Department of CSE
University of Kashmir, Srinagar
Structure Pointers
Process the structure using a structure pointer.
CLASSIFICATION OF DATA STRUCTURES
Data structures are broadly divided into two:
1. Primitive data structures: These are the basic data structures and are
directly operated upon by the machine instructions, which is in a primitive
P a g e 7 | 27
Er. Andleeb Zuhra
Lecturer
Department of CSE
University of Kashmir, Srinagar
level. They are integers, floating point numbers, characters, string
constants, pointers etc. These primitive data structures are the basis for the
discussion of more sophisticated (non-primitive) data structures discussed
later.
2. Non-primitive data structures: It is a more sophisticated data structure
emphasizing on structuring of a group of homogeneous (same type) or
heterogeneous (different type) data items. Array, list, files, linked list, trees
and graphs fall in this category.
Fig. 1. Classifications of data structures
The Fig. 1 will briefly explain other classifications of data structures. Basic
operations on data structure are to create a (non-primitive) data structure; which
is considered to be the first step of writing a program. For example, in Pascal, C
and C++, variables are created by using declaration statements.
int Int_Variable;
In C/C++, memory space is allocated for the variable “Int_Variable” when
the above declaration statement executes. That is a data structure is created.
Discussions on primitive data structures are beyond the scope of this book. Let
us consider non-primitive data structures.
Arrays
Arrays are most frequently used in programming. Mathematical problems
like matrix, algebra and etc can be easily handled by arrays. An array is a
collection of homogeneous data elements described by a single name. Each
element of an array is referenced by a subscripted variable or value, called
P a g e 8 | 27
Er. Andleeb Zuhra
Lecturer
Department of CSE
University of Kashmir, Srinagar
subscript or index enclosed in parenthesis. If an element of an array is referenced
by single subscript, then the array is known as one dimensional array or linear
array and if two subscripts are required to reference an element, the array is
known as two dimensional array and so on. Analogously the arrays whose
elements are referenced by two or more subscripts are called multi-dimensional
arrays.
Lists
As we have discussed, an array is an ordered set, which consist of a fixed
number of elements. No deletion or insertion operations are performed on arrays.
Another main disadvantage is its fixed length; we cannot add elements to the
array. Lists overcome all the above limitations. A list is an ordered set consisting
of a varying number of elements to which insertion and deletion can be made. A
list represented by displaying the relationship between the adjacent elements is
said to be a linear list. Any other list is said to be non-linear. List can be
implemented by using pointers. Each element is referred to as nodes; therefore a
list can be defined as a collection of nodes as shown below:
Fig. 2. Model of a List data structure.
Files and Records
A file is typically a large list that is stored in the external memory (e.g., a
magnetic disk) of a computer.
A record is a collection of information (or data items) about a particular
entity. More specifically, a record is a collection of related data items, each of
which is called a filed or attribute and a file is a collection of similar records.
Although a record is a collection of data items, it differs from a linear array in
the following ways:
(a) A record may be a collection of non-homogeneous data; i.e., the data items
in a record may have different data types.
(b) The data items in a record are indexed by attribute names, so there may not
be a natural ordering of its elements.
P a g e 9 | 27
Er. Andleeb Zuhra
Lecturer
Department of CSE
University of Kashmir, Srinagar
MEMORY MANAGEMENT
P a g e 10 | 27
Er. Andleeb Zuhra
Lecturer
Department of CSE
University of Kashmir, Srinagar
A memory or store is required in a computer to store programs (or
information or data). Data used by the variables in a program is also loaded into
memory for fast access. A memory is made up of a large number of cells, where
each cell is capable of storing one bit. The cells may be organized as a set of
addressable words, each word storing a sequence of bits. These addressable
memory cells should be managed effectively to increase its utilization. That is
memory management is to handle request for storage (that is new memory
allocations for a variable or data) and release of storage (or freeing the memory)
in most effective manner. While designing a program the programmer should
concentrate on to allocate memory when it is required and to deallocate once its
use is over.
In other words, dynamic data structure provides flexibility in adding,
deleting or rearranging data item at run-time. Dynamic memory management
techniques permit us to allocate additional memory space or to release unwanted
space at run-time, thus optimizing the use of storage space. Next topic will give
you a brief introduction about the storage management, static as well as dynamic
functions available in C.
MEMORY ALLOCATION IN C
There are two types of memory allocations in C:
1. Static memory allocation or Compile time
2. Dynamic memory allocation or Run time
In static or compile time memory allocations, the required memory is
allocated to the variables at the beginning of the program. Here the memory to be
allocated is fixed and is determined by the compiler at the compile time itself. For
example
int i, j;
float a[5], f;
//Two bytes per (total 2) integer variables
//Four bytes per (total 6) floating point variables
When the first statement is compiled, two bytes for both the variable ‘i’
and ‘j’ will be allocated. Second statement will allocate 20 bytes to the array A
[5 elements of floating point type, i.e., 5 × 4] and four bytes for the variable ‘f ’.
But static memory allocation has following drawbacks.
If you try to read 15 elements, of an array whose size is declared as 10,
then first 10 values and other five consecutive unknown random memory values
will be read. Again if you try to assign values to 15 elements of an array whose
P a g e 11 | 27
Er. Andleeb Zuhra
Lecturer
Department of CSE
University of Kashmir, Srinagar
size is declared as 10, then first 10 elements can be assigned and the other 5
elements cannot be assigned/accessed.
The second problem with static memory allocation is that if you store less
number of elements than the number of elements for which you have declared
memory, and then the rest of the memory will be wasted. That is the unused
memory cells are not made available to other applications (or process which is
running parallel to the program) and its status is set as allocated and not free. This
leads to the inefficient use of memory.
The dynamic or run time memory allocation helps us to overcome this
problem. It makes efficient use of memory by allocating the required amount of
memory whenever is needed. In most of the real time problems, we cannot predict
the memory requirements. Dynamic memory allocation does the job at run time.
C provides the following dynamic allocation and de-allocation functions:
(i)
(ii)
(iii)
(iv)
malloc( )
calloc( )
realloc( )
free( )
ALLOCATING A BLOCK OF MEMORY – malloc()
The malloc( ) function is used to allocate a block of memory in bytes. The
malloc function returns a pointer of any specified data type after allocating a
block of memory of specified size. It is of the form
ptr = (int_type *) malloc (block_size)
‘ptr’ is a pointer of any type ‘int_type’ byte size is the allocated area of
memory block. For example
ptr = (int *) malloc (10 * sizeof (int));
On execution of this statement, 10 times memory space equivalent to size
of an ‘int’ byte is allocated and the address of the first byte is assigned to the
pointer variable ‘ptr’ of type ‘int’.
Remember the malloc() function allocates a block of contiguous bytes. The
allocation can fail if the space in the heap is not sufficient to satisfy the request.
If it fails, it returns a NULL pointer. So it is always better to check whether the
memory allocation is successful or not before we use the newly allocated memory
pointer. Next program will illustrate the same.
P a g e 12 | 27
Er. Andleeb Zuhra
Lecturer
ALLOCATING
MULTIPLE
MEMORY – calloc()
Department of CSE
University of Kashmir, Srinagar
BLOCKS
OF
The calloc() function works exactly similar to malloc() function except for
the fact that it needs two arguments as against one argument required by malloc()
function. While malloc() function allocates a single block of memory space,
calloc() function allocates multiple blocks of memory, each of the same size, and
then sets all bytes to zero. The general form of calloc() function is
ptr = (int_type*) calloc(n sizeof (block_size));
ptr = (int_type*) malloc(n* (sizeof (block_size));
The above statement allocates contiguous space for ‘n’ blocks, each of size
of block_size bytes. All bytes are initialized to zero and a pointer to the first byte
of the allocated memory block is returned. If there is no sufficient memory space,
a NULL pointer is returned. For example
ptr = (int *) calloc(25, 4);
ptr = (int *) calloc(25,sizeof (float));
Here, in the first statement the size of data type in byte for which allocation
is to be made (4 bytes for a floating point numbers) is specified and 25 specifies
the number of elements for which allocation is to be made.
Note: The memory allocated using malloc() function contains garbage
values, the memory allocated by calloc() function contains the value zero.
RELEASING THE USED SPACE – free()
Dynamic memory allocation allocates block(s) of memory when it is
required and deallocates or releases when it is not in use. It is important and is
our responsibility to release the memory block for future use when it is not in use,
using free() function.
The free() function is used to deallocate the previously allocated memory
using malloc() or calloc() function. The syntax of this function is
free(ptr);
‘ptr’ is a pointer to a memory block which has already been allocated by
malloc() or calloc() functions. Trying to release an invalid pointer may create
problems and cause system crash.
P a g e 13 | 27
Er. Andleeb Zuhra
Lecturer
Department of CSE
University of Kashmir, Srinagar
RESIZE THE SIZE OF A MEMORY BLOCK –
realloc()
In some situations, the previously allocated memory is insufficient to run
the correct application, i.e., we want to increase the memory space. It is also
possible that the memory allocated is much larger than necessary, i.e., we want
to reduce the memory space. In both the cases we want to change the size of the
allocated memory block and this can be done by realloc() function. This process
is called reallocation of the memory. The syntax of this function is
ptr = realloc(ptr, New_Size)
Where ‘ptr’ is a pointer holding the starting address of the allocated
memory block. And New_Size is the size in bytes that the system is going to
reallocate. Following example will elaborate the concept of reallocation of
memory.
ptr = (int *) malloc(sizeof (int));
The following two statements are same:
ptr = (int *) realloc(ptr, sizeof (int));
ptr = (int *) realloc(ptr, 2);
The following two statements are same:
ptr = (int *) realloc(ptr, sizeof (float));
ptr = (int *) realloc(ptr, 4);
P a g e 14 | 27
Er. Andleeb Zuhra
Lecturer
Department of CSE
University of Kashmir, Srinagar
FUNCTION & RECURSION
P a g e 15 | 27
Er. Andleeb Zuhra
Lecturer
Department of CSE
University of Kashmir, Srinagar
FUNCTION
Functions
 provide modularity to the software
 divide complex tasks into small manageable tasks
 avoid duplication of work
THE CONCEPT OF STACK
A stack is memory in which values are stored and retrieved in "last in first out"
manner by using operations called push and pop.
THE SEQUENCE OF EXECUTION DURING A FUNCTION
CALL
 When the function is called, the current execution is temporarily stopped
and the control goes to the called function. After the call, the execution
resumes from the point at which the execution is stopped.
 To get the exact point at which execution is resumed, the address of the
next instruction is stored in the stack. When the function call completes,
the address at the top of the stack is taken.
P a g e 16 | 27
Er. Andleeb Zuhra
Lecturer
Department of CSE
University of Kashmir, Srinagar
 Functions or sub-programs are implemented using a stack.
 When a function is called, the address of the next instruction is pushed into
the stack.
 When the function is finished, the address for execution is taken by using
the pop operation.
PARAMETER * REFERENCE PASSING
 Passing by value: The value before and after the call remains the same.
 Passing by reference: Changed value after the function completes.
RESOLVING VARIABLE REFERENCES
P a g e 17 | 27
Er. Andleeb Zuhra
Lecturer
Department of CSE
University of Kashmir, Srinagar
When a variable can be resolved by using multiple references, the local definition
is given more preference.
RECURSION
A method of programming whereby a function directly or indirectly calls itself.
P a g e 18 | 27
Er. Andleeb Zuhra
Lecturer
Department of CSE
University of Kashmir, Srinagar
P a g e 19 | 27
Er. Andleeb Zuhra
Lecturer
Department of CSE
University of Kashmir, Srinagar
Example: Tower of Hanoi
P a g e 20 | 27
Er. Andleeb Zuhra
Lecturer
Department of CSE
University of Kashmir, Srinagar
Stack Overheads in Recursion
Two important results:
 the depth of recursion, and
 stack overheads in recursion
Writing a Recursive Function
Recursion enables us to write a program in a natural way. The speed of a recursive
program is slower because of stack overheads.
In a recursive program you have to specify recursive conditions, terminating
conditions, and recursive expressions.
TYPES OF RECURSION
P a g e 21 | 27
Er. Andleeb Zuhra
Lecturer
Department of CSE
University of Kashmir, Srinagar
Following are the types of Recursion:
1)
2)
3)
4)
5)
6)
Linear Recursion
Tail Recursion
Binary Recursion
Exponential Recursion
Nested Recursion
Mutual Recursion
1. Linear Recursion:
 Linear Recursion only makes a single call to itself each time the
function runs.
2. Tail Recursion
 Tail recursion is a form of linear recursion.
 In tail recursion, the recursive call is the last thing the function does.
Often, the value of the recursive call is returned.
3. Binary Recursion
 Some recursive functions don't just have one call to themselves, they
have two (or more).
P a g e 22 | 27
Er. Andleeb Zuhra
Lecturer
Department of CSE
University of Kashmir, Srinagar
4. Exponential Recursion
 An exponential recursive function is one that, if you were to draw
out a representation of all the function calls, would have an
exponential number of calls in relation to the size of the data set
(exponential meaning if there were n elements, there would be O(an)
function calls where a is a positive number).
5. Nested Recursion
 In nested recursion, one of the arguments to the recursive function
is the recursive function itself
 These functions tend to grow extremely fast.
P a g e 23 | 27
Er. Andleeb Zuhra
Lecturer
Department of CSE
University of Kashmir, Srinagar
6. Mutual Recursion
 A recursive function doesn't necessarily need to call itself.
 Some recursive functions work in pairs or even larger groups. For
example, function A calls function B which calls function C which
in turn calls function A.
P a g e 24 | 27
Er. Andleeb Zuhra
Lecturer
Department of CSE
University of Kashmir, Srinagar
Exercise 1: Draw the pattern using concept of recursion:
Exercise 2: Convert number from H10  H2
Exercise 3: Write a program to compute
S=1+2+3+…+n
using recursion.
Exercise 4: Write a program to print the revert of a number.
Example: Input n=12345. Print out: 54321.
P a g e 25 | 27
Er. Andleeb Zuhra
Lecturer
Department of CSE
University of Kashmir, Srinagar
Exercise 5: Write a recursion function to find the sum of digits of an integer
number.
Example: n=1980. Sum=1+9+8+0=18.
Exercise 6: Write a recursion function to calculate sum of elements of an array.
That is,
S = a[0] + a[1] + … + a[n-1]
where a is an array of integer numbers.
Exercise 7: Write a recursion function to find an element in an array (using linear
algorithm).
Exercise 8: Print following pattern using recursion:
Exercise 9: Print following pattern using recursion:
Exercise 10: Print following pattern using recursion:
P a g e 26 | 27
Er. Andleeb Zuhra
Lecturer
Department of CSE
University of Kashmir, Srinagar
Exercise 11: Print following pattern using recursion:
Exercise 12: Minesweeper game
**************************
P a g e 27 | 27
Download