Additional Laboratory in ADA TDDB 57 Data structures and algorithms Fall 2006

advertisement
Additional Laboratory in ADA
TDDB 57
Data structures and algorithms
Fall 2006
September 2006
2
Getting started
To get started you will need access to the assignment skeletons. These can be found in the course directory /home/TDDB57/skel/ . It is appropriate to copy these files to your home directory according
to the commands below:
mkdir ~/dalg/
cp -r ~TDDB57/skel/* ~/dalg
After this, all necessary files can be found in your directory. Go to the directory by typing:
cd
ls
~/dalg/
This lab assignment needs some programs from the directory /home/TDDB57/bin and access to an
Ada compiler. You should therefore add these paths with the following commands:
module add ~TDDB57/modules/tddb57
module initadd ~TDDB57/modules/tddb57
After you have passed the laboratory course you can remove the paths:
module initrm ~TDDB57/modules/tddb57
Good luck!
1
2
Getting started
Laboratory 5
Aim. The objective is to learn implementation techniques for heap and to demonstrate how heap
can be used for sorting.
Preparation. Read sections 8.1.3, 8.3.1-8.3.3 and 8.3.5 of the course book.
Heapsort
One of the applications of the data structure heap is the sorting algorithm Heapsort. Heapsort has the
worst case time complexity O(n log n), which is better than that of Quicksort1 . Another advantage
of Heapsort is that it does not require any additional memory space. More precisely, the size of the
memory used in addition to that needed for storing the input data is constant (and very small).
One can develop a sorting algorithm based on the priority queue implemented with a heap. The
idea would be to place all elements of the sorted sequence in the priority queue, using insert and
then iterate removeMin operation to retrieve them in increasing order. Recall that heap is a partially
ordered tree with implicit representation in an array. As described in Section 8.3.3 of the course
book, insert places a new element as additional leaf of a given heap. The obtained tree may not be a
partially ordered tree since the new node may be smaller than its parent. The insertion is completed
by iteratively swapping the child node that is smaller than its parent with the parent until a partially
ordered tree is obtained.
In this lab we also use a variant of heap but we organize sorting in a different way. Our variant of
heap has its maximal element in the root and the children of every node are not larger than this
node. All input data are initially placed in an array. The array is seen as an implicit representation
of a tree. We first have to make it into a heap. Then the root of the tree (i.e. the first element of the
array) is the maximal element in given ordering. We swap it with the last element of the array and
we make the remaining part of the array into heap. Iteration of this process will place all elements
in the array in the non-decreasing order. Thus our heapsort algorithm has the following two phases:
1. The heapification phase starts with input data stored in an array. Go through the array in
reverse order and for each element x heapify the subtree which has its root in position x. Since
we heapify in the reverse order, the children subtrees of x are already heapified; for proper
1
Despite this fact Quicksort is often faster than Heapsort in real life.
3
4
Laboratory 5
placement of x we can use the procedure shift-down (see below) which, if necessary swaps x
with the maximal of its children, and subsequently with some further descendants. As a result
of this phase the array becomes a heap with maximal element placed in the root (i.e. as the
first element of the array).
2. The sorting phase starts with the heap obtained as the result of the first phase. At every step
of the sorting phase the array consists of two parts: a heap placed at the beginning of the array
followed by a sorted list. Initially the sorted list is empty. Each step of the sorting phase goes
as follows. Let k > 1 be the size of the actual heap (the heap is placed in
the first k elements of the array. Do the removeMax operation by swapping the first and the
last element of the heap, followed by shift-down operation on
the first element to restore the heap of size (k − 1). This phase is completed when k = 1, in
which case the array includes initial data sorted in increasing order.
The Heapsort algorithm outlined above uses the auxiliary procedure described below:
• procedure shiftdown(a:ref array t; l, u: integer)
The array t is an implicit representation of a tree. The indices of the interval l:u denote thus
specific nodes of the tree, as described in the course book p.338. Before the call to shiftdown
every node of the tree possibly except of the first one (position a[l]) should be greater than or
equal each of its children. After the execution this property should hold for all nodes of the
interval including the first one. This is achieved by iterative swapping of the node not satisfying
this property with its maximal child.
Notice that the section of the array processed by this procedure need not to be a heap.
Instead of using shiftdown it is possible to implement shiftup (not done in this lab though) and use it
for creation of the heap by adding the elements one-by-one. As in this case the element added would
be the last element of the considered section of the array the violation of the heap condition would
only be possible at the end of the section, not at the beginning. Otherwise the same requirements as
in shiftdown would apply.
The Assignment. Write a Heapsort program and the procedure shiftdown as described above.
Analyze the time-complexity of shiftdown and of the heapification phase of Heapsort. What would
be the time complexity of heapification of input data done by one-by-one insertion using insert
and shiftup?
The skeleton for this assignment hsort.adb is shown below. Replace the dummy code sections with your implementation of the procedures. Compile your program with the command
make hsort The resulting executable file gets the name hsort . Run it and check the output:
> ./hsort < random | checkorder | more
Reporting. Demonstrate your program to the assistant. The report should include the printout of
the code and complexity analysis of both heapification algorithms discussed above.
Laboratory 5
File hsort.adb
with Ada.Text_IO; use Ada.Text_IO;
with Ada.Integer_Text_IO; use Ada.Integer_Text_IO;
with Ada.Float_Text_IO; use Ada.Float_Text_IO;
with Ada.Real_Time; use Ada.Real_Time;
procedure Hsort is
-- optimize for speed
-- comment-out when debugging
pragma Suppress(All_Checks);
Size: constant := 1000;
type Array_T is array(1..Size) of Integer;
procedure Siftdown(A: in out Array_T; L, U: Integer) is
begin
null; -- dummy code
end;
pragma Inline(Siftdown);
procedure Heapsort(A: in out Array_T) is
begin
null; -- dummy code
end;
procedure Read_Sequence(A: out Array_T) is
begin
for I in A’Range loop
Get(A(I));
end loop;
end;
procedure Write_Sequence(A: Array_T) is
begin
for I in A’Range loop
Put(A(I), 7);
if I mod 10 = 0 then
New_Line;
end if;
end loop;
end;
Iter: constant := 200;
5
6
Laboratory 5
A: Array_T;
As: array(1..Iter) of Array_T;
Start: Time;
T: Float;
begin
Read_Sequence(A);
As(1..Iter) := (others => A);
Start := Clock;
for I in 1..Iter loop
Heapsort(As(I));
end loop;
T := Float(To_Duration(Clock - Start));
Put("Time used in heapsort: ");
Put(1000.0*T/Float(Iter), Exp => 0, Fore => 3, Aft => 2);
Put_Line(" ms");
Write_Sequence(As(1));
end;
Appendix A
Tracing program errors
There are different ways to debug a program:
1. Looking into the code and searching for errors.
2. Adding to the code additional statements for printing values of selected variables and/or the
trace of the parts of the program entered during the test execution.
3. Using a debugger, such as gdb.
Before asking the assistant what is wrong with your code you should first try to locate the problem
with the methods 1 or 2 and 3, in this order.
The use of debugger is recommended if the execution in Ada ends abruptly. This is caused by
exception which is not caught by the exception handler and the debugger usually facilitates finding
the reasons of such errors.
The debugger can also be used for tracing the execution step-by-step (for more details see Section
A).
Using the gdb debugger
Gdb is primarily designed for C++ and its use for Ada can sometimes be more difficult. To be able
to use the debugger properly, compilation has to be done using a special flag which causes storing
additional inforamtion in the compiled program.
Use the flag -g after gnatmake. For the first use of -g one should also add -f for recompilation of
all files, e.g.:
gnatmake -f -g program
Start the debugger gdb in emacs by typing: M-x gdb, then state the name of the executable program
as an argument to gdb.
7
8
APPENDIX A. TRACING PROGRAM ERRORS
Start the program by typing run and possible parameters, e.g.:
(gdb) run arg1 arg2
which corresponds to execution of program arg1 arg2 in a shell.
Almost all commands in gdb have an abbreviation, e.g. run can be shortened to r. In the sequel all
abbreviations are written in parentheses.
To halt execution when an exception occurs a breakpoint has to be set. This is done with the command: break exception (b exception) before giving the command run (r). When the execution
of a program halts gdb prints out information about about the line and the function that was last
executed. If gdb is run inside emacs the window is split into two parts, where the lower one shows
the program code with an arrow => pointing out the last executed line.
To go to the last executed line type: up RET RET.
Use the command backtrace (bt) to look at the whole call-chain.
The most common error is a memory access violation, which often is caused by an illegal value of a
pointer or array index. To check the value of a variable or an expression use the command print
(p). This is illustrated by the following example where i is an integer variable, e.g. an index of an
array
(gdb) p i
$1 = 5627
The value of i is in this case 5627 and $1 part is a counter variable generated by gdb for every print
command.
To access variables used in earlier calls use the command up. To go back, use the command down
(do). There is also a possibility to refer to a variable used in a particular function. This is done using
::. For example the command p main::a will display on the screen the value of a variable in main.
The values of all local variables of a function can be printed with the command info locals (i lo).
Similarly, the values of all arguments of a function can be obtained by info args (i arg).
Any change in the program requires recompilation before using the debug tool again. To restart the
debugger type run (r) again, the same arguments used last time will be used automatically. Use the
command set confirm off to remove the question ”Do you really want to restart?” each time the
run command is used.
To apply a new set of arguments use set args command, e.g.
set args arg1 arg2 ...
To exit the debugger, close the gdb buffer, return to the code window and remove the upper emacs
window C-x k RET C-x o C-x 1.
9
Stepwise execution
If the examination of variable values in the selected program point is not sufficient to locate the error
one has to look more carefully at the steps of execution at the program points where you suspect that
things can go wrong. For this one has to halt the execution at these points. For this one has to set
so called breakpoints. The breakpoints are often set at the entrance to a function or at certain line
of the code. This is done by the command break (b) with the argument being a code line number
or a function name. For example, to halt the execution directly after start use:
(gdb) b main
If there are several functions with the same name, a list from which you can select one of them will
appear.
Once the breakpoints are in position, use run (r) the ”run” to start the execution of the program. The
execution will halt at the first reached breakpoint. Now one can stepwise go through the code using
the commands step (s) or next (n). the ”next” (n) or ”step” (s). There is a fundamental difference
between the two commands; step will go into the functions whenever there is a function call, while
next will not do that. By simply pressing the return key the last command will be repeated.
Another useful command is until (u) with a line number or function name as an (optional) argument.
It causes the execution go to the program point indicated by the argument, but only within the body
of the function. If no argument is given the execution goes to the next program line, which is often
used for quick loop traversal.
The command clear (cl) is used to remove a breakpoint. All breakpoints can be listed by the
command info break (i b) which also assigns numbers to the listed breakpoints. A breakpoint can
also be removed by the command delete (d) where the argument is the breakpoint listing number.
There are many more features and commands in gdb not discussed here. For details see e.g. the
manuals available on the Internet.
Program versions
This Appendix concerns the use of gnat-3.12p and gnat-patched gdb-4.17.
Download