Uploaded by cocoabean521

Answers Through Day19 (1)

Day 2 : March 13th
Can you explain why the while loop in Binary Array Search is guaranteed to terminate. That is,
How can you be sure that low will eventually be greater than high?
Here is the code for Binary Array Search as a refresher:
boolean contains(int[] collection, int target) {
int low = 0;
int high = collection.length-1;
while (low <= high) {
int mid = (low+high)/2;
int rc = collection[mid] - target;
if (rc < 0) {
low = mid+1;
} else if (rc > 0) {
high = mid-1;
} else {
return true;
return false;
If the collection is empty, then its length is 0 and high is set to -1, and the while loop never executes. So
we have a non-empty collection with at least 1 element.
Mid is computed within the while loop, when low <= high; if this is the case, then low  mid  high
because of the computation. Now, if the value is ever found within the while loop, then true is returned.
But what if the value is not found? Observe that low only increases in value (i.e., low = mid + 1) and high
only decreases in value (i.e., high = mid – 1) with each pass through the while loop, thus ensuring
forward progress with each iteration. Low and high will converge on each other, and eventually low will
exceed high (which is the same thing as saying high will be lower than low).
Eventually (after no more than floor(log N) + 1 iterations, you will have either located the value or
determined it doesn’t exit
Day 3 : March 14th
What is the fewest number of comparisons to determine both the largest and the smallest integer
in an array of size n (with index positions 0 to n-1).
Can you explain your reasoning?
First assume that N  2 otherwise the problem has no solution.
Use 1 comparison to determine the larger of ar[0] and ar[1] (set it to max) and the smaller is
secondMax. There are now n-2 elements remaining to be checked.
In the best case, each of the subsequent n-2 elements all prove to be greater than max and you can
simply replace secondMax with the old max each time. Thus, in the best case, you need n – 2 further
However, in the worst case, you never exceed max but must always be compared against secondMax.
This means 2 ∗ (𝑛 − 2) possible comparisons. All told, this means 2𝑛 − 4 + 1 (don’t forget the first
comparison we made) resulting in 2𝑛 − 3 in total.
To double check, if n=2, then this is just 1 (as expected). If n=3, then this is 3, which confirms 1 for the
first and 2 more in the worst case.
Day 4 : March 15th
You are given a stack with five integer values. You are asked to modify this stack so all of its
values are reversed (thus, the topmost element becomes the bottom-most element, and vice
In Java, this method might look like:
public void reverseFive(Stack<Integer> stack) { ... }
For example, given the stack [1, 3, 4, 5, 7], this function should alter the stack's contents so it
becomes [7, 5, 4, 3, 1] in return.
Describe in English how you would accomplish this.
Since you know in advance you have five values, just pop them off one at a time into variables, let’s call
these A, B, C, D, and E.
Then simply push them back on in reverse order, namely, push E, D, C, B, and A.
Since you know in advance the total number, there is no need for anything more complicated.
Day 5 : March 18th
The CircularBufferQueue stores the following pieces of information:
Item[] a -- An array of items whose initial size is determined when constructed
int N -- the number of items in the queue
int first -- the starting index of the queue
int last -- the ending index of the queue
It is possible to eliminate the last attribute in CircularBufferQueue by recognizing the
relationship between the value of last and the other pieces of information listed above.
Can you complete the following expression using just the other pieces of information?
last = ....
Since N is the number of elements in the queue, you can start by assuming first + N = last, however, this
won’t work when we exceed the length, so you need to make this (first + N) % a.length = last
I have provided sample code (in day05) showing how this would look in practice. Note it doesn’t make
the code any less efficient (or more efficient); just one more thing to compute instead of store.
Day 6 : March 19th
Each sorting algorithm has a distinct behavior when it comes to less (comparing two values) and
exch (exchanging the location of two values in an array).
Assuming that you are using selection sort to sort an array of N elements, how many times do
you exchange two values (using exch)? How many times do you compare two values (using
In selection sort, the goal is to find the minimum value of each of the remaining N – i – 1 elements,
repeatedly calling less on each of these. We have to subtract 1 because we assume A[i] is the minimum
until proven otherwise. We sum (N-1) + (N-2) + … + 2 + 1, which is TN-1 or N*(N-1)/2. To double check, for
N=4 values, we add 3 + 2 + 1 = 6 which equals 4*3/2.
Note: even when A[i] is already the minimum, we still call exch (a, i, min) which will have no effect. But
this is far better than wasting time checking (each time) whether min != i. We thus call exch N times.
Day 7 : March 21st
MergeSort is invoked by the method signature
void sort(Comparable[] a)
But you also know that MergeSort is a recursive procedure.
Explain why you need to define a recursive helper method for mergesort (something we didn't
need to do for SelectionSort and InsertionSort).
Now every recursive method needs a base case that determines the stopping point of the
recursion. Define this base case.
Neither Selection Sort or Insertion Sort needed a helper method, since they both strictly used for loops
to process their iterations.
However, MergeSort is recursive, and simply by the above method signature, there is no way to signal
that you are only attempting to sort a smaller part of the original array. So the helper method needs to
identify not just the size of the subproblem, but its boundaries, thus:
void sort(Comparable[] a, int lo, int hi)
The recursion proceeds by dividing a list into two (more-or-less) equal sublists. If N is a power of 2, then
you can stop when you get to a single element (i.e., lo = hi). However, to be safe, we have to allow for
situations when N is odd, which might leave us with an empty list. Thus the base case is:
if (hi <= lo) return;
which covers both the case when lo == hi and the case when hi < lo.
Day 8 : March 21st
Using the partition logic that was presented in class today, what is the result of partition(a, lo, hi) when
a[lo] is smaller than every other value in a[lo+1 .. hi]?
Partition makes two individual sweeps, one from (lo+1) to the right (searching for a value that is smaller
than a[0]) and one from hi to the left (searching for a value that is greater than or equal to a[0]).
Since a[0] is smaller than all other values, the first attempt will fail and stop when i == hi.
The second loop will stop when j == hi, since every element a[X] is greater than a[lo].
Both pointers will cross, and thus nothing is exchanged and the problem has not been cut in half, but
only advanced by 1. That is, the next sort will be called on sort(a, lo+1, hi).
Day 9 : March 25st
Quicksort has the following structure:
public static void sort(Comparable[] a) {
sort(a, 0, a.length - 1);
// quicksort the subarray from a[lo] to a[hi]
private static void sort(Comparable[] a, int lo, int hi) {
if (hi <= lo) return;
int j = partition(a, lo, hi);
sort(a, lo, j-1);
sort(a, j+1, hi);
Explain what the partition function does.
The job of partition is to find an element in the array a[lo .. hi] and to place it into its properly-sorted
location in the array, a[j] and j is returned. It does this by rearranging the elements in the sub-arrays a[lo
.. j-1] and a[j+1 .. hi] such that all elements in a[lo .. j-1] are <= a[j] and all elements a[j] <= a[j+1 .. hi].
Note that partition doesn’t sort these sub-arrays.
Day 10 : March 26th
The heap data structure is maintained using two operations, swim and sink. Heapsort only uses the sink
operator. Why doesn't it need to use swim?
First recognize that the Heap being maintained is a MaxHeap, which means each element is greater than
or equal to either of its (potential) two children. Now, HeapSort works by swapping this value with the
last value in the heap, and then reducing the heap size by 1. Thus the only value now out of place is the
newly appointed root, which might be smaller than its two children, and we have to sink it to its proper
Swim is only used when a value has increased in size.
Day 11 : March 26th
In the delete(Key key) method for the Symbol table, why is it so important to validate that 'first == null'
as the initial statement in this operation? Is this a special case that is being handled, and if so, what is
that special case?
As it turns out, this is not really required. The special case being handled is deleting a key value from an
empty linked list.
public void delete(Key key) {
if (first == null) { return; }
Node prev = null;
Node n = first;
while (n != null) {
if (key.equals (n.key)) {
if (prev == null) {
first = n.next;
} else {
prev.next = n.next;
prev = n;
n = n.next;
// no previous? Must have been first
// have previous one linke around
// don’t forget to update!
The special case being handled is deleting a key value from an empty linked list. But the above code
double-checks, and there is really no need to check, since if n is null, while loop doesn’t happen.
Day 12 : March 26th
In a Separate Chaining Hash Symbol Table as discussed today, A Hash Table partitions the keys
into M distinct bins or buckets, each of which is stored by a SequentialSearchST.
If you know in advance that you will be storing 1000 items in a Hashtable, what happens if you
simply instantiate SeparateChainingHashST(1000) to make sure there is room enough for all
items. Will this guarantee that each get(key) request performs in constant time? Explain why or
why not.
So I was a bit vague in my question. Let me break this up into two parts.
If you know you will have 1000 items, this tells you nothing about the distribution of these items into
the M buckets. As an example of a worst case, what if your hashCode method assigns the same bin to all
items? Then it doesn’t matter how many bins you have.
However, if you have a reasonable hash function, then starting with a larger number of bins simply
means that the hashtable will not be required to rehash all elements (which saves time). If the hash
function is good, then the performance should also be good, although it wastes some space. As further
evidence, go back and modify the sample SeparateChainingHashST to initialize with 321165 indices.
SeparateChainingHashST<String, String> table = new SeparateChainingHashST<String, String>(321165);
When you run, the following is output:
Table has 321165 indices.
there are 118063 empty indices 63.23914498777887%
maximum chain is 8
number of single is 118426
size distribution:[118063, 118426, 58847, 19703, 4912, 1021, 169, 23, 1, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
Empty Table has 9 indices.
Observe how many more empty bins there are! However, the end result is worst chain has only 8 nodes
in it, and nearly 1/3rd of the entries are in a chain by themselves (which is optimal).
Here is the original output by comparison
Table has 49151 indices.
there are 79 empty indices 99.8392708184981%
maximum chain is 19
number of single is 473
size distribution:[79, 473, 1589, 3277, 5369, 7003, 7757, 7314, 5907, 4217, 2757,
1704, 865, 472, 198, 109, 40, 14, 5, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
Day 13 : April 1st
What happens to the behavior of get(key) in SeparateChainingHashST if the hash function always
returns 0 for every key?
Then you simply get a linear linked list, which means the performance to retrieve a key will not be O(1)
in amortized estimation, but rather will be O(n)
Day 15 : April 4th
In what order would you insert the following values so you end up with a binary tree whose
height is 2?
There are many answers to this question. Since we want a tree of height 2, this means that we can have
a root with no more than two levels. Start with ‘O’ as the root, since this letter is the median of the
other letters in that word. Then the three smaller letters are “EMN” and the 3 larger are “RST”.
Essentially repeat this process for each of these three.
Here are two possible orderings to insert
#1: O M S E N R T
#2: O M E N S T R
Day 16 : April 5th
Finding the minimum (or maximum) value in a collection is a fundamental operation. In lecture
we used the tilde approximation to define the following performance expectations:
Sorted Array: Constant Time or "1"
BST : Logarithmic Time or "~log N"
ChainST: Linear Time or "N"
Explain why these are the expected performance behaviors for these three arrangements.
One can find the min (or max) in a sorted array by simply inspecting the contents of a[0] or a[a.length-1]
given a sorted array. Since this is just one operation, we can safely claim constant time, or ~1
If one stores all values in a ChainST symbol table, then you have to look at all elements in the table, since
they are unordered. You have to inspect all N keys, which means linear time or ~N.
For the Binary Search Tree, you still have to assume reasonable distribution of values inserted into the
key, which would lead to a tree that is “more or less balanced”. If this is the case, then the longest path
in the tree is the worst case to find a value in the tree. In balanced trees, the height of a BST is
proportional to log (N) where N is the number of keys in the tree. To find the minimum, follow left
branches until you run out; in the worst case, this is the height of the tree (or ~log N). Similar logic
applies to finding the maximum value.
Day 17 : April 8th
In class today someone asked if the following would produce the values in reverse order:
// invoke a pre-order traversal of the tree
public void revorder() { revorder(root); }
private void revorder(Node n) {
if (n == null) { return; }
revorder (n.right);
StdOut.println (n.key+ "=" + n.val);
revorder (n.left);
It turns out this WORKS! Clearly I wasn't able to think clearly on my feet on Monday Morning.
So please verify in a small example by inserting the following into a tree and demonstrating as
The resulting tree looks like the following:
/ \
/ \ / \
1 6 10 14
When you hand-execute the logic above, you can see that ‘14’ is output first, and then as the recursion
unwinds backwards, 11 comes next, followed by the reverse of the subtree rooted at 10, thus 10 is
When I coded this in class, I must have made a typo in the method.
Day 18 : April 9th
Is the following a true statement:
An AVL Binary Tree ensures that for every node in the tree, the number of descendant nodes in
the left subtree is the same as the number of descendant nodes in the right subtree.
If not, provide a counter example. If it is, explain why it is true always.
The AVL tree only guarantees that the height differential of a left-subtree and a right-subtree is either
–1 , 0, or +1.
This is related to the number of descendants but has more to do with structure. Thus the following is a
balanced AVL tree, even though the root has a left sub-tree with three descendants and a right sub-tree
with just one.
/ \
5 12
/ \
2 7
Day 19 : April 11th
An undirected graph contains N vertices and has E edges. You are told that the graph has no
cycles but it is connected (that is, there exists a path between any two vertices in the graph).
What is the fewest number of edges in this graph?
Now answer the same question if you know that a cycle exists. That is, what is the fewest
number of edges in an undirected graph which is allowed to contain a cycle?
If an undirected graph is connected then it must be possible to traverse from any vertex to any other
vertex using the edges in the graph. Let’s assume N > 1 so we have at least one edge. When N=2 then
only 1 edge is needed (and there is no cycle). As each new vertex is added to the graph, you only need
one additional edge to connect it so all vertices remain connected. Thus the minimum value of E is N-1.
If a cycle exists, there must be some path of edges which starts and ends at the same vertex. Imagine
putting all N vertices on paper in the shape of a circle, and then adding an edge between neighboring
vertices. You can see that a single large cycle results, and the removal of any edge breaks the cycle. You
can see that you need N edges to ensure full connectivity while retaining a cycle. The minimum value of
E is N.
Another approach for the graph with a cycle is to assume there is only one smallest cycle of three
vertices (and three edges among them). Then you add N-3 additional vertices in a linear list of vertices,
each with an edge connecting back. Counting edges, you can see there are still 3 + (N-3) = 3 so this is the
fewest number of edges.