Day 2 : March 13th Can you explain why the while loop in Binary Array Search is guaranteed to terminate. That is, How can you be sure that low will eventually be greater than high? Here is the code for Binary Array Search as a refresher: boolean contains(int[] collection, int target) { int low = 0; int high = collection.length-1; while (low <= high) { int mid = (low+high)/2; int rc = collection[mid] - target; if (rc < 0) { low = mid+1; } else if (rc > 0) { high = mid-1; } else { return true; } } return false; } Answer: If the collection is empty, then its length is 0 and high is set to -1, and the while loop never executes. So we have a non-empty collection with at least 1 element. Mid is computed within the while loop, when low <= high; if this is the case, then low mid high because of the computation. Now, if the value is ever found within the while loop, then true is returned. But what if the value is not found? Observe that low only increases in value (i.e., low = mid + 1) and high only decreases in value (i.e., high = mid – 1) with each pass through the while loop, thus ensuring forward progress with each iteration. Low and high will converge on each other, and eventually low will exceed high (which is the same thing as saying high will be lower than low). Eventually (after no more than floor(log N) + 1 iterations, you will have either located the value or determined it doesn’t exit Day 3 : March 14th What is the fewest number of comparisons to determine both the largest and the smallest integer in an array of size n (with index positions 0 to n-1). Can you explain your reasoning? Answer: First assume that N 2 otherwise the problem has no solution. Use 1 comparison to determine the larger of ar[0] and ar[1] (set it to max) and the smaller is secondMax. There are now n-2 elements remaining to be checked. In the best case, each of the subsequent n-2 elements all prove to be greater than max and you can simply replace secondMax with the old max each time. Thus, in the best case, you need n – 2 further comparisons. However, in the worst case, you never exceed max but must always be compared against secondMax. This means 2 ∗ (𝑛 − 2) possible comparisons. All told, this means 2𝑛 − 4 + 1 (don’t forget the first comparison we made) resulting in 2𝑛 − 3 in total. To double check, if n=2, then this is just 1 (as expected). If n=3, then this is 3, which confirms 1 for the first and 2 more in the worst case. Day 4 : March 15th You are given a stack with five integer values. You are asked to modify this stack so all of its values are reversed (thus, the topmost element becomes the bottom-most element, and vice versa). In Java, this method might look like: public void reverseFive(Stack<Integer> stack) { ... } For example, given the stack [1, 3, 4, 5, 7], this function should alter the stack's contents so it becomes [7, 5, 4, 3, 1] in return. Describe in English how you would accomplish this. Answer: Since you know in advance you have five values, just pop them off one at a time into variables, let’s call these A, B, C, D, and E. Then simply push them back on in reverse order, namely, push E, D, C, B, and A. Since you know in advance the total number, there is no need for anything more complicated. Day 5 : March 18th The CircularBufferQueue stores the following pieces of information: Item[] a -- An array of items whose initial size is determined when constructed int N -- the number of items in the queue int first -- the starting index of the queue int last -- the ending index of the queue It is possible to eliminate the last attribute in CircularBufferQueue by recognizing the relationship between the value of last and the other pieces of information listed above. Can you complete the following expression using just the other pieces of information? last = .... Answer: Since N is the number of elements in the queue, you can start by assuming first + N = last, however, this won’t work when we exceed the length, so you need to make this (first + N) % a.length = last I have provided sample code (in day05) showing how this would look in practice. Note it doesn’t make the code any less efficient (or more efficient); just one more thing to compute instead of store. Day 6 : March 19th Each sorting algorithm has a distinct behavior when it comes to less (comparing two values) and exch (exchanging the location of two values in an array). Assuming that you are using selection sort to sort an array of N elements, how many times do you exchange two values (using exch)? How many times do you compare two values (using less)? Answer: In selection sort, the goal is to find the minimum value of each of the remaining N – i – 1 elements, repeatedly calling less on each of these. We have to subtract 1 because we assume A[i] is the minimum until proven otherwise. We sum (N-1) + (N-2) + … + 2 + 1, which is TN-1 or N*(N-1)/2. To double check, for N=4 values, we add 3 + 2 + 1 = 6 which equals 4*3/2. Note: even when A[i] is already the minimum, we still call exch (a, i, min) which will have no effect. But this is far better than wasting time checking (each time) whether min != i. We thus call exch N times. Day 7 : March 21st MergeSort is invoked by the method signature void sort(Comparable[] a) But you also know that MergeSort is a recursive procedure. Explain why you need to define a recursive helper method for mergesort (something we didn't need to do for SelectionSort and InsertionSort). Now every recursive method needs a base case that determines the stopping point of the recursion. Define this base case. Answer: Neither Selection Sort or Insertion Sort needed a helper method, since they both strictly used for loops to process their iterations. However, MergeSort is recursive, and simply by the above method signature, there is no way to signal that you are only attempting to sort a smaller part of the original array. So the helper method needs to identify not just the size of the subproblem, but its boundaries, thus: void sort(Comparable[] a, int lo, int hi) The recursion proceeds by dividing a list into two (more-or-less) equal sublists. If N is a power of 2, then you can stop when you get to a single element (i.e., lo = hi). However, to be safe, we have to allow for situations when N is odd, which might leave us with an empty list. Thus the base case is: if (hi <= lo) return; which covers both the case when lo == hi and the case when hi < lo. Day 8 : March 21st Using the partition logic that was presented in class today, what is the result of partition(a, lo, hi) when a[lo] is smaller than every other value in a[lo+1 .. hi]? Answer: Partition makes two individual sweeps, one from (lo+1) to the right (searching for a value that is smaller than a[0]) and one from hi to the left (searching for a value that is greater than or equal to a[0]). Since a[0] is smaller than all other values, the first attempt will fail and stop when i == hi. The second loop will stop when j == hi, since every element a[X] is greater than a[lo]. Both pointers will cross, and thus nothing is exchanged and the problem has not been cut in half, but only advanced by 1. That is, the next sort will be called on sort(a, lo+1, hi). Day 9 : March 25st Quicksort has the following structure: public static void sort(Comparable[] a) { shuffle(a); sort(a, 0, a.length - 1); } // quicksort the subarray from a[lo] to a[hi] private static void sort(Comparable[] a, int lo, int hi) { if (hi <= lo) return; int j = partition(a, lo, hi); sort(a, lo, j-1); sort(a, j+1, hi); } Explain what the partition function does. Answer: The job of partition is to find an element in the array a[lo .. hi] and to place it into its properly-sorted location in the array, a[j] and j is returned. It does this by rearranging the elements in the sub-arrays a[lo .. j-1] and a[j+1 .. hi] such that all elements in a[lo .. j-1] are <= a[j] and all elements a[j] <= a[j+1 .. hi]. Note that partition doesn’t sort these sub-arrays. Day 10 : March 26th The heap data structure is maintained using two operations, swim and sink. Heapsort only uses the sink operator. Why doesn't it need to use swim? Answer: First recognize that the Heap being maintained is a MaxHeap, which means each element is greater than or equal to either of its (potential) two children. Now, HeapSort works by swapping this value with the last value in the heap, and then reducing the heap size by 1. Thus the only value now out of place is the newly appointed root, which might be smaller than its two children, and we have to sink it to its proper spot. Swim is only used when a value has increased in size. Day 11 : March 26th In the delete(Key key) method for the Symbol table, why is it so important to validate that 'first == null' as the initial statement in this operation? Is this a special case that is being handled, and if so, what is that special case? Answer: As it turns out, this is not really required. The special case being handled is deleting a key value from an empty linked list. public void delete(Key key) { if (first == null) { return; } Node prev = null; Node n = first; while (n != null) { if (key.equals (n.key)) { if (prev == null) { first = n.next; } else { prev.next = n.next; } return; } prev = n; n = n.next; // no previous? Must have been first // have previous one linke around // don’t forget to update! } } The special case being handled is deleting a key value from an empty linked list. But the above code double-checks, and there is really no need to check, since if n is null, while loop doesn’t happen. Day 12 : March 26th In a Separate Chaining Hash Symbol Table as discussed today, A Hash Table partitions the keys into M distinct bins or buckets, each of which is stored by a SequentialSearchST. If you know in advance that you will be storing 1000 items in a Hashtable, what happens if you simply instantiate SeparateChainingHashST(1000) to make sure there is room enough for all items. Will this guarantee that each get(key) request performs in constant time? Explain why or why not. Answer: So I was a bit vague in my question. Let me break this up into two parts. If you know you will have 1000 items, this tells you nothing about the distribution of these items into the M buckets. As an example of a worst case, what if your hashCode method assigns the same bin to all items? Then it doesn’t matter how many bins you have. However, if you have a reasonable hash function, then starting with a larger number of bins simply means that the hashtable will not be required to rehash all elements (which saves time). If the hash function is good, then the performance should also be good, although it wastes some space. As further evidence, go back and modify the sample SeparateChainingHashST to initialize with 321165 indices. SeparateChainingHashST<String, String> table = new SeparateChainingHashST<String, String>(321165); When you run, the following is output: Table has 321165 indices. there are 118063 empty indices 63.23914498777887% maximum chain is 8 number of single is 118426 size distribution:[118063, 118426, 58847, 19703, 4912, 1021, 169, 23, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] Empty Table has 9 indices. Observe how many more empty bins there are! However, the end result is worst chain has only 8 nodes in it, and nearly 1/3rd of the entries are in a chain by themselves (which is optimal). Here is the original output by comparison Table has 49151 indices. there are 79 empty indices 99.8392708184981% maximum chain is 19 number of single is 473 size distribution:[79, 473, 1589, 3277, 5369, 7003, 7757, 7314, 5907, 4217, 2757, 1704, 865, 472, 198, 109, 40, 14, 5, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] Day 13 : April 1st What happens to the behavior of get(key) in SeparateChainingHashST if the hash function always returns 0 for every key? Answer: Then you simply get a linear linked list, which means the performance to retrieve a key will not be O(1) in amortized estimation, but rather will be O(n) Day 15 : April 4th In what order would you insert the following values so you end up with a binary tree whose height is 2? MONSTER Answer: There are many answers to this question. Since we want a tree of height 2, this means that we can have a root with no more than two levels. Start with ‘O’ as the root, since this letter is the median of the other letters in that word. Then the three smaller letters are “EMN” and the 3 larger are “RST”. Essentially repeat this process for each of these three. Here are two possible orderings to insert #1: O M S E N R T #2: O M E N S T R Day 16 : April 5th Finding the minimum (or maximum) value in a collection is a fundamental operation. In lecture we used the tilde approximation to define the following performance expectations: Sorted Array: Constant Time or "1" BST : Logarithmic Time or "~log N" ChainST: Linear Time or "N" Explain why these are the expected performance behaviors for these three arrangements. Answer: One can find the min (or max) in a sorted array by simply inspecting the contents of a[0] or a[a.length-1] given a sorted array. Since this is just one operation, we can safely claim constant time, or ~1 If one stores all values in a ChainST symbol table, then you have to look at all elements in the table, since they are unordered. You have to inspect all N keys, which means linear time or ~N. For the Binary Search Tree, you still have to assume reasonable distribution of values inserted into the key, which would lead to a tree that is “more or less balanced”. If this is the case, then the longest path in the tree is the worst case to find a value in the tree. In balanced trees, the height of a BST is proportional to log (N) where N is the number of keys in the tree. To find the minimum, follow left branches until you run out; in the worst case, this is the height of the tree (or ~log N). Similar logic applies to finding the maximum value. Day 17 : April 8th In class today someone asked if the following would produce the values in reverse order: // invoke a pre-order traversal of the tree public void revorder() { revorder(root); } private void revorder(Node n) { if (n == null) { return; } revorder (n.right); StdOut.println (n.key+ "=" + n.val); revorder (n.left); } It turns out this WORKS! Clearly I wasn't able to think clearly on my feet on Monday Morning. So please verify in a small example by inserting the following into a tree and demonstrating as above bst.put(7) bst.put(11) bst.put(10) bst.put(3) bst.put(14) bst.put(6) bst.put(1) Answer: The resulting tree looks like the following: 7 / \ 3 11 / \ / \ 1 6 10 14 When you hand-execute the logic above, you can see that ‘14’ is output first, and then as the recursion unwinds backwards, 11 comes next, followed by the reverse of the subtree rooted at 10, thus 10 is output. When I coded this in class, I must have made a typo in the method. Day 18 : April 9th Is the following a true statement: An AVL Binary Tree ensures that for every node in the tree, the number of descendant nodes in the left subtree is the same as the number of descendant nodes in the right subtree. If not, provide a counter example. If it is, explain why it is true always. Answer: The AVL tree only guarantees that the height differential of a left-subtree and a right-subtree is either –1 , 0, or +1. This is related to the number of descendants but has more to do with structure. Thus the following is a balanced AVL tree, even though the root has a left sub-tree with three descendants and a right sub-tree with just one. 8 / \ 5 12 / \ 2 7 Day 19 : April 11th An undirected graph contains N vertices and has E edges. You are told that the graph has no cycles but it is connected (that is, there exists a path between any two vertices in the graph). What is the fewest number of edges in this graph? Now answer the same question if you know that a cycle exists. That is, what is the fewest number of edges in an undirected graph which is allowed to contain a cycle? Answer: If an undirected graph is connected then it must be possible to traverse from any vertex to any other vertex using the edges in the graph. Let’s assume N > 1 so we have at least one edge. When N=2 then only 1 edge is needed (and there is no cycle). As each new vertex is added to the graph, you only need one additional edge to connect it so all vertices remain connected. Thus the minimum value of E is N-1. If a cycle exists, there must be some path of edges which starts and ends at the same vertex. Imagine putting all N vertices on paper in the shape of a circle, and then adding an edge between neighboring vertices. You can see that a single large cycle results, and the removal of any edge breaks the cycle. You can see that you need N edges to ensure full connectivity while retaining a cycle. The minimum value of E is N. Another approach for the graph with a cycle is to assume there is only one smallest cycle of three vertices (and three edges among them). Then you add N-3 additional vertices in a linear list of vertices, each with an edge connecting back. Counting edges, you can see there are still 3 + (N-3) = 3 so this is the fewest number of edges.