One-Time Binary Search Tree Balancing

Spreadsheet-Aided Numerical Experimentation: Analytic Formula for Fibonacci Numbers Timothy J. Rolfe Computer Science Department Eastern Washington University 202 Computer Sciences Building Cheney, WA 99004-2412 Timothy.Rolfe@ewu.edu http://penguin.ewu.edu/~trolfe/ Abstract Spreadsheet representations of recurrences allow numerical experimentation with potential analytic solutions to those recurrences. This paper uses a very simple recurrence for which the analytic solution is quite obvious when one examines the values generated by the recurrence, and then examines another recurrence for which the solution is not obvious. Easy Case: the Towers of Hanoi Recurrence The recurrence for the number of disk movements during the solution of the Towers of Hanoi problem is the following Base Case: Hanoi(0) = 0 NO disks move if there are no disks to move! Recurrence: Hanoi(n) = Hanoi(n–1) + 1 + Hanoi(n–1) (a) Move all but one disk out of the way (b) Move that disk to its destination (c) Move the rest of the disks on top of it If spreadsheet column A contains the values for n, and column B contains the values for Hanoi(n), then the set-up is straight forward, and after the first row containing numbers (row 2, allowing row 1 for column labels), the subsequent rows can contain formulas for recurrences: column A will have the successor recurrence (that is, successor(n) = successor(n–1) + 1), while column B will have the Hanoi recurrence (Hanoi(n) = 2*Hanoi(n–1) + 1). I will use row 3 as the specimen row and give the spreadsheet formulas in A3 and B3: A3: B3: =1+A2 =1+B2*2 Spreadsheets typically provide an easy means to propagate values and formulas downward. Once we propagate row 3 through rows 4 through 7, we see the following: n 0 1 2 3 4 5 Hanoi(n) 0 1 3 7 15 31 Someone who has been working in the computer field for very long will immediately see that column B contains values one less than the powers of 2, suggesting this solution: Hanoi(n) = 2n – 1. We can add a new column to the spreadsheet, and write in our analytic solution so that it references only values in column A — specifically in this case to the same row in column A. C2: =2Â2–1 Propagating that formula from row 2 down through row 7, we get the following: n Hanoi(n) Formula 0 0 0 1 1 1 2 3 3 3 7 7 4 15 15 5 31 31 For a mathematician, of course, this only proves the formula for the range of n=0 to n=5. The final step is the inductive analytical proof that the formula is indeed the analytic solution to the recurrence. The Towers of Hanoi provides one of the standard recurrences used in teaching such inductive proofs. More Difficult Case: The Fibonacci Recurrence The Fibonacci recurrence is fairly similar to the Hanoi one, except that two previous values are involved, and we don’t have the “+1” (though a closely related recurrence describing worst-case AVL tree depth even has the “+1”): Base Cases: Recurrence: Fib(0) = 0 NO rabbits at the start Fib(1) = 1 ONE immature breeding pair of rabbits Fib(n+1) = Fib(n) + Fib(n–1) (a) All the rabbits we just had (b) Rabbits born to the mature breeding pairs Again, spreadsheet column A contains the values for n, and column B contains the values for Fib(n). We have two rows with numbers because we have two base cases: rows 2 and 3. Column A still has the successor recurrence, while Column B contains the Hanoi recurrence. A4: B4: =1+A3 =B3+B2 The values obtained, though, don’t show a sequence whose solution that jumps out at us. n Fib(n) 0 0 1 1 2 1 3 2 4 3 5 5 One advantage of working with spreadsheets, though, is that they make it extremely easy to examine data graphically. If we extend the series up to Fib(20) = 6765, and then plot the result, we see the following: 7000 6000 5000 4000 3000 2000 1000 0 0 5 10 15 20 That certainly looks like exponential explosion! In other words, for larger values of n, the function is dominated by an exponential part — Approx(n) = k * cn. One check is to compare ratios of adjacent values — they should approach c: (k * cn) / (k * cn-1) — everything cancels but one c. We use Fib(n) itself as a backwards “estimate” of Approx(n). Thus, for row 4, we have C4: =B4/B3 We find that we converge to a particular value. “c” is the “golden ratio” (  1 5 ), one of the solutions of 2 “ x2 = x + 1 ” The next step is the discover the value of k —rearranging the Approx(n) equation above so as to isolate k: k = Approx(n) / n — again using Fib(n) itself to stand in for Approx(n). D4: = B4/$G$1Â4 Note that cell G1 contains the calculated value of . We achieve convergence this time as well, and playing around a little discover that “k” is “1/5” — not surprising, considering the occurrence of “5” in  itself. n Fib(n) Adjacent Estimate 0 0 Ratio k-value 1 1 2 1 1 0.381966 3 2 2 0.472136 4 3 1.5 0.437694 5 5 1.666667 0.45085 6 8 1.6 0.445825 7 13 1.625 0.447744 8 21 1.615385 0.447011 9 34 1.619048 0.447291 10 55 1.617647 0.447184 Thus, we have obtained the following function: Approx(n) = n /5 We can now generate a column using that approximation. C7: = $G$1Â7/$G$2 Note that cell G2 contains the calculated value for √5. We see that it successively overshoots and undershoots the exact value, but by less and less as n increases. It’s easy to handle an alternating sign, so let’s examine the absolute value of the error: n Fib(n) Approx(n) | Error(n) | 0 0 0.447214 0.447214 1 1 0.723607 0.276393 2 1 1.17082 0.17082 3 2 1.894427 0.105573 4 3 3.065248 0.065248 5 5 4.959675 0.040325 6 8 8.024922 0.024922 7 13 12.9846 0.015403 8 21 21.00952 0.009519 9 34 33.99412 0.005883 10 55 55.00364 0.003636 Again we have what looks like exponential behavior, but in this case a decaying exponential. For that, we can use exactly the same approach as described above to characterize that exponential. As it turns out, we end up with exactly the same c and k, except that in this case we have c–n rather than c n. We can get the alternating sign from (– 1) –n. n n Fib (n)   ( )  5 Again, for the mathematician, the spreadsheet can only prove equality for a finite set of values. It is necessary to do the inductive analytical proof to establish the solution for all values. In this proof, it will be necessary to use the special properties of :  –1 =  – 1, and  2 =  + 1. Base cases: Fib(0) = (  0 –  –0 ) / 5 = ( 1 – 1 ) / 5 = 0 Fib(1) = (  1 +  –1 ) / 5 = (  +  – 1 ) / 5 = 5 / 5 = 1 QED substituting for  –1 definition of  QED The inductive proof is left as an exercise for the reader. Beyond the substitutions for  2 and  –1, it is mostly (as usual) just a matter of algebra! This is an example of “strong induction” — to prove Fib(n+1) we need to use Fib(n) and Fib(n-1), not just Fib(n). The above formula can be simplified by taking advantage of the other root of the equation “ x 2 = x + 1 ”. Let a  1 5 . It turns out that using a simplifies the equation appreciably, because a = –1 /  : 2  n  an * Fib (n)  5 A final side note of possible interest: Fib (n)   n  cos( n )  n provides an alternative continuous 5 form of the Fibonacci function even for negative n. For n < 0, it generates the same numeric values as Fib(n) for * Derived analytically in Gilles Brassard and Paul Bratley, Fundamental of Algorithmics (1996), pp. 120-21. n > 0, but with alternating signs — consistent with a backwards recurrence Fib(n–1) = Fib(n+1) – Fib(n). That is, for n < 0, the value of Fib is dominated by the term cos(n)  –n. 3 2 1 0 -4 -3 -2 -1 0 1 2 3 4 -1 -2 -3 Continuous Fib. Discrete Fib. A Messier Recurrence: Binary Search Analysis If we want to find the average number of loop iterations to find an array entry using the binary search with early exit, then we need to find the total of loop iterations to find all array entries for a given array size n, and then divide that by n. The structure of the problem makes it easier to deal only with complete binary trees. In that case, where we consider the binary tree root be at level 1 (rather than 0), we can write the summation (which is, of course, an implicit recurrence) to find the total for a complete tree of height ht: Q(ht )  k 1 ht k 2  2 k 1 A colleague and good friend of mine, Dr. Brian Carlson (who is, unfortunately, no longer with us), discovered an analytic solution to this while doing experimentation exactly like that discussed here. ht Q(ht )  (ht  1)  2  1 It is discussed in greater detail in an earlier paper in a companion ACM SIG newsletter (which has since ceased publication — I don’t think it was this paper that did it in!).** ** Timothy J. Rolfe, “Analytic Derivation of Comparisons in Binary Search”, SIGNUM Newsletter, Vol. 32, No. 4 (October 1997), pp. 15-19. Text available through http://penguin.ewu.edu/~trolfe/.

One-Time Binary Search Tree Balancing

Related documents

Products

Support

One-Time Binary Search Tree Balancing

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib