Chapter 10 Amortized Analysis 10 -1 An example– push and pop A sequence of operations: OP1, OP2, … OPm OPi : several pops (from the stack) and one push (into the stack) ti : time spent by OPi the average time per operation: t ave 1 m t m i i 1 10 -2 Example: a sequence of push and pop p: pop , u: push i O Pi 1 1u 2 1u 3 2p 1u 4 1u 5 1u 6 1u 7 2p 1u 8 1p 1u ti 1 1 3 1 1 1 3 2 tave = (1+1+3+1+1+1+3+2)/8 = 13/8 = 1.625 10 -3 Another example: a sequence of push and pop p: pop , u: push i O Pi 1 1u 2 1p 1u 3 1u 4 1u 5 1u 6 1u 7 5p 1u 8 1u ti 1 2 1 1 1 1 6 1 tave = (1+2+1+1+1+1+6+1)/8 = 14/8 = 1.75 10 -4 Amortized time and potential function ai = ti + i i 1 ai : amortized time of OPi i : potential function of the stack after OPi i i 1 : change of the potential m m m i 1 i 1 m i 1 a i t i ( i i 1 ) ti m 0 i 1 m If m 0 0, then a i represents an upper i 1 m bound of t i i 1 10 -5 Amortized analysis of the push-and-pop sequence i : # of elements We have in the stack m 0 0 Suppose that before we execute Opi , there are k elements in the stack and Opi consists of n pops and 1 push. i 1 k ti n 1 Then, i k n 1 a i t i i i 1 ( n 1) ( k-n 1) -k 2 10 -6 m We have ( a i ) / m 2 i 1 Then, t ave 2. By observation, at most m pops and m pushes are executed in m operations. Thus, t ave 2. 10 -7 Skew heaps meld: merge + swapping 1 1 5 50 10 19 13 12 50 5 20 40 30 14 19 16 10 25 Two skew heaps 40 13 12 16 30 14 20 Step 1: Merge the right paths. 5 right heavy nodes: yellow 25 10 -8 Step 2: Swap the children along the right path. 1 5 50 10 12 14 20 19 13 30 40 16 No right heavy node 25 10 -9 Amortized analysis of skew heaps meld: merge + swapping operations on a skew heap: find-min(h): find the min of a skew heap h. insert(x, h): insert x into a skew heap h. delete-min(h): delete the min from a skew heap h. meld(h1, h2): meld two skew heaps h1 and h2. The first three operations can be implemented by melding. 10 -10 Potential function of skew heaps wt(x): # of descendants of node x, including x. heavy node x: wt(x) wt(p(x))/2, where p(x) is the parent node of x. light node : not a heavy node potential function i: # of right heavy nodes of the skew heap. 10 -11 Any path in an n-node tree contains at most log2n light nodes. light nodes log2n # heavy=k3 log2n possible heavy nodes # of nodes: n The number of right heavy nodes attached to the left path is at most log2n . 10 -12 Amortized time # light log2n1 # heavy = k1 heap: h1 # of nodes: n1 # light log2n2 # heavy = k2 heap: h2 # of nodes: n2 10 -13 a i = t i + i i-1 t i : tim e spent by O P i t i 2+ log 2 n 1 + k 1 + log 2 n 2 + k 2 (“2” counts the roots of h 1 and h 2 ) 2+ 2log 2 n+ k 1 + k 2 w here n= n 1 + n 2 i i-1 = k 3 -(k 1 + k 2 ) log 2 n k 1 k 2 a i = t i + i i-1 2+ 2log 2 n+ k 1 + k 2 + log 2 n k 1 k 2 = 2+ 3log 2 n a i = O (log 2 n) 10 -14 AVL-trees height balance of node v: hb(v)= (height of right subtree) – (height of left subtree) The hb(v) of every node never differ by more than 1. -1 M -1 +1 I O 0 0 E K 0 -1 C G 0 0 B D N 0 J 0 0 0 L Q 0 0 P R 0 F Fig. An AVL-Tree with Height Balance Labeled 10 -15 Add a new node A. -2 M Critical node +1 -2 O I Critical path G C B D J L 0 0 P R 0 0 -1 0 0 -1 -1 Q N K E 0 0 0 -1 F 0 A Before insertion, hb(B)=hb(C)=hb(E)=0 hb(I)0 the first nonzero from leaves. 10 -16 Amortized analysis of AVL-trees Consider a sequence of m insertions on an empty AVL-tree. T0: an empty AVL-tree. Ti: the tree after the ith insertion. Li: the length of the critical path involved in the ith insertion. X1: total # of balance factor changing from 0 to +1 or -1 during these m insertions (the total cost for rebalancing) m X 1 = L i , w e w ant to find X 1 . i 1 V al(T ): # of unbalanced node in T (height balance 0) 10 -17 Case 1 : Absorption The tree height is not increased, we need not rebalance it. 1 0 Li 0 0 -1 0 newly added -1 -1 Ti-1 Ti Val(Ti)=Val(Ti-1)+(Li1) 10 -18 Case 2.1 Single rotation -1 D 0 B Li E(n) A(n) 0 C(n) E(n): height of tree E is n 0 newly added 10 -19 Case 2 : Rebalancing the tree D B A F D B A C C E G right rotation left rotation G E F B A D C F E G 10 -20 Case 2.1 Single rotation After a right rotation on the subtree rooted at D: 0 B 0 D A(n) -1 -1 C(n) E(n) Val(Ti)=Val(Ti-1)+(Li-2) 10 -21 Case 2.2 Double rotation -1 F 0 B 0 G(n) D A(n) C(n-1) E(n-1) newly added 10 -22 Case 2.2 Double rotation After a left rotation on the subtree rooted at B and a right rotation on the subtree rooted at F: 0 D 0 1 B A(n) F C(n-1) E(n-1) G(n) Val(Ti)=Val(Ti-1)+(Li-2) 10 -23 Case 3 : Height increase 0 Li A(n) -1 root 0 -1 0 -1 root newly added Li is the height of the root. Val(Ti)=Val(Ti-1)+Li 10 -24 Amortized analysis of X1 X 2: X 3: X 4: X 5: # # # # of of of of ab so rp tio n s in case 1 sin g le ro tatio n s in case 2 d o u b le ro tatio n s in case 2 h eig h t in creases in case 3 m V al(T m ) = V al(T 0 )+ L i X 2 2 (X 3 + X 4 ) i 1 = 0 + X 1 X 2 2 (X 3 + X 4 ) V al(T m ) 0 .6 1 8 m (p ro v ed b y K n u th ) X 1 = V al(T m )+ 2 (X 2 + X 3 + X 4 ) X 2 0 .6 1 8 m + 2 m = 2 .6 1 8 m 10 -25 A self-organizing sequential search heuristics 3 methods for enhancing the performance of sequential search (1 ) T ransp o se H euristics: Q u ery S eq u en ce B B D D B A D A B D D A B D D A B C D A C B A A D C B 10 -26 (2 ) M o ve-to -the-F ro nt H euristics: Q u ery S eq u en ce B B D D B A A D B D D A B D D A B C C D A B A A C D B 10 -27 (3 ) C o unt H euristics: (d ecreasing o rd er b y the co unt) Q u ery S eq u en ce B B D B D A B D A D D B A D D B A A D A B C D A B C A D A B C 10 -28 Analysis of the move-to-thefront heuristics interword comparison: unsuccessful comparison intraword comparison: successful comparison pairwise independent property: For any sequence S and all pairs P and Q, # of interword comparisons of P and Q is exactly # of interword comparisons made for the subsequence of S consisting of only P’s and Q’s. (See the example on the next page.) 10 -29 Pairwise independent property in move-to-the-front e.g . Q u ery S eq u en ce C C A A C C C A B B C A C C B A A A C B (A , B ) co m p ariso n # o f co m p ariso n s m ad e b etw een A an d B : 2 10 -30 C o n sid er th e su b seq u en ce co n sistin g o f A an d B : Q uery S equence (A , B ) com parison A A B B A A A B # o f co m p ariso n s m ad e b etw een A an d B : 2 10 -31 Query Sequence (A, B) (A, C) (B, C) C 0 0 0 A 0 1 1 C B 1 C 1 0 1 0 1 1 2 1 A 1 1 2 There are 3 distinct interword comparisons: (A, B), (A, C) and (B, C) We can consider them separately and then add them up. the total number of interword comparisons: 0+1+1+2+1+2 = 7 10 -32 Theorem for the move-to-thefront heuristics CM(S): # of comparisons of the move-to-thefront heuristics CO(S): # of comparisons of the optimal static ordering CM (S) 2CO(S) 10 -33 Proof Proof: InterM(S): # of interword comparisons of the move to the front heuristics InterO(S): # of interword comparisons of the optimal static ordering Let S consist of a A’s and b B’s, a b. The optimal static ordering: BA InterO(S) = a InterM(S) 2InterO(S) InterM(S) 2a 10 -34 Proof (cont.) Consider any sequence consisting of more than two items. Because of the pairwise independent property, we have InterM(S) 2InterO(S) IntraM(S): # of intraword comparisons of the moveto-the-front heuristics IntraO(S): # of intraword comparisons of the optimal static ordering IntraM(S) = IntraO(S) InterM(S) + IntraM(S) 2InterO(S) + IntraO(S) CM(S) 2CO(S) 10 -35 The count heuristics The count heuristics has a similar result: CC(S) 2CO(S), where CC(S) is the cost of the count heuristics 10 -36 The transposition heuristics The transposition heuristics does not possess the pairwise independent property. We can not have a similar upper bound for the cost of the transposition heuristics. e.g . C o n sid er p airs o f d istin ct item s in d ep en d en tly. Q u ery S eq u en ce C (A , B ) A C 0 (A , C ) 0 (B , C ) 0 0 1 1 B C 1 1 A 1 0 0 1 1 1 2 1 # o f in terw o rd co m p ariso n s: 7 (n o t co rrect) 1 2 10 -37 th e co rrect in terw o rd co m p ariso n s: Q uery S equence C A C B D ata O rdering C AC CA CBA CBA CAB 1 1 2 N u m b er o f In terw o rd 0 C o m p ariso n s C 0 A 2 6 10 -38