Lectures on Randomised Algorithms COMP 523: Advanced Algorithmic Techniques Lecturer: Dariusz Kowalski Lectures on Randomised Algorithms 1 Overview Previous lectures: • NP-hard problems • Approximation algorithms These lectures: • Basic theory: – probability, random variable, expected value • Randomised algorithms Lectures on Randomised Algorithms 2 Probabilistic theory Consider flipping two symmetric coins with sides 1 and 0 • Event: situation which depends on random generator – Event when the sum of results on two flipped coins is 1 • Random variable: a function which attaches a real value to any event – X = sum of results on two flipped coins • Probability of event: proportion of the event to the set of all events (sometimes weighted) – Pr[X=1] = 2/4 = 1/2 , since X = 1 is the event containing two elementary events: • 0 on the first coin and 1 on the second coin • 1 on the first coin and 0 on the second coin Lectures on Randomised Algorithms 3 Probabilistic theory cont. Consider flipping two symmetric coins with sides 1 and 0 • Expected value (of random variable): the sum of all possible values of the random variable weighted by the probabilities of occurring these values – E[X] = 0 1/4 + 1 1/2 + 2 1/4 = 1 • Independence: two events are independent if the probability of their intersection is equal to the multiplication of their probabilities – Event 1: 1 on the first coin, Event 2: 0 on the second coin; Pr[Event1 & Event2] = 1/4 = Pr[Event1] Pr[Event2] = 1/2 1/2 – Event 3: sum on two coins is 2; Pr[Event1 & Event3] = 1/4 Pr[Event1] Pr[Event3] = 1/2 1/4 Lectures on Randomised Algorithms 4 Randomised algorithms • Any kind of algorithm using (pseudo) random generator • Main kinds of algorithms: – Monte Carlo: algorithm computes proper solution with high probability (in practise: at least constant) • Algorithm MC always stops – Las Vegas: algorithm always computes proper solution • Sometimes algorithm can run very long, but with very small probability Lectures on Randomised Algorithms 5 Quick sort - algorithmic scheme Generic Quick Sort: • Select one element x from the input • Partition the input into the part containing elements not greater than x and the part containing all bigger elements • Sort each part separately • Concatenate these sorted parts Problem: how to choose element x to balance the sizes of these two parts? (to get the similar recursive equations as for MergeSort) Lectures on Recursive Algorithms 6 Why parts should be balanced? Suppose we do not balance, but choose the last element: T(n) T(n-1) + T(1) + c n T(1) c Solution: T(n) d n2, for some constant 0 < d c/2 Proof: by induction. – For n = 1 straightforward – Suppose T(n-1) d (n-1)2; then T(n) T(n-1) + c + c n d (n-1)2 + c (n+1) d (n-1)2 + 2dn d n2 Lectures on Recursive Algorithms 7 Randomised approach Randomised approach: • Select element x uniformly at random • Time: O(n log n) • Additional memory: O(n) Uniform selection: each element has the same probability to be selected Lectures on Randomised Algorithms 8 Randomized approach - analysis Let T(n) denote the expected time: sum of all possible values of time weighted by the probabilities of these values T(n) 1/n ([T(n-1)+T(1)] + [T(n-2)+T(2)] + … +[T(0)+T(n)]) + cn T(0) = T(1) = 1, T(2) c Solution: T(n) d n log n, for some constant d 8c Proof: by induction. – For n = 2 straightforward – Suppose T(m) d m log m, for every m < n; then (1-1/n)T(n) (2/n)(T(0) + … + T(n-1)) + c n (2d/n)(0 log 0 + … + (n-1)log(n-1)) + c n d n log n - d (n/4) + c n d n log n - d n/2 T(n) n/(n-1)(d n log n - d n/2) d n log n Lectures on Recursive Algorithms 9 Tree structure of random execution root 1 6 5 7 height = 5 3 2 4 1 2 3 4 5 6 7 8 8 leaves Lectures on Recursive Algorithms 10 Minimum Cut in a graph Minimum cut in an undirected multi-graph G (there may be many edges between a pair of nodes): – A partition of nodes with minimum number of crossing edges • Deterministic approach: – Transform the graph to s-t network, for every pair of nodes s,t – Replace each undirected edge by two directed edges in opposite directions of capacity 1 each – Replace all multiple directed edges by one edge with capacity equal to the multiplicity of this edge – Run Ford-Fulkerson (or other network-flow algorithm) to compute max-flow, which is equal to min-cut Lectures on Randomised Algorithms 11 Minimum Cut in a graph Randomised approach: • Select a random edge: – contract their end nodes into one node, – remove edges between these two nodes – keep the other adjacent edges to the obtained supernode • Repeat the above procedure until two supernodes remain • Count the number of edges between the remaining supernodes and return the result Lectures on Randomised Algorithms 12 Minimum Cut - Analysis Let K be the smallest cut (set of edges) and let k be its size. • Compute probability that in step j the edge in K is selected, providing no edge from K has been selected before, is – Each supernode has at least k adjacent edges (otherwise a cut between a node with smaller number of adjacent edges and remaining supernodes would be smaller than K) – Total number of remaining supernodes in the beginning of step j is n - j + 1 – Total number of edges in the beginning of step j is thus at least k(n - j + 1)/2 (each edge is counted twice to the degree of a node) – Probability of selecting (and so contracting) edge in K in step j is at most k/[k(n - j + 1)/2] = 2/(n - j + 1) Lectures on Randomised Algorithms 13 Minimum Cut - Analysis cont. • Event Bj : in step j of the algorithm an edge not in K is selected • Conditional probability (of event A under condition event B): Pr[A|B] = Pr[AB]/Pr[B] • From the previous slide: Pr[Bj | Bj-1 … B1] > 1 – 2/(n - j + 1) • The following holds: Pr[Bj Bj-1 … B1] = Pr[B1] Pr[B2|B1] Pr[B3|B2B1] … Pr[Bj|Bj-1…B1] • Probability of sought event Bn-2 Bn-3 … B1 (i.e., that in all n-2 steps of the algorithm edges not in K are selected) is at most [1-2/n][1-2/(n - 1)]…[1-2/3] = [(n-2)/n][(n-3)/(n-1)][(n-4)/(n-2)]…[2/4][1/3] = 2/[n(n-1)] Lectures on Randomised Algorithms 14 Minimum Cut - Analysis cont. • If we iterate this algorithm independently n(n-1)/2 times, always recording the minimum output obtained so far, then the probability of success (i.e., of finding a min-cut) is at least 1-(1-2/[n(n-1)])n(n-1)/2 1-1/e • To obtain bigger probability we have to iterate this process more times • The total time is O(n3) concatenations • Question: how to implement concatenation efficiently? Lectures on Randomised Algorithms 15 Conclusions • Probabilistic theory – Events, random variables, expected values • Basic algorithms – LV Randomised Quick Sort (randomised recurrence) – MC Minimum Cut (iterating to get bigger probability) Lectures on Randomised Algorithms 16 Textbook and Exercises READING: • Chapter 13, Sections 13.2, 13.3, 13.5 and 13.12 EXERCISE: • How many iterations of min-cut randomised algorithm should we perform to obtain probability of success at least 1 - 1/n ? For volunteers: • Suppose that we know the size of min-cut. What is the expected number of iterations of min-cut randomised algorithm to find a sample min-cut? Lectures on Randomised Algorithms 17 Overview Previous lectures: • Randomised algorithms • Basic theory: probability, random variable, expected value • Algorithms: LV (sorting) and MC (min-cut) This lecture: • Basic random processes Lectures on Randomised Algorithms 18 Expected number of successes Sequence (possibly infinite) of independent random trials, each with probability p of success • Expected number of successes in m trials is – Probability of success in one trial is p, so let Xj be such that Pr[Xj=1] = p and Pr[Xj=0] = 1 - p , for 0<jm – E[0<jm Xj] = 0<jm E[Xj] = mp • Memoryless guessing: n cards, you guess one, turn over one card, check if you succeeded, shuffle cards and repeat; how much time do you need to expect one proper guess? – Pr[Xj=1] = 1/n and Pr[Xj=0] = 1 - 1/n – E[0<jn Xj] = 0<jn E[Xj] = n 1/n = 1 Lectures on Randomised Algorithms 19 Guessing with memory • n cards, you guess one, turn over one card, remove the card, shuffle the rest of them and repeat; how many successful guesses can you expect? – Pr[Xj=1] = 1/(n-j+1) and Pr[Xj=0] = 1 - 1/(n-j+1) – E[0<jn Xj] = 0<jn E[Xj] = 0<jn 1/(n-j+1) = 0<jn 1/j = Hn = ln n + const. Lectures on Randomised Algorithms 20 Waiting for the first success Sequence (possibly infinite) of independent random trials, each with probability p of success • Expected time for waiting for the first success is j>0 j (1 - p)j-1 p = p j1 j (1 - p)j-1 = p (j1 (1 - p)j )’ = p ((1 - p)/(1-(1-p)))’ = p 1/p2 = 1/p Lectures on Randomised Algorithms 21 Collecting coupons • n types of coupons hidden randomly in a large number of boxes, each box contains one coupon. You choose a box and take a coupon from it. How many boxes can you expect to open in order to collect all kinds of coupons? • Stage j: time between selecting j - 1 different coupons and jth different coupon – Independent trials Yi for each step i of stage j, satisfying Pr[Yi=0] = (j-1)/n and Pr[Yi=1] = (n-j+1)/n – Let Xj be the length of stage j; the expected waiting for first success E[Xj] = 1/Pr[Yi =1] = n/(n-j+1) • Finally E[0<jn Xj] = 0<jn E[Xj] = 0<jn n/(n-j+1) = n 0<jn 1/j = nHn = n ln n + n const. Lectures on Randomised Algorithms 22 Conclusions • Probabilistic theory – Events, random variables, expected values, etc. • Basic random processes – – – – Number of successes Guessing with or without memory Waiting for the first success Collecting coupons Lectures on Randomised Algorithms 23 Textbook and Exercises READING: • Section 13.3 EXERCISES: • How many iterations of min-cut randomised algorithm should we perform to obtain probability of success at least 1 - 1/n ? • More general question: Suppose that algorithm MC answers correctly with probability 1/2. How to modify it to answer correctly with probability at least 1-1/n ? For volunteers: • Suppose that we know the size of min-cut. What is the expected number of iterations of min-cut randomised algorithm to find a precise min-cut? Lectures on Randomised Algorithms 24 Overview Previous lectures: • Randomized algorithms • Basic theory: probability, random variable, expected value • Algorithms: LV sorting, MC min-cut • Basic random processes This lecture: • Randomised caching Lectures on Randomised Algorithms 25 Randomised algorithms • Any kind of algorithm using (pseudo-)random generator • Main kinds of algorithms: – Monte Carlo: algorithm computes the proper solution with large probability (at least constant) • Algorithm MC always stops • We want to have high probability of success – Las Vegas: algorithm computes always the proper solution • Sometimes algorithm can run very long, but with very small probability • We want to achieve small expected running time (or other complexity) Lectures on Randomised Algorithms 26 On-line vs. off-line Dynamic data: • Arrive during execution Algorithms: • On-line: doesn’t know the future, makes decision on-line • Off-line: knows the future, makes decision off-line Complexity measure: Competitive ratio: • The maximum ratio, taken over all data, between the performance of given on-line algorithm and the optimum off-line solution for the data Lectures on Randomised Algorithms 27 Analyzing the caching process Two kinds of memory: • Fast memory: cache of size k • Slow memory: disc of size n Examples: • hard disc versus processor cache • network resources versus local memory Problem: • In each step a request for a value arrives; • If the value is in cache then answering does not cost anything, otherwise it costs one unit (of access to the slow memory) Performance measure: • Count the number of accesses to the slow memory • Compute competitive ratio Lectures on Randomised Algorithms 28 Marking algorithm(s) • Algorithm proceeds in phases • Each item in cache is either marked or unmarked • At the beginning of each phase all items are unmarked • Upon a request to item s: – If s is in cache then mark s (if already unmarked) – Else • If all items in cache are marked then finish the current phase and start a new one, unmark all items in the cache • Remove a randomly selected unmarked item from the cache and put s in its place; mark s Lectures on Randomised Algorithms 29 Example of processing by marking Stream: 1,2,3,4,1,2,3,4 Cache (for k = 3 items): • Phase 1: 1 -> 1 2 -> 1,2 3 -> 1,2,3 • Phase 2: 1,2,3 4 -> 1,3,4 1 -> 1,3,4 • Phase 3: 1,2,4 3 -> 1,2,3 4 -> 2,3,4 2-> 1,2,4 • Number of accesses to slow memory: 7 • Optimal algorithm: 5 Notation: Marked elements: 1 New marked elements: 4 Lectures on Randomised Algorithms 30 Analysis • Let r denote the number of phases of the algorithm • Item can be – Marked – Unmarked • Fresh - it was not marked during previous phase • Stale - it was marked during previous phase • Let – denote the stream of requests, – cost() denote the number of accesses to slow memory by the algorithm, – opt() denote the minimum possible cost on stream , – optj() be the number of “misses” in phase j. • Let cj denote the number of requests in the data stream to fresh items in phase j Lectures on Randomised Algorithms 31 Analysis: fresh items in optimal solution (*) After one phase, only items which have been requested in that phase can be stored in the cache Properties: • optj() + optj+1() cj+1 Indeed, in phases j and j+1 there are at least cj+1 “misses” in optimal algorithm, since, by (*), fresh items requested in phase j+1 were not requested in phase j and so could not be present in the cache. • 2opt () 0j<r [optj() + optj+1()] 0j<r cj+1 • opt() 0.5 0<jr cj Lectures on Randomised Algorithms 32 Analysis - stale items Let Xj be the number of misses by marking algorithm in phase j • No misses on marked items - they remain in cache • cj misses on fresh items in phase j • At the beginning of phase j all items in cache are stale unmarked by request in the previous phase • ith request to unmarked stale item, say it is for item s: – each of remaining k-i+1 stale items is equally likely to be no longer in cache, at most cj items were replaced by fresh items, and so s is not in the cache with probability at most cj/(k-i+1) • E[Xj] cj + 0<ik cj/(k-i+1) cj (1 + 0<ik 1/(k-i+1)) cj (1 + Hk) Lectures on Randomised Algorithms 33 Analysis - conclusions • Let Xj be the number of misses by marking algorithm in phase j • cost() denotes the number of accesses to external memory by the algorithm - random variable • opt() denotes the minimum possible cost on stream - deterministic value E[cost()] 0<jr E[Xj] (1 + Hk)0<jr cj (2Hk + 2) opt() Lectures on Randomised Algorithms 34 Conclusions • Randomised algorithm for caching: O(ln k) competitive • Lower bound k on competitiveness of any deterministic caching algorithm: for every deterministic algorithm there is a string of requests such that they are processed at least k times slower than the optimal processing Lectures on Randomised Algorithms 35 Textbook and Exercises READING: • Section 13.8 EXERCISES (for volunteers): • Modify Randomized Marking algorithm to obtain k-competitive deterministic algorithm. • Prove that Randomized Marking algorithm is at least Hk-competitive. • Prove that each deterministic caching algorithm is at least k-competitive. Lectures on Randomised Algorithms 36 Overview Previous lectures: • Randomized algorithms • Basic theory: probability, random variable, expected value • Algorithms: LV sorting, MC min-cut • Basic random processes • Randomised caching This lecture: • Multi-access channel protocols Lectures on Randomised Algorithms 37 Ethernet “dominant” LAN technology: • cheap $20 for 1000Mbs! • first widely used LAN technology • Simpler, cheaper than token LANs and ATM • Kept up with speed race: 10, 100, 1000 Mbps Metcalfe’s Ethernet sketch Lectures on Randomised Algorithms 38 Ethernet Frame Structure Sending adapter encapsulates IP datagram (or other network layer protocol packet) in Ethernet frame Preamble: • 7 bytes with pattern 10101010 followed by one byte with pattern 10101011 • used to synchronize receiver, sender clock rates Lectures on Randomised Algorithms 39 Ethernet Frame Structure (more) • Addresses: 6 bytes – if adapter receives frame with matching destination address, or with broadcast address (eg ARP packet), it passes data in frame to net-layer protocol – otherwise, adapter discards frame • Type: indicates the higher layer protocol, mostly IP but others may be supported such as Novell IPX and AppleTalk) • CRC: checked at receiver, if error is detected, the frame is simply dropped Lectures on Randomised Algorithms 40 Unreliable, connectionless service • Connectionless: No handshaking between sending and receiving adapter. • Unreliable: receiving adapter doesn’t send acks or nacks to sending adapter – stream of datagrams passed to network layer can have gaps – gaps will be filled if app is using TCP – otherwise, app will see the gaps Lectures on Randomised Algorithms 41 Random Access Protocols • When node has packet to send – transmit at full channel data rate R – no a priori coordination among nodes • Multiple-access channel: – One transmitting node at a time -> successful access/transmission – Two or more transmitting nodes at a time -> collision (no success) • Random access MAC protocol specifies: – how to detect collisions – how to recover from collisions (e.g., via delayed retransmissions) • Examples of random access MAC protocols: – ALOHA (slotted, unslotted) – CSMA (CSMA/CD, CSMA/CA) Lectures on Randomised Algorithms 42 Slotted ALOHA Assumptions • all frames same size • time is divided into equal size slots, time to transmit 1 frame • nodes start to transmit frames only at beginning of slots • nodes are synchronized • if 2 or more nodes transmit in slot, all nodes detect collision Operation • when node obtains fresh frame, it transmits in next slot • no collision, node can send new frame in next slot • if collision, node retransmits frame in each subsequent slot with prob. p until success Lectures on Randomised Algorithms 43 Slotted ALOHA Pros • single active node can continuously transmit at full rate of channel • highly decentralized: only slots in nodes need to be in sync • simple Cons • collisions, wasting slots • idle slots • nodes may be able to detect collision in less than time to transmit packet Lectures on Randomised Algorithms 44 Slotted Aloha: analysis Suppose that k stations want to transmit in the same slot. The probability that one station transmits in the next slot is kp(1-p)k-1 • If k 1/(2p) then kp(1-p)k-1 = (kp) < 1/2, and applying the analysis similar to the coupon collector problem we get an average number of slots when all stations transmit successfully is (1/p+1/(2p)+…+1/(kp)) = ((1/p)Hk) = ((1/p) ln k) • If k > 1/(2p) then kp(1-p)k-1 = (kp/ekp), hence the expected time even for the first successful transmission is ((1/p) ekp/k) Conclusion: choice of the probability matters! Lectures on Randomised Algorithms 45 CSMA (Carrier Sense Multiple Access) CSMA: listen before transmit: • If channel sensed idle: transmit entire frame • If channel sensed busy, defer transmission • Human analogy: don’t interrupt others! Lectures on Randomised Algorithms 46 CSMA/CD (Collision Detection) CSMA/CD: carrier sensing, deferral as in CSMA – collisions detected within short time – colliding transmissions aborted, reducing channel wastage • collision detection: – easy in wired LANs: measure signal strengths, compare transmitted, received signals – difficult in wireless LANs: receiver shut off while transmitting • human analogy: the polite conversationalist Lectures on Randomised Algorithms 47 Ethernet uses CSMA/CD • No slots • adapter doesn’t transmit if it senses that some other adapter is transmitting, that is, carrier sense • transmitting adapter aborts when it senses that another adapter is transmitting, that is, collision detection • Before attempting a retransmission, adapter waits a random time, that is, random access Lectures on Randomised Algorithms 48 Ethernet CSMA/CD algorithm 1. Adaptor gets datagram and creates frame 2. If adapter senses channel idle, it starts to transmit frame. If it senses channel busy, waits until channel idle and then transmits 3. If adapter transmits entire frame without detecting another transmission, the adapter is done with frame ! 4. If adapter detects another transmission while transmitting, aborts and sends jam signal 5. After aborting, adapter enters exponential backoff: after the m-th collision, if m < M, adapter chooses a K at random from {0,1,2,…,2m-1}. Adapter waits K*512 bit times and returns to Step 2 Lectures on Randomised Algorithms 49 Ethernet’s CSMA/CD (more) Jam Signal: make sure all other transmitters are aware of collision; 48 bits; Bit time: .1 microsec for 10 Mbps Ethernet ; for K=1023, wait time is about 50 msec See/interact with Java applet on AWL Web site: highly recommended ! Exponential Backoff: • Goal: adapt retransmission attempts to estimated current load – heavy load: random wait will be longer • first collision: choose K from {0,1}; delay is K x 512 bit transmission times • after second collision: choose K from {0,1,2,3}… • after ten collisions, choose K from {0,1,2,3,4,…,1023} Lectures on Randomised Algorithms 50 Ethernet CSMA/CD modified algorithm 1. Adaptor gets datagram from and creates frame; K := 0 2. If adapter senses channel idle, it starts to transmit frame. If it senses channel busy, waits until channel idle and then transmits 3. If adapter transmits entire frame without detecting another transmission, the adapter is done with frame ! 4. If adapter detects another transmission while transmitting, aborts and sends jam signal 5. After aborting, adapter enters modified exponential backoff: after the m-th collision, if m < M, adapter • waits (2m-1-K)*512 bit times • chooses a K at random from {0,1,2,…,2m-1}. Adapter waits K*512 bit times and returns to Step 2 Lectures on Randomised Algorithms 51 Modified Exponential Backoff: analysis Suppose some k stations start the protocol at the same time. Time complexity for a given packet out of k packets to be successfully transmitted is O(k) with probability at least ¼ : • Consider value of window such that 0.5 · window ≤ k < window; Time required to reach this size of window is O(k) • The probability that a given packet is transmitted during the run of the loop for this value of window is at least (1-1/window)k-1 > (1-1/k)k > ¼ Lectures on Randomised Algorithms 52 Modified Exponential Backoff: analysis cont. Suppose some k stations start the protocol at the same time. Time complexity for all given packets to be successfully transmitted is O(k2) with probability at least 1/2 : • Consider value of window such that 0.5 · window ≤ k2 < window; Time required to reach this size of window is O(k2) • The probability that there is any collision during the run of the loop for this value x = window is at most k’(k’-1)/2 1/x2 x = k’(k’-1)/2 1/x < k’(k’-1)/2 1/k2 < 1/2 where k’ is the number of packets that have not been successfully transmitted before, there are k’(k’-1)/2 of pairs of stations that may collide, each pair with probability 1/x2 , and there are x times available for collision Lectures on Randomised Algorithms 53 Ethernet Technologies: 10Base2 • 10: 10Mbps; 2: under 200 meters max cable length • thin coaxial cable in a bus topology • repeaters used to connect up to multiple segments – Each segment up to 30 nodes, up to 185 metres long. – Max of 5 segments. • repeater repeats bits it hears on one interface to its other interfaces: physical layer device only! • has become a legacy technology Lectures on Randomised Algorithms 54 10BaseT and 100BaseT • 10/100 Mbps rate; latter called “fast ethernet” • T stands for Twisted Pair • Nodes connect to a hub: “star topology”; 100 m max distance between nodes and hub nodes hub • Hubs are essentially physical-layer repeaters: – bits coming in one link go out all other links – no frame buffering – adapters detect collisions – provides net management functionality • eg disconnection of malfunctioning adapters/hosts. Lectures on Randomised Algorithms 55 Gbit Ethernet • use standard Ethernet frame format • allows for point-to-point links and shared broadcast channels • in shared mode, CSMA/CD is used; short distances between nodes to be efficient • uses hubs, called here “Buffered Distributors” • Full-Duplex at 1 Gbps for point-to-point links • 10 Gbps now ! Lectures on Randomised Algorithms 56 Textbook and Exercises READING: • Section 13.1 EXERCISES (for volunteers): • Exercise 3 from Chapter 13 Lectures on Randomised Algorithms 57