Randomized QS. Ch5, CLRS Dr. M. Sakalli, Marmara University Picture 2006, RPI Randomized Algorithm o Randomize the algorithm so that it works well with high probability on all inputs o In this case, randomize the order that candidates arrive o Universal hash functions: randomize selection of hash function to use M, Sakalli, CS246 Design & Analysis of Algorithms, Lecture Notes 2-1 o Hiring problem Probabilistic Analysis and Randomized Algorithms n candidates interviewed for a position One interview each day, if qualified hire the new, fire the current employed. Best = Meet the most qualified one at the 1st interview. for k = 1 to n Pay ci interview cost;. if candidate(k) is better Sack current, bestk; // sack current one and hire(k); Pay ch hiring (and sacking) cost; o Worst-case cost is increased quality: nch + nch, ch> ci, then O(nch) o Best-case, the least cost: ((n-1) ci + ch), prove that hiring problem is (n) o Suppose applicants arriving in a random qualification order and suppose that the randomness is equally likely to be any one of the n! of permutations 1 through n. 2-2 o A uniform random permutation.M, Sakalli, CS246 Design & Analysis of Algorithms, Lecture Notes Probabilistic Analysis and Randomized Algorithms o An algorithm is randomized, not just if its behaviors is controlled by but if the output is generated by a randomnumber generator. o C rand(a,b), with boundaries are inclusive.. Equally likely (probable) outputs of X = {Xi} for i=1:n. n Σi xi o Expected value: E[X] = * (probability density of n each x) = E[X] = Σi xi Pr{x=xi} o In the hiring case, o Defining an indicator, Indicator random variable, Xi = I { indicator of candidate if hired (or of coin), 1(H) or 0(T)} M, Sakalli, CS246 Design & Analysis of Algorithms, Lecture Notes 2-3 o H and T, three fair coins, head 3$s, every tail 2$s, and expected value of earnings. o HHH = 9$s, 1/8, o HHT = 4$s, 3/8, o HTT = -1$s, 3/8, o TTT = -6$s, 1/8 o E[earnings]= 9/8+12/8-3/8-6/8=12/8=1.5 M, Sakalli, CS246 Design & Analysis of Algorithms, Lecture Notes 2-4 Probabilistic Analysis and Randomized Algorithms o Lemma 1. Given a sample space S, and an event e in the sample space, let Xe=I(e) be indicator of occurrence of e, then, E[Xe]=Pr{e}. Proof: Fr the definition of expected value, n E[XA] = Σe xe Pr{x=e} = 1* Pr{Xe} + 0 * Pr{notXe} = Pr{Xe}, where note=S-e. o In the binary case, equally distributed, uniform distribution, E[e] = Pr{Xe} = 1/2. M, Sakalli, CS246 Design & Analysis of Algorithms, Lecture Notes 2-5 n o X = Σj=1 {xi} o Expected value of a candidate been hired is the probability of candidate hired. E[xj]= Pr{if candidate(i) is hired}, Candidate i is hired if better than previous i-1 candidates. The probability of being better is 1/i. Then, E[xi] = Pr{xi}= 1/i o Expected value of hiring (the average number of hired ones out of n arriving candidates in random rank) E[X]. Uniform distribution, equally likely, 1/i. M, Sakalli, CS246 Design & Analysis of Algorithms, Lecture Notes 2-6 o Expected value of hiring (the average number of hired ones out of n arriving candidates in random rank) E[X]. Uniform distribution, equally likely, 1/i. n o E[X] = E[Σj=1 Ii Pr{xi}], Ii={1, 0} indicator random value here. n o E[X] = Σi E[xi], from linearity of expected value. . o Σi=1:(n)(1/i) = 1+1/2+1/3…, harmonic number (divergent).. Int 1 to (n+1), (1/x)=ln(n+1)<=ln(n)+O(1) o Σi=2:(n)(1/i) = 1/2+1/3….. Int 1 to (n+1),(1/x) =ln(n)<=ln(n) o Σi=1:(n)(1/i) = 1+1/2+1/3….. ln(n)+1 o E(X) = ln(n)+O(1), o Expected value of all hirings.. Upper boundary is ln(n). o Lemma5.2: When candidates presented in random, the cost of M, Sakalli, CS246 Design & Analysis of Algorithms, Lecture Notes hiring is O(chlgn). Proof from Lemma5.1. 2-7 o In the case of dice, Pr{heads}= 1/2, in which case, for n tries, E[X] n = E[Σj=1 {1/2}] =n/2. o Biased Coin Suppose you want to output 0 and 1 with the probabilities of 1/2 You have a coin that outputs 1 with probability p and 0 with probability 1-p for some unknown 0 < p < 1 Can you use this coin to output 0 and 1 fairly? What is the expected running time to produce the fair output as a function of p? o Let Si be the probability that we successfully hire the best qualified candidate AND this candidate was the ith one interviewed o Let M(j) = the candidate in 1 through j with highest score o What needs to happen for Si to be true? Best candidate is in position i: Bi No candidate in positions k+1 through i-1 are hired: Oi These two quantities are independent, so we can multiply their probabilities to 2-8 M, Sakalli, CS246 Design & Analysis of Algorithms, Lecture Notes get Si o Characterizing the running time of a randomized algorithm. n o E[T(n)] = Σj=1 tiPr{ti} void qksrt( vector<int> & data) { RandGen rand; qksrt(data, 0, data.lngth() - 1, rand ); } void qksrt( vector<int> & dt, int srt, int end, RandGen & rand ) { if( start < end ) { int bnd = partition(dt, srt, end, rand.RandInt(srt, end ) ); qksrt(dt, start, bnd, rand); qksrt(dt, bnd + 1, end, rand ); } } M, Sakalli, CS246 Design & Analysis of Algorithms, Lecture Notes 2-9 o we are equally likely to get each possible split, since we choose the pivot at random. o Express the expected running time as: o T(0) = 0; n-1 o T(n) = Σk=0 1/n[T(k)+T(n-k+1)+n]; n-1 o T(n) = n + Σk=0 1/n[T(k)+T(n-k+1)]; o Bad split, choosing from 1st or 4th quarter, that is i k i+ ¼[j-i+1] or j - ¼[j-i+1] k j o Good split i + ¼[i-j+1] k j - ¼[j-i+1] M, Sakalli, CS246 Design & Analysis of Algorithms, Lecture Notes 2-10 o Mixing good and bad coincidence with equal likelihood. n-1 o T(n) = n + Σk=0 1/n[T(k)+T(n-k+1)]; n-1 o T(n) = n + (2/n) Σj=n/2 [T(k)+T(n-k+1)]; = n + (2/n){Σk=n/2 3n/4 [T(k)+T(n-k+1)] + Σk=3n/4 3n/4 n-1 [T(k)+T(n-k+1)]}; o n + (2/n){ Σk=n/2 [T(3n/4)+T(n/4)] + Σk=3n/4 o n + (2/n)(n/4){[T(3n/4)+T(n/4)] + [T(n-1)]}; o n +(1/2){[T(3n/4)+T(n/4)] + [T(n-1)]}; n-1 [T(n-1)+T(0)]}; o Prove that for all n T(n) cnlog(n), T(n) is the statement obtained above. Inductive proof, for n=0, and n=n, o Probability of many bad splits is very small. With high probability the list is divided into fractional pieces which is enough balance to get asymptotic n log n running time. o n + (1/2)[c(3n/4) log(3n/4) + c(n/4) log(n/4)] +(1/2)c(n-1)log(n-1) M, Sakalli, CS246 Design & Analysis of Algorithms, Lecture Notes 2-11 o o o o o o o MIT notess L(n)= 2U(n/2) + Θ(n) lucky U(n)= L(n –1) + Θ(n) unlucky L(n)= 2U(n/2 –1) + 2Θ(n/2)+ Θ(n) L(n)= 2U(n/2 –1) + Θ(n) L(n)= Θ(nlogn) And there is more there.. M, Sakalli, CS246 Design & Analysis of Algorithms, Lecture Notes 2-12 Computing S o o o o o o o Bi = 1/n Oi = k/(i-1) Si = k/(n(i-1)) S = Si>k Si = k/n Si>k 1/(i-1) is probability of success k/n (Hn – Hk): roughly k/n (ln n – ln k) Maximized when k = n/e Leads to probability of success of 1/e M, Sakalli, CS246 Design & Analysis of Algorithms, Lecture Notes 2-13