Randomized algorithms

Randomized Algorithms Pasi Fränti 1.10.2014 Treasure island Treasure worth 20.000 awaits ? 5000 5000 Map for sale: 3000 5000 ? DAA expedition To buy or not to buy Buy the map: 20000 – 5000 – 3000 = 12.000 Take a change: 20000 – 5000 = 15.000 20000 – 5000 – 5000 = 10.000 To buy or not to buy Buy the map: 20000 – 5000 – 3000 = 12.000 Take a change: 20000 – 5000 = 15.000 20000 – 5000 – 5000 = 10.000 Expected result: 0.5 ∙ 15000 + 0.5 ∙ 10000 = 12.500 Three type of randomization 1. Las Vegas - Output is always correct result Result is not always found Probability of success p 2. Monte Carlo - Result is always found Result can be inaccurate (or even false!) Probability of success p 3. Sherwood - Balancing the worst case behavior Las Vegas Eating philosophizes Las Vegas Input: Binary vector A[1, n] Output: Index of any 1-bit from A LV(A, n) REPEAT k ← RAND(1, n); UNTIL A[k]=1; RETURN k 8-Queens puzzle INPUT: OUTPUT: Eight chess queens and an 8×8 chessboard Setup where no queens attack each other 8-Queens brute force Brute force • Try all positions • Mark illegal squares • Backtrack if dead-end • 114 setups in total Where next…? 8 5 Random 4 • Select positions randomly • If dead-end, start over … Randomized • Select k rows randomly • Rest rows by Brute Force 8-Queens(k) Pseudo code FOR i=1 TO k DO // k Queens randomly r  Random[1,8]; IF Board[i,r]=TAKEN THEN RETURN Fail; ELSE ConquerSquare(i,r); FOR i=k+1 TO 8 DO // Rest by Brute Force r1; foundNO; WHILE (r≤8) AND (NOT found) DO IF Board[i,r] NOT TAKEN THEN ConquerSquare(i,r); foundYES; IF NOT found THEN RETURN Fail; ConquerSquare(i,j) Board[i,j]  QUEEN; FOR z=i+1 TO 8 DO Board[z,j] Board[z,j-(z-i)] Board[z,j+(z-i)]  TAKEN;  TAKEN;  TAKEN; Probability of success s = processing time in case of success e = processing time in case of failure p = probability of success q = 1-p = probability of failure t  ps  q e  t   ps  qe  qt  t  qt  ps  qe  t  t  pt  ps  qe q  t  s  e p Example: s=e=1, p=1/6  t=1+5/1∙1=6 Experiments with varying k K 0 1 2 3 4 5 6 7 8 S 114 39.6 22.5 13.5 10.3 9.3 9.1 9 9 E 36.7 15.1 8.8 7.3 7 7 7 T 114 39.6 25.2 29.0 35.1 46.9 53.5 56.0 56.0 P 100% 100% 88% 49% 26% 16% 14% 13% 13% Fastest expected time Swap-based clustering One centroid , but two clusters . Two centroids , but only one cluster . Clustering by Random Swap P. Fränti and J. Kivijärvi, "Randomised local search algorithm for the clustering problem", Pattern Analysis and Applications, 3 (4), 358-369, 2000. RandomSwap(X) → C, P C ← SelectRandomRepresentatives(X); P ← OptimalPartition(X, C); Select random REPEAT T times neighbor (Cnew, j) ← RandomSwap(X, C); Pnew ← LocalRepartition(X, Cnew, P, j); Cnew, Pnew ← Kmeans(X, Cnew, Pnew); IF f(Cnew, Pnew) < f(C, P) THEN (C, P) ← Cnew, Pnew; Accept RETURN (C, P); only if it improves Clustering by Random Swap 1. Random swap: c j  xi j  random(1, M ), i  random(1, N ) 2. Re-partition vectors from old cluster: pi  arg min d  xi , ck  1 k  M 3. Create new cluster: pi  arg min d  xi , ck  k  j  k  pi 2 2  i pi  j  i 1, N  Choices for swap O(M) clusters to be removed  O(M) clusters where to add = O(M2) different choices in total Swap is made from centroid rich area to centroid poor area. Probability for successful Swap Select a proper centroid for removal: – M clusters in total: premoval=1/M. Select a proper new location: – N choices: padd=1/N – M of them significantly different: padd=1/M In total: – M2 significantly different swaps. – Probability of each is pswap=1/M2 – Open question: how many of these are good – Theorem: α are good for add and removal. Clustering by Random Swap Probability of not finding good swap:    q  1  2   M  2 T Iterated T times Estimated number of iterations:  2  log q  T  log 1  2   M  log q T   2  log 1  2   M  Bounds for the iterations Upper limit: 2 ln q - ln q M T  2  -ln q  2 2 2 2 ln 1  α / M α /M α   Lower limit similarly; resulting in:  M2  T   - ln q  2  α   Total time complexity Time complexity of single step (t): t = O(αN) Number of iterations needed (T): 2  M T   - ln q  2 α  Total time:     -ln q   NM 2  M2  T N , M   -ln q  2  N   α α   Monte Carlo Monte Carlo Input: A bit vector A[1, n], iterations I Output: An index of any 1 bit from A LV(A, n, I) i ← 0; DO k ← RAND(1, n); i ← i + 1; WHILE (A[k]≠1 AND i ≤ I) RETURN k Monte Carlo Potential problems to be considered: • Detecting prime numbers • Calculating integral of a function Sherwood Selection of pivot element Something about Quicksort and Selection: • Practical example of re-sorting • Median selection 1 N-1 1 N-2 1 N-3 … O(N2) Simulated dynamic linked list 1. Sorted array - Search efficient: O(logN) - Insert and Delete slow: O(N) 2. Dynamically linked list - Insert and Delete fast: O(1) - Search inefficient: O(N) Simulated dynamic linked list Example Head Linked list: 1 2 4 7 5 Simulated by array: Head=4 i 1 2 3 4 Value 2 4 15 1 5 21 7 Next 2 5 6 7 1 5 6 0 7 3 15 21 Simulated dynamic linked list Divide-and-conquer with randomization SEARCH (A, x) i := A.HEAD; max := A[i].VALUE; Value searched N random breakpoints FOR k:=1 TO N DO Biggest breakpoint ≤ x j:=RANDOM(1, N); y:=A[j].VALUE; IF (max<y) AND (y≤x) THEN i:=j; max:=y; Full search from RETURN LinearSearch(A, x, i); breakpoint i Analysis of the search • • • • Divide into N segments Each segment has N/N = N elements Linear search within one segment. Expected time complexity = N + N = O(N) max N N (on average) search for Experiment with students Data (N=100) consists of numbers from 1..100: 1 2 3 4 Select N breaking points: 99 100 Searching for… 42 Empty space for notes

Randomized algorithms

Related documents

Products

Support

Randomized algorithms

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib