Document

advertisement
Jan 12, WSAC 2010
Randomized Algorithms
Kyomin Jung
KAIST
Applied Algorithm Lab
1
Randomness in Algorithms

A Probabilistic Turing Machine is a TM given with a
(binary) random “coin”.

In the computation process of the PTM, PTM can toss
a coin to decide its decision.

Ex) Computer games, random screen savor…

Note: in computer programming, the random number
generating function (ex, rand() in C) takes an initial
random seed, and generates a “deterministic”
sequence of numbers. Hence they are not “truly”
random.
2
Why randomness can be helpful?

A Simple example
 Suppose we want to check whether an integer set
A  {a1, a2 , a3...,an} has an even number or not.

Even when A has n/2 many even numbers, if we
run a Deterministic Algorithm, it may check n/2+1
many elements in the worst case.

A Randomized Algorithm: At each time, choose an
elements (to check) at random.
 Smooths the “worst case input distribution” into
“randomness of the algorithm”
3
Property of “independent” random trials

Law of Large Number
 If we repeat independent random samplings many
times according to a fixed distribution D, the “the
average becomes close to the expectation” (ex: dice
rolling)

Ex) Erdos-Renyi Random Graph G(n,p). Voting poll …
4
Chernoff Bound
• Suppose we have a coin with probability of heads is p.
• Let X be a random variable where X=1 if the coin flip
gives heads and X=0 otherwise.
E[X] = 1*p + 0*(1-p) = p
• After flipping a coin w times we can estimate the heads
prob by average of xi.
• The Chernoff Bound tells us that this estimate
converges exponentially fast to the true mean p.
w


1
Pr p  w  xi     exp   2 w
i 1




5
Las Vegas vs Monte Carlo

A Las Vegas algorithm is a randomized algorithm that
always gives correct results

Ex: randomized quick sort

A Monte Carlo algorithm is one whose running time is
deterministic, but whose output may be correct only
with a certain probability.

Class BPP (Bounded-error, Probabilistic, Polynomial time)

problem which has a Monte Carlo algorithm to solve it
6
Randomized complexity classes

Model: probabilistic Turing Machine

BPP
 L  BPP if there is a p.p.t. TM M:
x  L  Pry[M(x,y) accepts] ≥ 2/3
x  L  Pry[M(x,y) rejects] ≥ 2/3
 “p.p.t”
= probabilistic polynomial time
 y : (a sequence of) random coin tosses
7
Randomized complexity classes

RP (Random Polynomial-time)
 L  RP if there is a p.p.t. TM M:
x  L  Pry[M(x,y) accepts] ≥ ½
x  L  Pry[M(x,y) rejects] = 1

coRP (complement of Random Polynomial-time)
 L  coRP if there is a p.p.t. TM M:
x  L  Pry[M(x,y) accepts] = 1
x  L  Pry[M(x,y) rejects] ≥ ½
8
Error reduction for BPP

given L, and p.p.t. TM M:
x  L  Pry[M(x,y) accepts] ≥ ½ + ε
x  L  Pry[M(x,y) rejects] ≥ ½ + ε

new p.p.t. TM M’:
 simulate M k/ε2 times, each time with independent
coin flips
 accept if majority of simulations accept
 otherwise reject
9
Error reduction for BPP
 Xi
random variable indicating “correct” outcome
in i-th simulation (out of m = k/ε2 )
 Pr[Xi = 1] ≥ ½ + ε
 Pr[Xi = 0] ≤ ½ - ε
 E[Xi] ≥ ½+ε
 X = ΣiXi
 μ = E[X] ≥ (½ + ε)m
 By
Chernoff Bound: Pr[X ≤ m/2] ≤ 2-(ε
2 m)
10
Derandomization

Question: is BPP=P?

Goal: try to simulate BPP in polynomial time
use Pseudo-Random Generator (PRG):

seed
G
t bits

Good if t=O(log m)

Blum-Micali-Yao PRG
seed length t = mδ
output string
m bits
11
Example of randomized algorithm :
Minimum Cut Problem
C
A
B
D
Note: Size of the min cut must is no larger
than the smallest node degree in graph
12
Application: Internet Minimum Cut
13
Randomized Algorithm (by David Karger)

While |V| > 2:
a random edge (x, y) from E
 Contract the edge:
 Pick
Keep multi-edges, remove self-loops
 Combine nodes


The two remaining nodes represent
reasonable choices for the minimum cut sets
14
Analysis

Suppose C is a minimum cut (set of edges that
disconnects G)

When we contract edge e:
 Unlikely that e  C
 So, C is likely to be preserved
What is the probability a randomly
chosen edge is in C?
15
Random Edge in C?

|C| must be  degree of every node in G

How many edges in G:
|E| = sum of all node degrees / 2
 n |C| / 2
Probability a random edge is in C  2/n
16
Iteration





How many iterations? n - 2
Probability for first iteration:
Prob(e1  C)  1 – 2/n
Probability for second iteration:
Prob(e2  C | e1  C)  1 – 2/(n-1)
...
Probability for last iteration:
Prob(en-2  C |…)  1 – 2/(n-(n-2-1))  1 – 2/3
17
Probability of finding C?
 1  n2 1  n21 1  n22 ...1  23 


n2
n
  ...   
n3
n1
n4
n2
3
5
2
4
1
3
2
n ( n1)
Probability of not finding C
= 1 – 2/(n(n-1))
18
Probability of finding C?
Probability of not finding C on one trial:
 1 – 2/(n(n-1))  1 – 2/n2
Probability of not finding C on k trials:
 [1 – 2/n2]k
If k = cn2,
Prob failure  (1/e)c
Recall: lim (1 – 1/x)x = 1/e
x
19
Example of randomized algorithm :
Random Sampling

What is a random sampling?
a probability distribution  , pick a point
according to  .
 e.g. Monte Carlo method for integration
 Given

Choose numbers uniformly at random from the
integration domain, and sum up the value of f at
those points
20
How to use Random Sampling?

Volume computation in Euclidean space.
21
Hit and Run



Hit and Run algorithm is used to sample from a
convex set in an n-dimensional Eucliden space.
Assume that we can evaluate its ratios at given
points
It converges in O(n3 ) time.
22
Example of randomized algorithm :
Markov Chain Monte Carlo


Let   (1 ,...,  n ) be a probability density function
on S={1,2,..n}.
f(‧) is any function on S and we want to estimate
n
I  E ( f )   f (i ) i.
i 1

Construct P={Pij}, the transition matrix of an
irreducible Markov chain with states S, where
Pij  Pr{X t 1  j | X t  i}, X t  S
and π is its unique stationary distribution.
23
Markov Chain Monte Carlo

Run this Markov chain for times t=1,…,N and calculate
the Monte Carlo sum
ˆI  1
N
N
 f {X },
t
t 1
then Iˆ  I as N  .
24
Applied Algorithm Lab (http://aa.kaist.ac.kr/)
Research Projects









Markov Random Field and Image segmentation
Network consensus algorithm design
Community detection in complex networks
Routing/scheduling algorithm analysis in
wireless networks
Motion surveillance in video images
Character recognition in images
SAT solver design based on formula structures
Decentralized control algorithms for multi-agent
robot system
Financial data mining
Members
Professor
Kyomin Jung
Graduate Students
Yongsub Lim
Boyoung Kim Nam-ju Kwak
Download