doc

advertisement
CSE 5311 Notes 1: Mathematical Preliminaries
(Last updated 3/6/16 2:46 PM)
Chapter 1 - Algorithms & Computing
Relationship between complexity classes, e.g. log n , n, n log n , n 2, 2 n, etc.
Chapter 2 - Getting Started
Loop Invariants and Design-by-Contract:
http://dl.acm.org.ezproxy.uta.edu/citation.cfm?doid=2578702.2506375
RAM Model
Assumed
Reality
Memory
Access
Arithmetic
Word size
Chapter 3 - Asymptotic Notation
f(n) = O(g(n))  g(n) bounds f(n) above (by a constant factor)
f(n) = (g(n))  g(n) bounds f(n) below (by a constant factor)
f(n) = (g(n))  f(n) = O(g(n)) and f(n) = (g(n))
Iterated Logarithms
logk n = (log n)k
log(k) n = log(log(k-1) n) = log log . . . log n
(log(1) n = log n)
lg* n = Count the times that you can punch a log key on your calculator before value is  1
Arises for algorithms that run in “practically” or “nearly” constant time.
Appendix A - Summations
Review:
Geometric Series (p. 1147)
Harmonic Series (p. 1147)
Approximation by integrals (p. 1154)
2
Chapter 4 - Recurrences
SUBSTITUTION METHOD - Review
1. Guess the bound.
2. Prove using (strong) math. induction.
Exercise:
æ nö
T ( n) = 4Tç ÷ + n
è 2ø
æ ö
Try Oç n 3÷ and confirm by math induction
è ø
Assume T ( k ) £ ck 3 for k < n
æ nö
n3
Tç ÷ £ c
è 2ø
8
æ nö
T ( n ) = 4Tç ÷ + n
è 2ø
n3
£ 4c
+n
8
c
= n3 + n
2
c
= cn 3 - n 3 + n
2
£ cn 3
c
- n 3 + n £ 0 if n is sufficiently large
2
æ ö
Improve bound to Oç n 2 ÷ and confirm
è ø
Assume T ( k ) £ ck 2 for k < n
æ nö
n2
Tç ÷ £ c
è 2ø
4
æ nö
n2
T ( n) = 4Tç ÷ + n £ 4c
+ n = cn 2 + n STUCK!
è 2ø
4
3
as lower bound:
Assume T ( k ) ³ ck 3 for k < n
æ nö
n3
Tç ÷ ³ c
è 2ø
8
æ nö
T ( n ) = 4Tç ÷ + n
è 2ø
n3
³ 4c
+n
8
c
= n3 + n
2
c
c
= cn 3 - n 3 + n
- n 3 + n ³ 0?
2
2
³ cn 3 DID NOT PROVE!!!
as lower bound:
Assume T ( k ) ³ ck 2 for k < n
æ nö
n2
Tç ÷ ³ c
è 2ø
4
æ nö
n2
T ( n) = 4Tç ÷ + n ³ 4c
+ n = cn2 + n ³ cn 2 for 0 < c
è 2ø
4
What’s going on . . .
T (1) = d
T (2) = 4T (1) + 2 = 4d + 2
T ( 4) = 4T (2) + 4 = 4 ( 4d + 2) + 4 = 16d + 8 + 4 = 16d + 12
T (8) = 4T ( 4) + 8 = 4 (16d + 12) + 8 = 64d + 48 + 8 = 64d + 56
T (16) = 4T (8) + 16 = 4 (64d + 56) + 16 = 256d + 224 + 16 = 256d + 240
Hypothesis : T ( n ) = dn 2 + n 2 - n = ( d + 1) n 2 - n = cn 2 - n
4
æ ö
Oç n 2 ÷:
è ø
Assume T ( k ) £ ck 2 - k for k < n
æ nö
n2 n
Tç ÷ £ c
è 2ø
4
2
æ 2
ö
æ nö
n
n
T ( n) = 4Tç ÷ + n £ 4ççc
- ÷÷ + n = cn 2 - 2n + n = cn2 - n
è 2ø
2ø
è 4
: [This was already proven.]
Assume T ( k ) ³ ck 2 - k for k < n
æ nö
n2 n
Tç ÷ ³ c
è 2ø
4
2
Exercise:
æ 2
ö
æ nö
n
n÷
ç
T ( n) = 4Tç ÷ + n ³ 4çc
- ÷ + n = cn 2 - 2n + n = cn2 - n
è 2ø
2ø
è 4
T ( n) = T
( n ) +1
O(log n):
Assume T ( k ) £ c lg k for k < n
( n ) £ c lg
lg n
2
lg n
lg n
T ( n) = T n + 1 £ c
+ 1 = c lg n - c
+1
2
2
£ c lg n if c ³ 2
T
n =c
( )
O(loglog n)
Assume T ( k ) £ c lglg k for k < n. (Note : log a log a n Î Q(logb logb n ))
lg n
T n £ c lglg n = c lg
= c lglg n - c
2
( )
T ( n) = T
( n ) + 1 £ c lglg n - c + 1
£ c lglg n if c ³ 1
5
Assume T ( k ) ³ c lglg k for k < n
( n ) ³ c lglg n = c lg lg2n = c lglg n - c
T ( n) = T ( n ) + 1 ³ c lglg n - c + 1
T
³ c lglg n if 0 < c £ 1
Substitution Method is a general technique - See CSE 2320 Notes 08 for quicksort analysis
RECURSION-TREE METHOD - Review and Prelude to Master Method
Convert to summation and then evaluate
()
Example: Mergesort, p. 38 T ( n) = 2T n + n (case 2 for master method)
2
6
Exercise:
æ nö
T ( n) = 4Tç ÷ + n
è 2ø
T(n)Þn
T(n/2)Þn/2
T(n/4)Þn/4
lg n + 1
levels
T(n/8)Þn/8
T(n/16)Þn/16
.
.
.
T(2)=T(n/2lg n-1)Þ2
T(1)Þ1
n
4n/2=2n
16n/4=4n
64n/8=8n
256n/16=16n
.
.
.
4lg n-1•n/2 lg n-1 =2lg n-1•n
4lg n =nlg 4 =n2
Using definite geometric sum formula:
lg n-1
2lg n -1
k
2
cn å 2 + cn = cn
+cn 2
2 -1
k=0
= cn n -1 +cn 2
(
)
æ ö
= cn2 - cn +cn 2 = Qç n 2 ÷
è ø
(Case 1 of master method)
t
x t+1 -1
k
Using å x =
x -1
k=0
x ¹1
7
Exercise: (case 3 of master method)
æ nö
T ( n) = 2Tç ÷ + n 3
è 2ø
T(n)Þn3
T(n/2)Þn3/8
T(n/4)Þn3/64
lg n + 1
levels
T(n/8)Þn3/512
T(n/16)Þn3/4096
.
.
.
T(2)=T(n/2lg n-1)Þ8
T(1)Þ1
n3
2n3/8=n3/4
4n3/64=n3/16
8n3/512=n3/64
16n3/4096=n3/256
.
.
.
2lg n-1(n/2lg n-1)3=n3/4lg n-1
2lg n=n
Using indefinite geometric sum formula:
lg n-1 1
¥ 1
cn 3 å
+ cn £ cn 3 å
+ cn
k
k
4
4
k=0
k=0
= cn 3
1
+ cn
1
14
4
= cn 3 + cn = O(n 3 )
3
¥
1
F rom å x k =
1- x
k=0
0 < x <1
Using definite geometric sum formula:
()
1 lg n -1
lg n-1 1
t
x t+1 -1
cn 3 å
+ cn = cn 3 4
+ cn
U sing å x k =
k
1 -1
x -1
k=0 4
k=0
4
æ
n -2 -1
1- n -2
4
1 ö
= cn 3
+ cn = cn 3
+ cn = cn 3ç1- ÷ + cn
3
3
è n2 ø
-3
4
4
æ ö
= Qç n 3÷
è ø
MASTER METHOD/THEOREM (“new”) - CLRS 4.5
x ¹1
8
()
T ( n) = aT n + f ( n)
b
a1 b>1
Three mutually exclusive cases (proof sketched in 4.6.1):
æ logb a ö
f
n
=
O
( ) ç n e ÷, e > 0 Þ T ( n) = Qæçè nlogb a ö÷ø
1.
è n
ø
2.
æ
ö
æ
ö
f ( n) = Qç n logb a ÷ Þ T ( n ) = Qç n logb a log n÷
è
ø
è
ø
3.
(leaves dominate)
(each level contributes equally)
(root dominates)
(Problem 4.6-3 shows that e > 0 in 3. follows from the existence of c and n0. By taking n = bk n0
logb n-logb n0
and the condition on f, a
f ( n0 ) £ f ( n) may be established. Last step is
c
e £ logb 1 .)
c
()
()
Example:
ænö
T ( n) = 10Tç ÷ + n
è10 ø
a = 10 b = 10 f(n) = n.5 nlog10 10 = n
Case 1:
n.5 = O(n1-), 0 <   0.5
Þ T ( n) = Q( n)
9
Example: (recursion tree given earlier - leaves dominate)
æ nö
T ( n) = 4Tç ÷ + n
è 2ø
a = 4 b = 2 f(n) = n nlog2 4 = n2
Case 1:
2-
), 0 <   1
n = O(n
æ ö
Þ T ( n) = Qç n 2 ÷
è ø
Example:
æ nö
T ( n) = 9Tç ÷ + 2n 2
è 3ø
a = 9 b = 3 f(n) = 2n2 nlog3 9 = n2
Case 2:
2n2 = (n2)
æ
ö
Þ T ( n) = Qç n 2 log n÷
è
ø
Recursion Tree
æ
ö
T ( n) = 2n2 log3 n + cn 2 = Qç n 2 log n÷
è
ø
10
Example:
æ nö
T ( n) = 2Tç ÷ + n 3
è 2ø
a = 2 b = 2 f(n) = n3 nlg 2 = n
Case 3:
n3 = (n1+), 0 <   2
æ 3ö
af n = 2ç n ÷ £ cn 3 = cf ( n), 1 £ c < 1
b
4
è 8 ø
()
æ ö
Þ T ( n) = Qç n 3÷
è ø
Example:
ænö n
T ( n ) = Tç ÷ +
è2ø 2
a = 1 b = 2 f(n) = n/2 nlg 1 = n0 = 1
Case 3:
n/2 = (n0n), 0 <   1
ænö
n
af
= 1ç 2 ÷ = n £ c n = cf ( n ), 1 £ c < 1
b
ç2÷ 4
2
2
è ø
()
Þ T ( n) = Q( n)
From CLRS, p. 95:
æ nö
T ( n) = 2Tç ÷ + n lg n
è 2ø
a = 2 b = 2 f(n) = n lg n nlg 2 = n1 = n
f ( n) = w ( n), but
()
for any e > 0 . . .
()
()
af n = 2 f n = 2 n lg n = n lg n - n = cn lg n + (1- c ) n lg n - n
b
2
2 2
£ cf ( n ) = cn lg n for c ³ 1
So master method (case 3) does not support T ( n) = Q( n lg n)
11
Using exercise 4.6-2 as a hint . . . (or a simple recursion tree with n = 2k )
æ
ö
Substitution method to show T ( n) = Qç n lg2 n÷
è
ø
PROBABILISTIC ANALYSIS
Hiring Problem - Interview potential assistants, always hiring the best available.
Input:
Permutation of 1 . . . n
Output:
The number of sequence values that are larger than all previous sequence elements.
3 1 2 4
1 2 3 4
4 3 2 1
Worst-case = n
Observation: For any k-prefix of sequence, the last element is the largest with probability ______.
Summing for all k-prefixes:
n
ln n < å 1 = H n £ ln n + 1
k
k=1
12
Hiring m-of-n Problem
Observation: For any k-prefix of sequence, the last element is one of the m largest with probability
______.
______ if k  m
______ otherwise
Sum for all k-prefixes (expected number that are hired)
m+
n
å m = m + mH n - mH m
k
k=m+1
What data structure supports this application?
Coupon Collecting (Knuth)
n types of coupon. One coupon per cereal box.
How many boxes of cereal must be bought (expected) to get at least one of each coupon type?
Collecting the n coupons is decomposed into n steps:
Step 0 = get first coupon
Step 1 = get second coupon
Step m = get m+1st coupon
Step n - 1 = get last coupon
Number of boxes for step m
Let pi = probability of needing exactly i boxes (difficult)
¥
i-1 n-m
m
, so the expected number of boxes for coupon m + 1 is å ipi
pi =
n
n
i=1
( )
Let qi = probability of needing at least i boxes = probability that previous i - 1 boxes are failures
(much easier to use)
So, pi = qi - qi+1
13
¥
¥
ip
=
å i å i(qi - qi+1)
i=1
i=1
¥
¥
= å iqi - å iqi+1
i=1
i=1
¥
¥
= å iqi - å (i -1)qi
i=1
i=2
¥
¥
¥
= q1 + å iqi - å iqi + å qi
i=2
i=2
i=2
¥
= å qi
i=1
q1 = 1
q2 = m
n
( )
( )
2
q3 = m
n
k-1
qk = m
n
¥
¥
i
n = Expected number of boxes for coupon m + 1
å qi = å mn = 1m = n-m
1i=1
i=0
n
( )
Summing over all steps gives
n-1
å n = n 1+ 1 + 1 + ...+ 1 £ n (ln n + 1)
n-i
2 3
n
i=0
(
)
Analyze the following game:
n sides on each die, numbered 1 through n.
Roll dice one at a time.
Keep a die only if its number > numbers on dice you already have.
a. What is the expected number of rolls (including those not kept) to get a die numbered n?
i
¥
n-1
n
= 1 =
=n
å
n
n-1
nn-1)
(
1i=0
n
( )
14
b. What number of dice do you expect (mathematically) to keep?
Keep going until an n is encountered.
Repeats of a number are discarded.
Results in hiring problem ( Hn £ ln n +1)
Suppose a gambling game involves a sequence of rolls from a standard six-sided die. A player wins $1
when the value rolled is the same as the previous roll. If a sequence has 1201 rolls, what is the
expected amount paid out?
Probability of a roll paying $1 = 1/6. 1200/6 = $200
Suppose a gambling game involves a sequence of rolls from a standard six-sided die. A player wins $1
when the value rolled is larger than the previous roll. If a sequence has 1201 rolls, what is the
expected amount paid out?
6
Payout for next roll = 1 å Expected payout after roll of i
6
i=1
= 1 5+ 4 + 3+ 2+ 1+0
6 6 6 6 6 6
= 15 = 5
36 12
(
)
1200*5/12 = $500
Suppose a gambling game involves a sequence of rolls from a standard six-sided die. A player wins k
dollars when the value k rolled is smaller than the previous roll. If a sequence has 601 rolls, what is
the expected amount paid out?
(
)
1 1· 5 + 2 · 4 + 3· 3 + 4 · 2 + 5 · 1 + 6 · 0 = 5+8+9+8+5 = 35
6
6
6
6
6
6
6
36
36
600*35/36 = $583.33
Suppose a gambling game involves a sequence of rolls from a standard six-sided die. What is the
expected number of rolls until the first pair of consecutive sixes appears?
æ
ö
i
¥
ç
÷
6-1
1
6
Expected rolls to get first six = 6 = å
=
=
ç
6
6-1
66-1
( )÷
1èi=0
ø
6
( )
Probability that next roll is not a six = 5
6
15
k
¥
5
Expected rolls for consecutive sixes = (6 + 1) å
= (6 + 1) 1 = 42
6
1- 5
k=0
6
()
Also, see:
http://dl.acm.org.ezproxy.uta.edu/citation.cfm?doid=2408776.2408800
http://dl.acm.org.ezproxy.uta.edu/citation.cfm?doid=2428556.2428578
and
along with the book by Grinstead & Snell, especially the first four chapters.
GENERATING RANDOM PERMUTATIONS
PERMUTE-BY-SORTING (p. 125, skim)
Generates randoms in 1 . . . n3 and then sorts to get permutation in (n log n) time.
Can use radix/counting sort (CSE 2320 Notes 8) to perform in (n) time.
RANDOMIZE-IN-PLACE
Array A must initially contain a permutation. Could simply be identity permutation: A[i] = i.
for i=1 to n
swap A[i] and A[RANDOM(i,n)]
Code is equivalent to reaching in a bag and choosing a number to “lock” into each slot.
Uniform - all n! permutations are equally likely to occur.
Problem 5.3-3 PERMUTE-WITH-ALL
for i=1 to n
swap A[i] and A[RANDOM(1,n)]
Produces n n outcomes, but n! does not divide into n n evenly.
 Not uniform - some permutations are produced more often than others.
Assume n=3 and A initially contains identity permutation. RANDOM choices that give each
permutation.
1
1
2
2
3
3
2
3
1
3
1
2
3:
2:
3:
1:
2:
1:
1
1
1
1
1
1
2
2
1
1
1
2
3
2
3
2
1
1
1
1
2
1
2
2
3
3
2
3
2
1
2
3
3
1
1
1
2
2
2
2
3
3
1
1
3
2
2
2
3
2
2
2
2
3
3
2
3
2
3
3
2
3
1
3
3
3
1
1
2
3
3
2
3 1 1
3 3 1
3 1 3
16
RANDOMIZED ALGORITHMS
Las Vegas
Output is always correct.
Time varies depending on random choices (or randomness in input).
Challenge: Determine expected time.
Classic Examples:
Randomized (pivot) quicksort takes expected time in Q( n log n) .
Universal hashing
This Semester:
Treaps
Perfect Hashing
Rabin-Karp Text Search
Asides:
Randomized Binary Search - About 50% more probes
List Ranking - Shared Memory and Distributed Versions
(http://ranger.uta.edu/~weems/NOTES4351/09notes.pdf)
Ethernet:
http://en.wikipedia.org/wiki/Exponential_backoff
Valiant-Brebner permutation routing
Monte Carlo
Correct solution with some (large) probability. Use repetition to improve odds.
Usually has one-sided error - one of the outcomes can be wrong
Example - testing primality without factoring
To check N for being prime:
Randomly generate some a with 1 < a <
N)
If N moda = 0, then report composite else report prime
One-sided error - prime could be wrong.
(Several observations from number theory are needed to make this robust for avoiding
Carmichael numbers)
Concept: Repeated application of Monte Carlo technique can lead to:
Improved reliability
Las Vegas algorithm that gives correct result “with high probability”
Download