Chapter 2: Fundamentals of the Analysis of Algorithm Efficiency

advertisement
Notes on the analysis of multiplication algorithms..
Dr. M. Sakalli, Marmara University
Integer Multiplication MIT notes and wikipedia
Example.. Classic High school math..
Let g = A|B and h = C|D where A,B,C and D are n/2 bit integers
Simple Method: gh = (2n/2A+B)(2n/2C+D) same as given above.
4 multiplication routines. XY = (2n)AC+2n/2(AD+BC) + BD and
carriages c.
Long multiplication: rj = c + Σk = i-j gj hk
Running Time Recurrence T(n) < 4T(n/2) + 100n, 100
multiplications.??, In-place??..
T(n) = q(n2)
Provided that neither c nor the total sum exceed log space, indeed,
a simple inductive argument shows that the carry c and the
total sum for ri can never exceed n and 2n: <<?? 2lgn
respectively. Space efficiency: S(n)=O(loglog(N)), (loglog(N)).
N=gh.
M, Sakalli, CS246 Design & Analysis of Algorithms, Lecture Notes
3-1
Integer Multiplication MIT notes and wikipedia
Pseudo code: Log space multiplication algorithm,
multiply(g[0..n-1], h[0..n-1]) // Arrays representing to the binary
representations
x←0
for i= 0 : 2n-1
for j= 0 : i
k←i-j
x ← x + (g[j] × h[k])
r[i] ← x mod 2
x ← floor(x/2) //I think this is carriage return. Last bit if 1..
end
end
Lattice method, Muhammad ibn Musa al-Khwarizmi. Gauss's
complex multiplication algorithm.
M, Sakalli, CS246 Design & Analysis of Algorithms, Lecture Notes
3-2
Integer Multiplication MIT notes and wikipedia
Karatsuba’s algorithm: Polynomial extensions..
g = g110n/2 + g2
h = h110n/2 + h2
g h = g1 h110n + (g1h2 + g2h1)10n/2 + g2h2
(g1h2+ g2h1) = (g1 + h1)(g2 + h2) - (g2h2+ g1h1), f(n) =
4sums+1 more final sum = 5n, n>2, suppose it is a
constant 100n, and some carriages.
XY = (2n/2+2n)AC+2n/2(A-B)(C-D) + (2n/2+1) BD
A(n) = 3A(n/2)+5n,
A(n) < O(n lg 3) ≈q(n1.6)
Base value 7, when n<2,
M, Sakalli, CS246 Design & Analysis of Algorithms, Lecture Notes
3-3
Karatsuba (g, h : n-digit integer; n : integer)
// return (2n)-digit integer is
a, b, c, d; // (n/2)-digit integer
U, V, W; //n-digit integer;
begin
if n == 1 then
return g(0)*h(0); ????
else
g1  g(n-1) ... g(n/2);
g2  g(n/2-1) ... g(0);
h1  h(n-1) ... h(n/2);
h2  h(n/2-1) ... h(0);
U  Karatsuba ( g1, h1, n/2 );
V  Karatsuba ( g2, h2, n/2 );
W  Karatsuba ( g1+g2, h1+h2, n/2 );
return  U*10n + (W-U-V)*10^n/2 + V;
end if;
end Karatsuba;
FFT and Fast Matrix multiplication.
M, Sakalli, CS246 Design & Analysis of Algorithms, Lecture Notes
3-4
Quarter square multiplier 1980, Everett L. Johnson:
gh = {(g + h)2 - (g - h)2}/4= {(g2 + 2hg+ h2) - (g2 - 2hg+ h2) }/4
Think hardware implementation, with a lookup table
(converter), the difficulty is that summation of the two
numbers each 8bits, will require at least 9 bits, when squared,
18 bits wide.. But if divided by 2 before squared, (discarding
remainder when n is odd) .
Table lookup from 0 to .. 9+9, from 0 … to 81. O(3n), working
S(n) = (n).
i.e. 7 by 3, observe that the sum and difference are 10 and 4
respectively. Looking both of those values up on the table
yields 25 and 4, the difference of which is 21.
M, Sakalli, CS246 Design & Analysis of Algorithms, Lecture Notes
3-5
Russian (Egyptian) Peasant’s binary multiplication
Shift and add.. In-place algorithm, may be implemented and 2n
space.. Try complex examples.
11 3, in binary
1011 11
011
5 6,
101 110
110
2 12,
10 1100
1 24,
1 11000
11000.. =
100001
T(n) = q(n)+O(n2), think about this?.. Why..
S(n) = q(loglog(n)) which is the carriage.
If invertible? Division potential question.
M, Sakalli, CS246 Design & Analysis of Algorithms, Lecture Notes
3-6
Matrix multiplication, 8 multiplications, O(n3)
A11

A 21
A12  B11
 
A 22  B 21
B12  C 11
 
B 22  C 21
C 12 

C 22 
C 11  A11 B11  A12 B 21
C 12  A11 B12  A12 B 22
C 21  A 21 B11  A 22 B 21
Pseudo code for MM.
C 22  A 21 B12 
MM(A, B)
for i ← 1 : N
for j ← 1 : N
C(i, j) ← 0;
for k ← 1 : N
C(i, j) ← C(i, j) + A(i, 
k) * B(k, j)
end, end, end
Time complexity of this algo is n3 multiplications and additions.
Can we do better using divide and conquer?..
Subdividing matrices into four sub-matrices.
T(n) = b, n2,
T(n) = 8T(n/2) + cn2, n>2, which has T(n) = O(n??)
M, Sakalli, CS246 Design & Analysis of Algorithms, Lecture Notes
A 22 B 22
3-7
Strassen’s Algorithm
P1   A11  A 22 B11  B 22 
P5   A11  A12 B 22
C 11  P1  P4  P5  P7
P2   A 21  A 22 B11
P6   A 21  A11 B11  B12 
C 12  P3  P5
P3  A11 B12  B 22 
P7   A12  A 22 B 21  B 22 
C 21  P2  P4
P4  A 22 B 21  B11 
C 22  P1  P3  P2  P6

Strassen: 7 multiplies,
18 additions

T(n) = b, n2,
T(n) = 7T(n/2) + (7m+18s)n2, n>2, which has T(n) = O(n2.81)
7n2(1/4+1/16+…)
Strassen-Winograd: 7 multiplies, 15 additions
Coppersmith-Winograd, O(n2.376) (not easily implementable)
In practice faster (not large hidden constants) for relatively
smaller n~64, and stable but demonstrated that for some
matrices (Strassen and Strassen-Winograd) are too unstable.
M, Sakalli, CS246 Design & Analysis of Algorithms, Lecture Notes
3-8
Download