imult

advertisement
Karatsuba’s Algorithm for Integer
Multiplication
Jeremy R. Johnson
1
Introduction
• Objective: To derive a family of asymptotically fast integer
multiplication algorithms using polynomial interpolation
–
–
–
–
–
–
–
Karatsuba’s Algorithm
Polynomial algebra
Interpolation
Vandermonde Matrices
Toom-Cook algorithm
Polynomial multiplication using interpolation
Faster algorithms for integer multiplication
References: Lipson, Cormen et al.
2
Karatsuba’s Algorithm
• Using the classical pen and paper algorithm two n digit
integers can be multiplied in O(n2) operations. Karatsuba
came up with a faster algorithm.
• Let A and B be two integers with
– A = A110k + A0, A0 < 10k
– B = B110k + B0, B0 < 10k
– C = A*B = (A110k + A0)(B110k + B0)
= A1B1102k + (A1B0 + A0 B1)10k + A0B0
Instead this can be computed with 3 multiplications
• T0 = A0B0
• T1 = (A1 + A0)(B1 + B0)
• T2 = A1B1
• C = T2102k + (T1 - T0 - T2)10k + T0
3
Complexity of Karatsuba’s Algorithm
• Let T(n) be the time to compute the product of two n-digit
numbers using Karatsuba’s algorithm. Assume n = 2k.
T(n) = (nlg(3)), lg(3)  1.58
• T(n)  3T(n/2) + cn
 3(3T(n/4) + c(n/2)) + cn = 32T(n/22) + cn(3/2 + 1)
 32(3T(n/23) + c(n/4)) + cn(3/2 + 1)
= 33T(n/23) + cn(32/22 + 3/2 + 1)
…
 3iT(n/2i) + cn(3i-1/2i-1 + … + 3/2 + 1)
...
 c3k + cn[((3/2)k - 1)/(3/2 -1)] --- Assuming T(1)  c
 c3k + 2c(3k - 2k)  3c3lg(n) = 3cnlg(3)
4
Divide & Conquer Recurrence
Assume T(n) = aT(n/b) + (n)
• T(n) = (n)
[a < b]
• T(n) = (nlog(n)) [a = b]
• T(n) = (nlogb(a))
[a > b]
5
Polynomial Algebra
• Let F[x] denote the set of polynomials in the variable x
whose coefficients are in the field F.
• F[x] becomes an algebra where +, * are defined by
polynomial addition and multiplication.
m
n
A( x)   ai x , B ( x)   b j x
i
i 0
j
j 0
mn
C ( x)  A( x) B ( x)   ck x ,
k
k 0
c
k

a b
k i  j
i
j

min( k , m )
a b
i
i  max( 0 , k  n )
k i
6
Interpolation
• A polynomial of degree n is uniquely determined by its
value at (n+1) distinct points.
Theorem: Let A(x) and B(x) be polynomials of degree m. If
A(i) = B(i) for i = 0,…,m, then A(x) = B(x).
Proof.
Recall that a polynomial of degree m has m roots.
A(x) = Q(x)(x- ) + A(), if A() = 0, A(x) = Q(x)(x- ), and deg(Q)
= m-1
Consider the polynomial C(x) = A(x) - B(x). Since C(i) = A(i) B(i) = 0, for m+1 points, C(x) = 0, and A(x) must equal B(x).
7
Lagrange Interpolation Formula
• Find a polynomial of degree m given its value at (m+1)
distinct points. Assume A(i) = yi

( x  j ) 
y
A( x)  i 0  
 j i (  )  i
i
j 

m
• Observe that

( k  j ) 
y 
A( k )  i 0  
 j i (  )  i
i
j 

m
y
k
8
Matrix Version of Polynomial
Evaluation
• Let A(x) = a3x3 + a2x2 + a1x + a0
• Evaluation at the points , , ,  is obtained from the
following matrix-vector product

 A( )  1
 A(  ) 1


 A( )  1

 
 A( )  1





1
1
1
1




2
2
2
2
 
 3 a0
 

  a1
3
a 

  2
3
a3 

3
9
Matrix Interpretation of Interpolation
• Let A(x) = anxn + … + a1x +a0 be a polynomial of degree n.
The problem of determining the (n+1) coefficients an,…,a1,a0
from the (n+1) values A(0),…,A(n) is equivalent to solving
the linear system
 A( 0)   1

 
 A( 1)    1
 ...  ...

 
 A( n)  1


1
0
1
1
...

1
n
...
...
...
...
  a 
   a 
n
0
n
0
1
1
...  ... 


n 
 n a3
10
Vandermonde Matrix
1

1

V ( 0 ,..., n)  
...

 1


1
0
1
1
...

 0n
...  1 
... ... 

n
...  n 
...
n

det(V )   (  )
i j
1
n
j
i
V(0,…, n) is non-singular when 0,…, n are distinct.
11
Polynomial Multiplication using
Interpolation
• Compute C(x) = A(x)B(x), where degree(A(x)) = m, and
degree(B(x)) = n. Degree(C(x)) = m+n, and C(x) is uniquely
determined by its value at m+n+1 distinct points.
• [Evaluation] Compute A(i) and B(i) for distinct i,
i=0,…,m+n.
• [Pointwise Product] Compute C(i) = A(i)*B(i) for
i=0,…,m+n.
• [Interpolation] Compute the coefficients of C(x) = cnxm+n +
… + c1x +c0 from the points C(i) = A(i)*B(i) for i=0,…,m+n.
12
Interpolation and Karatsuba’s
Algorithm
• Let A(x) = A1x + A0, B(x) = B1x + B0, C(x) = A(x)B(x) = C2x2 +
C 1x + C 0
• Then A(10k) = A, B(10k) = B, and C = C(10k) = A(10k)B(10k) =
AB
• Use interpolation based algorithm:
–
–
–
–
Evaluate A(), A(), A() and B(), B(), B() for  = 0,  = 1, and  =.
Compute C() = A()B(), C() = A() B(), C() = A()B()
Interpolate the coefficients C2, C1, and C0
Compute C = C2102k + C110k + C0
13
Matrix Equation for Karatsuba’s
Algorithm
• Modified Vandermonde Matrix
• Interpolation
 C (0)  1 0 0 c0 
 C (1)   1 1 1  

 
  c1 
C () 0 0 1 c2 
c0   1 0 0  

A
0 B0
  



 c1    1 1  1  A0  A1B0  B1
c   0 0 1  

A
B
1
1


 2
14
Integer Multiplication Splitting the
Inputs into 3 Parts
• Instead of breaking up the inputs into 2 equal parts as is
done for Karatsuba’s algorithm, we can split the inputs into
three equal parts.
• This algorithm is based on an interpolation based
polynomial product of two quadratic polynomials.
• Let A(x) = A2x2 + A1x + A0, B(x) = B2x2 + B1x + B, C(x) =
A(x)B(x) = C4x4 + C3x3 + C2x2 + C1x + C0
• Thus there are 5 products. The divide and conquer part still
takes time = O(n). Therefore the total computing time T(n) =
5T(n/3) + O(n) = (nlog3(5)), log3(5)  1.46
15
Asymptotically Fast Integer
Multiplication
• We can obtain a sequence of asymptotically faster
multiplication algorithms by splitting the inputs into more
and more pieces.
• If we split A and B into k equal parts, then the
corresponding multiplication algorithm is obtained from an
interpolation based polynomial multiplication algorithm of
two degree (k-1) polynomials.
• Since the product polynomial is of degree 2(k-1), we need to
evaluate at 2k-1 points. Thus there are (2k-1) products. The
divide and conquer part still takes time = O(n). Therefore
the total computing time T(n) = (2k-1)T(n/k) + O(n) =
(nlogk(2k-1)).
16
Asymptotically Fast Integer
Multiplication
• Using the previous construction we can find an algorithm
to multiply two n digit integers in time (n1+ ) for any
positive .
– logk(2k-1) = logk(k(2-1/k)) = 1 + logk(2-1/k)
– logk(2-1/k)  logk(2) = ln(2)/ln(k)  0.
• Can we do better?
• The answer is yes. There is a faster algorithm, with
computing time (nlog(n)loglog(n)), based on the fast
Fourier transform (FFT). This algorithm is also based on
interpolation and the polynomial version of the CRT.
17
Download