Elliptic Curve Cryptography on a Reconfigurable Computer:

advertisement
Elliptic Curve Cryptography over GF(2m)
on a Reconfigurable Computer:
Polynomial Basis vs. Optimal Normal Basis Representation
Comparative Study
Kris Gaj, Sashisu Bajracharya, Chang Shu, Sang Han
George Mason University
kgaj@gmu.edu, sbajrach@gmu.edu, cshu@gmu.edu
Tarek El-Ghazawi
The George Washington University
tarek@gwu.edu
Reconfigurable Computers are high-end computers based on the close system-level
integration of traditional microprocessors and Field Programmable Gate Arrays (FPGAs).
Public key cryptography is particularly suitable for implementation on FPGAs rather than
traditional microprocessors because of the need for computationally intensive arithmetic
operations with unconventionally long operand sizes of several hundreds or even
thousands of bits. Elliptic Curve Cryptosystems (ECCs) are a family of public
cryptosystems that has emerged over the last ten years as a transformation of choice for
use in future communication networks.
In this paper, we present and contrast two fundamentally different families of Elliptic
Curve Cryptosystems from the point of view of their suitability for implementation on a
reconfigurable computer. Both families are based on operations in the Galois Fields
GF(2m) with m in the range from 160 to 512 bits. They differ in the way the operands are
represented, and the way of defining multiplication of two components of the Galois
Field GF(2m).
Our goal is to determine which of the two possible Galois Field representations:
Polynomial Basis or Optimal Normal Basis (ONB) is more suitable for an
implementation on a reconfigurable computer. This suitability is determined in terms of
both an absolute execution time, as well as in terms of the speed-up compared to a purely
microprocessor-based implementation.
As a platform for our experiments, we have chosen one of the first general-purpose,
stand-alone reconfigurable computers available on the market, SRC-6E from SRC
Computers Inc. This machine allows an application to be executed on two User FPGAs
Xilinx Virtex II XC2V6000, and two microprocessors P3 with 1 GHz clock.
The first tentative results of the implementation of both classes of Elliptic Curve
Cryptosystems using SRC-6E have been reported in our earlier publications [1, 2]. While
these publications gave the first rough estimate of the speed-up that can be achieved
using both approaches, they used different implementation approaches and optimizations,
and as a result were not very suitable for comparison.
In this paper, our attempt is to implement both classes of Elliptic Curve Cryptosystems
using very similar techniques and optimizations, and allow differences only for
operations that are fundamentally different in both investigated representations.
In particular, the following ECC operations and the ways of their implementation are
common for both Galois Field representations:
1. scalar multiplication performed using Montgomery Scalar Multiplication with
Projective Coordinates [3],
2. Elliptic Curve addition and doubling performed in Projective Coordinates [3],
3. Transformation from Projective Coordinates to Affine Coordinates.
The two operations that are specific to a given representation are:
1. Galois Field multiplication and squaring, and
2. Galois Field inversion.
Since the optimal ways of performing these operations are substantially different in each
representation, an effort has been made to devote the similar amount of time and effort to
their optimizations. The limitation in both cases comes from the limit on the maximum
clock frequency, which is set to 100 MHz, and is fixed in the SRC architecture.
Based on the tentative results, we predict that our implementations will result in
similar absolute execution times for both Galois Field representations, and that the speedup compared to a microprocessor-based implementation will be substantially higher for
the Optimal Normal Basis representation. This speed up has been estimated to be in the
range from 895 to 1300 depending on the chosen algorithm description partitioning
scheme (the amount of code written in VHDL vs. C).
While earlier publications (e.g., [4]) regarding implementations of cryptography on
reconfigurable computers have already proven the capability of accomplishing a 1000x
speed-up compared to the microprocessor-based implementations in terms of the data
throughput, this is a first publication that shows a comparable speed-up for data latency.
This speed-up is even more remarkable taking into account that the selected operation has
only limited amount of intrinsic parallelism, and cannot be easily sped up by multiple
instantiations of the same computational unit.
References
1. Nguyen N., Gaj K., Caliga D., El-Ghazawi T., “Implementation of Elliptic Curve
Cryptosystems on a Reconfigurable Computer,” IEEE International Conference on FieldProgrammable Technology, FPT 2003, Tokyo, Japan, Dec. 2003.
2. Bajracharya, S., Shu, C., Gaj, K., El-Ghazawi, K., “Implementation of Elliptic Curve
Cryptosystems over GF(2n) in Optimal Normal Basis on a Reconfigurable Computer,” 14th
International Conference on Field Programmable Logic and Applications, FPL 2004,
Antwerp, Belgium, Aug.-Sep. 2004 (in print).
3. López, J., and Dahab, R.: “Fast Multiplication on Elliptic Curves over GF(2m) without
precomputation,” CHES’99, LNCS 1717, (1999)
4. Fidanci O. D., Poznanovic D., Gaj K., El-Ghazawi K., and Alexandridis N., "Performance
and Overhead in a Hybrid Reconfigurable Computer," Reconfigurable Architecture
Workshop, RAW 2003, Nice, France, Apr. 2003.
Download