Devices Resistant to Attacks. Design Methodology. Alexander Taubin and Mark Karpovsky taubin@bu.edu, markkar@bu.edu Boston University, Electrical and Computer Engineering Department Security, reliability and survivability of computing is becoming increasingly important not only for its traditional roles like payment mechanisms and access control, but also for various wireless computing devices (mobile phones, PDAs, heart monitors, burglar alarms, etc). It is important that applications with an intermittent connectivity have to maintain much of their security locally rather than globally. Today, security is typically provided at the level of software (cryptographic algorithms). Traditional cryptographic protocol designs assume that input and output messages are available to attackers, but other information about the keys is not available. However, during the last five years a new class of attacks against microchips has become public [1]. These attacks exploit easily accessible information like power consumption, running time, and input-output behavior under malfunctions, and can be mounted by anyone using a low-cost equipment. These so-called side-channel attacks (or non-invasive attacks) amplify and evaluate the leaked information with the help of statistical methods and are often much more powerful than classical cryptanalysis. Examples show that a very small amount of side-channel information is enough to completely break a cryptosystem [2]. While many previously-known cryptanalytic attacks can be analyzed by studying algorithms, side-channel attacks vulnerabilities result from transistors and circuits electrical behaviors which propagate to expose logic gates, microprocessor operation, and software implementations. This ultimately compromises cryptography and shifts the top priority in cryptography from the further improvement of algorithms to the prevention of such attacks by reducing variations in timing, power and radiation from the hardware [3]. Current techniques, such as randomised clocking and noise generation, are considered as not effective enough [4, 1]. More sophisticated countermeasures are also implemented, e.g. the use of clocked dual-rail logic, i.e. representing 0 by “01” and 1 by “10” respectively, leads to a lower signal-to-noise ratio and a constant Hamming weight of the operands (Figure 1). Asynchronous (clockless) dual-rail pipelined circuit techniques are able to play a key role in hardware design inherently resistant to non-invasive attacks and fault injection. Using asynchronous circuit techniques has number of advantages: Electromagnetic (or power) signature could be strongly reduced by replacing synchronous processor by an asynchronous one (no clock harmonics). An asynchronous processor has signatures with a very broad band with a reduction by 20 to 30 db of the background noise as much as a synchronous processor has lines fine and precise in their electromagnetic signature. Asynchronous circuitry holds out the prospect of power consumption that is independent of the data being processed. By combining a handshake protocol with the multi-rail (e.g. dual-rail) encoding of data one can drastically reduce data dependent power consumption removing any useful power signature (Figure 1). Absence of clocks makes any glitch attack infeasible. In a synchronous implementation, power supply fluctuations also can be used to force a circuit into an erroneous state and to apply the DFA (differential fault analysis). Asynchronous circuits are much less sensitive to such attacks since the supply voltage dropping will just slow down the circuits rather than lead to errors. Peaks betray number of 1's in data word Conventional clocked design Dual-rail clocked design Asynchronous design Figure 1. Power signatures related to different design styles ( Self-Timed Solutions) Asynchronous fine-grain pipelining [5] is attractive for implementation of basic public-key cryptosystems algorithms since they are related to multiplication, particularly, array multiplication. Some attacks that are dangerous even for cryptosystem with fault detection [6] become infeasible for asynchronous pipelined implementation. Asynchronous fine-grain pipelines and arrays do not require pipeline stages balancing and could help to make the latency speed and the power consumption of operations independent from data. It is helpful as a countermeasure against timing and power analysis attacks [7]. Absence of clock also leads to low Electro Magnetic Interference (EMI): transitions are distributed in time evenly and there are no peak currents around clock edges. This means that asynchronous implementation of digital part is very promising for computing in mixed-signal (digital/analog) embedded devices (like mobile phones) where analog part is very sensitive to substrate noise related to the clocks. Actually, asynchronous design methodology itself cannot protect components like memory or communication channels where dual-rail redundancy could be unacceptable. This is why we are suggesting a methodology based on both asynchronous fine-grain pipelining and robust encoding techniques. Furthermore, combining error detection capabilities of asynchronous implementation of computational modules with the error detection based on optimal robust codes we can provide a very effective technique both for anti-jamming and the detection of fault-injection based attacks. Error detection based on optimal robust codes with an equal error-detection capability for all patterns of erroneous symbols [8, 9] is a unique technique which provides an equal protection against all error patterns – it is robust with respect to the statistics of error patterns. Conventional error detection signature analysis techniques (see e.g. [10]) based on linear error-detecting codes only guarantees the detection of errors, such that a number of distorted bits is less than a minimum distance of the code. However, if the number of erroneous bits is grater than the minimum distance, detection probabilities for different patterns of distorted bits are either 0 or 1. For communication channels or memory, a “natural” distortion of a single bit in a message is more likely to occur than a distortion of two or more bits, and the usage of linear error-detecting codes is justified. However, in the case of a device under attack (it could be jamming or extreme environment conditions: e.g. lightning) multiple errors may be as probable as single ones. The robust encoding technique [8] provides an equal error-detecting capability for all patterns of erroneous symbols with moderate redundancy of the code (and hardware) and high probability of error detection. The last two parameters become as better as longer become code-words - that is very important for cryptographic applications (e.g. for a codeword of 1024 bits, to detect any error with probability of mistake 2-32, we would need only 64redundant bits or 6.25% code redundancy) [9]. The application of robust encoding techniques could be easily combined with the asynchronous pipelined implementation of encoders, decoders and related circuitry. Recently, the first attempt was made to use some of asynchronous design methodology to protect smart cards [11]. However, it is rather far away from a complete methodology. Our approach is different from [11] in two major points: (1) We use a fine-grain pipelining together with a two (and three) dimensional pipeline organization to combine high performance with a very small data dependency in the duration of the operations. In [11] only traditional random delay insertion to diffuse data depending timing is considered. (2) We combine an asynchronous fine-grain pipelined implementation with the protection of memory and communication channels based on robust encoding. It does not only increase protection of portable devices against fault-injection based attacks but also protects data communications and memory against jamming. There is no attempt to protect memory and communication channels in [11].We address a number of hardware level security and reliability issues and how asynchronous fine-grain pipelined circuits together with robust encoding techniques can be used to build more robust and secure (trusted) microchips. To summarize, we are developing a unified methodology for the design of devices inherently resistant to attacks and jamming. Such a methodology provides security and reliability of all components (computational devices, control, memory and communication) by : eliminating a data dependent power consumption and an electromagnetic emission using asynchronous dual-rail fine-grain pipelined circuit techniques defending against faults injection in a computational datapath by using a redundant encoding scheme (dual rail), the absence of clock (glitch attack becomes infeasible), and the array based implementation of multipliers defending against a fault induction and jamming in communication channels and a memory using error detection based on optimal robust codes with equal errordetection capabilities for all patterns of errors [8,9] diffusing a data dependent timing using two and three dimensional asynchronous fine-grain pipelined architecture [5] We are going to build prototypes for major components of wireless high secure devices, such as cross-pipelined arrays, different kind of multipliers (including finite field multipliers), etc. We also plan to combine an online error detection based on robust error detecting codes [9] with an off line built-in self-testing and the data compression of test responses based on decoders for these codes. This approach will provide an additional tool for the detection of jamming attacks as well as intermittent and permanent faults [12]. References [1] E. Hess, N. Janssen, B. Meyer, and T. Schütze Information Leakage Attacks Against Smart Card Implementations of Cryptographic Algorithms and Countermeasures - A Survey. Proceedings of EUROSMART Security Conference, 2000 [2] J. Kelsey, B. Schneier, D. Wagner, and C. Hall, "Side Channel Cryptanalysis of Product Ciphers," ESORICS '98 Proceedings, 1998, pp. 97-110. [3] C. D. Walter, Montgomery’s Multiplication Technique: How to make it Smaller and Faster, Proc. Workshop on Cryptographic Hardware and Embedded Systems, (CHES 99), 1999, Lecture Notes in Computer Science, vol. 1717, pp 80-93. [4] R. Anderson Protecting Embedded Systems -The Next Ten Years. Proc. Workshop on Cryptographic Hardware and Embedded Systems (CHES 2001), LNCS 2162, 2001. [5] A.Taubin, K. Fant and J. McCardle Design of Delay-Insensitive Three Dimension Pipeline Array Multiplier for Image Processing, Proceedings, 2002 IEEE International Conference on Computer Design: VLSI in Computers and Processors. ICCD’2002, p.p.104-111. [6] S.-M.Yen, M.Joye Checking Before Output May Not Be Enough Against FaultBased Cryptanalysis., IEEE Transaction on Computer, V49, No 9, 2000, pp. 967-970 [7] P. C. Kocher: Timing Attacks on Implementations of Diffie-Hellman, RSA, DSS, and Other Systems. CRYPTO '96, 16th Annual International Cryptology Conference, Proceedings. LNCS, Vol. 1109, Springer, 1996, pp. 104-113 [8] M. G. Karpovsky, P. Nagvajara, "Optimal Robust Compression of Test Responses," IEEE Trans. on Computers, Vol. 39, No. 1, pp. 138-141, January 1990. [9] M. G. Karpovsky, P. Nagvajara, "Optimal Codes for the Minimax Criterion on Error Detection," IEEE Trans. on Information Theory, November 1989. [10] M.L. Bushnell, V.D. Agrawal Essential of Electronic Testing for Digital, Memory and Mixed-signal VLSI Circuits, Kluwer, 2000. [11] S. Moore, R. Anderson, P. Cunningham, R. Mullins and G. Taylor “Improving Smart Card Security using Self-timed Circuits” Proc. of The Eighth IEEE International Symposium on Asynchronous Circuits and Systems, 2002 [12] M. G. Karpovsky, "Integrated On-Line and Off-Line Error Detection Mechanism in the Coding Theory Framework," VLSI Design, Vol. 5, No. 4, 1998, pp. 313-331.