An Optimized S-Box Circuit Architecture for Low Power AES

advertisement
An Optimized S-Box Circuit
Architecture for Low Power
AES Design
IBM Japan Ltd.
Tokyo Research Laboratory
Sumio Morioka and Akashi Satoh
Contents
Š Background
Š Power Analysis of Conventional S-Boxes
Š Multi-Stage PPRM S-Box for Low-Power H/W
Š Conclusion
Background
Back Ground
Š In 2001, NIST selected Rijndael as the new
symmetric key standard cipher AES
Š AES H/W will be integrated in various applications
Š Low-power feature is important not only in low-end
applications but also in high-end servers
Š A 10-Gbps AES H/W chip consumes several Watts
in 0.13-µm CMOS
Š Need for power analysis and development of lowpower architectures for AES H/W
Power Analysis Method
Š Power analysis is based on timing simulation
Š Gate switching including dynamic hazard is evaluated
Š Quite accurate estimation compared with static
analysis
VHDL Code
Net List
library ieee;
use ieee.std_logic;
entity XNOR is
port (A : in std_logic;
B : in std_logic;
Y : out std_logic;
end XNOR;
architecture RTL of XNOR is
begin
signal T : std_logic;
T <= A xor B;
Y <= not T;
end RTL
Wave Form
Timing
Simulation
Synthesis
DQ
BUFFER
MUX21I
XOR2
NAND2N1
XNOR2
XOR3
XOR2
NAND2N1
XOR3
0101010101
0011001101
0100101101
0100101010
1111000111
0101010100
1110001111
0111000111
0011100011
Test Vector
Power
Calculation
Power Analysis in AES H/W
Š 128-bit bus 11-round Loop Architecture
Š Table-lookup-based S-Box using SOP logic
128 Data
1%
Input
2:1
Data Register
75 %
10 %
1%
ShiftRows
SubBytes
MixColumns
3:1
15%
Others
Data Output
Add
RoundKey
For Low-Power AES H/W
Š Reducing S-Box power is most effective
Š There are various S-Box H/W architectures
Š Which architecture is suitable for low power ?
Objectives
Š Investigate performance of all conventional
S-Boxes
Š Develop a low-power S-Box architecture
Power Analysis of
Conventional S-Boxes
S-Box Definition
Š Nonlinear byte substitution function
Š Multiplicative Inversion on GF(28) + affine
transformation.
b
10001111
11000111
11100011
11110001
= 11111000
01111100
00111110
00011111
a
1
1
0
-1 0
+ 0
1
1
0
Affine trans.
Inversion
8 x 8 XOR matrix
Complicated math. logic
S-Box Architectures
GF inverter + affine
Š
Š
Š
Š
Various mathematical techniques can be applied
Rather slow
Compact implementation
S-Box and S-Box-1 can be merged
Direct mapping
Š
Š
Š
Š
Generate black box circuit from a truth table
High-speed
Rather large
S-Box and S-Box-1 cannot be merged
GF Inverter + Affine
Š Apply hieratical structure of composite field GF(((22)2)2)
Š Map elements on GF(28) onto GF(((22)2)2) by isomorphism
Š Each component is constructed using AND-XOR logic
GF(28)
isomorphism GF(((22)2)2)
inversion
GF(((22)2)2)
GF((22)2)
GF(22)
GF(2)
4
x2
λ
4
4
x -1
4
isomorphism -1
merged
GF(28)
affine trans.
2
2
x2
φ
2
x -1
2
Direct Mapping
Š 2-Level Logic
Š SOP (Sum of Products) : (NOT)-AND-OR
Š POS (Product of Sums) : (NOT)-OR-AND
Š PPRM (Positive Polarity Reed-Muller) : AND-XOR
8-bit input
AND
matrix
Several
hundred
wires
OR / XOR
matrix
8-bit output
Direct Mapping
Š Selector-Based Logic
Š BDD (Binary Decision Diagram)
Š Twisted BDD will appear at ICCD 2002 in Sep.
in0 in1 in2 in3 in4 in5 in6 in7
in0 in1
in2
out0
in3 in4 in5 in6 in7
out0
out1
out1
0
1
0
1
out7
out7
2:1MUX
BDD
Twisted
BDD
Power vs. Gate Count
Š Smaller circuits do not always consume less power
400
0.13-µm 1.5-V CMOS @ 10 MHz
Nominal conditions
350
Power (uW)
300
PPRM
BDD
Twisted
BDD
250
200
Composite
Field
150
100
POS
SOP
Table
Lookup
50
0
0
0.5
1
1.5
2
Gate Count (Kgate)
2.5
3
Analysis
Š Power consumption of S-Box is greatly
influenced by the number of dynamic hazards
Caused by
Š Differences of signal arrival times at each gate
Š Propagation probability of signal transitions
Analysis
Differences of signal arrival times at each gate
Š Composite field S-Box consumes a lot of power
in spite of having the smallest size
Š It has many crossing and branched signal paths
Switch many times
x2
λ
x -1
Multiple
signal
paths
Analysis
Propagation probability of signal transitions
Š An XOR gate transfers signal transitions from
input to output with probability 100%
Š For AND, OR gates, the probability is 50%
Š So power for SOP is lower than PPRM
0
0
1
XOR
0
AND
100% Probability
1
50% Probability
1
1
OR
0
Multi-Stage PPRM S-Box
for Low-Power H/W
Approach for Low Power S-Box
Š Use composite field S-Box
Š Reduce gate counts
Š Divide combination logic into multiple stages
Š Reduce probability of signal transitions
Š Adjust the signal timing between each stage
using 2-level logic
Š Reduce the number of dynamic hazards
Transition Probability
0.5
1.0
AND
array
XOR
array
0.5
1.0
0.5
1.0
0.5
1.0
Multi-Stage PPRM
Š Composite field S-Box is divided into several blocks
Š Each block is designed using PPRM logic suitable for
GF operations
Š Adjust signal timing by using 2-level (AND-XOR) logic
4
4
8
x2
λ
4
4
4
x
-1 4
4
-1
+
affine
8
XOR
array
8
4
Delay chain
4
8
AND
array
28
XOR
array
4
4
4
AND 12 XOR
array
array
3-stage PPRM
4
4
AND
array
32
Multi-Stage PPRM
Š PPRM S-Boxes can be divided many ways (1~6-stages)
XOR
AND-XOR
AND-XOR
AND-XOR
4
4
8
4
λ
x2
4
4
x
-1 4
4
4
2
x2
φ
2
AND-XOR
AND-XOR
2
x -1
2
AND-XOR
-1
+
affine
8
Power vs. Gate Count
Š 3-stage PPRM is the most effective architecture
Š Multi-stage SOP is not suitable for GF operation
800
0.13-µm 1.5-V CMOS @ 10 MHz
Nominal conditions
700
32-
Power (uW)
600
500
1-stage
(normal PPRM)
400
300
2-
200
64-
100
0
0
31
PPRM
SOP
1-stage
(normal SOP)
2
3
4
Gate Count (Kgate)
5
6
7
Power vs. Gate Count
Š 3-stage PPRM is the most effective architecture
Š Multi-stage SOP is not suitable for GF operation
400
0.13-µm 1.5-V CMOS @ 10 MHz
Nominal conditions
350
Power (uW)
300
PPRM
BDD
Twisted
BDD
250
1/3
power
with
1/2
gates
POS
Composite
200
150
Field
SOP
100
Table
Lookup
50
3-stage PPRM
0
0
0.5
1
1.5
2
Gate Count (Kgate)
2.5
3
Conclusion
Š The AES S-Box (SOP logic) consumes 75% of
the power
Š Dynamic hazards boost power needs of S-Boxes
Š A multi-stage PPRM architecture based on a
composite field S-Box was developed
Š 3-stage PPRM / SOP : Power = 1/3, Size = 1/2
Š This architecture can be applied to other S-Boxes
defined over Galois fields
Download