DIMACS Workshop on Coding Theoretic Methods for Network Security
9:15 - 10:15 AM, April 1 st , 2015
Based on “How to Maintain a Secret” by Greg Dhuse and Jason Resch
Jason Resch
Cleversafe Research
Outline
Secret Sharing Schemes
Rebuilding Lost Data
Risks and Costs
Why it ’ s Necessary
A new way to Rebuild
Implementation Details
Efficiency and Security
Other Applications
Secret Sharing Schemes
Secret Sharing Schemes
Convert secret information (the secret) into n shares where each is given to one of n shareholders
The secret may be recovered from any t of the shares, where 1 ≤ t ≤ n
Importantly: no information can be gleaned (practically or in some cases theoretically) from fewer than t shares
Provides excellent confidentiality and availability:
Secret remains confidential despite (t – 1) exposures
Secret remains available despite (n – t) erasures
Comparison of Schemes
Name
Blakley
Shamir
XOR
SSMS
AONT-RS
Mathematical Basis
Hyperplane Geometry
Polynomial Interpolation O(n
Bitwise Exclusive-or
Information Dispersal
Erasure Codes
Space
O(n × t × l)
× l)
O(n × l)
O((n / t) × l)
O((n / t) × l)
Time
O(n × t × l)
O(n × t × l)
O(n × l)
O(n × l)
O((n - t) × l)
Security
Informational
Informational
Informational
Computational
Computational
Equivalency of the Schemes
The aforementioned secret sharing schemes were first described in terms of very different math:
Hyperplane geometry, polynomial interpolation, bitwise xor, information dispersal, and erasure codes
Yet, all of these schemes are operationally equivalent:
Each is a system of linear equations in a finite field
Encoding and decoding work identically, the only difference is how the linear equations are formed
Encoding and Decoding
In each of these schemes:
Encoding is performed using a system of n equations to derive a set of n solutions (each solution being a share)
Decoding uses at least t solutions to solve for the unknowns (the variables) used in the equations
This is significant because the same math that enables efficient and secure rebuilding applies to all schemes
Shamir as a Linear SSS
Define a random (t – 1) degree polynomial f(x)
Encode the secret s as one of the coefficients, for example as the y-intercept:
f(x) = r
1 x 4 + r
2 x 3 + r
3 x 2 + r
4 x 1 + sx 0
Pick n distinct values of (x > 0) to evaluate f(x):
share
1
= f(1), share
2
= f(2), …, share n
= f(n)
The polynomial can be solved with any t solutions:
Use polynomial interpolation to solve the equation, and then recover the coefficients (including the secret)
Shamir as a Linear SSS
9 × 5 encoding matrix V
1 0 1 1 1 2 1 3 1 4
2 0 2 1 2 2 2 3 2 4
3 0 3 1 3 2 3 3 3 4
4 0 4 1 4 2 4 3 4 4
5 0 5 1 5 2 5 3 5 4
6 0 6 1 6 2 6 3 6 4
7 0 7 1 7 2 7 3 7 4
8 0 8 1 8 2 8 3 8
9 0 9 1 9 2 9 3 9 4
4
×
5 coefficients s r
4 r
3 r
2 r
1
=
9 solutions of f(x) s1 0 + r
4
1 1 + r
3
1 2 + r
2
1 3 + r
1
1 4 s2 0 + r
4
2 1 + r
3
2 2 + r
2
2 3 + r
1
2 4 s3 0 + r
4
3 1 + r
3
3 2 + r
2
3 3 + r
1
3 4 s4 0 + r
4
4 1 + r
3
4 2 + r
2
4 3 + r
1
4 4 s5 0 + r
4
5 1 + r
3
5 2 + r
2
5 3 + r
1
5 4 s6 0 + r
4
6 1 + r
3
6 2 + r
2
6 3 + r
1
6 4 s7 0 + r
4
7 1 + r
3
7 2 + r
2
7 3 + r
1
7 4 s8 0 + r
4
8 1 + r
3
8 2 + r
2
8 3 + r
1
8 4 s9 0 + r
4
9 1 + r
3
9 2 + r
2
9 3 + r
1
9 4
Blakley as a Linear SSS
Define a choose a point p in t-dimensional space
Select p such that it encodes the secret, e.g. as one of the
t coordinates that specifies p
Generate n hyperplanes of dimensionality (t – 1) that intersect point p, the n hyperplanes are the shares
To find a random hyperplane intersecting p having coordinates = (s, x
2
, x coefficients (a
1
, a
2
, a
3
3
, x
, a
4
4
, x
, a
5
5
), generate t random
), the plane equation is:
y = a
1 s + a
2 x
2
+ a
3 x
3
+ a
4 x
4
+ a
5 x
5 share i
= y i and the coefficients (a i,1
, a i,2
, a i,3
, a i,4
, a i,5
)
Blakley as a Linear SSS a
7,1 a
8,1 a
9,1 a
3,1 a
4,1 a
5,1 a
6,1
9 × 5 encoding matrix V a
1,1 a
2,1 a a
1,2
2,2 a a
1,3
2,3 a a
1,4
2,4 a a
1,5
2,5 a
7,2 a
8,2 a
9,2 a
3,2 a
4,2 a
5,2 a
6,2 a
7,3 a
8,3 a
9,3 a
3,3 a
4,3 a
5,3 a
6,3 a
7,4 a
8,4 a
9,4 a
3,4 a
4,4 a
5,4 a
6,4 a
7,5 a
8,5 a
9,5 a
3,5 a
4,5 a
5,5 a
6,5
×
5 coordinates s x
2 x
3 x
4 x
5
=
9 solutions (y
1
– y n
) sa
1,1
+ x
2 a
1,2
+ x
3 a
1,3
+ x
4 a
1,4
+ x
5 a
1,5 sa
2,1
+ x
2 a
2,2
+ x
3 a
2,3
+ x
4 a
2,4
+ x
5 a
2,5 sa
3,1
+ x
2 a
3,2
+ x
3 a
3,3
+ x
4 a
3,4
+ x
5 a
3,5 sa
4,1
+ x
2 a
4,2
+ x
3 a
4,3
+ x
4 a
4,4
+ x
5 a
4,5 sa
5,1
+ x
2 a
5,2
+ x
3 a
5,3
+ x
4 a
5,4
+ x
5 a
5,5 sa
6,1
+ x
2 a
6,2
+ x
3 a
6,3
+ x
4 a
6,4
+ x
5 a
6,5 sa
7,1
+ x
2 a
7,2
+ x
3 a
7,3
+ x
4 a
7,4
+ x
5 a
7,5 sa
8,1
+ x
2 a
8,2
+ x
3 a
8,3
+ x
4 a
8,4
+ x
5 a
8,5 sa
9,1
+ x
2 a
9,2
+ x
3 a
9,3
+ x
4 a
9,4
+ x
5 a
9,5
XOR as a Linear SSS
Choose (t – 1) random bit strings of length equal to s
The first (t – 1) shares are these random bit strings
The final share is the bitwise exclusive-or of all the random bit strings together with the secret
Achieves Shannon perfect secrecy
Analogous to a one-time-pad with (t – 1) keys
To decode, xor all of the shares together
XOR as a Linear SSS
5 × 5 encoding matrix V
1 0 0 0 0
0 1 0 0 0
0 0 1 0 0
0 0 0 1 0
1 1 1 1 1
×
5 bit strings r
3 r
4 s r
1 r
2
=
5 shares r
1 r
2 r
3 r
4 r
1
+ r
2
+ r
3
+ r
4
+ s
XOR as a Linear SSS
9 × 5 encoding matrix V
1 0 0 0 0
0 1 0 0 0
0 0 1 0 0
0 0 0 1 0
1 1 1 1 1
2 0 2 1 2 2 2 3 2 4
3 0 3 1 3 2 3 3 3 4
4 0 4 1 4 2 4 3 4
5 0 5 1 5 2 5 3 5 4
4
×
5 bit strings r
3 r
4 s r
1 r
2
=
9 shares r
1 r
2 r
3 r
4 r
1
+ r
2
+ r
3
+ r
4
+ s s2 0 + r
4
2 1 + r
3
2 2 + r
2
2 3 + r
1
2 4 s3 0 + r
4
3 1 + r
3
3 2 + r
2
3 3 + r
1
3 4 s4 0 + r
4
4 1 + r
3
4 2 + r
2
4 3 + r
1
4 4 s5 0 + r
4
5 1 + r
3
5 2 + r
2
5 3 + r
1
5 4
SSMS as a Linear SSS
Secret Sharing Made Short (SSMS) combines Rabin’s
Information Dispersal, Encryption, and Shamir’s SSS
First encrypt the secret with a random key
Second, disperse the encrypted data with the IDA
Third, encode the key using Shamir’s SSS
Shareholders get an IDA fragment and a Shamir share
Unlike Shamir, Blakley, and XOR, SSMS is conditionally secure:
Security depends on the security of the cipher, and the limited computational resources of adversaries
Rabin IDA as a Linear SSS
9 × 5 encoding matrix V
1 0 1 1 1 2 1 3 1 4
2 0 2 1 2 2 2 3 2 4
3 0 3 1 3 2 3 3 3 4
4 0 4 1 4 2 4 3 4 4
5 0 5 1 5 2 5 3 5 4
6 0 6 1 6 2 6 3 6 4
7 0 7 1 7 2 7 3 7 4
8 0 8 1 8 2 8 3 8
9 0 9 1 9 2 9 3 9 4
4
5 data elements
× d
1 d
2 d
3 d
4 d
5
=
9 fragments d
1
1 0 + d
2
1 1 + d
3
1 2 + d
4
1 3 + d
5
1 4 d
1
2 0 + d
2
2 1 + d
3
2 2 + d
4
2 3 + d
5
2 4 d
1
3 0 + d
2
3 1 + d
3
3 2 + d
4
3 3 + d
5
3 4 d
1
4 0 + d
2
4 1 + d
3
4 2 + d
4
4 3 + d
5
4 4 d
1
5 0 + d
2
5 1 + d
3
5 2 + d
4
5 3 + d
5
5 4 d
1
6 0 + d
2
6 1 + d
3
6 2 + d
4
6 3 + d
5
6 4 d
1
7 0 + d
2
7 1 + d
3
7 2 + d
4
7 3 + d
5
7 4 d
1
8 0 + d
2
8 1 + d
3
8 2 + d
4
8 3 + d
5
8 4 d
1
9 0 + d
2
9 1 + d
3
9 2 + d
4
9 3 + d
5
9 4
AONT-RS as a Linear SSS
All-or-Nothing Transform Reed Solomon (AONT-RS) combines Rivest’s AONT with Reed-Solomon coding
First encode secret with AONT
Second, encode redundancy with Reed-Solomon
Third, split the AONT package and redundancy
Shareholders get a fraction equal to (1 / t) of secret
Unlike Shamir, Blakley, and XOR, but like SSMS,
AONT-RS is conditionally secure:
Security depends on the security of the cipher, and the limited computational resources of adversaries
AONT-RS as a Linear SSS
9 × 5 encoding matrix V
1 0 0 0 0
0 1 0 0 0
0 0 1 0 0
0 0 0 1 0
0 0 0 0 1
1 0 1 1 1 2 1 3 1 4
2 0 2 1 2 2 2 3 2 4
3 0 3 1 3 2 3 3 3 4
4 0 4 1 4 2 4 3 4 4
5 data elements
× d
1 d
2 d
3 d
4 d
5
=
9 fragments d
1 d
2 d
3 d
4 d
5 d
1
1 0 + d
2
1 1 + d
3
1 2 + d
4
1 3 + d
5
1 4 d
1
2 0 + d
2
2 1 + d
3
2 2 + d
4
2 3 + d
5
2 4 d
1
3 0 + d
2
3 1 + d
3
3 2 + d
4
3 3 + d
5
3 4 d
1
4 0 + d
2
4 1 + d
3
4 2 + d
4
4 3 + d
5
4 4
Decoding in a Linear SSS
9 × 5 encoding matrix V 5 inputs 9 outputs
× =
Decoding in a Linear SSS
9 × 5 encoding matrix V 5 inputs 9 outputs
× =
Decoding in a Linear SSS
9 × 5 encoding matrix V 5 inputs 9 outputs
× =
Decoding in a Linear SSS
9 × 5 encoding matrix V 5 inputs 9 outputs
× =
Decoding in a Linear SSS
9 × 5 encoding matrix V 5 inputs 9 outputs
× =
Decoding in a Linear SSS
5 × 5 encoding matrix V 5 inputs 5 outputs
× =
Decoding in a Linear SSS decoding matrix V -1 encoding matrix V inputs decoding matrix V -1 outputs
× = × ×
Decoding in a Linear SSS t × t identity matrix
1 0 0 0 0
0 1 0 0 0
0 0 1 0 0
0 0 0 1 0
0 0 0 0 1
× inputs
= decoding matrix V -1 outputs
×
Decoding in a Linear SSS decoding matrix V -1 outputs inputs
× =
Rebuilding Lost Data
No storage system is perfect
Whatever storage media a shareholder uses for their share of the secret, it’s subject to a non-zero failure rate
The inverse of a failure rate is the mean time to failure
Once more than (n – t) shareholders have lost their shares, the secret is rendered irrecoverable
But before this point in time, we may rebuild the lost share to recover full availability and reliability
The most straight-forward way to do this is to recover the secret from t shares, then re-encode the lost share
Risks of Rebuilding Data
Any time the secret is recovered it is vulnerable to interception, and accidental or malicious disclosure
No shareholder is necessarily trusted with knowledge of the secret, nor knowledge of other shareholders’ shares
To securely rebuild under the straight-forward approach, an entity trusted with knowledge of the secret must be involved in every rebuild operation
But what if this entity is not available?
Cost of Rebuilding Lost Data
Rebuilding is expensive, not only in terms of computation, but most importantly, in terms of necessary IO: Reads, Seeks, Network Transfers
To rebuild a single share corresponding to 1 TB of secret information in a t=5, n=9 secret sharing scheme:
SSMS and AONT-RS must read and transfer 1 TB of data
Shamir and XOR must read and transfer 5 TB of data
Blakley must read and transfer 25 TB of data!
Since the shares are on 9 independent storage devices, this rebuild cost must be paid at a rate 9 times greater than the failure rate of the underlying storage media
Necessity of Rebuilding
As costly and risky as rebuilding is, it is necessary to achieve any degree of reliability for the secret
Assume MTTF of a disk is 20 years (5% AFR):
For a t=5, n=9 Secret Sharing Scheme with shares stored on these drives, and no rebuilding,
Mean Time to Irrecoverable loss of Secret:
Time to (n – t + 1) disk failures = (20 years / 9) + (20 years / 8)
+ (20 years / 7) + (20 years / 6) + (20 years / 5) = 14.91 years
This is a 6.7% annual failure rate
Less reliable than storing secret on just a single disk!
Power of Rebuilding
If we assume the same drives, the same t=5, n=9 Secret
Sharing Scheme, but add rebuilding
Assume a 48 hour time window to rebuild lost shares:
Mean time to Irrecoverable Loss of Secret would be:
5.63 trillion years
Chance of losing secret over 100 years:
Less than 1 chance in 50 billion!
Rebuilding Conclusions
It’s costly, dangerous, but nonetheless necessary
A rebuild method that did not require shareholders to disclose their shares, or for the secret to be recovered every time a disk failed would be ideal
The above sounds like a pipe dream, but the math tells us that this is indeed possible, and in fact, that it can be more efficient than conventional rebuilding!
Partial Rebuilding
Partial rebuilding splits the rebuild operation into two separate mathematical stages:
The first stage generates “partial” rebuild results
The second stage combines the partial results
The share is recovered without ever reconstructing the secret
The combination can occur in different ways to decrease network overhead and increase performance
The partial results can also be masked in such a way that no one learns anything they didn’t already know
Traditional Rebuilding decoding matrix V -1 outputs inputs
× =
Traditional Rebuilding
1 × 5 encoding vector V inputs rebuilt output
× =
Full Rebuild Process
1 × 5 encoding vector V decoding matrix V -1 outputs rebuilt output
× × =
Decomposed Rebuild Process decoding matrix V -1 outputs inputs
× =
Decomposed Rebuild Process decoding matrix V -1 output partially decoded result
× =
Decomposed Rebuild Process decoding matrix V -1 output partially decoded result
× =
Decomposed Rebuild Process decoding matrix V -1 output partially decoded result
× =
Decomposed Rebuild Process decoding matrix V -1 output partially decoded result
× =
Decomposed Rebuild Process decoding matrix V -1 output partially decoded result
× =
Combination Phase inputs partially decoded results
+ + + + =
Why it Works decoding matrix V -1 decoding matrix V -1 outputs
× = × + + + +
Partial Decoding and Encoding
1 × 5 encoding vector V decoding matrix V -1 outputs rebuilt output
× × =
Partial Decoding and Encoding
1 × 5 encoding vector V decoding matrix V -1 output partially rebuilt output a
× × =
Partial Decoding and Encoding
1 × 5 encoding vector V decoding matrix V -1 output partially rebuilt output b
× × =
Partial Decoding and Encoding
1 × 5 encoding vector V decoding matrix V -1 output partially rebuilt output c
× × =
Partial Decoding and Encoding
1 × 5 encoding vector V decoding matrix V -1 output partially rebuilt output d
× × =
Partial Decoding and Encoding
1 × 5 encoding vector V decoding matrix V -1 output partially rebuilt output e
× × =
Partial Decoding and Encoding rebuilt output a b c d e
+ + + + =
Why it Works decoding matrix V -1
× + + + + ×
1 × 5 encoding vector V
a c e
a c b d f
x y
ax cx ex
by dy fy
b d
1
ax cx
by dy
x y
a c b d
1
0 by
a c b d
1
cx
0
dy
x y
e f
a c b d
1
0 by
e f
a c b d
1
cx
0
dy
ex
fy
Utility of Partial Rebuilding
In conventional rebuilding, t times as much data must traverse the network as the amount rebuilt
Partial rebuilding can reduce WAN traffic when some shareholders are local to each other
For example, in a t=10,n=15 system, across 3 sites, instead of 10 shares being sent over the WAN, shareholders at each site compute and sum their partials before sending that sum over a WAN link
Only 2 shares worth of data go over the WAN
Compared to 10 shares in conventional rebuilding!
Overcoming Bottlenecks
In conventional rebuilding, the location doing the rebuilding must receive t times the data it rebuilds
The rebuilding entity becomes inundated with traffic
With a 10 Gbps NIC and t=10, it can rebuild at 1 Gbps
Partial rebuilding permits a “ring” topology, where each participant computes its partial, adds it to the running sum, and sends it to the next participant
No NIC bottleneck, enables rebuild at full 10 Gbps
Security of Partial Rebuilding
Unfortunately, the security of partial rebuilding is really no better than that of conventional rebuilding
It is a straight-forward linear transformation to convert a partial back in to the slice that produced it
With t partials a malicious recipient could convert them to the t shares, and thus recover the secret
Fortunately, there is a solution, an extension to partial rebuilding we call Zero Information Gain rebuilding
Each partial is masked with (t – 1) encryption keys
The keys cancel out only when partials are combined
ZIG Rebuilding
B
A C E(P a
) = P a
E(P b
) = P b
E(P c
) = P c
E(P d
) = P d
E(P e
) = P e
⊕ K ab
⊕ K ab
⊕ K ac
⊕ K ad
⊕ K ae
⊕ K ac
⊕ K bc
⊕ K bc
⊕ K bd
⊕ K be
⊕ K ad
⊕ K bd
⊕ K cd
⊕ K cd
⊕ K ce
⊕ K ae
⊕ K be
⊕ K ce
⊕ K de
⊕ K de
E D share = E(P a
) ⊕ E(P b
) ⊕ E(P c
) ⊕ E(P d
) ⊕
E(P e
) share = P
K ) … a
⊕ P b
⊕ P c
⊕ P d
⊕ P e
⊕ ( K ab
⊕
ZIG Verification
ZIG rebuilding enables another useful technique
The ability to verify that one’s presently held share is correct and not corrupted, without others even knowing what data you hold
Moreover, it can be done with almost no network use
Certain integrity verification schemes, such as CRC are linear, meaning CRC(x) ⊕ CRC(y) = CRC(x ⊕ y)
It follows then that the CRC of all the partials, or of all the ZIG encrypted partials, is the CRC of the share
Ask t other shareholders for the CRC of your ZIGencrypted partials, xor them and verify your own share
Conclusions
Many Secret Sharing Schemes are fundamentally the same, they are systems of linear equations in a field
Rebuilding is currently a painful but necessary evil
New methods of rebuilding can significantly reduce that pain, and make the process much more secure