Encryption Key Recovery Using Coupons 2

Cloud Based Recoverable Encryption through Noised Secret with Discounts Sushil Jajodia, Witold Litwin, Thomas Schwarz April 12, 2013 Extended Abstract Abstract. Key loss danger is the Achille’s heel of cryptography. Key backup copies with escrows help, but favor disclosures. Schemes for recoverable encryption through noised secret alleviate the dilemma. The escrow’s backup is specifically encrypted. The recovery must use a cloud, possibly thousands nodes large. Cloud cost, money trail etc. may be expected to rarefy Illegal attempts. We now add to known schemes the concept of a discount. The recovery requestor optionally provides the discount with the request. The discount contains the code lowering the recovery complexity, easily by orders of magnitude. A smaller cloud may suffice for the same recovery timing. Alternatively, same cloud may provide faster recovery etc. We define the concept of a coupon and adapt known schemes for recoverable encryption through noised secret. We analyze various properties of the new schemes. 1 Introduction Key recovery is a classical goal. Key escrow idea, where a key copy is backed up with some trusted service, was proposed as a basis for solutions. The approach did not become popular. Key disclosure creates temptations to which an escrow may not resist, as it is now well-known. Recoverable encryption concept was meant as basis for more real-life solutions. The backup is by itself supposed encrypted in a novel way. This encryption should make (1) the brute-force recovery, i.e., from the backup alone, always feasible, although computationally hard, unlike for a key copy. The recovery complexity (hardness) should be (2) arbitrarily fixed by the key owner, depending on the trust in the escrow. As the result, the recovery decryption on the escrow’s site (node) alone, should become (3) impractical, e.g., could usually last dozens of days at least. Nevertheless, (4), the maximal recovery time should speed-up over a cloud, with possibly linear O (M / N) speed up. Here M is key-owner defined integer providing O (M) complexity and N is the cloud size in nodes. This time is expected provided by the recovery requestor. Practical timing, e.g., in minutes, should (5) imply N in thousands. The service (hiring) of such a large cloud may be expected usually noticeable, as well as somehow costly and traceable, e.g., through money trial. All these properties should predictably make an illegal recovery attempts, e.g., by an escrow side insider, arbitrarily less tempting than up to now. The class of recoverable encryption schemes termed through noised secret sharing, RENS in short, proved the concept feasible over the current cloud infrastructure, [ ]. We now extend these schemes with the concept of discount. As the name suggests, a discount contains a code that lowers the recovery cost with respect to a brute-force one. Technically, it lowers the calculation complexity, easily by orders of magnitude. We speak then below about the discounted recovery. The discounted recovery time, whether the worst case or the average one, is then accordingly smaller, for the same N. A discount of, say 50%, halves it. Alternatively, the discounted recovery uses a smaller cloud for the same timing, e.g., twice as small. Etc. The requestor sends the discount to the escrow within the recovery request. The escrow is not aware of any discount code otherwise. The code amends the otherwise brute-force RENS request sent to the cloud. If the requestor is unable to provide a coupon, the brute-force recovery is always possible. In this sense, a key recoverable according to an RENS scheme is never lost. For any given key, only specific discount codes lead to the recovery. Any discount provided triggers nevertheless a recovery attempt with the associated lower cost. An unsuccessful recovery also respects the requestor’s timeline. It doubles however the cost of the successful one. Full cost brute-force recovery remains always an option. Finally, every discount is guaranteed successful only for one key. For any different one, it basically acts as random guess of what could be the one. Key owner may send-out discounts to other potential requestors. Different discounts for the same backup are possible. Greater discount may reflect higher trust that the selected user performs the legitimate recovery only. Below, we first define and analyze the RENS recovery with the coupons using the simplest code expressions that we will define and schemes based on, so-called, 2-share noised secret. We reuse for this purpose the schemes in []. We define successively the backup creation, then the discounted recovery calculation. Next, we analyze the correctness, the complexity and the safety of the resulting schemes. Next, we discuss other useful coupon code specifications. In Section _, we address similarly the schemes using more noised shares. We then discuss the related work and we finally conclude. 2 Backup Creation Let S be the key to backup, e.g. a 256b long AES key. The key owner or the agent, i.e., a client program, say, C running on owner’s site, first create a usual 2-share secret, with shares, say s0 and s1 = S XOR s0. It is the common knowledge that S = s0 XOR s1. Next, the owner chooses some time D, e.g., 70 days. D is the presumed escrow’s site alone recovery time of S, assuming a 1-node (core) site. After that, C defines the hint h = H (s0), using some one-way hash function H, e.g., H = SHA256. We recall that in practice (i) h is unique for any s0 and (ii) it is impossible for a good H such as the one mentioned, to find s0 as H-1(h). Next, C determines the number M of match attempts H (s) =? H that 1-node site could perform in time D, where s is an integer that could be s0. Then, C chooses a random noise n that is an integer in the noise space I = [0, M[. An integer f = s0 – n becomes the base noise share, while we call s0 a noised one. The name comes from the backup representation of s0 that is P = (f, h, M). The backup sent to the escrow is the couple (s1, P). 3 Discount In [ ], the brute-force recovery, i.e., using exclusively the backup as defined there and until now here, we recall, was the only capability of RENS schemes defined there. One may nevertheless observe that the requestor could in fact have some prior useful knowledge of s0. The analysis in Section_ shows that the recovery calculation can be effectively make good use of such knowledge. Transmitted with the backup request, it may lower the recovery computation complexity. For instance, the key owner could note that s0 is an odd integer. This would lower the complexity by 50%, as we show. We say that any such knowledge defines a discount, of 50% in this case. More precisely, the key owner defines the discount for a given backup according to an RENS scheme, through some discount code. At its simplest, the code is an m-bit suffix of the noised share s0 ; m = 0,1…. We implicitly focus on such codes, until we say otherwise. The value m = 0 tacitly means the no-discount request, i.e., for the brute-force recovery. Otherwise, we expect m = 8 or m = 16 at most, in practice. We call discount value the reduction with respect to the brute-force recovery complexity that the recovery with the code provided offers. The analysis later on shows that the value of any m-bit long code is 2m for both, worst case and average complexities besides. We expect thus each complexity reduction of up to 28 = 256 and 216 = 64K times, respectively. Codes that long may appear as one or two ASCII digits. One may expect them easy to safely keep, e.g., on a smartphone, or just in memory. Recall that Europeans routinely keep in mind the 4-digit credit card codes. As the result, these codes may lower the recovery time 28 ÷ 32 times with respect to the brute-force figures. Alternatively, they may reduce the cloud size for the same timing. Au finale, they discount accordingly the cloud cost. The value m = 0 tacitly means a nondiscounted request, i.e., for the brute-force recovery. Obviously, the necessary condition for a successful match attempt is that the noise share embeds the code. The discounted recovery calculation we define below attempts matches only for such shares. 4 Recovery 4.1 Recovery Request The escrow performs the recovery upon the legitimate request for. How the escrow knows which request is legitimate is out of scope here. Recovery schemes with discount discussed below reuse the scheme for brute-force recovery only defined in [_]. The recovery request has in particular the same form, augmented however with the discount code. It is thus formally the tuple Pd = (P, R, d). As for the brute-force recovery alone, here, R designates the desired time bound on the recovery time, e.g., 10 mins. 4.2 Brute-Force Recovery If the request has no coupon, i.e., m = 0 in d, the escrow proceeds with the brute-force recovery decryption. The schemes in [] apply then as is. The escrow forwards thus Pd to some cloud node, called coordinator, with the exception of s1. In this way no cloud insider can disclose the recovered key. With respect to the actual execution on the cloud, managed by the coordinator, we recall that there were basically two schemes: so called with static or scalable partitioning. The former was proposed for a homogenous cloud, the latter for a heterogeneous one. Their common characteristic is that the recovery calculations attempt the matches over different noise shares f + m, until the successful match. This one must occur, but attempts may possibly explore even every m in I. Both schemes partition the attempts over N nodes, with the linear speed-up O (N). The choice of N value depends on the scheme. In both cases, it makes the recovery computation at each node fitting the time bound provided by the requestor, e.g., 10 mins. As the result, the whole calculation fits this bound. Typically, N should be possibly in thousands, as we discussed. The cloud delivers the noised share s0 found to the escrow. The escrow XORs it with s1 and, finally, delivers the key to the requestor. 4.3 Discounted Recovery The discounted recovery request differs from the brute-force one only by additional presence of the discount code with m > 0. The cloud uses either scheme defined for bruteforce recovery with the following modifications. 1. The coordinator calculates M’ = M \ 2m. 2. It initiates the static or the scalable scheme with M’ instead of M. We recall that this step determines N, in function of M’ and R. 3. It determines the smallest noise share with suffix d that is greater or equal to f. It may calculate it as follows: d’ = f – f \ 2m * 2m ; /* f ‘ is the m-bit suffix of f If d’ = d then f’ = f ; else f ’ = f \ 2m * 2 m + d ; /* f ‘ is an integer ending with d 4. 5. 6. 7. If f ’ < f then f ’ := f’ + 2m ; /* f’ is the smallest noise share ending /* with d It delivers (1) the “usual” brute-force request for match attempts and (2) d in addition, to each of N nodes. Using M’ instead of M, every node calculates one after another each value of noise n for which it should generate the noise share s for match attempt H (s) =? h. For each n, the node calculates s as s = f’ + n * 2m As for the brute-force recovery, except for M’ instead of M, the node starts match attempts with some n, usually n = 0. If successful, the node reports the result to the coordinator, unless the node is the coordinator itself. Otherwise, the node tries out some next n. Node continues, till the last n < M’ or until the node receives the message from the coordinator requesting to stop the attempts. 8. Assuming the cloud finds the noised share s0, it returns it to the escrow. The escrow XORs it with s1 and returns the recovered key to the requestor. If the cloud finds s0, we say that the discount was valid. Otherwise, the cloud got an invalid one. The legitimate requestor made perhaps some error, or the discount came from an intruder… The cloud acts in the same way for a valid or an invalid d. There no way for the coordinator to distinguish between both upfront. Notice that the brute-force recovery (normally) always terminates with the successful search only. Ex. 1. Consider that M = 250 that should be rather typical. We suppose the noise shares 256b long, as an AES key. Next, let f be f = ‘10….1011’ and m = 2. We further suppose d = ‘01’. The coordinator calculates M’ as M = 248. Let us suppose then that the static scheme is used and that after the calculations for M’ as in [ ] for M, we have N = 1K, hence n = 0,1…N – 1. The coordinator calculates first f’ = ‘11’. Since f’ is not d, the next calculation yields f’= ‘10….1001’. We have f’ < f and since f is the minimal noise share, f’ cannot be one. The smallest noise share with suffix d is therefore f’ = f’ + 22 = ‘10….1111’. Node 0 attempts the matches for noises k = 0, 1024, 2048…, i.e., with each successive k such that k mod N = 0 and till the largest such k < M’. Each k is multiplied by 22 then added to f’, then node 0 attempts the match of the resulting noise share, etc. Likewise, node 1, attempts the matches for noises k = 1, 1025, 2045…., i.e., where k mod N = 1 and also till the largest such k < M’. Etc. In general every node n attempts in this way the matches for each and only k that yields k mod N = n. The integer division ‘\’ by 2m denotes in fact m-bit right shift. Likewise, the multiplication by 2m denotes the m-bit left shift. Dedicated functions, when available to the compiler, may be faster than the arithmetic calculations. There are thus various ways to implement the algorithm we do not address further here. 5 Algorithm Analysis 5.1 Correctness For every N, every f and every d, each RENS schema under consideration generates every noise share ending with d. No such share is generated twice. The proof is rather easy to see from the example. First, we start with determining the smallest noise share, here f’, that ends up with d. The loop that attempts the matches at each node starts with f’ or the smallest value above f’, among those handled by the node. The value of f’ has to be greater or equal to f. By definition of f, a smaller integer simply cannot be a noise share. The algorithm nevertheless produces at 1st the calculation of f’ with last m bits being d’ = ‘0..0’. This f’ can be under f. The algorithm produces then f’ + d, to have a value with the right suffix. The example showed one case, where this value is smaller than f. That is why it was then adjusted to, obviously, the smallest possible value over f ending with d. Next, observe that for d = ‘11’ instead, we would have d’ = d, hence f’ = f. Finally, for f = ‘10….1001’ and d = ‘11’, the calculation f ’ = f \ 2m * 2m + d would yield directly f’ > f. Obviously, there cannot be other cases for f’ being the smallest noise share, to begin with. Next, it should be clear from the example that whatever is N and discount code, all M’ noises are possibly explored, once per noise. Similar analysis holds for any partitioning that could be generated by a scalable scheme. We skip here the tedious details, referring the reader to []. 5.2 Complexity An m-bit long d decreases the recovery calculation complexity (hardness) 2m times in practice. Respectively, we have O (M / 2m) for the worst case and O (M / 2m + 1) on the average. Proof. For the brute-force recovery, the complexity could be measured basically by the number of noises to try out: at most or on the average. Each noise may indeed trigger a match attempt. The computational cost of SHA256, as well as any other known good 1-way hash function dominated additional operations required, at the start-up or termination etc. of the algorithms. We had thus basically the complexity of O (M) in the worst case, for both static and scalable schemes. The algorithm for the discounted recovery has M’ noises to try out at most, in both cases also. This is 2m time smaller. On the other hand, the discounted recovery algorithm requires an additional initial calculation of M’. Next, it requires then then calculation of f’. Finally, at each attempt, there is an additional multiplication by 2 m. However, it is the common knowledge that the cumulated computational cost of a few such operations is again negligible with respect to that of SHA256 or another good 1-way hash calculation. Hence, we have basically the O (M’) worst case complexity, i.e., the O (M / 2m) one. For the average case, we had under similar assumptions O (M / 2) for the brute-force recovery. The reason was that both schemes enumerated all attempts till the successful one, while every noise, hence every noise share tried out, were equally likely to try out and succeed, provided a good 1-way hash, as we supposed. For the discounted recovery, every attempt uses again a different noise and at worst all noises M’ noises are explored. The discount code is (pseudo)random, hence every code is equally likely. Also, the rest of s0, beyond the discount code, is (pseudo)random. Hence, every noise share generated is again equally likely to be the noised one, under the same good 1-way hash assumption. We thus have on the average the O (M’ / 2) complexity, hence O (M / 2m + 1). Ex. 2 Consider the running example in [_] where the encryption complexity is set up so that 1-node recovery would require up to prohibitive 70-days and 35 days on the average. To recover the key in 10 min at most instead, using the brute-force, a 10K-node wide cloud may do. The actual cost could be 200$. Consider that the owner retained some 8b discount code. Now, 40-node cloud may suffice for the same timing. Alternatively, the same 10K cloud, delivers the discounted recovery in up to a couple of seconds. In both cases, the cost theoretically drops to less than 1$. A 16b discount would discount these figs respectively further, by the same factor. The requestor could even recover the key at the her/his own presumably single node, in about 2mins. 5.3 Safety 1. Knowledge of a discount code cannot lower the complexity of the requested backup under values O (M’) at worst and O (M’/2) on the average, provided by our algorithm (see below). Proof. Our algorithm enumerates all attempts till the successful one (if any). Every attempt uses a different noise among M’ and, at worst, all noises M’ noises must be explored. The rest of s0, beyond the discount code, is (pseudo)random and thus independent of the discount code value. Also, for a good 1-way hash as we suppose, each such value is equally likely to generate the matching f’. Hence, whatever is a given a discount code, one cannot calculate from it or otherwise any f’ that could be less or more likely than any other possible. No method exists that would allow to attack the requested backup from its given discount, towards lowering the complexity under that of our algorithm. 2. Guessing a discount code does not lower the complexity of any backup under O (M). Proof’s sketch. An attacker A considers m = 1 and guesses the value of the 1b code. The success probability is 0.5. If A succeeds there are up to M / 2 attempts and M/4 on the average. If A, misses, there are M / 2 attempts, all unsuccessful. They have to be followed by up to M/2 ones until the successful one. On the average there is then M/4 such attempts. In total, the maximal complexity remains M (and O (M) more generally). The average one is (M/4 + (M/2 + M/4)) / 2 = M. The guessing did not help. It’s easy thought tedious to see that the end result is the same for any m > 1. 3. A discount code d for backup B does not lower the complexity of any discounted recovery using d for a different backup B’. The latter remains O (M) characterizing bruteforce recovery of B’. Proof. The discount codes being (pseudo)random, it would be indeed like guessing in (2). Property 3 means that the knowledge a discount code for a backup by the escrow, does not threaten any different backup at escrow’s possession. Or, in other words, a discount code once used by the escrow is of no further utility. 6 Key-owner defined discount codes The owner chooses for a key a convenient, i.e., easy to store or remember discount code. Possibly then, one chooses the code that an outsider cannot know. Could be, e.g., 1st two letters of the childhood pet. This way of proceeding is OK for a single backup used. The attacker getting the knowledge of any such code within the recover request cannot lower the complexity, since all the other bits of s0 are random. It may be in contrast risky for safety of multiple backups. Especially, using the same code for multiple backups is an obvious invitation to disaster. One discount code production rule could be then that the owner derives the discount from the ID of the encrypted dataset concatenated with in practice impossible to guess secret convoluted passphrase, e.g., Dali’srabbit’sSwatch*shows17:15. The derivation then may use a 1-way hash, e.g., our SHA256, being finally truncated to desired length. The rule seems safe as long as secret data remains undisclosed. Any dictionary attack etc. appears indeed to have impractical complexity. 6.1 Partly defined discounts Ex. The discount code has m = 8 and it’s a number 0…9 in ASCII code. Such discount spec. potentially reduces the recovery complexity 25.6 times. TBC

Encryption Key Recovery Using Coupons 2

Related documents

Products

Support

Encryption Key Recovery Using Coupons 2

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib