Secure Lightweight Entity Authentication with Strong PUFs: Mission

Secure Lightweight Entity Authentication with Strong PUFs: Mission Impossible? Authors: Jeroen Delvaux (1,2), Dawu Gu (1), Dries Schellekens (2), Ingrid Verbauwhede (2) (1) CSE/LoCCS, Shanghai Jiao Tong University, Shanghai, China (2) ESAT/COSIC, KU Leuven, Belgium Abstract. Physically unclonable functions (PUFs) exploit the unavoidable manufacturing variations of an integrated circuit (IC). Their input-output behavior serves as a unique IC ‘fingerprint’. Therefore, they have been envisioned as an IC authentication mechanism, in particular for the subclass of so-called strong PUFs. The protocol proposals are typically accompanied with two PUF promises: lightweight and an increased resistance against physical attacks. In this work, we review eight prominent proposals in chronological order: from the original strong PUF proposal to the more complicated converse and slender PUF proposals. The novelty of our work is threefold. First, we employ a unified notation and framework for ease of understanding. Second, we initiate direct comparison between protocols, which has been neglected in each of the proposals. Third, we reveal numerous security and practicality issues. To such an extent, that we cannot support the use of any proposal in its current form. All proposals aim to compensate the lack of cryptographic properties of the strong PUF. However, proper compensation seems to oppose the lightweight objective. Keywords: physically unclonable function, entity authentication, lightweight 1. Introduction In this work, we consider a common authentication scenario with two parties: a low-cost resourceconstrained token and a resource-rich server. Practical instantiations could be the following: RFID, smart cards and a wireless sensor network. One-way or possibly mutual entity authentication is the objective. The server has secure computing and storage at its disposal. Providing security is a major challenge however, given the requirements at the token side. Tokens typically store a secret key in non-volatile memory (NVM), using a mature technology such as EEPROM and its successor Flash, battery backed SRAM or fuses. Cryptographic primitives import the key and perform an authentication protocol. Today's problems are as follows. First, implementing cryptographic primitives in a resource-constrained manner is rather challenging. Second, an attacker can gain physical access to the integrated circuit (IC) of a token. NVM has proven to be vulnerable against physical attacks [19]: the secret is stored permanently in a robust electrical manner. Third, most NVM technologies oppose the low-cost objective. EEPROM/Flash requires floating gate transistors, resulting in additional manufacturing steps with respect to a regular CMOS design flow. Battery backed SRAM requires a battery. Circuitry to protect the NVM contents (e.g. a mesh of sensing wires) tends to be expensive. Physically unclonable functions (PUFs) offer a promising alternative. Essentially, they are binary functions, with their input-output behavior determined by IC manufacturing variations. Therefore, they can be understood as a unique IC ‘fingerprint’, analogous to human biometrics. They might alleviate the aforementioned problems. Many PUFs allow for an implementation which is both resource-constrained and CMOS compatible. Furthermore, the secret is hidden in the physical structure of an IC, which is a much less readable format. Invasive attacks might easily destroy this secret, as an additional advantage. Several PUF-based protocols have been proposed, in particular for the subclass of so-called strong PUFs. We review the most prominent proposals: controlled PUFs [4-6], Ozturk et al. [15], Hammouri et al. [7], logically reconfigurable PUFs [10], reverse fuzzy extractors (FEs) [21], slender PUFs [14] and the converse protocol [11]. The novelty of our work is threefold. First, we employ a unified notation and framework for ease of understanding. Second, we initiate direct comparison between protocols, which has been neglected in each of the proposals. Third, we reveal numerous security and practicality issues. To such an extent, that we cannot support the use of any proposal in its current form. All proposals aim to compensate the lack of cryptographic properties of the strong PUF. However, proper compensation seems to oppose the lightweight objective. The observations and lessons learned in this work might facilitate future protocol design. The organization of this paper is as follows. Section 2 introduces notation and preliminaries. Section 3 analyzes the strong PUF protocols. Section 4 provides an overview of the protocol issues. Section 5 concludes the work. We operate at protocol level, considering PUFs as a black box. The protocol of Hammouri et al. is therefore discussed in Appendix B: a good understanding of PUF internals is required, as described in Appendix A. 2. Preliminaries 2.1. Physically Unclonable Functions: Black Box Description The m-bit input and n-bit output of a PUF are referred to as challenge C and response R respectively. Unfortunately for cryptographic purposes, the behavior of the challenge-response pairs (CRPs) does not correspond with a random oracle. First, the response bits are not perfectly reproducible: noise and various environmental perturbations (supply voltage, temperature, etc.) result in non-determinism. The reproducibility (error rate) differs per response bit. Second, the response bits are non-uniformly distributed: bias and correlations are present. The latter might enable so-called modeling attacks. One tries to construct a predictive model for the PUF, given a limited set of training CRPs. Machine learning algorithms have proven to be successful [18]. PUFs are often subdivided in two classes, according to their number of CRPs. Weak PUFs offer few CRPs: their total content (2m+n bits) is of main interest. Architectures typically consist of an array of identically laid-out cells (or units), each producing one response bit. E.g. the SRAM PUF [8] and the ring oscillator PUF1 [20] are both very popular. The total bit content scales roughly linear with the required IC area. Although there might be some correlation or a general trend among cells, a predictive model is typically of no concern. The response bits are mostly employed to generate a secret key, to be stored in volatile memory, in contrast to NVM. Post-processing logic, typically a fuzzy extractor (FE) [3], is required to ensure a reproducible and uniformly distributed key. Strong PUFs offer an enormous number of CRPs, often scaling exponentially with the required IC area. They might greatly exceed the need for secret key generation and have been promoted primarily as lightweight authentication primitives. Architectures are typically able to provide a large challenge space (e.g. m = 128), but only a very small response space, mostly n = 1. CRPs are highly correlated, making modeling attacks a major threat. The most famous example is the arbiter PUF [12], described in appendix A. 2.2. Secure Sketch The non-determinism of a PUF causes the regenerated instance of a response R to be slightly different: Ṙ = R + E, with HW(E) small. Secure sketches [3] are a useful reconstruction tool, as defined by a two-step procedure. First, public helper data is generated: P = Gen(R). Second, reproduction is performed: R = Rep(Ṙ, P). Helper data P unavoidably leaks some information about R, although this entropy loss is supposed to be limited. Despite the rather generic definition, two constructions dominate the implementation landscape, as specified below. Both the code-offset and syndrome construction employ a binary [n,k,t] block code C, with t the error-correcting capability. The latter construction requires a 1 We consider the most usable read-out modes which avoid correlations, e.g. pairing neighboring oscillators. linear block code, as it employs the parity check matrix H. Successful reconstruction is guaranteed for both constructions, given HW(E) ≤ t. Information leakage is limited to n-k bits. The hardware footprint is asymmetric: Gen is better suited for resource-constrained devices than Rep [21]. Figure 1: Token representation for all protocols and the two references. 3. Lightweight Authentication with Strong PUFs We analyze all strong PUF authentication schemes in chronological order. One can read the sections in arbitrary order, although we highly recommend to read 3.1 first. All schemes employ two phases. The first phase is a one-time enrollment in a secure environment, following IC manufacturing. The server then obtains some information about the PUF, CRPs or even a predictive model via machine learning, to establish a shared secret. The destruction of one-time interfaces might permanently restrict the PUF access afterwards. The second phase is in-the-field deployment, where tokens are vulnerable to physical attacks. Token and server then authenticate over an insecure communication channel. In general: challenge C and response R are required to be of sufficient length, e.g. m = n = 128, to counteract brute force attacks and random guessing. For proper assessment, we dene two reference authentication methods. Reference I employs a token with a secret key K programmed in NVM, as represented by Figure 1(a). Additional cryptographic logic performs the authentication. For ease of comparison, we opt for a hash function, hereby limiting ourselves to token authenticity only. The server checks whether a token can compute a ← Hash(K,N), with N a random nonce. Reference II employs PUF technology, potentially providing more physical security at a lower manufacturing cost. We employ a weak PUF2 to generate a secret key, as represented by Figure 1(b). The reproducibility and non-uniformity issue are resolved in a sequential manner, using a FE. A secure sketch first ensures reproducibility. Gen is executed only once during enrollment. The public helper bits are stored by the server, or alternatively at the token side in insecure off-chip NVM. A hash function performs entropy compression, hereby compensating the non-uniformity of R, in addition to the entropy loss caused by the helper data. One could generate a key as K ← Hash(R). We perform an optimization by merging this step with the authentication. 3.1. Naive Authentication The most simple authentication method employs an unprotected strong PUF only [16], as shown in Figure 1(c). Figure 2 represents the corresponding protocol. The server collects a database of d arbitrary CRPs during enrollment. We assume the use of a true random number generator (TRNG). A genuine token should be able to reproduce the response for each challenge in the server database. Only an approximate match is required, taking PUF non-determinism into account: Hamming distance threshold Ɛ implements this. To avoid token impersonation via replay, CRPs are discarded after use, limiting the number of authentications to d. Choosing e.g. m = 128, an attacker cannot gather all CRPs and impersonate a token as such. Choosing e.g. n = 128, randomly guessing R is extremely unlikely to succeed. 2 Logic for generating challenges is implicitly present and might be as simple as reading out the full cell array. Figure 2: Naive authentication protocol. The thick arrow points from verifier to prover. Modeling Attacks Strong PUFs are too fragile for unprotected exposure, as demonstrated by a history of machine learning attacks [18]. A predictive PUF model would enable token impersonation. So far, no architecture can claim to be practical, well-validated and robust against modeling. Two fundamental problems undermine the optimism for a breakthrough. First, strong PUFs extract their enormous amount of bits from a limited IC area only, hereby using a limited amount of circuit elements. A highly correlated structure is the unavoidable consequence. The arbiter PUF model in appendix A.2 provides some insights in this matter. Second, the more complicated the structure of the PUF, the more robust against modeling, but the less reproducible the response, as local non-determinism propagates to the PUF output. Appendix A.3 provides some insights in this matter. Limited Response Space In practice, strong PUFs provide a limited response space only, often n = 1. Replicating the PUF circuit is a simple but unfortunately very expensive solution. The lightweight approach is to evaluate a list of n challenges, hereby concatenating the response bits. Various methods can be employed to generate such a list. The server could generate the list, requiring no additional IC logic, but resulting in a large communication overhead. A small pseudorandom number generator (PRNG), such as a linear feedback shift register (LFSR), is often employed. Challenge C is then used as a seed value: R ← PUF(PRNG(C)). A variety of counter-based solutions can be applied as well. All protocol proposals in the remainder of this work typically suggest a particular response expansion method. We make abstraction of this, except when there is a related security problem. 3.2. Controlled PUFs Controlled PUFs [4-6] provide reinforcement against modeling via a cryptographic hash function (oneway). Two instances, preceding and succeeding the PUF respectively, are shown in Figure 1(d). Figure 3 represents the corresponding protocol3. The preceding instance eliminates the chosen-challenge advantage of an attacker. The succeeding instance hides exploitable correlations due to the PUF structure. The latter hash seems to provide full protection by itself, but requires the use of a secure sketch: its avalanche effect would trigger on a single bit flip in Ṙ’. CRP stored by the server are accompanied by helper data. Inferior to reference II The proposal seems to be inferior to reference II. First, the PUF is required to have an enormous instead of modest input-output space. This is inherently more demanding, even if one would extend reference II with a few challenge bits to enable the use of multiple keys. Second, server storage requirements scale proportionally with the number of authentications, in contrast to constant-size. 3 Controlled PUFs were proposed in a wider context than CRP-based token authentication only. Figure 3: Authentication with controlled PUFs. 3.3. Ozturk et al. Ozturk et al. [15] employ two easy-to-model PUFs, as shown in Figure 1(e). Figure 4 represents the corresponding protocol. Both PUFs are assumed to possess a large response space, without defining challenge expansion logic however. The outer PUF foresees the token with CRP behavior. To prevent its modeling by an attacker, an internal secret X is XORed within the challenge path. A feedback loop, containing a (repeated) permutation and an inner PUF, is employed to update X continuously. During enrollment, the server has both read and write access to X via a one-time interface. This allows the server to construct models for either PUF, followed by an initialization of X. The server has to keep track of X, which is referred to as synchronization. The non-determinism of the inner PUF makes this nontrivial. One assumes an excellent match between the responses of the inner PUF and its model. At most one bit of X is assumed to be affected, in the seldom case of occurrence. An authentication failure (violation of Ɛ) provides an indication thereof. One proposes a simple recovery procedure at the server side: bits of X are successively flipped until the authentication succeeds. Figure 4: Authentication protocol of Ozturk et al. Issues Regarding X There are several issues related to the use of X. First, it implicates the need for secure reprogrammable NVM, hereby undermining the advantages of PUFs. Either read or write access would enable an attacker to model the system, as during enrollment. Second, the synchronization effort is too optimistic. PUFs and their models typically have a 3% error rate. The server faces a continuous synchronization effort, not necessarily limited to a single error. Third, it enables denial-of-service attacks. An attacker can collect an (unknown) large number of CRPs, desynchronizing X. Propagation of errors across authentications makes the recovery effort rapidly infeasible. Feedback Loop Comments The permutation has to be chosen carefully. Repetition is bound to occur, as determined by the order k: πk = π. This would implicate X to have identical bits, opposing its presumed non-uniformity. A simple simulation confirms the significance for 64-bit vectors. The estimated probability of a randomly chosen permutation to have k ≤ 63 equals ≈ 9%. Furthermore, the need for the inner PUF is questionable. First, its non-determinism poses a limit on the complexity of the feedback loop, and hence the modeling resistance of the overall system. Second, the outer PUF and the initialization of X already induce IC-specific behavior. Using a cryptographic hash function as feedback would resolve all foregoing comments. The resulting system would then be remarkably similar to a later proposal: logically reconfigurable PUFs [11]. 3.4. Logically Reconfigurable PUF Logically reconfigurable PUFs [10] were proposed in order to make tokens recyclable, hereby reducing electronic waste. An internal state X is therefore mixed into the challenge path, as shown in Figure 1(e). Read access to X is allowed, although not required. There is no direct write access, although one can perform an update: X ← Hash(X). Figure 5 represents the authentication protocol. Figure 5: Authentication protocol for logically reconfigurable PUFs. Exacting PUF Requirements The proposal does not aim to prevent PUF modeling attacks, despite providing forward/backward security proofs with respect to X. Therefore, the practical value of the proposal is rather questionable: the protocol cannot be instantiated due to lack of an appropriate strong PUF. Issues related to NVM. The proposal requires reprogrammable write-secure NVM, which is not free of issues. First, it undermines a main advantages of PUFs: low-cost manufacturing. Second, it enables denial-of-service attacks. An attacker can update X one or more times, invalidating the CRP database of the server. The proposal does not describe an authentication mechanism for the reconfiguration. 3.5 Reverse Fuzzy Extractor The reverse FE proposal [21] provides mutual authentication, in contrast to previous work. The term ‘reverse’ highlights that Gen and not Rep is implemented as token hardware, as shown in Figure 1(h). As such, one does benefit of the lightweight potential of Gen. Figure 6 represents the protocol4. One author proposed a modified version in [13], with a reversal of authentication checks stated to be the main difference. PUF Non-Determinism. Non-determinism is essential to avoid replay attacks. For tokens, one does reuse the PUF for this purpose, hereby avoiding the need for a TRNG. A lower bound for the PUF nondeterminism is imposed, in order to avoid server impersonation. However, this seems to be a very delicate balancing exercise in practice, opposing more conventional design efforts to stabilize PUFs. The 4 The use of a token identifier (public, could be stored in insecure off-chip NVM) is omitted for simplicity, as it seems to have no impact on security. The server has been maintained as protocol initiator. need for environmental robustness (perturbations of outside temperature, supply voltage, etc.) magnifies this opposition. An attacker might immerse the IC in an extremely stable environment, hereby minimizing the PUF error rate. For genuine use however, one might have to provide correctness despite large perturbations, corresponding to the maximum error rate. The modified protocol proposal [13] does not discuss the foregoing contradiction, although its use of a token TRNG provides a resolution. Figure 6: Reverse FE protocol (token identifier omitted). PRNG Exploitation The proof-of-concept implementation employs a 64-bit arbiter PUF in combination with a PRNG to expand its response space: Ṙ  PUF(LFSR(C)). The concatenation of 255 arbiter response bits forms a single response R. A [n=255, k=21, t=55] BCH code has been employed, allowing for an efficient implementation of Gen. One generates 7 responses (and helper sets) from a single C, aiming at a security level of 7k > 128 bits. The proposal does not provide an explicit warning to refuse fixed points of the LFSR, e.g. C = 0. This would have been appropriate however, in order to avoid a trivial server impersonation attack. Fixed points will result in either Ṙ = 0 or Ṙ = 1, assuming stability for one particular response bit. An attacker can guess b with probability 1/2. Another issue, which is far less obvious and hence more serious, is described next. The proposal raises the concern of repeated helper data exposure. An attacker might collect helper data {P1, P2, …} for the same challenge C, trying to retrieve R. Therefore, one does recommend the syndrome construction, as there is provably no additional leakage with respect to the traditional n-k entropy loss5. However, the following threat has been overlooked: helper data exposure for correlated CRPs, as enabled by the PRNG. We first consider the (unrealistic but desired) case of a perfectly deterministic PUF. Helper data exposure can be understood via an underdetermined system of linear equations H RT= PT, having n - k = 234 equations for n = 255 unknowns. Challenge C1 provided by the server is employed as the LFSR initial value, resulting in n = 255 challenges after expansion, leading to a response R = (r1, r2, …, r255). One could easily construct a challenge C2 leading to a response R = (r2, r3, …, r256). Subsequently, both system of equations can be merged, as shown below. There are 2(n-k) = 468 equations for n + 1 = 256 unknowns, 5 Apart from security, one does not mention that the code-offset construction seems less practical: it requires a random codeword. which seems to be solvable at first sight. However, equation dependencies have to be considered. Therefore, we distinguish two types of block codes. First, the case of a non-cyclic code. Our experiments indicate that there are n independent equations for n + 1 unknowns, leading to a successful attack. Helper data leakage cannot distinguish between R and not(R), since ri + rj = not(ri) + not(rj). Second, the case of a cyclic code (e.g. BCH). Our experiments indicate that there are n - k + 1 independent equations for n + 1 unknowns, which does not benefit an attacker. However, an alternative method does circumvent this limitation. Arbiter PUFs can be modeled with only a few thousand CRPs, using machine learning techniques. By repeatedly merging helper data, one can obtain a system of n - k + q independent equations for n + q unknowns, given an arbitrary q ≥ 1. We choose e.g. q = 10000, including both training and verification CRPs. Subsequently, machine learning is performed in a brute-force manner, guessing the k = 21 unknowns, which seems to be feasible. The correct guess would result in the observable event of a high modeling accuracy. We now consider a PUF which is not perfectly deterministic. To minimize this inconvenience, an attacker could ensure the IC's environment to be very stable. Code parameters, which are normally chosen in order to maintain robustness against a variety of environmental perturbations (supply voltage, outside temperature, etc.), become relatively relaxed then. Subsequently, we consider a transformation H* = T H, which selects linear combinations of the rows of H. The transformation is chosen so that the rows of H* do have a low Hamming weight. Many algorithms could find a suitable transformation. Transforming H to reduced row echelon form, possibly after permuting its columns, might be a good start for instance. Suboptimal Even with d = 1, the protocol still allows for an unlimited number of authentications. Token impersonation via replay has already been prevented by the use of nonce N. This observation could result in numerous protocol simplifications, as has been acknowledged in the proposing paper. However, one does not state clearly that there would a simplification for the PUF as well: the enormous input-output space is no more required, even a weak PUF could be employed. Despite all the former, one still promotes the use of d > 1, arguing that it offers an increased side channel resistance. We argue that this countermeasure might not outweigh the missed advantages and that there might be numerous more rewarding countermeasures. The modified protocol proposal [13] does not discuss the foregoing matter, although it uses d = 1, hereby providing a weak PUF with implicit challenge as an example. 3.6. Slender PUF The slender PUF proposal [14] includes three countermeasures against modeling, while avoiding the need for cryptographic primitives, as clear from Figure 1(i). First, one employs a strong PUF with a high resistance, which is modeled during enrollment via auxiliary one-time interfaces. One does propose the XOR arbiter PUF (see appendix A.3). Second, the exposure of R is limited to random substrings S, hereby obfuscating the CRP link. The corresponding procedure SubCirc treats the bits in a circular manner. Third, server and token both contribute to the challenge via their respective nonces CS and CT, counteracting chosen-challenge attacks. Figure 7 represents the protocol. Figure 7: Slender PUF protocol. PRNG Exploitation The proof-of-concept implementation employs a PRNG to expand the PUF response space: Ṙ  PUF(PRNG(CS, CT)). However, the employed construction might allow for token impersonation: PRNG(CS, CT) = LFSR(CS) + LFSR(CT). We assume an identical feedback polynomial for both LFSRs, which is the most intuitive assumption6. A malicious token might then return CT  CS, resulting in an expanded challenge C’= 0. The server's PUF model outputs either Ṙ = 0 or Ṙ = 1. So provided a substring S = 0, authentication does succeed with a probability 1/2. Via replay, one could increase the probability to 1. We require eavesdropping on a single genuine protocol execution: CS(1), CT(1) and S(1). The malicious prover gets authenticated with the following: CT(2)  CS(2) + CS(1) + CT(1) and S(2)  S(1). The expanded challenges are equal. Exacting PUF Requirements. The PUF requirements are rather exacting and partly opposing. On one hand, the PUF should be easy-to model, requiring a highly correlated structure. On the other hand, CRP correlations do enable statistical attacks, due to the lack of cryptographic primitives. Although the XOR arbiter PUF seems to offer this delicate balance, it comes at a price: several one-time interfaces7, high response non-determinism and a limited modeling accuracy. Furthermore, the user's control regarding the challenge list should be highly restricted: statistical attacks are very powerful if an attacker has a (partial) chosen-challenge advantage. The PRNG-TRNG construction should accomplish this goal. Statistical attacks exploit the knowledge of a function P(ru = rv) = f(Cu, Cv) for a certain strong PUF architecture. CRPs with |P (ru = rv) - ½| > 0 are correlated and hence exploitable. One might be able to retrieve index j as such8, circumventing the SubCirc countermeasure. This subsequently enables machine learning attacks, as the CRP link has been restored. The proposal plots f for the arbiter PUF and its XOR variant based on simulations. However, we believe the simulation results of the XOR variant to be erroneous, and provide a correction in appendix A.4. Although the correlations have been underestimated, the larger XOR variants might still offer security. 6 The proof-of-concept implementation employs 128-bit LFSRs, with CS and CT as the initial states, without specifying feedback polynomials. Furthermore, FPGA implementation results (Table III) strongly suggest the use of identical feedback polynomials. 7 Chains are modeled separately via machine learning, before XORing takes place. 8 An identical attack could result in full key-recovery for the protocol predecessor: the pattern matching key generator [17], which serves as an alternative for the fuzzy extractor (reference II). This risk has not been in acknowledged in either paper. We note that another statistical attack of fundamentally different nature has recently been published for the predecessor [2]. 3.7. Converse Authentication Figures 1(j) and 8 represent the converse authentication protocol [11]. The authentication is one-way, in a less conventional setting where tokens verify the server. The difference vector (XOR) between two responses is denoted as ∆. The restriction ∆ ≠ 0 prevents an attacker from impersonating the server, choosing {Ci = Cj, Pi = Pj}. Optionally, one can extend the protocol with the establishment of a session key Hash(Ri || Rj ). The attacker capabilities are restricted: invasive analysis of the prover IC is assumed to be impossible. Furthermore, one assumes an eavesdropping attacker, trying to impersonate the server given a log of genuine authentication transcripts. Figure 8: Converse authentication protocol. Attacker Overly Restricted The attacker capabilities are unclear and overly restricted to be practical. First, restricting invasion would automatically extend to physical attacks in general. The need for PUFs is hereby severely reduced. It is not clear whether an attacker is allowed to freely query the server, as exploited hereafter. We argue that this should be possible, as the protocol is initiated by the token. Scalability Issues The probability of success for a randomly guessing attacker would be 1/2n, when trying to impersonate the server. In certain sense, the server faces a similar probability, posing an upper limit on n for practicality reasons. Successful authentication of the server relies on the availability of a given ∆ within its database. A scalable method to (construct and) search within a database was not discussed and seems far from obvious. Assume a cumbersome trial and error procedure as a search method: a pairwise selection has a probability of 1/2n to be usable. Memory requirements are then mild compared to the search workload, as there are d(d-1)/2 pairwise selections, although still enormous in comparison to all other proposals. A database of size d = 225 is mentioned to be realistic. A major threat for server impersonation is the following. An attacker might query the server and construct a personal database: <∆i, Ci1, Pi1, Ci2, Pi2>. Authentication will succeed, although a session key cannot be retrieved if present. Predecessor Issues Although the protocol is in essence identical to a direct predecessor [1], it is not described in terms of modifications. Nevertheless, a few issues have (quietly) been resolved. We observe five modifications. First, Rep is acknowledged to require helper data. Second, one introduces the restriction ∆ ≠ 0. However, this event occurs so seldom that an explicit check could be regarded as overhead. Third, ∆ is generated by a TRNG instead of a PRNG. To avoid replay, the latter construction would require secure NVM, hereby undermining the advantages of PUFs. Fourth, the protocol is initiated by a token instead of the server. Unfortunately, this enables exhaustion of the server database as described before. Fifth, PUF responses are reinforced by a cryptographic hash function. In its absence, we see a direct opportunity for prover impersonation in the occasional case that HW(E) < t. Consider an arbitrary response R = Rep(PUF(C),P). For both secure sketch constructions, an attacker might be able to produce a given ∆: Code-offset construction: R + ∆ = Rep(PUF(C), P + ∆) Syndrome construction: R + ∆ = Rep(PUF(C); P + H ∆) 4. Overview and Discussion Tables 1 and 2 provide an overview of Section 3. We do not support the use of any strong PUF proposal in its current form, given the numerous amount of issues. However, we do not object the use of both weak PUF methods: reference II and the modified reverse FE proposal [13]. We now discuss the strong PUF issues. Two proposals rely on secure reprogrammable NVM: Ozturk et al. and logically reconfigurable PUFs. The need for reprogramming undermines a major advantage which PUFs might offer: low-cost manufacturing. The secrecy of the NVM state in the former proposal undermines a second advantage as well: an improved resistance against physical attacks. Furthermore, the hardware footprint of the latter proposal is not expected to improve with respect to reference I. Note that reference I requires one-time programmable NVM only: e.g. fuses can be used. Finally, updating the NVM state was identified as a denial-of-service attack for both proposals. PUF responses are not perfectly reproducible. Protocols employ two approaches to overcome this issue: error correction and error tolerance. Unfortunately, two proposals struggle with the error tolerance approach. PUF non-determinism is greatly underestimated in Ozturk et al.: the protocol synchronization effort is presented too optimistic. We believe the proposal of Hammouri et al. to be non-functional because of internal error propagation. In practice, strong PUFs do have a small output space. There are various methods to resolve this issue, all imposing a certain efficiency burden. However, protocols proposals have very little attention for this topic. Even though system security is not necessarily unaffected. We specified attacks for the proof-ofconcept implementations of the reverse FE and slender PUF proposal, both exploiting the challenge expansion PRNG. Strong PUFs are insecure against modeling, in practice. Two proposals are therefore too demanding: logically reconfigurable PUFs and the slender PUF protocol. Although the latter offers some countermeasures, there is an opposing requirement for the PUF to be easy-to-model, leading to a delicate balancing exercise. In general, proposals not reinforced by a cryptographic hash function are much more likely to be vulnerable. Several proposal rely on a secure TRNG. Tampering with its randomness opens new perspectives for a physical attack. Its use seems to be unavoidable however if server authentication is a must, as assumed for the converse protocol, to avoid replay attacks. The reverse FE proposal extracts its non-determinism from the PUF instead, which has been identified as a delicate balancing exercise. Two protocols without server authentication employ their TRNG as a modeling countermeasure: Hammouri et al. and slender PUF protocol. All proposals aim to provide lightweight entity authentication, which addresses a highly relevant need. However, in many use cases, there will be accompanying security needs: message confidentiality and integrity, privacy, etc. We did not consider protocol extensibility in this work, although it might be of interest when designing a new protocol. References I and II, which employ a secret key, might benefit from a huge amount of scientific literature. Like-minded, the establishment of a session key has been proposed as an extension for the converse protocol. 5. Conclusion Various protocols utilize a strong PUF to provide lightweight entity authentication. We described all such proposals using a unified notation, hereby creating a first overview and initializing direct comparison as well. We defined two reference authentication methods, to identify the misuse of PUFs. Our analysis revealed numerous security and practicality issues. Therefore, we do not recommend the use of any strong PUF proposal in its current form. More fundamental research is required, aiming to create a truly strong PUF with great cryptographic properties. If not, secure lightweight authentication might become a mission impossible. The observations and lessons learned in this work will facilitate future protocol design. Acknowledgment The authors greatly appreciate the support received. The European Commission through the ICT programme under contract FP7-ICT-2011-317930 HINT. The Research Council of KU Leuven: GOA TENSE (GOA/11/007), the Flemish Government through FWO G.0550.12N and the Hercules Foundation AKUL/11/19. The national major development program for fundamental research of China (973 Plan) under grant no. 2013CB338004. Jeroen Delvaux is funded by IWT-Flanders grant no. 121552. The authors would like to thank Anthony Van Herrewege (KU Leuven, ESAT/COSIC) for his valuable comments. References 1. Das, A., Kocabas,U., Sadeghi, A.-R., Verbauwhede, I.: PUF-based secure test wrapper design for cryptographic SoC testing. In: Design, Automation & Test in Europe (DATE), IEEE, pp. 866-869, Mar. 2012. 2. Delvaux, J., Verbauwhede, I.: Attacking PUF-Based Pattern Matching Key Generators via Helper Data Manipulation In: CT-RSA 2014. 3. Dodis, Y., Ostrovsky, R., Reyzin, L., Smith, A.: Fuzzy Extractors: How to Generate Strong Keys from Biometrics and Other Noisy Data. SIAM J. Comput., vol. 38, no. 1, pp. 97-139, Mar. 2008. 4. Gassend, B., Clarke, D.E., van Dijk, M., Devadas, S.: Silicon physical random functions. In: ACM Conference on Computer and Communications Security, pp. 148-160, Nov. 2002. 5. Gassend, B., Clarke, D.E., van Dijk, M., Devadas, S.: Controlled Physical Random Functions. In: Annual Computer Security Applications Conference (ACSAC), pp. 149-160, , Dec. 2002. 6. Gassend, B., van Dijk, M., Clarke, D.E., Torlak, E., Devadas, S., Tuyls, P.: Controlled physical random functions and applications. In: ACM Trans. Inf. Syst. Secur., vol. 10, no. 4, 2008. 7. Hammouri, G., Ozturk, E., Sunar B.: A tamper-proof and lightweight authentication scheme. In: Journal Pervasive and Mobile Computing, vol. 6, no. 4, 2008. 8. Holcomb, D.E., Burleson, W.P., Fu, K.: Power-Up SRAM State as an Identifying Fingerprint and Source of True Random Numbers. In: IEEE Trans. Computers, vol. 58, no 9, 2009. 9. Hospodar, G., Maes, R., Verbauwhede, I.: Machine Learning Attacks on 65nm Arbiter PUFs: Accurate Modeling poses strict Bounds on Usability. In: Workshop on Information Forensics and Security (WIFS), IEEE 2012, pp. 37-42, Dec. 2012. 10. Katzenbeisser, S., Kocabas,U., van der Leest, V., Sadeghi, A.-R., Schrijen G.-J., Schroder, H., Wachsmann, C.: Recyclable PUFs: Logically Reconfigurable PUFs. In: Cryptographic Hardware and Embedded Systems (CHES), pp. 374-389, Sep 2011. 11. Kocabas, U., Peter, A., Katzenbeisser, S., Sadeghi, A.-R.: Converse PUF-Based Authentication. In: Trust and Trustworthy Computing (TRUST), LNCS, vol. 7344, pp. 142-158, Jun. 2012. 12. Lee, J.W., Lim, D., Gassend, B., Suh, G.E., van Dijk, M., Devadas, S.: A technique to build a secret key in integrated circuits for identification and authentication applications. In: VLSI Circuits, 2004 Symposium on, pp. 176-179, Jun. 2004 13. Maes, R.: Physically Unclonable Functions: Constructions, Properies and Applications. PhD Thesis, KU Leuven, Chapter 5, 2012. 14. Majzoobi, M., Rostami, M., Koushanfar, F., Wallach, D.S., Devadas, S.: Slender PUF Protocol: A Lightweight, Robust, and Secure Authentication by Substring Matching. In: IEEE Symposium on Security and Privacy (SP), pp. 33-44, May 2012. 15. Ozturk, E., Hammouri, G., Sunar B.: Towards Robust Low Cost Authentication for Pervasive Devices. In: IEEE Conference on Pervasive Computing and Communications (PerCom), Mar. 2008. 16. Pappu, R.: Physical One-Way Functions. PhD Thesis, MIT, Chapter 9, 2001. 17. Paral, Z., Devadas, S.: Reliable and Efficient PUF-Based Key Generation Using Pattern Matching. In: Hardware-Oriented Security and Trust (HOST), 2011 IEEE International Symposium on, pp. 128-133, Jun. 2011. 18. Ruhrmair, U., Sehnke, F., S•olter, J., Dror, G., Devadas, S., Schmidhuber, J.: Modeling attacks on physical unclonable functions. In: Computer and Communications Security (CCS), 2010 ACM conference on, pp. 237-249, Oct. 2010. 19. Skorobogatov, S.: Semi-invasive attacks - a new approach to hardware security analysis. Technical Report, UCAM-CL-TR-630, University of Cambridge, Computer Laboratory, Apr. 2005. 20. Suh, G.E., Devadas, S.: Physical unclonable functions for device authentication and secret key generation. In: Design Automation Conference (DAC), pp. 9-14, Jun. 2007. 21. Van Herrewege, A., Katzenbeisser, S., Maes, R., Peeters, R., Sadeghi, A.-R., Verbauwhede, I., Wachsmann, C.: Reverse Fuzzy Extractors: Enabling Lightweight Mutual Authentication for PUF-Enabled RFIDs. In: Financial Cryptography and Data Security (FC), LNCS, vol. 7397, pp. 374-389, Feb. 2012. Appendix A. Arbiter PUF A.1 Architecture Arbiter PUFs [12] quantify manufacturing variability via the propagation delay of logic gates and interconnect. The high-level functionality is represented by Figure 9(a). A rising edge propagates through two paths with identically designed delays, as imposed by layout symmetry. Because of nanoscale manufacturing variations however, there is a delay difference ∆t between both paths. An arbiter decides which path `wins' the race (∆t > 0) and generates a response bit r. Figure 9: Arbiter PUF (a) and its XOR variant (b). The two paths are constructed from a series of m switching elements. The latter are typically implemented with a pair of 2-to-1 multiplexers. Challenge bits determine for each stage whether path segments are crossed or uncrossed. Each stage has a unique contribution to ∆t, depending on its challenge bit. Challenge vector C determines the time difference ∆t and hence the response bit r. The number of CRPs equals 2m. The response reproducibility differs per CRP: the smaller |∆t|, the easier to flip side because of various physical perturbations. A.2. Vulnerability to Modeling Attacks Arbiter PUFs show additive linear behavior, which makes them vulnerable to modeling attacks. A decomposition in individual delay elements is given in Figure 10. Both intra- and inter- switch delays contribute to ∆t, as represented by white and gray/black squares respectively. We incorporate the latter in their preceding switches, without loss of generality, to facilitate the derivation of a delay model. Figure 10: Arbiter PUF decomposed in individual delay elements, represented by small squares and prone to manufacturing variability. The interconnecting lines have zero delay. In the end, delay elements are important as far as they generate delay differences between both paths. Therefore, each stage can be described by two delay parameters only: one for each challenge bit state, as illustrated in Figure 11. The delay difference at the input of stage i flips in sign for the crossed configuration and is incremented with δt1i or δt0i for crossed and uncrossed configurations respectively. Figure 11: Arbiter stage. The impact of a δt on ∆t is incremental or decremental for an even and odd number of subsequent crossed stages respectively. By lumping together the δt's of neighboring stages, one can model the whole arbiter PUF with m + 1 independent parameters only (and not 2m). A formal expression for ∆t is as follows [18]: Vector γ is a transformation of challenge vector C. Vector τ contains the lumped stage delays. The more linear a system, the easier to learn its behavior [9]. By using γ instead of C as ML input, a great deal of non-linearity is avoided. The non-linear threshold operation ∆t > 0 remains however. Only 5000 CRPs were demonstrated to be sufficient to model non-simulated 64-stage arbiter PUFs with an accuracy of about 97% [9]. A.3. XOR Variant Several variants of the arbiter PUF increase the resistance against ML. They introduce various forms of non-linearity for this purpose. We only consider the XOR variant. The response bits of multiple arbiter chains are XORed to a single response bit, as shown in Figure 9(b). All chains have the same challenge as input. The more chains, the more resistance against ML: the required number of CRPs and the computation time both increase rapidly [18]. However, the reproducibility of r decreases with the number of chains as well: each additional chain injects non-determinism into the overall system. A practical limit on the ML resistance is hence imposed. A.4. CRP Correlation Function: Enabling Statistical Attacks Machine learning attacks exploit CRP correlations in an implicit manner. Statistical attacks benefit from their explicit exploitation, hereby assuming the knowledge a function P(r1 = r2) = f(C1, C2). Such a function has already been determined for the arbiter PUF and its XOR variant in [14] via simulations. However, we believe the simulation results for the XOR variant to be erroneous. Therefore, we repeat the simulation experiment and we derive an analytical model as well, as shown in Figure 12. Let N (μ, σ) denote a normal distribution with mean μ and standard deviation σ. Let φ(x,σ) and Φ(x,σ) denote the probability density function and cumulative distribution function respectively, assuming μ = 0. We assume all δt's to have a distribution N (0, σ1), which is a common and well-validated practice in previous work. Although not fully correct, we then assume the elements of τ to have a distribution N (0,2σ1). We introduce the variable h = HD(γ1, γ2). Figure 12: Correlations for the arbiter PUF and its XOR variant. Dots represent simulations results. The mathematical model, drawn continuously although of discrete nature, matches reasonably well. Vertical dashed lines enclose 99% of the data for randomly chosen challenges. The more chains being XORed, the better one approximates the ideal behavior f = 1/2, but the larger the response non-determinism. B. Hammouri et al. Hammouri et al. [7] employ two strong PUFs, as shown in Figure 1(f). Both PUFs are modeled during enrollment. The outer PUF is an arbiter PUF. The inner PUF is a custom architecture based on the arbiter PUF. Lin compensates the non-linearity of the outer arbiter PUF, apart from the final thresholding step: ∆t is a linear function of Lin(X). Its form is relatively simple, as shown below. Figure 13 represents the authentication protocol. Figure 13: Authentication protocol of Hammouri et al. B.1. Inner Architecture: Bizarre Architecture, Prone to Modeling Deficiencies Architecture The inner PUF is a custom architecture based on the arbiter PUF. One proposes a rather bizarre extension of the challenge space. Out of two individual chains, one aims to construct a single reconfigurable chain. For each stage, one out of two switching elements is selected, hereby introducing a second challenge S. The proposal ignores the need to describe the reconfiguration logic: Figure 14(right) provides a generic schematic, including both intra- and inter-switch delays. Via reconfiguration, one aims to provide a large response space. One does evaluate CI for a fixed list of configurations vectors {S1, S2, …}, hereby concatenating the response bits. The configuration vectors are generated by a PRNG, having S1 as initial state. Note that one could have provided a large response space with a regular arbiter PUF as well, given the use of a PRNG. Figure 14: Reconfigurable stage of the inner PUF. Prone to Modeling Deficiencies. The architecture is required to be easy-to-model: the server has to construct a model during enrollment. The overall structure is additive, as for a regular arbiter PUF, and therefore we expect modeling to be feasible. Although one could have derived a generic (but complicated) delay model, possibly leading to an efficient modeling method, the authors do propose a shortcut. The overall delay model is supposed to be separable in terms of the individual chains. One does construct a model τU for the upper chain, applying S = 0. Similar, one obtains a model τL for the lower chain, applying S = not(0). The overall model τ(S) selects elements from both vectors: τ (i) = S(i) τL(i) + not(S(i)) τU(i). The variability of intra-stage delays is being neglected, justified by placing the stages far apart in the circuit lay-out. An approximating delay model for upper/lower chain is derived as well, as shown in Figure 14 (left). Although it might be possible to make all the former workable (there is no proof-of-concept implementation), the approach is strongly lay-out dependent and prone to modeling deficiencies. Apart from being area consuming: positioning stages far apart might not be sufficient to justify separability. Intra and inter-stage variations originate from CMOS transistors and metal interconnect respectively. We highly doubt that the former would be per se negligible with respect to the latter. It is possible though to enhance separability: upsizing transistor of the switches, inserting minimum-sized inverter chains in between switches, etc. Furthermore, one should distinguish the metal interconnect before and after the selection logic, as shown in Figure 14 (right). Variability of the latter would undermine the separability. We stress that all these complications could have been avoided easily, e.g. by using a regular arbiter PUF. B.2. Bizarre Modeling Procedure: Contradictive and Overcomplicated During enrollment, the server constructs a model for both the inner and outer PUF. Reading out X, the output of the inner PUF, via a one-time interface would have made things easy. This would allow to model both PUFs separately. However, one designed a rather complicated procedure to model the inner PUF via the output of the outer PUF. Therefore, one introduces a function Lin to linearize the outer PUF. We note that this is rather contradictive: it degrades the overall modeling resistance. Fixing CO = 0, one obtains a system r  PUFO(Lin(X)). Error propagation from X to r is very limited then. An error in bit X(i) would flip the sign of γ(i) only, corresponding to h = 1 in Figure 12. During enrollment, one can force the PRNG to maintain either 0 or not(0) as its state, allowing to model the upper and lower chain separately. This requires some sort of destructive interface, similar to our proposal to read out X directly. Depending on CI, X will be either 0 or not(0), apart from potential noisiness, hereby maximizing h. As a consequence, one can distinguish either case, as r is very likely to flip. This enables modeling of the inner PUF. Although some samples might be problematic: the pair X = 0/not(0) occasionally result in ∆t ≈ 0, maintaining r in a noisy state. B.3 Non-functional: Error Propagation We believe the proposal to be non-functional because of internal error propagation. The minimization of h does not hold for the system r  PUFO(Lin(X’) + CO), in the general case that CO ≠ 0. A single error in X will flip the sign of r with a probability close to 1/2. This could have been avoided by implementing the system r  PUFO(Lin(X’+ CO)) instead.

Secure Lightweight Entity Authentication with Strong PUFs: Mission

Related documents

Products

Support

Secure Lightweight Entity Authentication with Strong PUFs: Mission

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib