Fine-Tuning XoP Security: Ideal Worlds & Permutation Outputs

Fine-Tuning Ideal Worlds for the Xor of Two Permutation Outputs Wonseok Choi1 , Minki Hhan2 , Yu Wei1 , and Vassilis Zikas1 1 Purdue University, West Lafayette, IN, USA {wonseok, yuwei}@purdue.edu, {vzikas}@cs.purdue.edu 2 KIAS, Seoul, Korea {minkihhan}@kias.re.kr Abstract. Security proofs of symmetric-key primitives typically consider an idealized world with access to a (uniformly) random function. The starting point of our work is the observation that such an ideal world leads to underestimating the actual security of certain primitives. As a demonstrating example, XoP2, which relies on two independent random permutations, is proven to exhibit far superior concrete security compared to XoP, which employs a single permutation with domain separation. But the main reason for this is an artifact of the idealized model used in the proof, in particular, that (in the random-function-ideal world) XoP might hit a trivially bad event (outputting 0) which does not occur in the real/domain-separated world. Motivated by this, we put forth the analysis of such primitives in an updated ideal world, which we call the fine-tuned setting, where the above artifact is eliminated. We provide fine-tuned (and enhanced) security analyses for XoP and XoP-based MACs: nEHtM and DbHtS. Our analyses demonstrate that the security of XoP-based and XoP2-based constructions are, in fact, far more similar than what was previously proven. Concretely, for the number of users u and the maximum number of queries per user qm , we show that the multi-user “fine-tuned” security bound of XoP can be proven as O u0.5 qm 2 /22n via the Squared-ratio method proposed by Chen et al. [CRYPTO’23], resulted to the same security bound of XoP2 proven there. We also show the compatibility of the fine-tuned model with the Chi-squared method proposed by Dai et al. [CRYPTO’17], and show that XoP and XoP2 enjoy the same security bound in the fine-tuned setting regardless of proving tools. Finally, we turn to the security analysis of MACs in the multi-user setting, where the effect of transitioning the proofs to the fine-tuned setting is even higher. Concretely, we are able to prove unexpected improvements in the security bounds for both nEHtM and DbHtS. Our security proofs rely on a fine-tuned and extended version of Mirror theory for both lower and upper bounds, which yields more versatile and improved security proofs. Of independent interest, this extension allows us to prove the multi-user MAC security of nEHtM in the nonce-misuse model, while the previous analysis only applied to the multi-user PRF security in the nonce-respecting model. As a side note, we also point out (and fix) a flaw in the original analysis of Chen et al.. Table of Contents Fine-Tuning Ideal Worlds for the Xor of Two Permutation Outputs . . . . . . Wonseok Choi1 , Minki Hhan2 , Yu Wei1 , and Vassilis Zikas1 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Xor of Two Permutation Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 MAC constructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Exaggerated Assumption in MAC Security Notion . . . . . . . . . . . . . 1.4 Our Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.5 The security bound of nEHtM2 in [11] . . . . . . . . . . . . . . . . . . . . . . . . 1.6 Version Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 The Chi-Squared Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 The Squared-Ratio Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Patarin’s H-Coefficient Technique . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 Useful Inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Fine-Tuning Security Notions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Fine-Tuning Extended Mirror Theory with Upper Bounds . . . . . . . . . . . 4.1 Proof of Mirror Theory - Lower Bound for ξmax > 2 . . . . . . . . . . . 4.2 Proof of Mirror Theory - Lower Bound for ξmax = 2 . . . . . . . . . . . 4.3 Proof of Mirror Theory - Upper Bound for ξmax > 2 . . . . . . . . . . . 4.4 Proof of Mirror Theory - Upper Bound for ξmax = 2 . . . . . . . . . . . 5 Multi-User Security of XoP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Proof of Multi-User Security of XoP via the Chi-Squared Method 5.1.1 Proof of Lemma 22 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Proof of Multi-User Security of XoP via the Squared-Ratio Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Multi-User Security of nEHtM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 Bad and Good Transcripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Proof of Theorem 11 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 Bad Transcript Analysis and Interpretations . . . . . . . . . . . . . . . . . . 6.4 Nonce-respecting Setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.5 Using Stronger Hash and Proofs in [11] . . . . . . . . . . . . . . . . . . . . . . 7 Multi-User Security of DbHtS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1 Proof of Theorem 12 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 Proof of Theorem 13 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 3 3 4 5 5 8 8 9 10 11 11 11 13 14 19 22 35 41 42 43 46 48 49 51 56 59 63 64 68 69 75 1 Introduction Block ciphers are often regarded as pseudorandom permutations (PRPs) in cryptography. This so-called standard model in symmetric cryptography assumes that distinguishing a secure block cipher from a random permutation is almost impossible until a certain number of encryption and decryption queries have been made, enabling a black-box methodology for block ciphers. On the other hand, many symmetric-key constructions, such as message authentication codes or authenticated encryptions, rely on pseudorandom functions (PRFs) as foundational building blocks to achieve beyond-birthday-bound security [2, 3, 7, 15]. However, substituting PRFs with PRPs in such constructions results in compromised security, leading to vulnerabilities concerning the birthday-bound [4, 5, 6, 9, 26, 27]. Because block ciphers take many advantages, e.g., AES-NI for AES, it is ideal for constructing other cryptographic primitives based on block ciphers. 1.1 Xor of Two Permutation Outputs Bellare, Krovetz, and Rogaway [5] and Hall et al. [26] pioneered the investigation of constructing beyond-birthday-bound secure PRFs from PRPs, which has since attracted considerable attention [5, 26, 35, 36, 31, 19, 8, 24, 14, 25, 11]. One of the most well-known such constructions is so-called the xor of two permutations. Given a n-bit (keyed) PRP P, XoP maps x ∈ {0, 1}n−1 to def XoP[P](x) = P(0 ∥ x) ⊕ P(1 ∥ x). Alternatively, given two n-bit (keyed) PRPs P and Q, their sum, denoted XoP2, maps x ∈ {0, 1}n to def XoP2[P, Q](x) = P(x) ⊕ Q(x). After the initial introduction of XoP construction [5, 26], several studies have built upon and enhanced this groundbreaking work [1, 18, 32, 34]. The most notable advancements include proofs by Dai, Hoang, and Tessaro [19] and Dutta, Nandi, and Saha [22], which established that XoP and XoP2 are secure up to O(2n ) queries. The two works use the chi-squared method and a verifiable version of mirror theory, respectively. However, their concrete security bounds are q different; The tight bound of XoP is 2n while the best known bound of XoP2 is O q2 22n where q is the number of queries made by an adversary. The difference can be more significant in the multi-user model. In the multiuser setting, Choi et al. [13] and Chen, Choi, and Lee [11] improved the multi-user security bound of XoP2. Their result implies that if there are O(2n ) number of XoP instances, i.e., O(2n ) users, only one query per instance suffices to break PRF security of XoP. On the other hand, XoP2 still enjoys beyond-birthdaybound in the same case. 3 1.2 MAC constructions Nonce-Enhanced Hash-Then-MAC. On the other side, Dutta, Nandi, and Talnikar [23] presented an efficient construction of a message authentication code (MAC) called nonce-enhanced hash-then-MAC (nEHtM), achieving the BBB security both as a PRF and a MAC. Furthermore, this construction provides graceful security degradation of nonce misuse and only uses a (two-call of) single-block cipher and a single-block hash function such as the polynomial hash, making it a preferable option. The original construction of nEHtM is of the form: def nEHtM[H, P](N, M ) = P(0 ∥ N ) ⊕ P(1 ∥ HKh (M ) ⊕ N ) for a permutation P and appropriate hash function H. The original paper proved the single-user security of nEHtM up to O(22n/3 ) MAC queries and O(2n ) verification queries when the number of faulty queries is sufficiently small. Choi et al. [16] later improved this upto O(23n/4 ) MAC queries and O(2n ) verification queries. More recently, a variant of nEHtM has been considered, defined as def nEHtM2[H, P, Q](N, M ) = P(N ) ⊕ Q(HKh (M ) ⊕ N ) using two permutations, which we refer to nEHtM2. This was first considered by Chen, Mennink, and Preneel [12], showing the single-user PRF security of this variant up to O(23n/4 ) queries. Chen, Choi, and Lee [11] proved that nEHtM2 achieves stronger PRF security in the multi-user setting than the original nEHtM. In particular, they showed the BBB PRF security of nEHtM2 for the number of users is about 2n/2 , which was impossible for the original nEHtM because of the uq/2n term in the advantage bound. The (improved) MAC security of nEHtM and its variant in the multi-use setting is, on the other hand, left as an open problem. Double-block Hash-then-Sum. Double-block Hash-then-Sum (DbHtS) paradigm was proposed by Datta et al. [20]. DbHtS has two versions: the two-key version based on XoP1 and the three-key version based on XoP2 where the number of keys implies the total number of keys used in a 2n-bit output hash function and an n-bit block cipher. In this paper, we will focus on a variant of the twokey version of DbHtS, defined as follows: Let H = (H1 , H2 ) : {0, 1}2k × M → {0, 1}n−1 × {0, 1}n−1 be a (2n − 2)-bit hash function. H can be decomposed into two (n − 1)-bit hash functions H1 , H2 : Kh × M → {0, 1}n−1 , and thus have HKh (M ) = (H1Kh (M ), H2Kh (M )) where Kh = (Kh1 , Kh2 ) ∈ {0, 1}k × {0, 1}k . 1 2 Let E : {0, 1}k × {0, 1}n → {0, 1}n be a block cipher modeled as an ideal cipher. We define (a variant of) the DbHtS constructions as follows def DbHtS[H, E](Kh , K, M ) = EK (0∥H1Kh,1 (M )) ⊕ EK (1∥H2Kh,2 (M )). Note here we drop the least significant bit of the last round output of E deployed by H1 , H2 so that both H1 and H2 outputs are n − 1 bits, which is differentiated 4 from previous works. We assume H1 , H2 are δ1 -regular and δ2 -almost universal for (n − 1)-bit outputs. There are several works improved security analysis of DbHtS [29, 37, 21]; Notably, two-keyed DbHtS are proven to be tightly secure with a multi-user ℓq 4/3 in the ideal cipher model when ℓ is the maximum length of bound O 2n messages, a 2ℓn -universal (and regular) hash is used and discarding primitive queries by Datta et al. [21]. Those works on two-keyed DbHtS paradigm [37, 21] did not focus on domain separation and used the same block cipher call for each hash output; however, they required an additional hash property, namely cross-collision resistant, which can be realized by introducing domain separation. 1.3 Exaggerated Assumption in MAC Security Notion Message authentication codes are widely accepted symmetric-key constructions. To ensure the security of MACs, we often compare those constructions with uniformly random functions. PRF security itself is not directly related to MAC security, but the indistinguishability from the uniform random functions is often used as an intermediate step for the security proof. In turn, the uniform random functions are somehow considered as the ideal world. Albeit the ideal worlds are often identified as random functions or include random functions as a part of their interface, we notice that the random functions for MAC security do not need to be chosen uniformly randomly from the set of all possible functions: It does not rely on the all-zero output, denoted by 0, in the ideal worlds—they were introduced only to make analyses easier. Indeed, we can expect improvement by modifying the ideal worlds from the proof of [19]. While they introduced fine-tuned ideal worlds as intermediate worlds, their security bounds could not surpass O 2qn due to the presence of a trivial bad event that outputs 0 in the vanilla ideal world. This observation can lead to immediate improvement of security bounds of XoP-based constructions such as nEHtM [23, 16, 11] by the rid of the overestimated assumption. From this observation, we also newly introduce a variant of DbHtS [20, 29, 37, 21], which uses one block cipher key and domain separation. 1.4 Our Contribution Our main contribution is, as stated above, observing the unnecessary loss in the previous security analyses and providing improved analyses in various models by introducing fine-tuning ideal worlds for those constructions: XoP, nEHtM, and (a variant of) DbHtS. We denote u as the number of users and qm as the maximum number of queries per user allowed to an adversary. We use the standard model for XoP and nEHtM and the ideal cipher model for DbHtS to show that our observation can be applied regardless of the choice of models and the proof strategies. In the ideal cipher model, p stands for the number of primitive queries allowed to an adversary. 5 Security of XoP. We show that the “fine-tuned” multi-user PRF security bound of XoP from the random ideal world without outputting zero can be 0.5 1.5 qm 1. O u 21.5n via the Chi-squared method [19] where the same security bound for XoP2 was proven in Choi et al. [13] at ASIACRYPT’22; 0.5 2 qm via the Squared-ratio method [11] where the same security 2. O u 22n bound for XoP2 was proven in the same paper at CRYPTO’23. Note that just checking if there is an output 0 of the oracle breaks the standard m PRF security for q ≥ 2n , making no hope for better than O uq security. Our n 2 result for XoP demonstrates this barrier is entirely due to the output 0. Security of nEHtM. We revisit the multi-user security of the original nEHtM in the multi-use setting. We prove that nEHtM enjoys strong MAC multi-user security similar to the multi-user PRF result in [11] while using less key size with graceful security degradation under nonce misuse, resolving the open question posed in [11]. When the number of users u = O(2n/2 ) and each user makes the faulty queries much less than 2n/4 times, then our result indicates that nEHtM is BBB secure MAC. This was believed to be impossible, at least through the standard ideal world—with outputting zero. we prove that the multiConcretely, 1/2 4 uqm user MAC security bound of nEHtM is O as long as the number of 23n faulty and verification queries is sufficiently small qm is large enough. The and 2 uqm [16]. A similar security previous best bound in a similar setting was O 21.5n of nEHtM as PRF without outputting zero is also proven. Along the way, we figure out that the multi-user PRF security of nEHtM2 in [11] is buggy (see Sections 1.5 and 6.5), resulting in a slightly worse bound than they claimed; for example, the claimed birthday bound security for u ≈ 2n is false. Despite this, we develop and fine-tune the relevant extended mirror theory without outputting zero and the security proof of nEHtM, resulting in the even better bound than one for nEHtM2 in [11] in some sense. For example, their security bound does not work for qm ≈ 23n/4 . We refer to Figure 2 for the graphical comparison. We also study variants of nEHtM and nEHtM2 based on a stronger hash function. This variant (almost) recovers the security multi-used PRF claim for nEHtM2 in [11] if we use theiroriginal proof, and the even better MAC security 4 u0.5 qm bound of nEHtM including O if we exploit the improved strategies and 23n mirror theory in this paper. This exhibits the power of our fine-tuning and indicates that the current obstacles to better and cleaner security are from the hash functions, either its property itself or its current analysis. Security of DbHtS. At last, we explore the multi-user MAC security of DbHtS. Our main targets are the variants of [21, 37] using the domain separation. We focus on the security bound that is fine-tuned in terms of the query bound qm for each user, instead of the total number of queries q across all users. In the worst case, q = uqm holds. Our results can be summarized as follows. 6 – Under the ideal cipher model as in [37], we analyze the security of DbHtS based on a dedicated analysis regarding qm along with the idea of finetuning but mainly following the original approach. This leads to a better security bound than one by the so-called generic reduction and also achieves an improvement over the original result in the same setting, except for the domain separation. – For [21], if we focus on qm , we observe that naïvely following the original 4/3 proof cannot avoid uqm /2n , which is even worse than the trivial bad probn ability of uqm /2 . Inspired by the case of nEHtM, we show that an improved bound can be achieved assuming stronger underlying hash functions. A pictorial comparison is shown in Figure 1. The effect of fine-tuning also appears in the low end, for example, when qm ≲ 2n/3 still allows the 1/2n security bound in both cases when the other parameters are sufficiently small, which was impossible in the previous bounds. log2 qm log2 qm n Ours q = uqmax n 3n 4 2n 3 n 2 n 3 n 2 log2 u 0 n 2 3n 4 log2 u 0 n 2 n 2n 3 n Fig. 1: Comparison of the security bounds (in terms of the threshold number of queries per user) as functions of log2 u. The solid line represents our bounds, and the dash-dotted line represents the previous bound where q = uqmax . The left figure compares our Theorem 13 with Theorem 1 from [21]. The right figure compares our Theorem 12 with Theorem 1 from [37], where we set ϵ3 , ϵ4 = O(2−2n ) according to [37, p. 19]. We set p = 0, k = n, δ = 2−n , and neglect l and the logarithmic term of n in all graphs. We remark that the use of strong hash functions for better security is reminiscent of the recent advances in the cascaded LRW2 security. Mennink [33] presented an attack on Cascaded LRW2 [30], or CLRW2, and showed that matching security bound, under several assumptions, including a stronger property of hash functions about the multiple collisions, very similar to ours. Subsequently, Jha and Nandi [28] demonstrated how to eliminate those assumptions and developed a couple of new tools for dealing with the multiple collisions of the hash functions, apparently inspired by the result of Mennink. In turn, these tools are 7 frequently used in the later works [11, 16, 12] as well as this work. We hope our security bounds with strong hashes highlight the specific point in the security proofs that future works resolve. 1.5 The security bound of nEHtM2 in [11] We briefly sketch the problems in the original multi-user nEHtM2 security proof in [11], confirmed by private communications with the authors. We stress that the main problems are from the security proof of nEHtM itself, not from their main tool, the Squared-ratio method and Mirror theory. The main issue is the behavior of the property of hash functions H and qc , the number of edges in the components of size > 2 in the graph representation of queries. In the nonce-respecting setting, qc increases when Xi = Xj happens for Xi = HKh (Mi ) ⊕ Ni for the message Mi and nonce Ni . Since H is δ-almost XOR universal, we can only predict the property of single event Xi = Xj that is paraphrased by HKh (Mi ) ⊕ HKh (Mj ) = Ni ⊕ Nj . This suffices for estimating 2 the expectation of qc . However, we need to compute the expectation 2 of qc and give the bound on qc with a high probability. Computing Ex qc is involved with multiple collisions, i.e., the event that Xi = Xj and Xk = Xℓ simultaneously happen. [11] implicitly assumes that two collisions happen independently, giving a nice upper bound of Ex qc2 (as in Fact 6 derived using stronger hash functions). However, what we can actually give is a worse bound as in Fact 5. They again use their false estimation when computing the probability for some bad event (bad5 in their proof). mu−prf Another minor issue is a missing term at the √ end of the proof. In AdvnEHtM unLqmax δ at the second line. bound in [11, page 28], there is a term about 2n However, there is no corresponding term in the final bound. A similar term √ 2 unLqmax δ 2 δ ≤ 1, is alive, which is smaller than the problematic term when qmax n 2 This missing part affects some exponent of the final security bound. We show that using a stronger hash function allows us to recover a similar result as the original. We refer to Section 6.5 for a more detailed analysis. 1.6 Version Notes Some parts of this paper have been revised from the original version to improve the presentations and some technical parts. We summarize the differences below. The Eurocrypt submission version. This is the original version. The Eprint (2023-Oct.) version. – We make the consistency between the presentations in each section, and correct and improve many presentations. – We add Lemmas 4 and 6 as separated lemmas. – Proofs of Mirror theory are revised. – We apply Lemma 6 in proving Theorem 10 for a cleaner presentation, though it has a slightly worse constant. 8 – In Section 6, we give a colorful transition of equations for easier verification. We 22n also made some changes at 1) the event bad6 by qc ≥ 186q 2 and the conditions m 2n n 2 2 of L1 , L2 (originally it was qc ≥ min 186q ), 2) the analysis 2 , m 3(5L1 +5L2 +2µm ) of bad3c based on ¬bad2a (originally it follows the analysis of [16, bad2e ]), and 3) the choice of L2 , which now considers two cases. These changes improve the final bound, allowing the security makes sense for u ≲ 22n , while the previous 5n boundonly works for u ≲ 2 3 . We also slightly improve the computations 2 of Ex ϵ1 (τ ) at the beginning of Section 6.2, and fill the sanity check for ξmax qm ≪ 2n which was not explicitly written in the original version. – We modify Figure 2; we adjust the improvement in this version for our result, and correct an error of the graph for [11] in the previous version. We also add the graphs when assuming δ-AXU(2) . – We provide a rigorous proof of Theorem 13. In the proof of Theorem 13, we replace the misused mirror theory lower bound ([21, Lemma 1]) with our fine-tuning extended mirror theory lower bound (Theorem 6). The correction doesn’t affect the dominant term in the statement. 2 Preliminaries Notation. Throughout this paper, we fix positive integers n and u to denote the block size and the number of users, respectively. For a non-empty finite set X , we let X ∗ℓ denote a set {(x1 , . . . , xℓ ) ∈ X ℓ | xi ̸= xj for i ̸= j}. For an integer A and b, we denote (A)b = A(A − 1) . . . (A − b + 1). A notation x ←$ X means that x is chosen uniformly at random from X . |X | means the number of elements in X . The set of all permutations of {0, 1}n is simply denoted Perm(n). The set of all functions with domain {0, 1}n and codomain {0, 1}m is simply denoted by Func(n, m). We additionally define Func∗ (n, m) ⊂ Func(n, m) by the set of all functions in Func(n, m) satisfying the following condition: for any f ∈ Func∗ (n, m), f (x) ̸= 0 for all x ∈ {0, 1}n . For a keyed function F : K × X → Y with key space K, and non-empty sets X and Y, we will denote F (K, ·) by FK (·) for K ∈ K. When two sets X and Y are disjoint, their (disjoint) union is denoted X ⊔ Y. We write Tre and Tid as random variables following the distribution of the transcripts in the real world and the ideal world, respectively. For any positive integer i, and a1 , . . . , ai , b ∈ {0, 1}n , We denote def {a1 , . . . , ai } ⊕ b = {a1 ⊕ b, . . . , ai ⊕ b} Almost XOR Universal Hash Functions. Let δ > 0, and let H : Kh ×M → X be a keyed function for three non-empty sets Kh , M, and X . We say that H is δ-XOR almost universal (δ-XAU) if for any distinct M, M ′ ∈ M and X ∈ X , Pr [Kh ←$ Kh : HKh (M ) ⊕ HKh (M ′ ) = X] ≤ δ. Regular and Almost Universal Hash Functions. Let δ1 , δ2 > 0, and let H : Kh × M → X be a keyed function for three non-empty sets Kh , M, and X . 9 We say that H is δ1 -regular if for any M ∈ M and X ∈ X , Pr [Kh ←$ Kh : HKh (M ) = X] ≤ δ1 , and H is δ2 almost universal (δ2 -AU) if for any distinct M, M ′ ∈ M and X ∈ X , Pr [Kh ←$ Kh : HKh (M ) = HKh (M ′ )] ≤ δ2 . 2.1 The Chi-Squared Method We give here the necessary background on the chi-squared method [19]. We fix a set of random systems, a deterministic distinguisher A that makes q oracle queries to one of the random systems, and a set Ω that contains all possible answers for oracle queries to the random systems. For a random system S and i ∈ {1, . . . , q}, let ZS,i be the random variable over Ω that follows the distribution of the i-th answer obtained by A interacting with S. Let def ZiS = (ZS,1 , . . . , ZS,i ), and let def piS (z) = Pr ZiS = z for z ∈ Ω i . We omit i when i = q. For i ≤ q and z = (z1 , . . . , zi−1 ) ∈ Ω i−1 such i−1 that pi−1 =z S (z) > 0, the probability distribution of ZS,i conditioned on ZS z will be denoted pS,i (·), namely for z ∈ Ω, def pzS,i (z) = Pr ZS,i = z | Zi−1 =z . S For two random systems S0 and S1 , and for i < q and z = (z1 , . . . , zi−1 ) ∈ i−1 2 z z Ω i−1 such that pi−1 S0 (z), pS1 (z) > 0, the χ -divergence for pS0 ,i (·) and pS1 ,i (·) is defined as follows. 2 X def pzS1 ,i (z) − pzS0 ,i (z) 2 z z . χ pS1 ,i (·), pS0 ,i (·) = pzS0 ,i (z) z∈Ω such that pzS ,i (z)>0 0 We will simply write χ2 (z) = χ2 pzS1 ,i (·), pzS0 ,i (·) when the random systems are clear from the context. If the support of pi−1 S1 (·) is contained in the support 2 z z of pi−1 (·), then we can view χ p (·), p (·) as a random variable, denoted S1 ,i S0 ,i S0 i−1 χ2 Zi−1 , where z follows the distribution of Z S1 S1 . Then A’s distinguishing advantage is upper bounded by the total variation distance of pS0 (·) and pS1 (·), denoted ∥pS0 (·)−pS1 (·)∥, and we have the following theorem. Theorem 1 ([19]). Suppose whenever pS11 (·) > 0 then pS01 (·) > 0. Then we have !1 q 2 i−1 2 1X ∥pS0 (·) − pS1 (·)∥ ≤ Ex χ ZS1 . (1) 2 i=1 10 2.2 The Squared-Ratio Method This method was first introduced in [11]. For multi-user security, we assume a random system S = (S 1 , . . . , S u ) and i ∈ {1, . . . , u}. Note that S01 is the ideal world and S11 is the real world for the first user, independent of the other user’s oracle. Theorem 2 ([11]). Suppose whenever pS11 (·) > 0 then pS01 (·) > 0. Let Ω = Γgood ⊔ Γbad . If a function ϵ1 (z) and a constant ϵ2 holds the following constraints pS11 (z) pS01 (z) − 1 ≤ ϵ1 (z) for all attainable z ∈ Γgood and h i Pr ZS01 ∈ Γbad ≤ ϵ2 , one has ∥pS1 (·) − pS0 (·)∥ ≤ p 2uEx [ϵ1 (z)2 ] + 2uϵ2 where the expectation is taken over the distribution of ZS01 . Note that the expectation in the squared-ratio method is the distribution from the ideal world, while the expectation in the Chi-squared method is taken over the distribution from the real world. 2.3 Patarin’s H-Coefficient Technique The well-known Patarin’s H-coefficient technique can be expressed as below: Suppose whenever pS11 (·) > 0 then pS01 (·) > 0. Let Ω = Lemma 1 ([10]). pS 1 (z) Γgood ⊔ Γbad . Let ϵ1 , ϵ2 ≥ 0 be two constants. If p 11 (z) ≥ 1 − ϵ1 holds for all S0 h i attainable z ∈ Γgood and Pr ZS01 ∈ Γbad ≤ ϵ2 . Then, it holds that ∥pS11 (·) − pS01 (·)∥ ≤ ϵ1 + ϵ2 , where the expectation is taken over the distribution of ZS01 . 2.4 Useful Inequalities We use the following inequalities multiple times in the proof. n Y (1 − xi ) ≥ 1 − i=1 n X xi if 0 ≤ xi ≤ 1 for all i (2) i=1 n X i=1 ik ≥ nk+1 for k ≥ 1 k+1 11 (3) Lemma 2 (Markov’s inequality). Let X be a non-negative random variable and a > 0. It holds that Pr[X ≥ a] ≤ Ex [X] /a. Lemma 3 (Chebyshev’s inequality). Let X be a random variable and t > 0. It holds that Var[X] . Pr[X ≥ Ex [X] + t] ≤ t2 Lemma 4 (Bonferroni’s inequality). For events A1 , ..., An , it holds that n X X Pr[Ai ] − i=1 Pr[Ai ∧ Aj ] ≤ Pr [∨ni=1 Ai ] ≤ n X Pr[Ai ]. i=1 1≤i<j≤n The upper bound is usually called the union bound. Lemma 5. Let M, q be positive integers. If (λ1 , ..., λq ) ∈ [M ]q are uniformly randomly distributed, then the number of collisions C = {(i, j) ∈ [q]2 : (i < j) ∧ (λi = λj )} satisfies the following inequalities hold for any t > 0: q2 q2 q2 q2 Ex [C] ≤ , Var[C] ≤ , Pr C ≥ +t ≤ . 2M 2M 2M 2M t2 Furthermore, if q 2 < 2M , it also holds that Ex C 2 ≤ q 2 /M, and Ex C 2 ≤ q 4 /2M 2 otherwise. Proof. Let PIi,j equal 1 if λi = λj , and 0 otherwise. It holds that Ex [Ii,j ] = 1/M and C = i<j Ii,j , and Ex [C] = X Ex [Ii,j ] = i<j≤q q2 q(q − 1) ≤ . 2M 2M For the variance, it holds that " Var[C] = Var[ X Ii,j ] = Ex XX i<j k<ℓ Ex Ii,j i<j i<j = X Ii,j − 1 M 1 − M Ik,ℓ − 1 M !2 # = XX i<j k<ℓ 1 Ex Ii,j Ik,ℓ − 2 M We with consider two cases as follows: 1) (i, j) = (k, ℓ), then Ex [Ii,j Ik,ℓ ] = 1/M q 2 possible choices, and 2) |{i, j}∩{k, ℓ}| ≤ 1, then Ex [I I ] = 1/M anyway i,j k,ℓ 2 and the relevant term becomes zero. Overall, it holds that q 1 1 q2 Var[C] = − 2 ≤ , 2 M M 2M 2 and finally Ex C 2 = Var[C] + Ex [C] gives the final statement. ⊔ ⊓ 12 Lemma 6. For X1 , ..., Xk , it holds that v !2 v u k u k k q u X u X X t Xi k · Xi 2 . ≤ tk · Xi2 ≤ i=1 i=1 i=1 In particular, it holds that v  u " k # !2  v u k k q u u X X X u t 2  tEx  Xi ≤ kEx [Xi2 ], Xi ≤ kEx i=1 i=1 i=1 The first inequality is due to Cauchy-Schwartz inequality, and the second is obvious. Although this is usually not tight (but only loss a constant factor for constant k), we use it in the squared-ratio method for deriving a simple upper bound of Ex ϵ1 (τ )2 . 3 Fine-Tuning Security Notions We define a new fine-tuned multi-user pseudorandom function security from the function domain Func∗ (m, n) instead of Func(m, n). Pseudorandom Functions without 0n . Let C : K × {0, 1}n → {0, 1}m be a keyed function with key space K. We will consider an (information-theoretic) distinguisher A that makes oracle queries to CKi for multiple keys Ki for i ∈ [u] and returns a single bit. The advantage of A in breaking the mu-prf ∗ security of C, i.e., in distinguishing C(K1 , ·), . . . , C(Ku , ·) where K1 , . . . , Ku ←$ K from uniformly chosen functions F1 , . . . , Fu ←$ Func∗ (n, m), is defined as h i -prf ∗ (A) = Pr K , . . . , K ← K : ACK1 (·),...,CKu (·) = 1 Advmu 1 u $ C h i − Pr F1 , . . . , Fu ←$ Func∗ (n, m) : AF1 (·),...,Fu (·) = 1 . -prf ∗ (u, q , t) as the maximum of Advmu-prf ∗ (A) over all the disWe define Advmu m C C tinguishers against C for u users making at most qm queries to each user and running in time at most t. When we consider information-theoretic security, we will drop the parameter t. When the global primitive—usually the ideal cipher—is given to an adversary, we take into account the number of queries to the primitive oracle made by -prf ∗ (u, q , p) as the maximum of Advmu-prf ∗ (D) the adversary. We define Advmu m C C over all the distinguishers against C making at most qm construction queries to each of u users and p primitive queries in total. Nonce-based MACs. Given four non-empty sets K, N , M, and T , a noncebased keyed function with key space K, nonce space N , message space M and tag space T is a function F : K × N × M → T . Stated otherwise, it is a keyed 13 function whose domain is a cartesian product N × M. We will sometimes write FK (N, M ) to denote F (K, N, M ). For K ∈ K, let AuthK be the MAC oracle which takes as input a pair (N, M ) ∈ N × M and returns FK (N, M ), and let VerK be the verification oracle which takes as input a triple (N, M, T ) ∈ N × M × T and returns ⊤ (“accept”) if FK (N, M ) = T , and ⊥ (“reject”) otherwise. We assume that an adversary makes queries to the two oracles AuthK and VerK for a secret key K ∈ K. Assuming that, without loss of generality, an adversary never makes the verification query that it received from the MAC query, we say that an adversary forges if its queries to the oracle VerK returns ⊤ for some K. A MAC query (N, M ) made by an adversary is called a faulty query if the adversary has already queried the MAC oracle with the same nonce but with a different message; we sometimes call both the faulty query and the corresponding previous query with the same nonce by a query with a repeated nonce. In the multi-user setting, a (u, µm , qm , vm , t)-adversary against the noncebased MAC security of F is an algorithm A with oracle access to AuthKi and VerKi for i ∈ [i], making at most qm MAC queries, at most µm faulty queries, and at most vm verification queries to AuthKi and VerKi for each i ∈ [i], and runs in time at most t. The multi-user MAC advantage of F against (u, µm , qm , vm , t)-mac (u, µ , q , v , t), is defined by adversary, denoted by Advmu m m m F max Pr K1 , ..., Ku ←$ K : AAuthK1 ,...,AuthKu ,VerK1 ,...,VerKu forges A where max is taken over all (u, µm , qm , vm , t)-adversary A against the noncebased MAC security of F. We occasionally drop the parameter t when we focus on information-theoretic security. When µm = 0, we say that A is nonce-respecting. In this work, we prove the MAC security of F by comparing it with the ideal world of MAC. That is, we consider the ideal world oracles Rand∗i and Reji for i ∈ [u], where Rand∗i returns an independent random value except 0n , instantiated by a truly random function from Func∗ (m, n), and Reji always returns ⊥ for -mac (u, µ , q , v , t) is bounded above by every verification query. Then, Advmu m m m F max Pr K1 , ..., Ku ←$ K : DAuthK1 ,...,AuthKu ,VerK1 ,...,VerKu = 1 D h i ∗ ∗ − Pr DRand1 ,...,Randu ,Rej1 ,...,Reju = 1 where max is taken over all (u, µm , qm , vm , t)-adversary. This is easily proven by using forgery adversary A to distinguish the two worlds. In turn, we mainly focus on showing the above indistinguishability for MAC security. 4 Fine-Tuning Extended Mirror Theory with Upper Bounds Definitions and Notations. We write N = 2n for simplicity. Let r, q, p be fixed nonnegative integers such that r ≤ 2(p + q). The sets P = {P1 , ..., Pr } 14 of unknown variables Pi ∈ {0, 1}n for i ∈ [r], where Pi ̸= Pi′ for i ̸= i′ . We consider two types of relations between variables, equations and non-equations. The system of equations is represented by a sequence of constants (λ1 , ..., λq ) ∈ ({0, 1}n )q along with indices γ1 , . . . , γq , γ1′ , . . . , γq′ ∈ [r] such that γi ̸= γi′′ for any i, i′ ∈ [r] and the equations  Pγ1 ⊕ Pγ1′ = λ1 ,     Pγ3 ⊕ Pγ ′ = λ2 , 2 Γ= : ..   .    ′ Pγ q ⊕ Pγ q = λ q hold. Similarly, a sequence of constants (µ1 , ..., µp ) ∈ ({0, 1}n )p and indices σ1 , . . . , σp , σ1′ , . . . , σp′ ∈ [r] determine the system of inequations Γ ̸=  Pσ1 ⊕ Pσ1′ ̸= µ1 ,     Pσ2 ⊕ Pσ′ ̸= µ2 , 2 : ..   .    Pσp ⊕ Pσp′ ̸= µp where σi ̸= σi′′ for any i, i′ ∈ [r] The overall system is denoted by Γ. When the variables in P are assigned by some values, we will identify the variables with the values assigned to them. Graph-theoretic interpretation. Two systems Γ = (Γ = , Γ ̸= ) give corresponding simple graphs G = = G(Γ = ) = (P, E = ) and G ̸= = G(Γ ̸= ) = (P, E ̸= ). The sets of edges are defined by E = = {(Pγi , Pγi′ ) : i ∈ [q]}, E ̸= = {(Pσi , Pσi′ ) : i ∈ [p]} . Each edge (P, P ′ ) ∈ E = is labeled by (=, λ) if P ⊕ P ′ = λ is included in Γ = and ⋆ (P, P ′ ) ∈ E ̸= is labeled by (̸=, µ) if P ⊕ P ′ ̸= µ. We sometimes write P − P ′ when an edge (P, P ′ ) is labeled with (=, ⋆), and define the label function λ by λ(P, P ′ ) = ⋆. We also define the function µ by µ(P, P ′ ) = ⋆ if (P, P ′ ) is labeled with (̸=, ⋆). Throughout this paper, we only consider the graph G = with no loops, i.e., that is acyclic. For the graph of equations G = , let L be a trail of ℓ-length λ1 λ2 λℓ L : V0 − V1 − · · · − Vℓ . We can naturally extend λ to the trails by defining def λ(L) = λ1 ⊕ λ2 ⊕ · · · ⊕ λℓ , def and we say that L is λ(L)-labeled. Since G = is acyclic, λ(V0 , Vℓ ) = λ(L) is well-defined. If V and V ′ are not connected, we define λ(V, V ′ ) = ⊥. 15 Recall that the equation graph G = is acyclic. Also, since the variables in P take the different values, G = must satisfy that λ(L) ̸= 0 for any trail L we say that the graph is non-degenerated if it satisfies this property. The union graph G = G(Γ ) = (V, E = ∪ E ̸= ) does not contain isolated vertices, i.e., every vertex has a positive degree. We decompose the set of vertices V of the graph G = into its connected components V = C1 ⊔ C2 ⊔ ... ⊔ Cα+β ⊔ D (4) for some α, β ≥ 0, where C1 , ..., Cα are the components of size greater than 2, and Cα+1 , ..., Cα+β denote the components of size 2. Finally, D = {D1 , ..., Ds } denotes the set of isolated vertices (that are connected by the edges in G ̸= ). For each component, we arbitrarily choose a representative Vi ∈ Ci . When we assign a value to Vi , each vertex W ∈ Ci is automatically assigned the value Vi ⊕ λ(Vi , W ) to satisfy the system of equations Γ = . With the representatives, def we define λi (W ) = λ(Vi , W ) for simplicity. Any assignment to the representatives (V1 , ..., Vα+β ) makes all equations in the system Γ = be satisfied. Still, the assignment may not satisfy one of the conditions that 1. the assignments to P are different, and 2. some non-equations from Γ ̸= . We also need to assign some values to the vertices in D. Below, we clarify when the assignment satisfies the conditions, which can be written in terms of the non-equations. Non-equations in the graph. Recall that P ∗2 denotes the set of pairs of ̸= different vertices included in the same set. We write Ei,j ⊂ Ci × Cj for i ̸= j 3 to denote the set of non-equations connecting vertices in Ci and Cj . We first consider that the assignments of P should be different. Fix arbitrary assignments of the representatives. Consider two vertices (W, W ′ ) ∈ P ∗2 such that W ∈ Ci and W ′ ∈ Cj . If i = j, W and W ′ take different values due to the non-degeneracy regardless of the assignments of the representatives. For i ̸= j, the condition W ̸= W ′ implies the non-equation Vi ⊕ λi (W ) ̸= Vj ⊕ λj (W ′ ). (5) Now we consider the edges in E ̸= with respect to Equation (4). Let V ⊕V ′ = ̸ µ ̸= be a non-equation in Γ ̸= for (V, V ′ ) ∈ Ei,j . For ν := µ ⊕ λi (V ) ⊕ λj (V ′ ), this non-equation can be written as Vi ̸= Vj ⊕ ν. 3 We assume that E ̸= does not contain an edge connecting two vertices in the same component, which trivially holds or induces a contradiction. 16 If there is (W, W ′ ) ∈ Ci ×Cj such that ν = λi (W )⊕λj (W ′ ) holds, then we say the non-equation V ⊕ V ′ ̸= µ is trivial, because it can be derived from Equation (5). ̸= give the same ν, we say that they are equivalent. Also, if two non-equations in Ei,j ̸= We assume that Γ does not include trivial non-equations or equivalent nonequation pairs. ̸= | be the number Let ci := |Ci | be the number of vertices in Ci and vi,j = |Ei,j of ̸=-labeled edges connecting a vertex in Ci and a vertex in Cj . We write Ni,j to denote the set of constants representing the non-equations between Vi and Vj for i ̸= j: {λi (W ) ⊕ λj (W ′ )}(W,W ′ )∈Ei,j ∪ {µ ⊕ λi (V ) ⊕ λj (V ′ )}(V,V ′ )∈Ti,j :[V ⊕V ′ ̸=µ]∈Γ ̸= , where the assignments of Vi and Vj must obey the condition Vi ∈ / Ni,j ⊕ Vj . Note that the size of Ni,j is computed by ci cj + vi,j because we assume that the graph does not have trivial or equivalent non-equations. Define a set Ni := ∪j<i Ni,j , We say that Γ (and G(Γ )) is nice if G = is a non-degenerated acyclic bipartite graph, and for any (λ, ̸=)-labeled edge between (P, Q), there is no λ-labeled trail between P and Q in G = . Counting the number of solutions. For the system Γ with its associated graph G = G(Γ ), we write the set of the solutions, or the valid assignments to {V1 , ..., Vα+β } ∪ D, of G by S(G), and denote the number of solutions by h(G) = |S(G)|. We use the following notations in the analysis. – For a set I ⊂ [α + β], SI denotes the set of partial assignments to {Vi }i∈I that satisfying all the conditions, or solutions, and hI := |SI | be the number of solutions for {Vi }i∈I . If I = [i] for some i ≤ α + β, we simply use Si and hi instead of SI and hI , respectively. – Recall that vi,j denotes the number of ̸=-labeled edges between Ci and Cj . Let vi be the number of ̸=-labeled edges connecting a vertex in Ci and Cj P for some j < i, so that vi = j<i vi,j . Let vj,I be the number of ̸=-labeled edges connecting Cj and Ci for some i ∈ I. For the set Ni,j of constants representing the non-equations between Vi , Vj , define Ni,j (Vj ) = Ni,j ⊕ Vj . – For a set I ⊂ [α + β], we write CI to denote the set of vertices ∪i∈I Ci . The number of vertices are denoted by ci = |Ci | and CI = |CI |. When I = [i], we simply write Ci instead of CI . Let ξmax := maxi {ci }. We also define the following sets for i ∈ [α + β]: def Ri = (V1 , V1′ , V2 , V2′ ) ∈ Ci ∗2 × Cj ∗2 j < i and λ(V1 , V1′ ) = λ(V2 , V2′ ) . (6) Theorem 3 (Mirror Theory for ξmax > 2). Let G ge a nice graph, let q denote the number of edges of G, and qc denote the number of edges of C1 ⊔· · ·⊔Cα . When q ≤ 4ξN and 0 < qc ≤ q, it holds that max   18v + 2 h(G)(N − 1)q  − 1 ≤ exp  (N )|V|  α+β P |Ri | + 2 i=1 α P i=1 N 17 c2i 31qc q 2 + 2qc2 + α P i=1 N2  c2i +  20q 4   − 1. 3 N  In particular, if 18v + 2 α+β P |Ri | + 2 i=1 α P i=1 31qc q 2 + 2qc2 c2i i=1 + N α P N2 c2i + 20q 4 ≤1 N3 we have q h(G)(N − 1) −1 ≤ (N )|V| 36v + 4 α+β P |Ri | + 4 i=1 α P i=1 c2i 62qc q 2 + 4qc2 i=1 + N α P c2i N2 + 40q 4 . N3 This theorem combines Theorem 5 and Theorem 7 — the lower and upper bounds of h(G)(N − 1)q (N )|V| where the statements and proofs are in Sections 4.1 and 4.3 and the final statement is from ex ≤ 1 + 2x for x ≤ 1. Below, we give a Mirror theory for equations systems with all component sizes 2. Theorem 4 is used in a multi-user security proof of XoP. N Theorem 4 (Mirror Theory with ξmax = 2). Let q ≤ 13 and qc = 0. Then, it holds Pq 3 i=1 |Ri | 3q 2 h(G)(N − 1)q 10(n + 1)2 − 1 ≤ exp + 2 + − 1. (N )Cα+β N N N Further, if 3 Pq i=1 |Ri | N + 3q 2 N2 + 10(n+1)2 N 6 h(G)(N − 1)q −1 ≤ (N )Cα+β Pq i=1 N < 1, it holds |Ri | + 6q 2 20(n + 1)2 + . 2 N N Proof. By Theorem 8 (see Section 4.4 for details), we have Pq 3 i=1 |Ri | 147q 3 h(G)(N − 1)q 10(n + 1)2 − 1 ≤ exp + + − 1, (N )Cα+β N N3 N and by Theorem 6 (deferred to the end of this section), we have 1− 128q 3 N3 h(G)(N − 1)q 2q 2 128q 3 8(n + 1)3 ≤ 2 + + 3 (N )Cα+β N N 3N 2 2 2q 128q 3 8(n + 1)3 ≤ exp + + − 1. N2 N3 3N 2 147q 3 N3 2q 2 N2 , 3 2 and 8(n+1) ≤ 10(n+1) , we conclude with 3N 2 N P q 3 i=1 |Ri | 3q 2 h(G)(N − 1)q 10(n + 1)2 − 1 ≤ exp + 2 + − 1. (N )Cα+β N N N Since ≤ ≤ The last statement can be proved using the fact exp (X)−1 ≤ 2X for X < 1. ⊔ ⊓ 18 Theorems 5 and 6 are Mirror theory lower bounds for equations systems with all component sizes being 2, and with all component sizes larger or equal to 2, separately. They are used in the multi-user security proof of DbHtS discussed in Section 7. Specifically, Theorem 6 is used in the proof of Theorem 12 and Theorem 5 is used in the proof of Theorem 13. Theorem 5 (Lower Bound Mirror Theory for ξmax > 2). Assume that 8q ≤ N . It holds that P 9qc2 1≤i≤α c2i 31qc q 2 16q 4 18v h(G)(N − 1)q ≥1− − − − . 2 2 (N )|V| 8N N N3 N The proof of this theorem is deferred to Section 4.1. Theorem 6 (Lower Bound Mirror Theory for ξmax = 2). Let q ≤ qc = 0. Then, it holds that N 13 and h(G)(N − 1)q 2q 2 128q 3 8(n + 1)3 ≥1− 2 − − . (N )Cα+β N N3 3N 2 The proof of this theorem is deferred to Section 4.2. 4.1 Proof of Mirror Theory - Lower Bound for ξmax > 2 Below, we describe the proof of Theorem 5, which is a Mirror theory lower bound for equations systems with all component sizes larger or equal to 2. Many parts of the proof are adapted from [16] while we modified some parts for our purpose. The following simple bounds of hI∪{j} in terms of hI for j ∈ / I will be useful. Lemma 7. Recall hI for I ⊂ [α + β] is the number of the valid assignments of {Vi }i∈I . For I ⊊ [α + β] and j ∈ [α + β] \ I, it holds that (N − cj CI − vj,I )hI ≤ hI∪{j} ≤ N hI . In particular, the following inequality holds (N − ci+1 Ci − vi+1 )hi ≤ hi+1 ≤ N hi . Proof. The upper bound is clear because Vj can take one of [N ] values. For the lower bound, fix an assignment VI = {Vi }i∈I ∈ SI . The assignment to Vj cannot take the values inP∪i∈I Ni,j (Vi ). By the union bound, the size of this set is bounded above by i ci cj + vj,I = CI cj + vj,I , and Vj can take at least (N − cj CI − vj,I ) different values for each solution VI . ⊔ ⊓ Components of size > 2. The following lemma shows a rudimentary Mirror Pα lower bound of h(G) for the components of size > 2. Let v (≥3) = i=1 vi . Lemma 8. It holds that Cα2 hα (N − 1)qc ≥1− (N )Cα 19 2 1≤i≤α ci N2 P − 2v (≥3) . N Proof. We first prove the following claim. Claim. For each 0 ≤ i < α such that ci+1 Ci ≤ N , it holds that hi+1 (N − 1)ci+1 −1 ≥1− hi (N − Ci )ci+1 ci+1 Ci N 2 − 2vi+1 . N Proof (of claim). By applying Lemma 7 to hi+1 , we obtain N − ci+1 Ci − vi+1 N (N − 1)ci+1 −1 hi+1 (N − 1)ci+1 −1 ≥ . · hi (N − Ci )ci+1 N (N − Ci )ci+1 The second term is bounded below by ci+1 −1 N (N − 1)ci+1 −1 Ci Ci = 1+ · 1+ (N − Ci )ci+1 N − Ci N − Ci − 1 ci+1 ci+1 Ci Ci ≥1+ , ≥ 1+ N N which gives the overall lower bound 2 ci+1 Ci − 2vNi+1 as we wanted. N 1− ci+1 Ci N − vi+1 N · 1+ ci+1 Ci N ≥ 1− ⊔ ⊓ Now we return to the original proof. If there exists i such that ci+1 Ci ≥ N, the right-hand side is less than 0 as follows so that the inequality becomes obvious: (Cα ci+1 )2 ≥ (Ci ci+1 )2 ≥ N 2 . When ci+1 Ci ≤ N holds for all i ≤ α − 1, we obtain the desired result by multiplying the inequalities from the claim for i = 1, ..., α−1 and using Inequality (2), and the fact that Ci ≤ Cα for i ≤ α. ⊔ ⊓ Components of size 2. The following lemma is for the components of size 2. Pα+β Let v (2) = i=α+1 vi . Lemma 9. Suppose that 4Cα+β + 2 ≤ N . Then it holds that hα+β (N − 1)β 4Cα2 β 4Cα β 2 22β 2 32Cα β 3 16β 4 18v (2) ≥1− − − − − − . 2 2 2 3 3 hα (N − Cα )2β N N N 3N N N Proof. We use the following claim. Claim. For each 0 ≤ i < β such that 4Cα+i + 2 ≤ N , it holds that 4C 2 8Cα i 44i 32Cα i2 64i3 2vi+1 16v (2) hα+i+1 (N − 1) ≥ 1 − 2α − − 2− − 3 − − . hα+i (N − Cα+i )2 N N2 N N3 N N N2 20 Proof (of claim). We adapt the inequality bottom of [16, page 14]. 2 (2) (1 − (N − 1) N − 2Cα+i − vα+i+1 + 4i −16i−8v N hα+i+1 (N − 1) ≥ hα+i (N − Cα+i )2 (N − Cα+i )2 ≥ N 2 − (2Cα+i + 1)N − vα+i+1 N + (4i2 − 16i − 8v)(1 − N 2 − (2Cα+i + 1)N + Cα+i (Cα+i + 1) =1− 4Cα+i N ) 4Cα+i +1 ) N vα+i+1 N + Cα+i (Cα+i + 1) − (4i2 − 16i − 8v (2) )(1 − N 2 − (2Cα+i + 1)N + Cα+i (Cα+i + 1) 4Cα+i +1 ) N 8Cα i 36i 32Cα i2 64i3 8i2 2vα+i+1 4Cα2 16v (2) − − − − − − − N2 N2 N2 N3 N3 N3 N N2 2 2 3 (2) 4C 8Cα i 44i 32Cα i 64i 2vα+i+1 16v ≥ 1 − 2α − − 2− − 3 − − N N2 N N3 N N N2 ≥1− The first inequality is adapted from [16, Bottom of page 14], and the second inequality uses N − 1 ≤ N (in the third term) and (1 − x)(1 − y) ≥ 1 − x − y for x = 1/N and y = 4Cα+i /N (in the last term). In the third inequality, we use that the denominator is less than N 2 /2 because 2Cα+i + 1 ≤ N/2. The last inequality removes some non-dominating terms. ⊔ ⊓ By multiplying the above inequality for i = 0, ..., β − 1, we have: β−1 Y hα+i+1 (N − 1) hα+β (N − 1)β = hα (N − Cα )2β h (N − Cα+i )2 i=0 α+i β−1 Y 8Cα i 44i 32Cα i2 64i3 2vi+1 16v (2) 4C 2 − − − − − ≥ 1 − 2α − N N2 N2 N3 N3 N N2 i=0 β−1 X 4C 2 8Cα i 44i 32Cα i2 64i3 2vi+1 16v α ≥1− + + + + + + N2 N2 N2 N3 N3 N N2 i=0 ≥1− 4Cα β 2 22β 2 32Cα β 3 16β 4 18v (2) 4Cα2 β − − − − − . 2 2 2 3 3 N N N 3N N N The first inequality is from the claim, and the second one is Inequality (2). In the last inequality, we use Inequality (3) and β ≤ N . ⊔ ⊓ Isolated vertices. Finally, we need to exclude the solutions that violate some non-equations connected to D. Let vD be the number of such non-equations. Lemma 10. Suppose that Cα+β + |D| ≤ N/2. It holds that h(G) 2vD ≥1− . hα+β (N − Cα+β )|D| N Proof. For each solution to C1 ⊔ C2 ⊔ ... ⊔ Cα+β , there is (N − Cα+β )|D| valid assignments to the vertices in D ignoring the non-equations. Among them, at 21 most (N −Cα+β )|D|−1 assignments violate each non-equation. Therefore we have h(G) ≥ hα+β · (N − Cα+β )|D| − vD (N − Cα+β )|D|−1 , and the desired inequality follows from the condition Cα+β + |D| ≤ N/2. ⊔ ⊓ Proof of Theorem 5. Observe that 8q ≤ N implies the conditions of all lemmas. Applying Lemmas 8 to 10 in sequence, we have P Cα2 1≤i≤α c2i 4Cα2 β + 4Cα β 2 + 22β 2 h(G)(N − 1)qc +β ≥1− − 2 (N )Cα +β+|D| N N2 32Cα β 3 /3 + 16β 4 2v (≥3) + 18v (2) + 2vD − 3 N N P 2 9qc 1≤i≤α c2i 9qc2 β + 6qc β 2 + 22β 2 16qc β 3 + 16β 4 18v ≥1− − − − 2 2 3 4N N N N P 2 4 9qc2 1≤i≤α c2i 31qc q 16q 18v − − − . ≥1− 8N 2 N2 N3 N − We use 3qc ≥ 2Cα , and v = v (≥3) + v (2) + vD in the first Pα inequality. In the second inequality, we use qc + β = q so that β ≤ q and i=1 ci = Cα ≤ 3qc /2, and finally qc ≥ 1 to suppress the β 2 term. ⊔ ⊓ 4.2 Proof of Mirror Theory - Lower Bound for ξmax = 2 The following concepts and useful auxiliary lemma compute the more refined lower bound for mirror theory with ξmax = 2 —Theorem 6. For m ∈ {2, · · · , q}, let I = {i1 , · · · , im } ⊂ [q] be an index set such that |I| = m. We define def V[I] = {Pγi1 , Pγi′ , · · · , Pγim , Pγi′ }, m 1 def E[I] = {(Pγi1 , Pγi′ , λi1 ), · · · , (Pγim , Pγi′ , λim )}, m 1 def G[I] = (V[I], E[I]), where (Pγ , Pγ ′ , λ) ∈ E[I] represents an edge connecting Pγ and Pγ ′ with label λ. When I = [m], we will simply write Gm to denote G[I]. So Gq = G(Γ ), which is the graph representation of the equation system Γ. We also define def R[I]i = (V1 , V1′ , V2 , V2′ ) ∈ Ci ∗2 × Cj ∗2 i, j ∈ I and j < i and λ(V1 , V1′ ) = λ(V2 , V2′ ) . For k ∈ [m − 1], let J = (j1 , j2 , · · · , jk+1 ) ∈ I k+1 be a sequence of distinct indices in I, and let L = (L1 , · · · , Lk ) ∈ ({0, 1}n \ {0}n )k be a sequence of def n-bit weights. Then we define an edge set (a set of equations) F[J , L] = 22 {(Pγj1 , Pγj′ , L1 ), · · · , (Pγjk , Pγj′ 2 k+1 , Lk )} and a weighted graph (an equation sys- def tem) G[I, J , L] = G[I] ∪ F[J , L]. When h(G[I] ∪ F[J , L]) > 0, we say that G[I] ∪ F[J , L] is valid. We also define subgraphs of G[I, J , L] as follows def G −+ [I, J , L] = G[I] ∪ (F[J , L] \ {(Pγj1 , Pγj′ , L1 )}), 2 G +− def [I, J , L] = G[I \ {jk+1 }] ∪ (F[J , L] \ {(Pγjk , Pγj′ k+1 , Lk )}), def G −− [I, J , L] = G[I \ {jk+1 }] ∪ (F[J , L] \ {(Pγj1 , Pγj′ , L1 ), (Pγjk , Pγj′ 2 k+1 , Lk )}). When I, J , L are clear from the context, we will simply write G ++ = G[I, J , L], G −+ = G −+ [I, J , L], G +− = G +− [I, J , L], G −− = G −− [I, J , L]. Lemma 11 (Orange Equation). Let α = 0. For any positive integer t ∈ {1, · · · , q}, it holds X ht = (N − 2Ct−1 + |Rt |)ht−1 + h(Gt−1 ∪ E), E∈L[Gt ] where L[Gt ] = {(V, V ′ , λt )|0 ≤ i < j < t, V ∈ Ci , V ′ ∈ Cj , h(Gt−1 ∪ (V, V ′ , λt )) > 0}. Proof. For t = 1, · · · q, recall the component Ct has only two vertices and one F def edge and λt be the label of the edge in Ct . Define the set Λt = ( Ci ). We thus i∈[t] have X ht = N − |Λt−1 N − |Λt−1 | − |Λt−1 ⊕ λt | + |Λt−1 [ (Λt−1 ⊕ λt )| (V1 ,...,Vt−1 )∈St−1 X = \ (Λt−1 ⊕ λt )| (V1 ,...,Vt−1 )∈St−1 = (N − 2Ct−1 )ht−1 + X |Λt−1 \ (Λt−1 ⊕ λt )|, (7) (V1 ,...,Vt−1 )∈St−1 where St−1 is the set of solutions to Gt−1 . In particular, we have X \ X X |Λt−1 (Λt−1 ⊕ λt )| = 1(V ⊕ V ′ = λt ). (V1 ,...,Vt−1 )∈St−1 (V1 ,...,Vt−1 )∈St−1 V,V ′ ∈Λt−1 Let us consider following cases for a fixed pair of (V, V ′ ): 1. For each (W, W ′ , V, V ′ ) ∈ Rt , we have X 1(V ⊕ V ′ = λt ) = (V1 ,...,Vt−1 )∈St−1 X (V1 ,...,Vt−1 )∈St−1 23 1 = ht−1 . 2. If V ∈ Ci , V ′ ∈ Cj , i < j < t, then we have 1(V ⊕ V ′ = λt ) = h(Gt−1 ∪ {(V, V ′ , λt )}). X (V1 ,...,Vt−1 )∈St−1 This leads to X 1(V ⊕ V ′ = λt ) X (V1 ,...,Vt−1 )∈St−1 V,V ′ ∈Λt−1   X = X  (V1 ,...,Vt−1 )∈St−1 = |Rt | ht−1 + 1(V ⊕ V ′ = λt ) + (W,W ′ ,V,V ′ )∈Rt X X V ∈Ci ,V 1(V ⊕ V ′ = λt ) ′ ∈C ,i<j<t j h(Gt−1 ∪ E). (8) E∈L[Gt ] ⊔ ⊓ Lemma 11 follows from Equations (7) and (8). Lemma 12 (Purple Equation). Let α = 0. Fix integers m, k such that 1 ≤ k < m ≤ q, an index set I ⊂ [q] such that |I| = m, a sequence of distinct indices J = (j1 , · · · , jk+1 ) ∈ I k+1 and a sequence of labels L = (L1 , · · · , Lk ) ∈ ({0, 1} \ {0n })k . If G ++ (G[I, J , L]) is valid, then it holds X X h(G ++ ) = h(G +− ) − h(G +− ∪ {E}) + h(G +− ∪ {E, E ′ }), {E,E ′ }∈N[G ++ ] E∈M[G ++ ] where M[G ++ ] = {E = (Pγjk , V, Lk ⊕ λjk+1 ⊕ λa ) : V ′ ∈ V[I] \ V[J ], V, V ′ ∈ Ca , h(G +− ∪ {E}) > 0} ∪ {E = (Pγjk , V, Lk ⊕ λa ) : V ′ ∈ V[I] \ V[J ], V, V ′ ∈ Ca , h(G +− ∪ {E}) > 0} N[G ++ ] = {{E, E ′ } = {(Pγjk , V, Lk ⊕ λjk+1 ⊕ λa ), (V ′ , W, λjk+1 } : W, V ′ ∈ V[I] \ V[J ], W ̸= V ′ , V, V ′ ∈ Ca , h(G +− ∪ {E, E ′ }) > 0}. Proof. Without loss of generality, we assume that I = [m], J = {m − k, m − k + 1, · · · , m}. Let S ⊂ ({0, 1}n )2m and S ′ ⊂ ({0, 1}n )2m−2 be the sets of solutions to ′ G ++ and G +− , respectively. For each solution (Pγ1 , Pγ1′ , · · · , Pγm−1 , Pγm−1 ) ∈ S ′, let Pγm = Pγm−1 ⊕ Lk ⊕ λm ′ = Pγ Pγ m m−1 ⊕ Lk . ++ ′ ) is a solution to G Then (Pγ1 , Pγ1′ , · · · , Pγm , Pγm if and only all 2m variables have distinct values. Formally, it requires for any vertex V ∈ Λm−1 , Pγm ̸= V ⇔ Pγm−1 ⊕ Lk ⊕ λm ̸= V ⇔ Pγm−1 ̸= V ⊕ Lk ⊕ λm ′ ̸= V ⇔ Pγ Pγ m ⊕ Lk ̸= V ⇔ Pγm−1 ̸= V ⊕ Lk . m−1 24 Therefore we have X (1 − 1(Pγm−1 ∈ (Λm−1 ⊕ Lk ) ∪ (Λm−1 ⊕ Lk ⊕ λm ))) h(G ++ ) = S∈S ′ = h(G +− ) − X 1(Pγm−1 ∈ (Λm−1 ⊕ Lk )) − S∈S ′ + X X 1(Pγm−1 ∈ (Λm−1 ⊕ Lk ⊕ λm )) S∈S ′ 1(Pγm−1 ∈ (Λm−1 ⊕ Lk ) ∩ (Λm−1 ⊕ Lk ⊕ λm ))). S∈S ′ When Pγm−1 ∈ (Λm−1 ⊕ Lk ⊕ λm ), we know that the vertex V = Pγm must satisfy V ∈ Λm−1 \ V[J ]. Otherwise there exists a trail such that λ(V, Pγm ) = 0 in G ++ , which means G ++ has a circle, invalid, a contradiction. Therefore this solution to G +− is also a solution to G +− ∪ {(Pγm−1 , V ′ , Lk ⊕ λm ⊕ λ(V, V ′ ))}, where V, V ′ are in the same component C. Similarly, when Pγm−1 ∈ (Λm−1 ⊕Lk ), we know there exists a vertex V = Pγm must satisfy V ∈ Λm−1 \ V[J ], and this solution to G +− is also a solution to G +− ∪ {(Pγm−1 , V ′ , Lk ⊕ λ(V, V ′ ))}, where V and V ′ are in the same component C. To summarize, we have X X 1(Pγm−1 ∈ (Λm−1 ⊕ Lk ⊕ λm )) = h(G +− ∪ {E}), S∈S ′ E∈M1 X 1(Pγm−1 ∈ (Λm−1 ⊕ Lk )) = S∈S ′ X h(G +− ∪ {E}), E∈M2 where def M1 = {(Pγm−1 , V, Lk ⊕ λm ⊕ λa ) : V ′ ∈ Λm−1 \ V[J ], V, V ′ ∈ Ca }, def M2 = {(Pγm−1 , V, Lk ⊕ λa ) : V ′ ∈ Λm−1 \ V[J ], V, V ′ ∈ Ca }. When Pγm−1 ∈ (Λm−1 ⊕ Lk ) ∩ (Λm−1 ⊕ Lk ⊕ λm ), we know there exists two distinct vertecies V ′ , W ∈ Λm−1 \ V[J ] such that Pγm−1 = V ′ ⊕ Lk ⊕ λm = W ⊕ Lk . Equivalently, for V such that V ∈ Ca , we have Pγm−1 = V ⊕ Lk ⊕ λm ⊕ λa = V ′ ⊕ Lk ⊕ λm = W ⊕ Lk . And this solution to G +− is also a solution to G +− ∪ {(Pγm−1 , V, Lk ⊕ λm ⊕ λa ), (V ′ , W, λm )}. Therefore, we have X X 1(Pγm−1 ∈ (Λm−1 ⊕ Lk ) ∩ (Λm−1 ⊕ Lk ⊕ λm ))) = h(G +− ∪ {E, E ′ }), S∈S ′ {E,E ′ }∈N[G ++ ] where def N[G ++ ] = {{(Pγm−1 , V, Lk ⊕ λm ⊕ λa ), (V ′ , W, λm )} : W, V ′ ∈ Λm−1 \ V[J ], W ̸= V ′ , V, V ′ ∈ Ca }. ⊔ ⊓ This concludes the proof. Lemma 13 estimates the size of sets L[Gm ], M[G ++ ], and N[G ++ ] using in Lemma 11 and 12. In order to state Lemma 13, we need to reorder the indices of Gq ; note 25 that any reordering of the indices does not affect the number of solutions to Gq . For the edge set {(Pγ1 , Pγ1′ , λ1 ), · · · , (Pγq , Pγq′ , λq )}, we choose as many different label λ as possible, put them in a separate list, remove them from the edge set, and perform the same procedure recursively for the remaining elements. This procedure defines a reordering of the edges (indices) and with it, we have max λ∈{0,1}n \{0n } {|{k ≤ m : λk = λ}|} ≤ |Rm+1 | . (9) Lemma 13 (Size Lemma). Fix integer m, k, n such that 2 ≤ k < m ≤ t ≤ q. Then, it holds that |L[Gm ]| = (m − 1 − |Rm |)(m − 2 − |Rm |). For an index set I ⊂ [t] such that |I| = m, a sequence of distinct indices J = (j1 , · · · , jk+1 ) ∈ I k+1 and a sequence of labels L = (L1 , · · · , Lk ) ∈ ({0, 1} \ {0n })k . If G ++ (G[I, J , L]) is valid, then it holds M[G −+ ] − 4(|Rt+1 | + 1) ≤ M[G ++ ] ≤ 2r, N[G −+ ] − 4r(|Rt+1 | + 1) ≤ N[G ++ ] ≤ r2 . When k = 1, it holds 2m − |R[I]m | − 4(|Rt+1 | + 1) ≤ M[G ++ ] ≤ 2r, L[G −+ ] − 4r(|Rt+1 | + 1) ≤ N[G ++ ] ≤ r2 . Proof. 1. For the first equality, we first recall the definition of L[Gi ] = {(V, V ′ , λm )|0 ≤ j1 < j2 < m, V ∈ Cj1 , V ′ ∈ Cj2 , h(Gi−1 ∪ (V, V ′ , λm )) > 0}. Since λj1 ̸= λm and λj2 ̸= λm otherwise the resulting graph is invalid. The number of such edge is (m − 1 − |Rm |)(m − 1 − |Rm | − 1), (10) which proves the statement. 2. We then prove the second inequality. Note that M[G ++ ] ⊂ M[G −+ ] when k ≥ 2. We consider the edge in M[G −+ ] \ M[G ++ ], which is of the form either (Pγjk , V, Lk ⊕ λjk+1 ⊕ λa ) or (Pγjk , V, Lk ⊕ λa ) for V ′ ∈ (V[I] \ V[J ]) ∪ Cj1 and V, V ′ ∈ Ca . Such an edge falls into at least one of the following three cases. (a) V ∈ Cj1 . At most four edges fall into this case since |Cj1 | = 2 and V has at most two possible assigned values. (b) E = (Pγjk , V, Lk ⊕ λjk+1 ⊕ λa ) for V ′ ∈ V[I] \ V[J ] and V, V ′ ∈ Ca . Since E ∈ M[G −+ ] \ M[G ++ ], by M’s definition, we know G ++ and G −− ∪ {E} are valid, while G +− ∪ {E} is invalid. This means λ(V, Pγj1 ) = 0 or λ(V, Pγj′ ) = 0. For the case λ(V, Pγj1 ) = 0, we have 1 def λa = L1 ⊕ · · · ⊕ Lk ⊕ λγj2 ⊕ · · · ⊕ λγjk+1 ( = λ). 26 The number of such edges E is at most |{a ≤ t : λa = λ}| where by Equation 9, |{a ≤ t : λa = λ}| ≤ |Rt+1 | . Similarly, the number of edges satisfying λ(V, Pγj′ ) = 0 is at most |Rt+1 | . 1 (c) E = (Pγjk , V, Lk ⊕ λa ) for V ′ ∈ V[I] \ V[J ] and V, V ′ ∈ Ca . Similarly to Case 2, the total number of edges of this type is at most 2 |Rt+1 |. Moreover, |M[G ++ ]| ≤ 2r. We conclude that M[G −+ ] − 4(|Rt+1 | + 1) ≤ M[G ++ ] ≤ 2r. (11) 3. We then prove the third inequality. Note that N[G ++ ] ⊂ N[G −+ ] when k ≥ 2. We consider the pair of edges in N[G −+ ] \ N[G ++ ], where the edge E is of the form (Pγjk , V, Lk ⊕ λjk+1 ⊕ λa ) for V ′ ∈ (V[I] \ V[J ]) ∪ Cj1 and V, V ′ ∈ Ca and the edge E ′ is of the form (V ′ , W, λjk+1 ) for W ∈ (V[I] \ V[J ]) ∪ Cj1 , W ̸= V ′ . Such a pair {E, E ′ } falls into at least one of the following three cases. (a) V ′ ∈ (V[I] \ V[J ]) ∪ Cj1 and W ∈ Cj1 . Since |(V[I] \ V[J ]) ∪ Cj1 | ≤ r, the number of pairs of edges is at most 2r. (b) W ∈ (V[I] \ V[J ]) ∪ Cj1 and V ′ ∈ Cj1 . Similarly to case 1, the number of such pairs of edges is at most 2r. (c) V ′ , W ∈ (V[I]\V[J ]). By N’s definition, we know G ++ and G −− ∪{E, E ′ } are valid, while G +− ∪ {E, E ′ } is invalid. This means λ(V, Pγj1 ) = 0 or λ(V, Pγj′ ) = 0 or λ(W, Pγj1 ) = 0 or λ(W, Pγj′ ) = 0. For the case 1 1 λ(V, Pγj1 ) = 0, we have def λa = L1 ⊕ · · · ⊕ Lk ⊕ λγj2 ⊕ · · · ⊕ λγjk+1 ( = λ). The number of such edges E is at most |{a ≤ t : λa = λ}| where by Equation 9 |{a ≤ t : λa = λ}| ≤ |Rt+1 | . So the number of edge pair {E, E ′ } of this type is at most |Rt+1 | r. The number of edge pairs for the other three cases follows the same upper bound. Moreover, |N[G ++ ]| ≤ r2 . We conclude that N[G −+ ] − 4r(|Rt+1 | + 1) ≤ N[G ++ ] ≤ r2 . (12) 4. We then turn to the fourth inequality. When k = 1, we define the edge set M′ whose edge is of the form either (Pγj1 , V, L1 ⊕ λj2 ⊕ λa ) or (Pγj1 , V, L1 ⊕ λa ) for V ′ ∈ (V[I] \ V[J ]) ∪ Cj1 and V, V ′ ∈ Ca . Note that |M′ | = 2m − |R[I]j2 | ≥ 2m−|R[I]m | and M[G ++ ] ⊂ M′ . We then follow a similar analysis procedure as that in the proof of the second inequality and can conclude that 2m − |R[I]m | − 4(|Rt+1 | + 1) ≤ M[G ++ ] ≤ 2r. 27 (13) 5. We finally turn to the fifth inequality. When k = 1, we define the pairs of edges set N′ where E = (Pγj1 , V, L1 ⊕ λj2 ⊕ λa ) and E ′ = (V ′ , W, λj2 ) such that W, V ′ ∈ V[I] \ V[J ], W ̸= V ′ , V, V ′ ∈ Ca , h(G +− ∪ {E ′ }) > 0. Then we have N[G ++ ] ⊂ N′ and |N′ | = |L[G −+ ]| since L[G −+ ] is obtained by collecting E ′ for all {E, E ′ } ∈ N′ . We then follow a similar analysis procedure as that in the proof of the third inequality and can conclude that L[G −+ ] − 4r(|Rt+1 | + 1) ≤ N[G ++ ] ≤ r2 . (14) By Equation (10) and inequalities (11) to (14), the proof is completed. ⊔ ⊓ The following combinatorial lemma proved by [17] is used in our Mirror theory statement. Lemma 14. Let t be a positive integer, and let (Dm,k )m,k be a two-dimensional sequence of non-negative numbers, where 1 ≤ m ≤ t and k ≤ m − 1. If Dm,k = 0 for k ≤ 0, and Dm,k ≤ Dm−1,k−1 + 2A · Dm−1,k + A2 · Dm−1,k+1 + C , (N − 2A)t−m+k for 2 ≤ m ≤ t and k ≤ m − 3, where A, C are positive constants and A < 2n−1 . Then, for any integer c such that 1 ≤ c ≤ m 2 − 1, it holds Dm,1 ≤ 2c X 2c i=c i i A Dm−c,1−c+i + 2j c−1 X X 2j j=0 i=j i Ai C . (N − 2A)t−m+1+i t We define the following two-dimensional sequence Dm,k where t is a fixed positive integer such that t ≤ q, 1 ≤ m ≤ t and k is an integer – When 1 ≤ k ≤ m − 1, t Dm,k = max I,J ,L h(G −+ [I, J , L]) − h(G[I, J , L]) N , where the maximum is taken over all possible index sets I ⊂ [t] such that |I| = m, sequence of distinct indices J ∈ I k+1 , and sequence of labels L ∈ ({0, 1}n \ {0n })k such that G[I, J , L] is valid. t – When k ≤ 0, Dm,k = 0. t In order to upper bound Dm,k , we begin with the following lemma. Lemma 15. For any I ⊂ [t], J ∈ I k+1 , L ∈ ({0, 1}n \ {0n })k such that |I| = m and G[I, J , L] is valid, one has h(G[I, J , L]) ≤ h(Gt ) . (N − 2r)t−m+k 28 Proof. Without loss of generality, let I = [m] and S be the set of solution to Gm . ′ ′ ) ∈ S, (Pγ , Pγ ′ , · · · , Pγ For each solution (Pγ1 , Pγ1′ , · · · , Pγm , Pγm , Pγm+1 ) is 1 m+1 1 ′ a solution to Gm+1 if for all V ∈ Λm , Pγm+1 ̸= V, Pγm+1 ̸= V. Therefore, we have X ′ }) h(Gm+1 ) ≥ (N − {V ∈ Λm : V = Pγm+1 } ∪ {V ∈ Λm : V = Pγm+1 S∈S ≥ (N − 2r)h(Gm ). By repeatedly applying the above inequality, we have h(Gm ) ≤ h(Gt ) , (N − 2r)t−m (15) which completes the statement when k = 0. When k ≥ 1, let L ∈ ({0, 1}n \ {0n })k and without loss of generality let ′ ) to L = {m − k, m − k + 1, · · · , m}. For each solution (Pγ1 , Pγ1′ , · · · , Pγm , Pγm ′ ′ ) is a G ++ (let its solution set be S ′ ), (Pγ1 , Pγ1′ , · · · , Pγm−k , Pγm−k , · · · , Pγm , Pγm ′ solution to G −+ if for all V ∈ Λm \ Cm−k , Pγm−k ̸= V, Pγm−k ̸= V. Therefore, we have X ′ h(G −+ ) ≥ (N − {V ∈ Λm \ Cm−k : V = Pγm−k } ∪ {V ∈ Λm \ Cm−k : V = Pγm−k }) S∈S ′ ≥ (N − 2r)h(G ++ ). By repeatedly applying the above inequality, we have h(G ++ ) ≤ h(Gm ) . (N − 2r)k (16) Combining Equation (15) and (16), we complete the proof. ⊔ ⊓ By lemma 15, for G ++ (= G[I, J , L]), h(Gt ) h(Gt ) h(G −+ ) ≤ ≤ . N (N − 2r)t−m+k−1 (N − 2r)t−m+k t Therefore, using the above inequality and Dm,k ’s definition, t Dm,k ≤ max{ h(Gt ) h(G −+ ) , h(G ++ )} ≤ . N (N − 2r)t−m+k (17) t Lemma 16 shows that our constructed two-dimensional sequence Dm,k satisfies the condition required for using combinatorial Lemma 14. This proof is based on purple equation (Lemma 12), size lemma (Lemma 13) and Lemma 15. Lemma 16. For 2 ≤ m ≤ t, and k ≤ m − 3, it holds that t t t t Dm,k ≤ Dm−1,k−1 + 2r · Dm−1,k + r2 · Dm−1,k+1 + where def C = (6 |Rt+1 | + 6)h(Gt ) . N 29 C , (N − 2r)t−m+k t Proof. When m = 2 or 3, it is easy to see the statement holds since by Dm,k ’s t definition Dm,k = 0 when k ≤ 0. When m ≥ 4 and 2 ≤ k ≤ m − 3, for any G[I, J , L] such that |I| = m, J ∈ I k+1 , and L ∈ ({0, 1}n \ {0n })k , by purple equation (Lemma 12), we have X X h(G ++ ) = h(G +− ) − h(G +− ∪ {E}) + h(G +− ∪ {E, E ′ }), {E,E ′ }∈N[G ++ ] E∈M[G ++ ] (18) h(G −+ ) = h(G −− X )− h(G −− X ∪ {E}) + E∈M[G −+ ] h(G −− ∪ {E, E ′ }). {E,E ′ }∈N[G −+ ] (19) t By Dm,k ’s definition and since G −− = (G +− )−+ , we have h(G −− ) t . − h(G +− ) ≤ Dm−1,k−1 N t For each edge E ∈ M[G ++ ], by Dm,k ’s definition, we have h(G −− ∪ {E}) t . − h(G +− ∪ {E}) ≤ Dm−1,k N Using the above inequality, we have X E∈M[G −+ ] ≤ h(G −− ∪ {E}) − N h(G X E∈M[G ++ ] h(G +− ∪ {E}) E∈M[G ++ ] −− ∪ {E}) − h(G +− ∪ {E}) + N t ≤ 2r · Dm−1,k + 4(|Rt+1 | + 1) t ≤ 2r · Dm−1,k + X h(G X E∈M[G −+ ]\M[G ++ ] h(G −− ∪ {E}) N −− ∪ {E}) N (by size lemma (Lemma 13)) 4(|Rt+1 | + 1)h(Gt ) . N (N − 2r)t−m+k (by Lemma 15) For each pair of edge {E, E ′ } ∈ N[G ++ ], since G −− ∪{E, E ′ } = (G +− ∪{E, E ′ })−+ , we have h(G −− ∪ {E, E ′ }) t − h(G +− ∪ {E, E ′ }) ≤ Dm−1,k+1 . N Using the above inequality, Lemma 13 and 15, we have X E∈N[G −+ ] h(G −− ∪ {E, E ′ }) − N t ≤ r2 Dm−1,k+1 + X E∈N[G ++ ] 4r(|Rt+1 | + 1)h(Gt ) . N (N − 2r)t−m+k+1 30 h(G +− ∪ {E, E ′ }) By subtracting Equation (18) from above, we have 1 N× Equation (19) and combine everything h(G −+ ) − h(G ++ ) N 4(|Rt+1 | + 1)h(Gt ) 4r(|Rt+1 | + 1)h(Gt ) t + r2 · Dm−1,k+1 + t−m+k N (N − 2r) N (N − 2r)t−m+k+1 (6 |Rt+1 | + 6)h(Gt ) t t t . ≤ Dm−1,k−1 + 2r · Dm−1,k + r2 · Dm−1,k+1 + N (N − 2r)t−m+k (∵ N 2r −2r ≤ 1) t t ≤ Dm−1,k−1 + 2r · Dm−1,k + When m ≥ 4 and k = m − 3 = 1, for any G[I, J , L] such that |I| = m, J = j1 , j2 ∈ I 2 and L ∈ {0, 1}n \ {0n }. By Purple equation (Lemma 12) and Orange equation (Lemma 11), respectively, we have X X h(G ++ ) = h(G +− ) − h(G +− ∪ {E}) + h(G +− ∪ {E, E ′ }), {E,E ′ }∈N[G ++ ] E∈M[G ++ ] (20) h(G −+ ) = h(G −− ) − (2m − 2 − |R[I]m |)h(G −− ) + X h(G −− ∪ E). (21) E∈L[G −+ ] Since G +− = G −− , we have h(G −− ) − h(G +− ) = 0. For each edge E ∈ M[G ++ ], t by Dm,k ’s definition, we have h(G −− ) t . − h(G +− ∪ {E}) ≤ Dm−1,1 N Using the above inequality, we have (2m − 2 − |R[I]m |) ≤ X E∈M[G ++ ] h(G −− ) − N X h(G +− ∪ {E}) E∈M[G ++ ] h(G −− ) h(G ) − h(G +− ∪ {E}) + 2m − 2 − |R[I]m | − M[G ++ ] N N −− h(G −− ) N 4(|R | + 1)h(G t+1 t) t ≤ 2r · Dm−1,1 + . N (N − 2r)t−m+1 t ≤ 2r · Dm−1,1 + 4(|Rt+1 | + 1) (by Lemma 13) For each pair of edge {E, E ′ } ∈ N[G ++ ], since each edge E uniquely determines an edge E ′ , we have h(G −− ∪ {E}) t − h(G +− ∪ {E, E ′ }) ≤ Dm−1,2 . N 31 It implies that X E∈L[G −+ ] h(G −− ∪ {E}) − N t ≤ r2 · Dm−1,2 + X h(G +− ∪ {E, E ′ }) {E,E ′ }∈N[G ++ ] 4r(|Rt+1 | + 1)h(Gt ) . N (N − 2r)t−m+2 By subtracting Equation (20) from we have 1 N× Equation (21) and combining the above, h(G −+ ) − h(G ++ ) N t Dm,1 = max I,J ,L t t ≤ 2r · Dm−1,1 + r2 · Dm−1,2 + (6 |Rt+1 | + 6)h(Gt ) . N (N − 2r)t−m+1 ⊔ ⊓ This concludes the proof. t When k = 1, Lemma 17 gives a sharper upper bound on Dt,1 . The proof can be derived from Lemma 16 and 14. Lemma 17. If 2n + 2 ≤ t < q and r ≤ t Dt,1 ≤ N 13 , then it holds that (29 |Rt+1 | + 31)h(Gt ) . N2 32 t Proof. Since the two-dimensional sequence Dm,k satisfies Lemma 16, let n ≤ m − 1 (the c in Lemma 14), then we can apply Lemma 14 to obtain 2 t Dm,1 2j 2n n−1 X XX 2n i t 2j ri C ≤ r Dm−n,1−n+i + i i (N − 2r)t−m+1+i i=n j=0 i=j 2j 2n n−1 X XX 2j ri C t (2e)i ri Dm−n,1−n+i + i (N − 2r)t−m+1+i i=n j=0 i=j 2ne i i ( 2n i ≤ ( i ) ≤ (2e) when n ≤ i ≤ 2n) 2j n−1 2n XX X 2j ri C h(Gt ) i + ≤ (2er) (N − 2r)t−m+1+i j=0 i=j i (N − 2r)t−m+1+i i=n ≤ (by Inequality 17) = ≤ ≤ ≤ h(Gt ) (N − 2r)t−m+1 h(Gt ) (N − 2r)t−m+1 2n X i=n ∞ X ( 2j n−1 XX 2er i ) + N − 2r j=0 (1/2)i + i=n j=0 i=j 2j n−1 XX 1 2h(Gt ) + (N − 2r)t−m+1 2n j=0 i=j 2j n−1 XX i=j 2j ri C i (N − 2r)t−m+1+i ri C 2j i (N − 2r)t−m+1+i (r ≤ N 13 ) 2j ri C i (N − 2r)t−m+1+i 1 4C 2h(Gt ) + . (N − 2r)t−m+1 2n (N − 2r)t−m+1 Now, plug in m = t and have 2h(Gt ) 1 4C + (N − 2r) 2n (N − 2r) 26 13 h(Gt ) (24 |Rt+1 | + 24)h(Gt ) ≤ 11 2 + 11 N N2 N , substitute C defined in Lemma 16) (∵ r ≤ 13 t Dt,1 ≤ = (29 |Rt+1 | + 31)h(Gt ) . N2 ⊔ ⊓ This concludes the proof. Finally, using the above, we can prove Theorem 6 as follows: Proof (of Theorem 6). Recall when qc = 0, we have α = 0. We also know in this equation system Ci = 2i, since each component has size only 2. To recursively −1)q i+1 (N −1) , we first lower bound hi (Nh−C compute the lower bound for h(G)(N (N )C i )(N −Ci −1) β for i = 0, · · · , 2n + 1 and i = 2n + 2, · · · , q, separately. i+1 (N −1) To lower bound each term of hi (Nh−C , we first lower bound hi+1 i )(N −Ci −1) by hi for i = 0, · · · , 2n + 1 and i = 2n + 2, · · · , q − 1, separately. By lemma 7, we 33 simply have (N − ci+1 Ci )hi ≤ hi+1 for i = 0, · · · , 2n + 1 as vi+1 = 0 in graph G, which represents the equation system Γ. For i ≥ 2n + 2, we first replicate part of the proof of Lemma 11 and have X hi+1 = (N − 2Ci )hi + |Ri+1 |hi + {V,V ′ }∈L h′ (V, V ′ ), (22) i+1 where recall h′ (V, V ′ ) denote the number of solutions to Λi such that V ⊕ V ′ = def λ(V, V ′ ) = ⊥ . We also have λi+1 for V, V ′ ∈ Λi , and Li+1 = {V, V ′ } ∈ Λ∗2 i 2 |Li+1 | = Ci (Ci − 2) = 4i − 4i. Then by Lemma 17, we have hi h (V, V ) ≥ N ′ ′ (29 |Ri+1 | + 31) 1− N . Plugging in Equation (22), we have hi+1 ≥ 4i2 − 4i 116 |Ri+1 | i2 − 116 |Ri+1 | i + 124i2 − 124i N − 4i + |Ri+1 | + − N N2 hi . For i = 2n + 2, · · · , q, plugging in the above inequality, we have hi+1 (N − 1) ≥ hi (N − Ci )(N − Ci − 1) (N − 1) N − 4i + |Ri+1 | + 4i2 −4i N − 116|Ri+1 |i2 −116|Ri+1 |i+124i2 −124i N2 N 2 − (4i + 1)N + 4i2 + 2i 2 N 2 − (4i + 1)N + 4i2 + 128i−128i + (N −1)N |Ri+1N|−116i N ≥ N 2 − (4i + 1)N + 4i2 + 2i ≥1+ ≥1− −2i + 128i−128i2 N N2 (∵ i ≤ q ≤ 2q 128q 2 − . 2 N N3 For i = 1, · · · , 2n + 1, with (N − ci+1 Ci )hi ≤ hi+1 , we have hi+1 (N − 1) (N − ci+1 Ci )(N − 1) ≥ hi (N − Ci )(N − Ci − 1) (N − Ci )(N − Ci − 1) N 2 − (4i + 1)N + 4i = 2 N − (4i + 1)N + 4i2 + 2i 4i2 ≥ 1 − 2. N 34 N 13 ) 2 |Ri+1 | By using the above inequalities, then we have q−1 2n+1 Y Y h(G)(N − 1)q hi+1 (N − 1) hi+1 (N − 1) = × (N )Cβ h (N − C )(N − C + 1) h (N − Ci )(N − Ci + 1) i i i i=0 i=2n+2 i 2n+1 Y q−1 Y 2q 128q 2 ≥ × 1− 2 − N N3 i=2n+2 i=0 4n(n + 1)(2n + 1) 2q 2 128q 3 ≥ 1− 1− 2 − 6N 2 N N3 2 3 3 128q 8(n + 1) 2q ≥1− 2 − − , N N3 3N 2 4i2 1− 2 N ⊔ ⊓ which completes the proof. 4.3 Proof of Mirror Theory - Upper Bound for ξmax > 2 Theorem 7 (Upper Bound Mirror Theory for ξmax > 2). Let G be a nice graph, q denote the number of edges of G, and qc denote the number of edges of C1 ⊔ · · · ⊔ Cα . When q ≤ 4ξN and 0 < qc ≤ q, then it holds that max  2 h(G)(N − 1)q  ≤ exp  (N )Cα+β  α+β P |Ri | + 2 i=1 α P i=1 c2i N 2qc2 + α P i=1  c2i + 4qc q 2 + N2 20q 4   . N3  The proof of Theorem 7 is deferred to the end of this section. Before proving it, we introduce essential lemmas first. Lemma 18. When q ≤ N 4ξmax hi+1 ≤ and 0 < qc ≤ q, for i = 0, · · · , α − 1, it holds that 2(ci+1 )2 qc2 N − ci+1 Ci + |Ri+1 | + N hi . Proof. For a vertex V ∈ Ci+1 , denote the set ΛV = (C1 ⊔ · · · ⊔ Ci ) ⊕ λi+1 (V ). Recall that Si is the set of solutions to (V1 , . . . , Vi ). By Fixing Si and assigning any value to V ∗ ∈ Ci+1 , the other unknowns in Ci+1 are uniquely determined. Hence a solution to hi+1 after fixing Si can be identified to choose a solution to V ∗ from [ {0, 1}n \ ΛV . V ∈Ci+1 35 We thus have an upper bound of hi+1 as follows:   [ X N − | ΛV | (count for every fixed solution in Si ) V ∈Ci+1 (V1 ,...,Vi )∈Si   ≤ X N − X |ΛV | + V ∈Ci+1 (V1 ,...,Vi )∈Si X V,V ′ ∈C  ≤ X (Lemma 4) i+1  N − ci+1 Ci + X V,V (V1 ,...,Vi )∈Si = (N − ci+1 Ci )hi + |ΛV ∩ ΛV ′ | X |ΛV ∩ ΛV ′ | ′ ∈C i+1 X |ΛV ∩ ΛV ′ |. (V1 ,...,Vi )∈Si V,V ′ ∈Ci+1 For V1 , V1′ ∈ Ci+1 , V2 , V2′ ∈ C1 ⊔ · · · ⊔ Ci , let h′ (V1 , V1′ , V2 , V2′ ) denote the number of solutions to C1 ⊔ · · · ⊔ Ci such that V2 ⊕ V2′ = λi+1 (V1 ) ⊕ λi+1 (V1′ ). Let n o def ∗2 Li+1 = {V1 , V1′ }, {V2 , V2′ } ∈ Ci+1 ∗2 × (C1 ⊔ · · · ⊔ Ci ) λ(V2 , V2′ ) = ⊥ . P P Then the summation (V1 ,...,Vi )∈Si V,V ′ ∈Ci+1 |ΛV ∩ ΛV ′ | can be computed by X |Ri+1 |hi + h′ (V1 , V1′ , V2 , V2′ ). ({V1 ,V1′ },{V2 ,V2′ })∈Li+1 This is because the constant in ΛV1 ∩ ΛV1′ satisfies that there exists V2 , V2′ ∈ C1 ⊔ · · · ⊔ Ci such that V2 ⊕ V2′ = λi+1 (V1 ) ⊕ λi+1 (V1′ ). We count the number of such constants by considering two cases: V2 , V2′ are in the same component (the first term) or not (the second term). Let h′′ (V, V ′ ) denote the number of solutions to (C1 ⊔ · · · ⊔ Ci ) \ (CV ⊔ CV ′ ) where V ∈ CV and V ′ ∈ CV ′ . For ({V1 , V1′ }, {V2 , V2′ }) ∈ Li+1 , we have: h′ (V1 , V1′ , V2 , V2′ ) ≤ N · h′′ (V2 , V2′ ) (Upper bound of Lemma 7) N hi ≤ (Lower bound of Lemma 7) (N − ξmax Ci )2 hi 2N ξmax Ci ≤ 1+ N (N − ξmax Ci )2 hi 192ξmax qc ≤ 1+ N 25N 73hi , ≤ 25N where the last two steps are because Ci ≤ 3q2c and qc ≤ q ≤ 4ξN . We also max compute 9(ci+1 )2 qc2 ci+1 Ci (ci+1 )2 Ci2 ≤ . |Li+1 | ≤ ≤ 2 2 4 16 36 Combining all together, we have 2(ci+1 )2 qc2 hi+1 ≤ N − ci+1 Ci + |Ri+1 | + hi . N ⊔ ⊓ This concludes the proof. Lemma 19. For α > 0 and i = α, · · · , α + β − 1, it holds that 3qc q 16q 3 C2 hi + hi+1 ≤ N − 2Ci + |Ri+1 | + i + N N N2 Proof. For i = α, · · · , α + β − 1, recall the component Ci has only two vertices and one edge. Let λi+1 be the label of the edge in Ci+1 for such i in the proof’s def F context. Denote the set Λi = Ci for i = α, · · · , α + β − 1. We thus have j∈[i] X hi+1 = N − |Λi N − |Λi | − |Λi ⊕ λi+1 | + |Λi [ (Λi ⊕ λi+1 )| (V1 ,...,Vi )∈Si X = \ (Λi ⊕ λi+1 )| (V1 ,...,Vi )∈Si = (N − 2Ci )hi + X |Λi \ (Λi ⊕ λi+1 )|. (23) (V1 ,...,Vi )∈Si For V, V ′ ∈ Λi , let h′ (V, V ′ ) denote the number of solutions to Λi such that V ⊕ V ′ = λi+1 . Let def Mi+1 = Then we have X |Λi \ {V, V ′ } ∈ Λ∗2 λ(V, V ′ ) = ⊥ . i (Λi ⊕ λi+1 )| = |Ri+1 |hi + X h′ (V, V ′ ). (24) {V,V ′ }∈Mi+1 (V1 ,...,Vi )∈Si Let h′′ (V, V ′ ) denote the number of solution to Λi \ (CV ⊔ CV ′ ) where V ∈ CV and V ′ ∈ CV ′ . Suppose that V ∈ Cj , V ′ ∈ Ck for j, k ≤ i. Applying Lemma 7, we have h′ (V, V ′ ) ≤ N · h′′ (V, V ′ ) N hi ≤ (N − cj Ci )(N − ck Ci ) hi cj Ci ck C i = 1+ 1+ N N − cj Ci N − ck Ci 2cj Ci 2ck Ci hi 1+ + , ≤ N N − cj Ci N − ck Ci 37 where we used cj Ci ≤ ξmax q ≤ N/4 and (1 + x)(1 + y) ≤ 1 + 2(x + y) for x, y ≤ 1. Since Cj has cj vertices, the term related to j is added at most cj Ci times. By cj Ci ≤ N − cj Ci , it holds that (cj Ci )2 (cj Ci )2 2(cj Ci )2 ≤ cj Ci , and ≤ . N − cj Ci N − cj Ci N Summing up over all (V, V ′ ), we have ! Pi 2 Ci2 j=1 2(cj Ci ) hi hi+1 ≤ N − 2Ci + |Ri+1 | + + N N (N − cj Ci )   α i 2 2 X X C 2cj Ci 4(cj Ci )  ≤ N − 2Ci + |Ri+1 | + i + + hi N N N2 j=1 j=α+1 C2 3qc q 16q 3 ≤ N − 2Ci + |Ri+1 | + i + + hi , N N N2 Pα where we use i=1 ci = Cα ≤ 3qc /2, Ci ≤ q, i ≤ β ≤ q and cj = 2 for all j ≥ α + 1 for proving the last inequality. Now we prove the second part of the statement, when α = 0, which means there are no components with a size larger than 2 in the graph. For V, V ′ ∈ Li+1 , if i ≥ 2n + 2, by Lemma 17, then we have (29 |Ri+1 | + 31)hi hi − h′ (V, V ′ ) ≤ , N N2 equivalently, we have hi h (V, V ) ≤ N ′ ′ 29 |Ri+1 | + 31 1+ N Plugging in Equation (24), we have X \ |Λi (Λi ⊕ λi )| = |Ri+1 |hi + X h′ (V, V ′ ) {V,V ′ }∈Li+1 (V1 ,...,Vi )∈Si ≤ C2 |Ri+1 | + i N 29 |Ri+1 | + 31 1+ N hi . Plugging the above inequality into Equation (23), we have 116q 2 124q 2 Ci2 hi+1 ≤ N − 2Ci + |Ri+1 | 1 + + + hi . N2 N N2 ⊔ ⊓ This concludes the proof. Using the above lemmas, we can prove Theorem 7 as follows. 38 Proof (of Theorem 7). We start by finding a relation between hi and hi+1 . Lemma 7 has already shown hi+1 ≤ N hi , while Lemmas 18 and 19 gives us a tighter upper bound of hi+1 using hi , of which the proofs are deferred to the end of this section. We first observe that the non-equations only decrease the number of solutions. So, in the following, we only consider a system of equations Γ = Γ =. N by the constraints 4qξmax ≤ N . To recurNote that ξmax ≥ 3, hence q ≤ 12 sively compute the upper bound for h(G)(N −1)q (N )Cα+β , we first upper bound hi+1 (N −1)ci+1 −1 hi (N −Ci )ci+1 , for i = 0, · · · , α − 1. To do so, we observe (N − Ci )ci+1 ≥ N ci+1 −1 (N − ci+1 Ci+1 ), (25) which is simply because by dividing N ci+1 from both side it is true that c Ci Ci + ci+1 − 1 Ci + ci+1 i+1 ci+1 Ci+1 1− × ··· × 1 − ≥ 1− ≥ 1− . N N N N So we have hi+1 (N − 1)ci+1 −1 hi (N − Ci )ci+1 2(c )2 qc2 (N − 1)ci+1 −1 N − ci+1 Ci + |Ri+1 | + i+1 N ≤ c −1 i+1 N (N − ci+1 Ci+1 ) (by Lemma 18 and Equation (25)) ci+1 Ci+1 − ci+1 Ci |Ri+1 | 2(ci+1 )2 qc2 + + N − ci+1 Ci+1 N − ci+1 Ci+1 N (N − ci+1 Ci+1 ) c2i+1 |Ri+1 | 2(ci+1 )2 qc2 ≤1+ + + N − ci+1 Ci+1 N − ci+1 Ci+1 N (N − ci+1 Ci+1 ) 2 2 2c 2|Ri+1 | 4(ci+1 )2 qc ≤ 1 + i+1 + + . (ci+1 Ci+1 ≤ N2 by 4qξmax ≤ N ) N N N2 ≤1+ Now we can compute α−1 Y hi+1 (N − 1)ci+1 −1 h(Gα )(N − 1)qc = (N )Cα hi (N − Ci )ci+1 i=0 α−1 Y 4(ci+1 )2 qc2 2|Ri+1 | 2c2i+1 + + ≤ 1+ N N N2 i=0 Pα 2 Pα Pα 2 i=1 |Ri | + 2( i=1 ci ) 2qc2 ( i=1 c2i ) ≤ exp + , N N2 39 where we use 1 + x ≤ ex . On the other hand, for i = α, · · · , α + β − 1, hi+1 (N − 1) hi (N − Ci )(N − Ci − 1) C2 (N − 1) N − 2Ci + |Ri+1 | + Ni + ≤ (N − Ci )(N − Ci − 1) 3qc q N + 16q 3 N2 (by Lemma 19) 3 N 2 − (2Ci + 1)N + Ci2 + |Ri+1 | N + 3qc q + 16 qN ≤ N 2 − (2Ci + 1)N + Ci2 3 |Ri+1 | N + 3qc q + 16 qN ≤1+ 2 N − (2Ci + 1)N + Ci2 6 |Ri+1 | 18qc q 96q 3 ≤1+ + + . 5N 5N 2 5N 3 (2(Ci + 1) ≤ 2q ≤ N 6) Now we can compute h(G)(N − 1)q (N )Cα+β α+β−1 Y hi+1 (N − 1)ci+1 −1 = hi (N − Ci )ci+1 i=0 α+β−1 hi+1 (N − 1) h(Gα )(N − 1)qc Y = (N )Cα hi (N − Ci )(N − Ci − 1) i=α α+β−1 Y 6 |Ri+1 | 18qc q 96q 3 ≤ exp (δ1 ) + + 1+ 5N 5N 2 5N 3 i=α ! Pα+β 2 i=α |Ri+1 | 4qc q 2 20q 4 + + ≤ exp (δ1 ) exp N N2 N3 ≤ exp (δ1 + δ2 ) , for δ1 = 2 Pα i=1 and |Ri | + 2( N Pα+β Pα 2 i=1 ci ) Pα 2qc2 ( i=1 c2i ) + N2 4qc q 2 20q 4 + , 2 N N N3 where we use 1+x ≤ ex , β ≤ q, and choose some integer upper bounds. Therefore, we have δ2 = 2 δ = δ1 + δ2 = α+β P 2 i=α+1 |Ri | i=1 N + 2 |Ri | + Pα 2 i=1 ci N + 2qc2 ( Pα 2 i=1 ci ) 2 N + 4qc q 2 + 20q 4 . N3 ⊔ ⊓ This concludes the proof. 40 4.4 Proof of Mirror Theory - Upper Bound for ξmax = 2 Theorem 8 (Upper Bound Mirror Theory). Let G be a nice graph and q N denote the number of edges of G for q ≤ 13 . It holds that P q 3 i=1 |Ri | 72q 3 10(n + 1)2 h(G)(N − 1)q + ≤ exp + . (N )Cα+β N N3 N To prove this theorem, we first state the following lemma: Lemma 20. For i ∈ [2n + 2, β − 1], it holds that 116q 2 C2 124q 2 )+ i + )hi 2 N N N2 Proof. For V, V ′ ∈ Li+1 , if i ≥ 2n + 2, by Lemma 17, then we have hi+1 ≤ (N − 2Ci + |Ri+1 |(1 + (29 |Ri+1 | + 31)hi hi − h′ (V, V ′ ) ≤ , N N2 equivalently, we have hi h (V, V ) ≤ N ′ ′ 29 |Ri+1 | + 31 1+ N Plugging in Equation (24), we have X \ |Λi (Λi ⊕ λi )| = |Ri+1 |hi + X {V,V (V1 ,...,Vi )∈Si ≤ |Ri+1 | + Ci2 N h′ (V, V ′ ) ′ }∈L i+1 1+ 29 |Ri+1 | + 31 N hi . Plugging the above inequality into Equation (23), we have 116q 2 124q 2 Ci2 hi+1 ≤ N − 2Ci + |Ri+1 | 1 + + + hi . N2 N N2 ⊔ ⊓ This completes the proof. Using Lemma 20, we can prove Theorem 8 as follows. Proof (of Theorem 8). Recall when qc = 0, α = 0. To recursively compute −1)q i+1 (N −1) the upper bound for h(G)(N , we first upper bound hi (Nh−C for (N )C i )(N −Ci −1) β i = 2n + 2, · · · , q. For i = 2n + 2, · · · , q, using Lemma 20, we have 2 C2 i N 2 − 2Ci N + |Ri+1 |(N + 116q hi+1 (N − 1) N )+ 2 + ≤ hi (N − Ci )(N − Ci − 1) N 2 − (2Ci − 1)N + Ci2 + Ci 2 2 124q |Ri+1 |(N + 116q N )+ N ≤1+ 2 N − (2Ci − 1)N + Ci2 + Ci 2|Ri+1 | 138|Ri+1 |q 2 147q 2 ≤1+ + + 3 N N N3 2 3|Ri+1 | 147q ≤1+ + . N N3 41 124q 2 N By using the above inequality, we have q−1 2n+1 Y Y h(G)(N − 1)q hi+1 (N − 1) hi+1 (N − 1) = × (N )Cβ hi (N − Ci )(N − Ci − 1) i=2n+2 hi (N − Ci )(N − Ci − 1) i=0 2n+1 Y 2Ci N ≤ 1+ 2 N − (2Ci − 1)N + Ci2 + Ci i=0 q−1 Y 3|Ri+1 | 147q 2 × 1+ (∵ hi+1 ≤ N hi ) + N N3 i=2n+2 Pq q 2n+1 Y 3 i=1 |Ri | 147q 2 5i ≤ 1+ × 1+ + N Nq N3 i=0 (∵ Ci ≤ min{i, N/13} and by Jensen’s Inequality) P 3 10(n + 1)2 3 qi=1 |Ri | eδ1 (substitute δ1 = + 147q ≤ 1+ 3 ) N N N ≤ eδ2 +δ1 , (substitute δ2 = ⊔ ⊓ which completes the proof. 5 10(n+1)2 ) N Multi-User Security of XoP This section proves the multi-user PRF∗ security of XoP such that XoP[P](x) := P (0∥x) ⊕ P (1∥x) where P is an n-bit random permutation. The first theorem follows the paradigm of [13], the Chi-squared method. We slightly adapted the proof because we are not considering the truncation. This is relatively weaker than the result obtained from the Squared-ratio method described below, but effectively exemplifies the power of our new ideal world without outputting zero. Theorem 9. Let n, u, and qm be positive integers such that qm ≤ 2n−3 . Then, it holds 12 3 6uqm mu-prf ∗ AdvXoP (u, qm ) ≤ . (2n − 1)3 The second result is obtained by following the paradigm of [11]. We stress that we have no intention to optimize the constant factors, and there is significant room for improvement on them. Theorem 10. Let n, u, and qm be positive integers such that n > 12 and qm ≤ 2n 4n . Then, it holds 1 -prf (u, q ) ≤ Advmu m XoP ∗ 1 2 26u 2 qm 49u 2 (n + 1)2 + . 22n 2n 42 The remainder of this section is organized as follows. We first prove Theorem 9 in Section 5.1 using the Chi-squared method. In Section 5.2, we prove Theorem 10 using the Squared-ratio method. 5.1 Proof of Multi-User Security of XoP via the Chi-Squared Method Before proving the security of XoP, we define multiple experiments. First, let S0 be the fine-tuned ideal world where each user interacts with the random function sampled from Func∗ (n − 1, n). The world S1 is defined similarly, but each user interacts with different XoP constructions. An adversary makes qm queries to each user interface, a total of q = uqm queries. Without loss of generality, we assume an information-theoretic adversary is deterministic and does not make any redundant query; any redundant query only degrades the adversary’s ability. We consider the experiments in Algorithm 1 that are essentially identical to the original worlds but with lazily sampling the queries; each answer for the oracle queries of the j-th user is replaced by the output Zj . This change cannot be observed from the adversary’s view. Thus, we have ∥pS0 (·) − pS1 (·)∥ = ∥pB0 (·) − pB1 (·)∥. Algorithm 1 Ideal/Real experiments for XoP Experiment B0 1: for j ← 1 to u do 2: for i ← 1 to qm do 3: yij ←$ {0, 1}n \ {0} j 4: Z ← (y1j , . . . , yqjm ) 5: return (Z1 , . . . , Zu ) Experiment B1 1: for j ← 1 to u do 2: R ← {0, 1}n 3: for i ← 1 to qm do 4: uj2i−1 ←$ R, R ← R \ {uj2i−1 } 5: uj2i ←$ R, R ← R \ {uj2i } 6: yij ← uj2i−1 ⊕ uj2i 7: Zj ← (y1j , . . . , yqjm ) 8: return (Z1 , . . . , Zu ) We then consider the intermediate worlds in Algorithm 2 for applying the Chi-squared method. In the world C0 , the oracle pretends to the answer y is of the form u ⊕ u′ for u = P (0∥x) and u′ = P (1∥x) for the given permutation P 43 and an input x, and returns z = (u, u′ ). When the adversary processes it as if the oracle outputs y = u ⊕ u′ , and as long as the oracle does not return (⊥, ⊥), this world is identical to B0 in the adversary’s view. In the world C1 , everything is the same with B1 , but the oracle returns (u, u′ ) as in C0 . The following lemma Algorithm 2 Intermediate experiments for XoP Experiment C0 1: for j ← 1 to u do 2: R ← {0, 1}n 3: for i ← 1 to qm do 4: yij ←$ {0, 1}n \ {0} 5: Tij (yij ) ← {(u, v) : u, v ∈ R, u ̸= v, u ⊕ v = yij } 6: if Tij (yij ) > 0 then 7: (uj2i−1 , uj2i ) ←$ Tij (yij ) 8: else 9: (uj2i−1 , uj2i ) ← (⊥, ⊥) 10: 11: 12: R ← R \ {uj2i−1 , uj2i } zij ← (uj2i−1 , uj2i ) Zj ← (z1j , . . . , zqjm ) 13: return (Z1 , . . . , Zu ) Experiment C1 1: for j ← 1 to u do 2: R ← {0, 1}n 3: for i ← 1 to qm do 4: uj2i−1 ←$ R, R ← R \ {uj2i−1 } 5: uj2i ←$ R, R ← R \ {uj2i } 6: zij ← (uj2i−1 , uj2i ) 7: Zj ← (z1j , . . . , zqjm ) 8: return (Z1 , . . . , Zu ) holds for Experiment C0 in Algorithm 2. Lemma 21. If qm ≤ 2n−3 , Experiment C0 in Algorithm 2 never returns (⊥, ⊥). Proof. We suppose any j ∈ [u] and omit y for simplicity. If i = 1, it is trivial that |Ti (yi )| = 2n > 0. For 2 ≤ i ≤ qm , we have |R| = 2n − 2(i − 1). Note that (u, v) ∈ Ti (yi ) ⇒ (v, u) ∈ Ti (yi ). Therefore |Ti (yi )| ≥ 2n − 4(i − 1) > 2n−1 > 0 since i ≤ qm ≤ 2n−3 . ⊔ ⊓ As described above, we can regard an adversary in B0 and B1 as special cases of C0 and C1 if C0 never returns (⊥, ⊥), we have the following inequality: ∥pS0 (·) − pS1 (·)∥ = ∥pB0 (·) − pB1 (·)∥ ≤ ∥pC0 (·) − pC1 (·)∥. 44 (26) Concluding the proof with the Chi-squared method. Without loss of the generality, assume that each user makes qm queries. For i ∈ [q] where i = (j − 1)qm + k such that j ∈ [u] and k ∈ [qm ], the response of the i-th query is seen as zi = zkj . We can easily check that the support of pi−1 C1 (·) is contained in the support of pi−1 (·) for i = 1, . . . , q, allowing us to use the Chi-squared C0 method. Let Ω = {0, 1}n × {0, 1}n . For a fixed i ∈ {1, . . . , q}, let i ∈ [q] where i = (j − 1)qm + k such that j ∈ [u] and k ∈ [qm ]. Fix z ∈ Ω i−1 such that pi−1 C1 (z) > 0. We will compute 2 pzC1 ,i (z) − pzC0 ,i (z) . pzC0 ,i (z) X 2 χ (z) = z∈Ω such that pzC ,i (z)>0 0 Note that zi is independent with Z(j−1)qm . For y ∈ {0, 1}n , let Tkj (y) = Tkj (y) . Also note that Tkj (y) can be defined in both C0 and C1 . From the proof of Lemma 21, we have Tkj (y) ≥ 2n − 4(k − 1). Moreover, we see that pzC0 ,i (u, v) = 1 , − 1)Tkj (y) 1 1 = n pzC1 ,i (u, v) = n . (2 − 2k + 2)(2n − 2k + 1) (2 − 2k + 2)2 (2n Therefore, χ2 (z) = (2n − 1) Tkj (y) − X (u,v)∈Ω such that pzC0 ,i (u,v)>0 (2n −2k+2)2 2n −1 2 Tkj (y)(2n − 2k + 2)2 (2n − 2k + 1)2 2n − 1 32 · 3n n ≤ 9 2 (2 − 2k + 2)2 X Tkj (y) (u,v)∈Ω such that pzC ,i (u,v)>0 (2n − 2k + 2)2 − 2n − 1 2 0 n = 32 2 −1 · 9 22n (2n − 2k + 2)2 X y∈{0,1}n Tkj (y) − (2n − 2k + 2)2 2n − 1 2 . (27) since k ≤ qm ≤ 2n−3 . We claim the following lemma proved in Section 5.1.1. Lemma 22. It holds that h i (2n − 2k + 2) 2 Ex Tkj (y) = , 2n − 1 h i 2(2k − 2)(2k − 3)(2n − 2k + 2) 2 Var Tkj (y) = (2n − 1)2 (2n − 3) where the expectation and variance are taken over from the distribution of C1 . 45 From Equation (27) and Lemma 22, it follows that   2 n n X 2 32 2 −1 (2 − 2k + 2)2  Ex χ (z) ≤ · Ex  Tkj (y) − 9 22n (2n − 2k + 2)2 2n − 1 y∈{0,1}n h i 32 2n − 1 = · n n Var Tkj (y) 9 2 (2 − 2k + 2)2 32 2(2k − 2)(2k − 3) 32(k − 1)2 = · n n ≤ 9 2 (2 − 1)(2n − 3) (2n − 1)3 and finally, we have the following inequality, which concludes the proof:  12  !1 q qm u X X 2 2 2 1X 1 Ex χ (z) Ex χ (z)  = 2 i=1 2 j=1 ∥pC0 (·) − pC1 (·)∥ ≤ k=1  1 ≤ 2 qm u X X 32(k − 1)2 j=1 k=1 (2n − 1)3  21  ≤ 3 6uqm (2n − 1)3 12 . 5.1.1 Proof of Lemma 22 Let Ψ = {0, 1}n and fix j and k. Let Iψ where ψ ∈ Ψ be an indicator variable Iψ = 1 ⇔ ψ, ψ ⊕ y ∈ {0, 1}n \ {ujℓ }ℓ∈[2k−2] . Note that the variables {uℓj } are uniformly randomly sampled from {0, 1}n without replacement. Observe that X Tkj (y) = Iψ ψ∈Ψ and Ex [Iψ ] = (2n − 2k + 2)(2n − 2k + 1) . 2n (2n − 1) Thus, we have h i X (2n − 2k + 2)(2n − 2k + 1) Ex Tkj (y) = Ex [Iψ ] = . 2n − 1 ψ∈Ψ Now, we compute the following expectation  2   2 X   j Ex Tk (y) = Ex  Iψ   = Ex   X (ψ,ψ ′ )∈Ψ 2 ψ∈Ψ 46 Iψ Iψ ′  . (28) For ψ and ψ ′ , let r be the size of the following set {ψ, ψ ′ , ψ ⊕ y, ψ ′ ⊕ y}. We see that r ∈ {2, 4} since y ̸= 0; moreover, (2n − 2k + 2)r . (2n )r Ex [Iψ Iψ′ ] = For a fixed ψ ∈ Ψ , we have |{ψ ′ ∈ Ψ | r = 2}| = 2, |{ψ ′ ∈ Ψ | r = 4}| = 2n − 2. It follows that X 2(2n − 2k + 2)2 , Ex [Iψ Iψ′ ] = (2n )2 ′ ψ ∈Ψ, r=2 X ψ ′ ∈Ψ, r=4 2k − 2 Ex [Iψ Iψ′ ] = (2 − 2) 1 − n 2 −2 n 2k − 2 (2n − 2k + 2)2 1− n . 2 −3 (2n )2 hP i P P P ′ As Ex = (ψ,ψ′ )∈Ψ 2 Ex [Iψ Iψ′ ] = ψ∈Ψ ψ′ ∈Ψ Ex [Iψ Iψ′ ] (ψ,ψ ′ )∈Ψ 2 Iψ Iψ and the sum is divided into two cases according to the value of r, the sum of the expectations is given as   ! X X X n Ex  Iψ Iψ′  = 2 Ex [Iα Iψ′ ] + Ex [Iα Iψ′ ] . (29) ψ ′ ∈Ψ, r=2 (ψ,ψ ′ )∈Ψ 2 ψ ′ ∈Ψ, r=4 for an arbitrary constant α ∈ Ψ . Therefore, by Equation (29), we have   X 1 (2k − 2)2 (2n − 2k + 2)2 n 2 − (2k − 2) 2 + + Ex  Iψ Iψ′  = 2n − 1 2n − 3 2n − 3 (ψ,ψ ′ )∈Ψ 2 (2k − 2)2 (2n − 2k + 2)2 n 2 − 4k + 4 + n . = 2n − 1 2 −3 for a fixed ψ. By Equation (28), conclude that     2 h i X X j Var Tk (y) = Ex  Iψ Iψ′  − Ex  Iψ  (ψ,ψ ′ )∈Ψ 2 ψ∈Ψ (2n − 2k + 2)2 (2k − 2)2 (2n − 2k + 2)2 n = 2 − (4k − 4) + − 2n − 1 2n − 3 2n − 1 n (2 − 2k + 2)2 2(2k − 2)(2k − 3) = · n 2n − 1 (2 − 1)(2n − 3) 2(2k − 2)(2k − 3)(2n − 2k + 2)2 = (30) (2n − 1)2 (2n − 3) 47 By Equations (28) and (30), the proof completes. 5.2 Proof of Multi-User Security of XoP via the Squared-Ratio Method Thanks to the squared-ratio method (Theorem 2), considering an adversary D making at most qm queries in the information-theoretic setting suffices. The queries made by D are x1 , ..., xqm ∈ {0, 1}n−1 , which are assumed to be all different without loss of the generality. In this way, D obtains a transcript τ = ((x1 , λi ), · · · , (xqm , λqm )). def In the real world, XoP[P](xi ) = Pγi ⊕ Pγi′ , where P is a given n-bit (keyed) PRP function, and {Pγ1 , Pγ1′ , · · · , Pγn , Pγn′ } should be a solution to the following equation system  Pγ1 ⊕ Pγ1′ = λ1 ,    Pγ2 ⊕ Pγ ′ = λ2 , 2 Γ =: ..   .    Pγqm ⊕ Pγq′ m = λqm . This induces the transcript graph G(τ ) in the real world. Bad transcript analysis. Recall Theorem 4. To upper bound |Ri | corresponding to this system in the ideal world, where each λj takes an independent random value sampled from {0, 1}n \ {0}, we define a bad event as follows: bad: ∃(i1 , · · · , in ) ∈ [qm ]∗n such that λi1 = · · · = λin . We have Pr[bad] = qm n (2n − 1)n−1 ≤ n!(2n q n n qm m ≤ . n−1 − 1) 2n because n! ≥ 2n+1 and 2 · (2n − 1)n−1 ≥ (2n )n for n > 12. We say that the transcript is good if it is not bad. Good transcript analysis. Now we focus on the good transcript. Let Tid and Tre be random variables following the distribution of the transcripts in the real world and the ideal world, respectively. Then, we have Pr[Tre = τ ] h(G)(2n − 1)qm = , Pr[Tid = τ ] (2n )2qm because in the real world, the transcript can occur with a probability proportional to the number of solutions h(G) over the choices of Pγi , Pγi′ . On the other hand, in the ideal world, every transcript occurs equally, i.e., with probability 1 (2n −1)qm . Recall the following definition of Ri , where |Ri | ≤ n for all i ∈ [qm ] because of ¬bad in our case: Ri = (V1 , V1′ , V2 , V2′ ) ∈ Ci ∗2 × Cj ∗2 j < i and λ(V1 , V1′ ) = λ(V2 , V2′ ) . 48 Let Ij,k be P a random variable that equals 1 if λi = λj and 0 otherwise. Then it P qm |Ri | = j,k Ij,k as a random variable, which will be used later. holds that i=1 The bound of ¬bad implies that Pqm 2 2 |Ri | 3qm 3 i=1 10(n + 1)2 3nqm 3qm 10(n + 1)2 + + ≤ + + ≤1 2n 22n 2n 2n 22n 2n for n > 12 and qm ≤ 2n 4n . Therefore, by Theorem 4, Pqm |Ri | 6qm 2 6 i=1 h(G)(2n − 1)qm 20(n + 1)2 ≤ + 2n + . − 1 n n (2 )2qm 2 2 2n n Conclude the proof. Define ϵ2 = q2m and n Pqm 6 i=1 |Ri | 6qm 2 20(n + 1)2 ϵ1 (τ ) = + + . 2n 22n 2n To apply Theorem 2, we need to bound the expectation of ϵ1 (τ )2 where the randomness is taken over the distribution of the ideal world. We apply Lemma 6, and using Lemma 5 for bounding the expectations relevant to Ri , we have hP i 2 qm 108Ex ( |R |) 4 i i=1 108qm 1200(n + 1)4 Ex ϵ1 (τ )2 ≤ + + 22n 24n 22n 2 4 4 108qm 108qm 108qm 1200(n + 1)4 ≤ 2n n + 2n n + + 2 (2 − 1) 2 (2 − 1)2 24n 22n 4 4 1200(n + 1) 318qm + . ≤ 4n 2 22n Applying Theorem 2 and plugging in the above inequality, we have -prf (u, q ) ≤ Advmu m XoP ∗ q 1 2uEx [ϵ21 ] + 2uϵ2 ≤ 1 2 49u 2 (n + 1)2 26u 2 qm + . 2n 2 2n ⊔ ⊓ This completes the proof. 6 Multi-User Security of nEHtM This section proves the multi-user MAC security of the nonce-based Enhanced Hash-then-mask (nEHtM) scheme proposed by [23]. Let H be a (n−1)-bit output δ-AXU hash function and let P be an n-bit permutation. For given inputs a message M and an (n − 1)-bit nonce N , nEHtM = nEHtM[H, P] outputs a tag T defined as follows: T = nEHtM(N, M ) := P(0∥N ) ⊕ P(1∥HKh (M ) ⊕ N ). An adversary A for the nEHtM makes two types of queries: MAC queries that compute the tags given inputs messages and nonces, and verification queries 49 that take a tuple of a nonce, a message, and a candidate tag (N ′ , M ′ , T ′ ) as inputs and is returned b ∈ {0, 1}, where b = 1 if and only if the equation nEHtM(N ′ , M ′ ) = T ′ holds. The main result of this section is summarized as follows. Theorem 11. Let n ≥ 20 be a positive integer. Let δ > 0 and H : K × M → {0, 1}n−1 be a δ-AXU hash function family. Let u, qm , vm , and µm be positive mu−mac 2 2 integers such that uqm δ ≤ 1 and 32µm qm ≤ 1. Then, AdvnEHtM (u, µm , qm , vm ) is bounded by 4 12 2 2 3 12 √ uqm δ n uµm qm δ 140n uµ2m + 129uvm δ + 149 · + 80 · + 2n 22n 22n 1 1 1 2 2 5 2 3 3 2 n2 u2 µ2m qm δ 3 n2 u2 qm δ u µm qm + 153 · + 155 · + 12 · 23n 22n 22n 72uµ2m δ Assuming δ = ℓ 2n for some ℓ ≥ 1, we have the following asymptotic bound: 1 O 3 1 1 2 2 ℓu(nµ2m + vm ) ℓ 2 nuµm qm ℓ 2 u 2 qm + + + 3n 3n n 2 22 22 2 ℓn2 u2 µ2m qm 3n 2 13 + 5 ℓ2 n2 u2 qm 4n 2 31 ! Plugging µm = 0, vm = 0 in this bound matches the nonce-respecting security bound, resulting in the asymptotic bound 4 12 2 5 13 ! uqm u qm Õ + , 23n 24n ignoring small factors, which is more carefully dealt in Section 6.4. Figure 2 shows the graphical comparison between our bounds and the previous bounds [11, 16] in this setting. We further explore the multi-user security of nEHtM with hash functions with a stronger property, dubbed a pairwise δ-almost XOR universal: for any M1 ̸= M1′ and M2 ̸= M2′ in M such that {M1 , M1′ } = ̸ {M2 , M2′ } and X1 , X2 ∈ X , it holds that Pr [HKh (M1 ) ⊕ HKh (M1′ ) = X1 ∧ HKh (M2 ) ⊕ HKh (M2′ ) = X2 ] ≤ δ 2 . $ K← −K mu−mac In this setting, we obtain a much better bound on AdvnEHtM (u, µm , qm , vm ) of 2 2 2 1/3 2 2/3 2 6 1/3 ! √ 4 uµ2m + uvm uqm u µm qm uqm u qm Õ + 3n + + + n 3n 2n 2 2 2 2 25n ignoring polynomial factors of ℓ and n for δ = O(ℓ/2n ). For the mu PRF security, we have the following security bound assuming the strong hash functions: 2 2/3 2 6 1/3 ! √ 4 uqm uqm u qm mu-prf ∗ AdvnEHtM (u, qm ) = Õ + + . 23n 22n 25n 50 . log2 qmax log2 qmax 3n 0.7n 4 0.75n 2n 3 0.5n 0.5n log2 u 0 n 0.5n 3 n 2n 2n 5 0 0.4n log u 2n 2 n Fig. 2: Comparison of the multi-user security bounds (in terms of the threshold number of queries per user) as functions of log2 u. We neglect the polynomial terms of ℓ and log n in the graphs. We assume vm = µm = 0 for a fair comparison. The solid line represents our bounds in both graphs. In the left graph, the blue dashed line (resp. the red dash-dotted line) represents the security bound obtained by the hybrid argument where q = qm (resp. q = uqm ). On the other hand, in the right graph, the blue dashed line corresponds to the result of [11] with our correction in Section 6.5. The red dash-dotted line in the right graph corresponds to the claimed security bound in [11], which was buggy. Assuming δ-AXU(2) , the dash-dotted line is recovered, while the densely dotted line can be proven with the method in this paper. In the remainder of this section, we prove Theorem 11 using Theorems 2 and 3. The proof sketch of the nonce-respecting setting can be found in Section 6.4. The stronger bound with a pairwise δ-AXU and the discussion on the previous nEHtM2 security proof [11] is placed in Section 6.5. Before starting the proof, some observations are in order. First, we always 3n/4 n 2n 2n 20.5n √ , 2 , nuqm δ < 2n , and vm ≤ 128 , µm ≤ 12 , assume that qm ≤ 2 8 ≤ 256 n 32qm otherwise the right hand side of the advantage becomes ≥ 1, and nothing to prove. Second, we do not intend to optimize the constant factors in the proof and sometimes even give up on optimizing the small factors ℓ and n. The constants between the inequalities may be chosen as a rough upper bound. 6.1 Bad and Good Transcripts The queries of the adversary can be represented by the MAC queries and the verification queries as follows τm = (Ni , Mi , Ti )1≤i≤qm , τv = Nj′ , Mj′ , Tj′ , b′j 1≤j≤v m where Ti = nEHtM(Ni , Mi ) and b′j = 1 if and only if Tj′ = nEHtM(Nj′ , Mj′ ). The overall transcript is τ = (τm , τv , K) where we assume that the key K is given at the end of the attack for free, which only makes the adversary stronger. We additionally define Xj′ := HKh (Mj′ ) ⊕ Nj′ Xi := HKh (Mi ) ⊕ Ni , 51 for i = 1, ..., qm and j = 1, ..., vm . In the real world, these values should obey the following system of equations when the adversary fails to forge the MAC:   P(0∥N1 ) ⊕ P(1∥X1 ) = T1 , P(0∥N1′ ) ⊕ P(1∥X1′ ) ̸= T1′ ,         P(0∥N2 ) ⊕ P(1∥X2 ) = T2 , P(0∥N2′ ) ⊕ P(1∥X2′ ) ̸= T2′ , and .. ..     . .       ′ P(0∥Nqm ) ⊕ P(1∥Xqm ) = Tqm , P(0∥Nvm ) ⊕ P(1∥Xv′ m ) ̸= Tv′ m . We identify {P(0∥Ni )}i ∪ {P(0∥Nj′ )}j with a set of unknowns P = {P1 , ..., Pq1 } for q1 ≤ qm + vm and similarly identify {P(1∥Xi )}i ∪ {P(1∥Xj′ )}j with a set of unknowns Q = {Q1 , ..., Qq2 } for some q2 ≤ qm + vm . We define the corresponding transcript graph G(τ ) = (V, E) for V = P ⊔ Q. Here the set E includes the following edges: For i = 1, ..., qm , P(0∥Ni ) ∈ P and P(1∥Xi ) ∈ Q are connected with a (Ti , =)-labeled edge. Similarly, for i = 1, ..., vm , P(0∥Ni′ ) ∈ P and P(1∥Xi′ ) ∈ Q are connected with (Ti′ , ̸=)-labeled edge. Therefore, the transcript graph G(τ ) is a connected bipartite graph with two independent sets P and Q. In the ideal world, the tags Ti should be a uniformly random element in {0, 1}n \{0} and independent from each other; we again stress that the punctured point 0 is important of our argument. On the other hand, the candidate tags Tj′ are arbitrarily chosen by the adversary from {0, 1}n \{0} even in the ideal world.4 We will compare the difference between the real and ideal worlds regarding the transcript graph G(τ ). Notations. Fix a transcript τ so that each Ni , Xi is determined. In the graph G = (τ ), for each (n − 1)-bit string X ∈ {0, 1}n−1 , we define the degree of X, denoted by dX , by the number of i ∈ [qm ] such that Xi = X. We call (i1 , i2 , ...) ∈ [qm ]∗j for some j by a length-j X-trail, which means that it starts from a vertex corresponding to X (see Equation (31)), if (Ni1 = Ni2 ) ∧ (Xi2 = Xi3 ) ∧ ... holds. An X-trail can be interpreted as a trail of P(1∥Xi1 ) − P(0∥Ni1 ) = P(0∥Ni2 ) − ..., or Xi1 − Ni1 = Ni2 − ... (31) and similarly define N -trails. (A trail can be both X- and N -trail.) We ambiguously call them trails. Note that a trail (i, j) satisfies Ni = Nj or Xi = Xj , and 4 The adversary can choose Tj′ = 0. However, the verification query always rejects such a choice, so we ignore this case. 52 is of length-2. For a trail γ = (i1 , ..., ij ), the label of γ is defined by M λ(γ) = Tik , k∈[j] which is equal to λ(V0 , Vℓ ) for the first and last vertices of γ in the mirror theory. If λ(γ) = λ(γ ′ ), we say that two trails γ, γ ′ are a collision pair. A set of trails {γ1 , ..., γk } is called by a k-collision if all λ(γi ) are equal for all i ∈ [k]. If λ(γ) = 0, γ is called by a null trail. Bad Transcripts. We first define bad transcripts. Let L1 , L2 ≥ 2 be fixed positive integers. Recall qc denotes the number of edges included in the components of size ≥ 3, and dX for X ∈ {0, 1}n−1 denotes the number of i ∈ [qm ] such that Xi = X. We say that the transcript τ is bad if any of following conditions the 20.5n 2n holds. We will choose constants so that L1 , L2 ≤ min 32qm , 24√n . – bad1 : ∃(i, j) ∈ [qm ]∗2 such that for some k, ℓ ∈ [qm ]2 with k ̸= i, ℓ ̸= j: (Nk = Ni ) ∧ (Xi = Xj ) ∧ (Nj = Nℓ ). – bad2 = bad2a ∨ bad2b , where: • bad2a : P |{i ∈ [qm ] : Xi = Xj ∧ Nj = Nk for some j ̸= i, k ̸= j}| ≥ L1 • bad2b : X∈{0,1}n−1 ,dX >1 d2X ≥ L22 . – bad3 = bad3a ∨ bad3b ∨ bad3c , where: • bad3a : ∃ a null trail (i, j) ∈ [qm ]∗2 of length 2, i.e., Ti ⊕ Tj = 0. • bad3b : ∃ a null trail of length 3. • bad3c : ∃ a null trail of length 4. – bad4 = bad4a ∨ bad4b , where • bad4a : ∃(i, j) ∈ [qm ] × [vm ] such that (Ni , Xi , Ti ) = (Nj′ , Xj′ , Tj′ ). • bad4b : ∃(i, j, k, ℓ) ∈ [qm ]∗3 × [vm ] such that (i, j, k) is an N -trail and (Xk = Xℓ′ ) ∧ (Nℓ′ = Ni ) ∧ (Ti ⊕ Tj ⊕ Tk ⊕ Tℓ′ = 0). – bad5 = bad5a ∨ bad5b ∨ bad5c ∨ bad5d , where: • bad5a : ∃ a n-collision of length 1 trails. • bad5b : ∃ a n-collision of length 2 N -trails. • bad5c : ∃ a n-collision of length 2 X-trails. • bad5d : ∃ a n-collision of length ≥ 3 trails. 22n – bad6 : qc ≥ 186q 2 . m Interpretations of bad events. We make the following interpretations and implications of the bad events, which are used in the analysis multiple times. The detailed description and analysis are deferred to the end of Section 6.3. Fact 1 If ¬bad1 , then it holds that 1. every length-4 trail is N -trail, 2. ∄ length-5 trail, 53 3. ∄ cycles in G = (τ ), 4. ∄(i, j) ∈ [qm ]∗2 s.t. (Ni = Nj ) ∧ (Xi = Xj ). Furthermore, each component C of G = (τ ) of size ≥ 3 can be understood as a tree, which we call tree≥3 , (See Figure 3) with a special vertex N0 called as a root. Every vertices with degree 1 in the tree is called by a leaf. N1 T1 X1 = X2 X3 = X4 = X5 T2 X6 T6 T9 N8 T8 X8 = X9 N3 T4 T5 N0 T3 X7 T7 N4 T11 X10 = X11 T10 N10 Fig. 3: An example of tree≥3 . In each edge, the tag Ti corresponds to the query P(0∥Ni ) ⊕ P(1∥Xi ), where Xi and Ni are written in each vertex. The root is N0 , which is equal to N2 , N3 , N6 , N7 , N9 , and N11 . N1 , N3 , N4 , X6 , X7 , N8 , and N10 are leaves. Fact 2 If ¬bad1 , ¬bad3 and ¬bad4 , then G(τ ) is nice. Fact 3 If ¬bad1 and ¬bad2 , the following upper bounds hold: – The number of all vertices in all tree≥3 is less than or equal to 3L1 + µm . – dX ≤ L2 for all X ∈ {0, 1}n−1 and ξmax ≤ 2L1 + 2L2 + µm . Furthermore, n 2n ξmax qm ≤ 5·2 32 ≤ 4 holds. – The number of length-2 N -trails is bounded by L22 /2. – The number of length-2 X-trails is bounded by 2µ2m (regardless of bad2 ). – Recall the notations from eq. (4). The number P of trails in C1 , ..., Cα is bounded α by 2µ2m + 9L21 + 0.5L22 . Further, it holds that i=1 c2i ≤ 18L21 + L22 + 4µ2m ≤ n 2 2 min 4.5ξmax , 16n . Fact 4 If ¬bad1 and ¬bad3 , a collision pair of two trails does not start from the same vertex. More strongly, for a ℓ-collision {γ1 , ..., γℓ }, there exist a set of indices {i1 , ..., iℓ } such that for each j, ij is included in γj but not included in γk for all k < j. 2 Fact 5 If H is a δ-AXU hash function, Ex [qc ] ≤ 2µm + qm δ and Ex qc2 ≤ 3 8µ2m + 2qm δ. Bad transcript analysis. The probability Pr[bad] is bounded as follows: ϵ2 := 2 4 ℓ(7µ2m + 2vm ) 3ℓqm L1 3ℓµm qm 3ℓq 2 372ℓqm + + n + n m2 + . n 2n 3n 2 2 2 L1 2 L2 2 54 (32) The detailed analysis is deferred to Section 6.3. Good transcript analysis. We now assume that the transcript is good, i.e., no bad events occur. Recall Ri is defined as follows for i ∈ [α + β]: Ri = (V1 , V1′ , V2 , V2′ ) ∈ Ci ∗2 × Cj ∗2 j < i and λ(V1 , V1′ ) = λ(V2 , V2′ ) , which is a missing term in the above analysis. We divide it into two sets: n o Si := (V1 , V1′ , V2 , V2′ ) ∈ Ri V1 V1′ , V2 V2′ ∈ E , Di = Ri \ Si . Pα+β Let S := i=1 |Si |. Since ∪i∈[α+β] Si is the number of collisions of the independent uniform random tags over {0, 1}n \{0} among edges, we can invoke Lemma 5 to obtain ( 2 qm q2 2 2 if 2m < B, qm B Ex [S] ≤ , Ex S ≤ q4 (33) m 2B otherwise, 2 2B n where B = 2 P − 1. Also, by ¬bad5 , it holds that S ≤ nqm . α Let C := i=1 c2i . Consider Di for i ≤ α. For each (V1 , V1′ ) ∈ Ci∗2 , ¬bad5 asserts that there are at most 4n different (V2 , V2′ ) ∈ V ∗2 with the same label with λ(V1 , V1′ ). On the other hand, for i > α, a pair of vertices (V2 , V2′ ) such that (V1 , V1′ , V2 , V2′ ) ∈ Di must be included in Cj for j ≤ α, because it is included in Si otherwise. For each (V2 , V2′ ) ∈ Cj∗2 for j ≤ α, there are at most n different i > α such that (V1 , V1′ , V2 , V2′ ) ∈ Di for some V1 , V2 . Therefore, we have α+β X i=1 |Di | ≤ 5n α X ci i=1 2 ≤ 3nC. We consider the following upper bound before invoking Theorem 3. From now on, we occasionally give colors on some terms to denote the corresponding upper (or lower) bounds in the following (in)equalities, for making one easily chase the transitions of the terms. Pα+β 2 4 2S + 2( i=1 |Di |) + 2C + 18vm 2Cqc2 + 31qc qm 20qm + + 2n 22n 23n 2 2 2 4 9ξ · q + 31qc qm 20q 2nqm + 7nC + 18vm ≤ + max m2n + 3nm 2n 2 2 2n 2 2 2 1 7 18 225 20 31qm 186qm ≤ + + + 2 + 4 + 2n 128 16 128 32 8 2 ≤1 where we use the inequalities from Fact 3 and the upper bounds of qc from 3n/4 n ¬bad6 , qm ≤ 2 8 ≤ 212 . By Theorem 3, it holds that 2 4 h(G)(2n − 1)q 4S + 14nC + 36vm 4Cqc2 + 62qc qm 40qm − 1 ≤ + + =: ϵ1 (τ ). (2n )|V| 2n 22n 23n (34) 55 6.2 Proof of Theorem 11 We will use Theorem 2 to prove the main theorem in this section given Equations (32) and (34). The remaining part is to give an upper bound of ϵ1 (τ )2 to prove the main theorem and optimize the parameters L1 , L2 appropriately. By ¬bad6 , Fact 5, Equation (33), and the assumptions on the parameters, especially all terms are less than 1, the expectations of the squared terms can be bound as follows: " 2 # 2 4 4 qm qm qm + ≤ 3n 2n 2 2n B·2 2B · 2 2 " " 2 # 2 # 3 δ) nC nξmax qc (n(2L1 + 2L2 + µm ))2 (8µ2m + 2qm Ex ≤ Ex ≤ n n 2n 2 2 2 " # 2 3 4Cqc2 4Cqc2 4(5L1 + L2 + 2µm )2 (8µ2m + 2qm δ) Ex ≤ Ex ≤ 2n 2n 2n 2 2 2 " 2 # 2 2 4 4 62qc qm 62 · 8µ2m qm 622 · 2qx2 qm Ex ≤ Ex + 22n 24n 24n 2 4 4 42qx qm 53qm 11qm + ≤ ≤ Ex 23n 22n 23n Ex S 2n ≤ In the last inequality, we invoke the notation used in the proof of Fact 5, where 2n 20.5n 2 √ and qx ≤ qc ≤ 2 2 qc ≤ 2µm + qx and Ex [qx ] ≤ qm δ. We also use µm ≤ 12 186qm n p by ¬bad6 . We derive an upper bound of 2Ex [ϵ1 (τ )2 ] using Lemma 6 as follows: 3 2 70n(2L1 + 2L2 + µm )(4µ2m + qm δ)0.5 125vm 139q 4 29qm + + + 3nm 1.5n n n 2 2 2 2 2 1.5 3 125vm + 140nµ2m 32qm + 70ℓ0.5 nµm qm 140n(L1 + L2 )(4µ2m + qm δ)0.5 ≤ + + n 1.5n n 2 2 2 where we use qm ≤ 23n/4 8 , 2n/8 > n. 56 Combining with Equation (32), the overall security bound 2uϵ2 from Theorem 2 is given by p 2uEx [ϵ1 (τ )2 ] + 2 4 6ℓuqm L1 6ℓuµm qm 744ℓuqm 14ℓuµ2m + 4ℓuvm 6ℓuq 2 + + + + n m 2 n 2n n 3n 2 2 2 L1 2 L2 2 √ 2 √ 2 √ √ 0.5 1.5 32 uqm + 70ℓ n uµm qm 125 uvm + 140n uµm + + 1.5n 2n 2 √ 1.5 0.5 314n(L1 + L2 ) u max µm , qm δ + n √ √ 22 √ 2 1.5 60ℓ0.5 uqm 70ℓ0.5 n uµm qm (14ℓu + 140n u)µm + 129ℓuvm + + ≤ n 1.5n 1.5n 2 2 2 √ 1.5 0.5 2 δ 314nL1 u max µm , qm 6ℓuqm L1 6ℓuµm qm + + + 22n 2n 2n L1 √ 1.5 0.5 314nL2 u max µm , qm δ 6ℓuq 2 + + n m 2 2 L2 2n 744ℓuq 4 where we use 23n m ≤ 1. We balance the last equation by choosing L21 = 3ℓuµm qm , 1.5 δ, 3q 2 δ) max(157nu0.5 µm , 157nu0.5 qm m (35) and L2 =   2 3ℓu0.5 qm 1.5 δ) 157n max(µm ,qm  2n 32qm 13 3 if qm < 22n , (36) 3 if qm ≥ 22n . 3 We consider two cases separately. If qm < 22n , this gives the final advantage upper bound by √ √ 2 √ 1.5 (14ℓu + 140n u)µ2m + 129ℓuvm 60ℓ0.5 uqm 70ℓ0.5 n uµm qm + + 2n 21.5n 21.5n + 1 1 3 1 2 2 3 1 5 3 0.5 1.5 4 87ℓ 2 n 2 u 4 µm qm 87ℓ 4 n 2 u 4 µ0.5 12uµ0.5 m qm m qm + + 5n n 1.5n 2 2 24 2 2 2 2 5 2 3 3 3 87ℓ 3 n 3 u 3 µm qm 87ℓ 3 n 3 u 3 qm + + 4n n 2 23 √ √ 2 √ 1.5 (72ℓu + 140n u)µ2m + 129ℓuvm 61ℓ0.5 uqm 70ℓ0.5 n uµm qm ≤ + + 2n 21.5n 21.5n 1 2 2 2 2 2 2 2 5 1.5 3 3 3 12uµ0.5 153ℓ 3 n 3 u 3 µm qm 158ℓ 3 n 3 u 3 qm m qm + + + 4n 1.5n n 2 2 23 57 11.1ℓnuq 2.5 222ℓuq 2.5 m ≤ where we use 22n m ≤ 22n ity to suppress some terms as follows: 1 2 2 2 2 2 2 5 2.5 11.1ℓnuqm 22n 2 1 2/3 1 and the AM-GM inequal1 3 3 3 2 ℓuµ2m 3ℓ 3 n 3 u 3 µm qm 4ℓ 2 n 2 u 4 µm qm + ≥ , 2n 2n 2n 3 1 3 1 5 3 2 4 ℓuµ2m 3ℓ 3 n 3 u 3 qm 4ℓ 4 n 2 u 4 µm qm + ≥ . 4n 5n n 2 23 24 q 1.5 3 Now we consider the case qm ≥ 22n . First, observe that 1 ≤ 2mn . The overall advantage upper bound becomes: √ √ 2 √ 1.5 (14ℓu + 140n u)µ2m + 129ℓuvm 60ℓ0.5 uqm 70ℓ0.5 n uµm qm + + n 1.5n 1.5n 2 2 2 1 1 3 1 2 2 1 3 5 3 0.5 1.5 4 87ℓ 2 n 2 u 4 µm qm 87ℓ 4 n 2 u 4 µ0.5 12uµ0.5 m qm m qm + + 5n n 1.5n 2 2 24 √ √ 0.5 0.5 4 10n uµm 10ℓ n uqm 6144ℓuqm + + + 23n qm 20.5n √ √ 2 √ 1.5 (72ℓu + 140n u)µ2m + 129ℓuvm 149ℓ0.5 uqm 80ℓ0.5 n uµm qm ≤ + + n 1.5n 1.5n 2 2 2 + + 2 2 2 2 5 2 0.5 1.5 3 3 3 qm 12uµm qm 66ℓ 3 n 3 u 3 qm 66ℓ 3 n 3 u 3 µm + + 4n n 1.5n 2 2 3 2 where we use the above application of the AM-GM inequality and √ √ √ 0.5 1.5 n uµm qm n uµm qm n uµm ≤ ≤ qm 2n 21.5n √ 0.5 √ 2 0.5 0.5 ℓ n uqm ℓ n uqm ≤ 20.5n 21.5n Taking the maximum of both, we have the advantage upper bound as follows: √ √ 2 √ 1.5 (72ℓu + 140n u)µ2m + 129ℓuvm 149ℓ0.5 uqm 80ℓ0.5 n uµm qm + + n 1.5n 1.5n 2 2 2 1 + 2 2 2 2 2 2 2 5 1.5 3 3 3 qm 153ℓ 3 n 3 u 3 qm 153ℓ 3 n 3 u 3 µm 12uµ0.5 m qm + + 4n n 1.5n 2 2 23 This concludes the concrete security of the main theorem. Sanity check. Our choices 1, L2 for the optimizations should obey the q of L 2n 2n conditions L1 , L2 ≪ min n , qm . Recall we choose them according to Equa 1 tions (35) and (36). Since max(x,y,z) ≤ min x1 , y1 , z1 , it suffices to check one of the choice make L1 , L2 satisfy the condition. For L1 , choosing nu0.5 µm among three choices for the maximum gives 0.5 ℓu qm L21 = O n 58 n 2 which is much smaller than 2n because of the assumption uqm ≤ (2n /ℓ)2 . Also, 0.5 1.5 choosing m = nu qm δ gives L21 = O ℓ0.5 u0.5 20.5n 0.5 nqm n 2 3 which is smaller than q2m . This is because it is equivalent to ℓuµ2m qm ≪ n2 23n , which is true because of the condition µm qm ≪ 2n . 3 1.5 0.5 For L2 , if qm < 22n , choosing qm δ for the maximum gives L32 = O 2 ℓu0.5 qm 1.5 0.5 nqm δ =O 0.5 0.5n ℓ0.5 u0.5 qm 2 n 1.5n which is smaller than 2n1.5 because it is equivalent to ℓnuqm ≪ 22n . This is also n 3 7 ≪ n2 25n . This is true because smaller than q2m , which is equivalent to ℓuqm 2n qm 4 3 of uqm ≪ 23n . If qm ≥ 22n , 6.3 ≪ 0.5n 2√ n apparently holds. Bad Transcript Analysis and Interpretations We give an upper bound of the probability that the event bad = bad1 ∨ bad2 ∨ bad3 ∨ bad4 ∨ bad5 occurs in the ideal world. Recall that µm is the upper bound of the number of faulty queries, and H is a δ-AXU hash function for δ = ℓ/2n and B = 2n − 1. The following fact can be easily shown by an inductive argument: for k ≥ 1 and uniform and independent random variables T1 , ..., Tk sampled from {0, 1}n \ {0}, it holds that for any K ∈ {0, 1}n  Pr   M Ti = K  ≤ i∈[k] 1 1 = . 2n − 1 B (37) We now analyze the probability that each bad event occurs. We assume that n ≥ 20, and µm , L1 , L2 ≥ 1. The detailed conditions on the parameters will be explicitly described after analyzing each bad event. bad1 : The number of indices i ∈ [qm ] such that there exists k(̸= i) ∈ [qm ] such that Ni = Nk is bounded by 2µm . Thus, there are at most 4µ2m pairs of (i, j) ∈ [qm ]∗2 satisfying the condition. For each (i, j), the probability that Xi = Xj , or equivalently HKh (Mi ) ⊕ HKh (Mj ) = Ni ⊕ Nj is at most δ because H is a δ-AXU. By the union bound, we have Pr[bad1 ] ≤ 4µ2m δ. 59 bad2a : Fix i ∈ [qm ]. There are at most 2µm choices of j since it is a repeated nonce. For each j, the probability that Xi = Xj is at most δ, and the probability that i satisfies the condition is at most 2µm δ. Therefore, the expected size of the given set is at most 2µm qm δ, and by Markov’s inequality, we have Pr[bad2a ] ≤ 2µm qm δ . L1 bad2b : Recall the graph-theoretic interpretation; dX is the number of indices i ∈ [qm ] such that Xi = X. Let Col be the number of i < j such that Xi = Xj , 2 whose expectation is less than qm δ/2 because of the δ-AXU property of H. On the other hand, it holds that X X dX Col = ≥ d2X /4, 2 n−1 X:dX >1 X∈{0,1} where we use dX − 1 ≥ dX /2 for dX > 1. By Markov’s inequality, we have " # X 2q 2 δ 2 2 Pr[bad2b ] = Pr dX ≥ L2 ≤ Pr[4Col ≥ L22 ] ≤ m2 . L2 X:dX >1 bad3a : Assume that ¬bad1 so that i ̸= j satisfies at most one of Ni = Nj or Xi = Xj . We consider the following two cases: 1) Ti = Tj and Ni = Nj : The number of pairs (i, j) such that Ni = Nj is at most 2µ2m (Fact 3), and the probability that Ti = Tj is 1/B. 2) Ti = Tj and Xi = Xj : For each i, j, the probability that Xi = Xj is bounded by δ, and Ti = Tj is 1/B and two events are independent. By the union bound, we have Pr[bad3a |¬bad1 ] ≤ 2 2µ2m + δqm . B bad3b : Suppose that there exist indices (i, j, k) such that Ni = Nj and Xj = Xk . The number of (i, j) such that Ni = Nj is at most 2µ2m . For each k, the two events that Xj = Xk and Tk = Ti ⊕ Tj are independent, and the probabilities for them are bounded by δ and 1/B. By the union bound, we have Pr[bad3b ] ≤ 2δµ2m qm . B bad3c : Assume that ¬bad1 and ¬bad2a . By Fact 1, the length-4 trail (i, j, k, ℓ) must be N -trail, i.e., Xi = Xj , Nj = Nk , and Xk = Xℓ holds. For each pair (i, j) ∈ [qm ]∗2 , the probability that Xi = Xj is bounded above by δ due to H. Observe that (ℓ, k, j) satisfies Xℓ = Xk and Nk = Nj , which makes at most L1 different choices of ℓ. Because of the structure of the graph (Figure 3), k is deterministic for given (i, j, ℓ). Since the probability that (i, j, k, ℓ) becomes a null trail is at most 1/B, by the union bound, we have Pr[bad3c ] ≤ 60 2 qm δL1 . B bad4a : Assume that ¬bad3a . It implies that there is no i ̸= k such that Ni = Nk and Ti = Tk . For each verification query (Nj′ , Mj′ , Tj′ ), there is at most one MAC query (Ni , Mi , Ti ) such that Ni = Nj′ and Ti = Tj′ holds. For such a pair (i, j), the probability that Xi = Xj′ is bounded above by δ because of H. By the union bound, we have Pr[bad4a |¬bad3a ] ≤ vm δ. bad4b : We assume ¬bad1 and ¬bad2 . Let (Nℓ′ , Mℓ′ , Tℓ′ ) be a verification query included in a length-4 cycle described in the condition. For any (i, j, k, ℓ) satisfying the condition, (i, j, k) should be a length-3 N -trail and Ni should be a leaf with the root Nj because of Fact 1. Thus, given a fixed ℓ, there is a unique pair (i, j) ∈ [qm ]∗2 such that Ni = Nℓ′ and Xi = Xj holds and (i, j, k ∗ ) becomes a trail for some k ∗ (otherwise violating Fact 1). Fix (ℓ, i, j). For each k ∈ [qm ], the probability that Xk = Xℓ′ and Ti ⊕Tj ⊕Tk ⊕Tℓ′ = 0n are independent and at most δ and 1/B, respectively. Therefore, regardless of Nj = Nk , we have the following bound using the union bound: Pr[bad4b |(¬bad1 ) ∧ (¬bad2 )] ≤ vm q m δ . B bad5a : Since the values Ti are independent of each other in the ideal world and there are qnm different pairs (i1 , ..., in ), we have qm q n m n Pr[bad5a ] ≤ n−1 ≤ B B where we used n! ≥ 2n − 1 for n ≥ 4 in the middle. bad5b : Assume that ¬bad1 , ¬bad2 and ¬bad3 . Define B be a set of all collections of different n trails of length 2. By Fact 3, we have 2 L2 /2 L2n |B| ≤ ≤ 2 B n and we can show that Pr[Tij ⊕ Ti′j = Ti1 ⊕ Ti′1 ] ≤ 1/B for each (j, j ′ ) by Equation (37) and Fact 4. It gives 2 n |B| L2 Pr[bad5b |¬bad1,2,3 ] ≤ n−1 ≤ . B B bad5c : Assume that ¬bad1 ,¬bad2 and ¬bad3 . A similar argument shows that 2 n 2µm . Pr[bad5c |¬bad1,2,3 ] ≤ B bad5d : Assume that ¬bad1 ,¬bad2 , and ¬bad3 . Each trail of length ≥ 3 should be included in C1 , ..., Cα . Using Fact 3, A similar argument shows that 2 n 2µm + 9L21 + 0.5L22 Pr[bad5d |¬bad1,2,3 ] ≤ . B 61 bad6 : Recall Fact 5. For t > 2µm , by Markov’s inequality, it holds that Pr[qc ≥ 2t] ≤ Pr[qc − 2µm ≥ t] ≤ By setting t = 22n 2 , 372qm 2 qm δ . t we have 4 372qm δ . 22n n 3n/4 √ 2n 2 2n Summary. Recall that qm ≤ min 12n , vm ≤ 127 holds. ,24 , and µm ≤ 12√ n n 0.5n 2 We will choose L1 , L2 ≤ min 32q , 2 √ . This setting makes Pr[bad5 ] ≤ 1/2n m 24 n Pr[bad6 ] ≤ and the condition of bad6 holds. The overall upper bound of Pr[bad] is as follows: 2µm qm δ 2µ2 + (qm + 2µ2m + vm )qm δ 2q 2 δ + m2 + m L1 L2 B 2 4 δL1 δ qm 1 372qm + vm δ + + n+ . B 2 22n 4µ2m δ + Using δ = ℓ/2n for ℓ ≥ 1, B = 2n − 1 ≥ 1.0001 · 2n , and qm ≤ 0.01 · 2n , 23n/4 /8, we derive the following simplified upper bound: 2 4 2 ℓ(7µ2m + 2vm ) 3ℓqm L1 3ℓµm qm 372ℓqm 3ℓqm + + + . + 2n 22n 2n L1 2n L22 23n Analysis of Interpretations. We give detailed descriptions for the interpretations of the bad events. Remind that G is a bipartite graph. Fact 1 Suppose that (i, j, k, ℓ) is a length-4 X-trail. Then Ni = Nj , Xj = Xk , Nk = Nℓ holds, which directly violates ¬bad1 . Since a length-5 trail must contain a length-4 X-trail, the second item follows. By this observation, a cycle in G = (τ ) must be of length 4, which again violates ¬bad1 if it exists. The final item is just ¬bad1 with k = j, ℓ = i. The structure of the graph directly follows. Fact 2 ¬bad1 and ¬bad3 implies that G = is acyclic and non-degenerated, respectively. The consistency between G = and G ̸= is due to ¬bad4 , because ¬bad1 already rules out the other cases. Fact 3 ¬bad1 imposes the structure as in Figure 3. The number of vertices included in the trail of length two from the root to the leaf is bounded by 3L1 because of ¬bad2 ; in particular, if every length of two trails is in the same component, we can count it as 2L1 + 1. The indices corresponding to the other vertices must induce nonce collisions, thus the number of the other vertices is bounded above by µm . For the second item, dX ≤ L2 is obvious from ¬bad2 . For giving an upper bound of ξmax , let us consider the component with a trail of length three. Then, 62 as in the above argument, this component has at most 2L1 +µm +1 vertices. On the other hand, for the component without such a trail, it must be a star-shape with indices i1 , ..., ik such that Ni1 = ... = Nik or Xi1 = ... = Xik . In any case, the number of vertices in this component is bounded by max(µm , L2 ) + 1, all of which are less P than 2L1 + 2L2 + µm . For the number of length-2 N -trails, it is at most dX ≥2 d2X ≤ L22 /2. The number of i ∈ [qm ] with Nj = Ni for some j ̸= i is at most 2µm and the number of such j is at most µm , thus the total number is at most 2µ2m . m Finally, the number of all trails in C1 , ..., Cα is P bounded by 3L1 +µ + L22 /2 2 α 2 following the above facts. The upper bound of i=1 ci is obvious. Fact 4 Given a collision pair shares the same starting vertex, we can construct a null trail by combining them and removing the intersection, violating ¬bad3 . We can choose each starting vertex for the second statement as a unique index. Fact 5 For each i ∈ [qm ], let Ii equal 1 if there exists j ∈ [qm ] such that i ̸= j and Xi = Xj , and P zero otherwise. It holds that Ex [Ii ] ≤ qm δ by the union bound. Let qx = i∈[qm ] Ii ≤ qm . It holds that qc ≤ 2µ + qx , where 2µ is from the faulty queries, and qc2 ≤ (2µ + qx )2 ≤ 8µ2m + 2qx2 ≤ 8µ2m + 2qc qx . We have Ex [qc ] ≤ 2µ + Ex [qx ] ≤ 2µ + X 2 Ex [Ii ] ≤ 2µ + qm δ, i∈[qm ] 3 Ex qc2 ≤ Ex 8µ2m + 2qc qx = 8µ2m + 2qm Ex [qx ] ≤ 8µ2m + 2qm δ. 6.4 Nonce-respecting Setting We roughly sketch the security analysis of nEHtM when we only consider the nonce-respecting setting; every constant factor is ignored here. In this case, we can ignore the events bad1 , bad2a , most of the cases of bad3 (except the length-2 null trail with the Xi = Xj case), bad5c and bad5d . Asymptotically, the remaining probability of the bad events is 2 4 ℓqm ℓqm ℓvm + + , ϵ2 = O 2n 2n L22 23n where we ignore the probability of bad5 , which can be made less than 1/23n . Since there is no faulty query, the parameter L2 provides an upper bound of ξmax = O(L2 ) and C = O(L22 ) as well. We have Pα+β 2 4 |Di |) + C + vm Cqc2 + qc qm qm + + 3n n 2n 2 2 2 2 4 nqm + nL22 + vm L22 qc2 + qc qm qm = + + 3n = O(1). 2n 22n 2 S+( i=1 Let ϵ1 (τ ) = 2 4 Cqc2 + qc qm qm S + nC + vm + + . 2n 22n 23n 63 Then, as in the original proof, we have 0.5 2 1.5 1 ℓ0.5 nL2 qm vm ℓ qm 2 2 + + n . Ex ϵ1 (τ ) =O 21.5n 21.5n 2 Therefore, by Theorem 2, we have the overall asymptotic advantage bound of 1 Ex 2uϵ1 (τ )2 2 + 2uϵ1 is given by 0.5 √ 2 √ 1.5 2 4 ℓ0.5 nL2 uqm ℓuvm ℓuqm ℓuqm ℓ uqm + + n + n 2 + 3n . O 21.5n 21.5n 2 2 L2 2 1 ℓuqm 2n 6 2n By choosing L2 = Θ min , we have , qm n2 O ℓuvm + 2n 4 ℓuqm 23n 12 + 5 ℓ2 n2 u2 qm 24n 13 ! , (38) where we suppress some terms, as in the main proof. The sanity check passes in exactly the same way. 6.5 Using Stronger Hash and Proofs in [11] We briefly analyze the security of nEHtM when H satisfies a stronger property as follows: For δ > 0, we say that H : K × M → X is a pairwise δ-almost XOR universal, denoted by δ-AXU(2) , hash function if it is a δ-AXU and additionally for any M1 ̸= M1′ and M2 ̸= M2′ in M such that {M1 , M1′ } = ̸ {M2 , M2′ } and X1 , X2 ∈ X , it holds that Pr [HKh (M1 ) ⊕ HKh (M1′ ) = X1 ∧ HKh (M2 ) ⊕ HKh (M2′ ) = X2 ] ≤ δ 2 . $ K← −K Note that a 4-wise ℓ/|X |-almost universal hash function for constant ℓ is a pairwise δ-AXU. We let δ = Õ(1/2n ) and ignore the small factors in the analysis for exhibiting the asymptotic behavior solely. This new property of hash functions allows multiple improvements in our analysis. We first see a variant of Fact 5, which was erroneously used in [11] without any clarification or explicitly using δ-AXU(2) . 4 q2 qm . Fact 6 If H is δ-AXU(2) , then Ex qc2 = O µ2m + 2m n + 22n Proof (sketch). Recall the definitions in the proof of Fact 5 (at the end of Section 6.3). Let Ii,j equal 1 if Xi = Xj and otherwise zero. Then Ii = ∨j̸=i Ii,j ≤ P n (2) property implies that Ex [Ii,j Ik,ℓ ] ≤ δ 2 = j Ii,j = Õ(qm /2 ). The δ-AXU hP i Õ(1/22n ). We can give an upper bound of Ex qx2 = Ex I I by i j i,j   " # 2 4 X X X qm qm   Ex Ii + Ex ( Ii,k )( Ij,ℓ ) = Õ + 2n . 2n 2 i i,k Finally, qc2 2 ≤ (2µm + qx ) ≤ 8µ2m + j̸=i,ℓ 2qx2 gives the desired result. 64 ⊔ ⊓ We take a similar approach whenever two XOR equations of H appear. qm – Applying Chebyshev inequality instead of Markov, Pr[bad2a ] = Õ µ2m . n L2 1 µm qm This requires L1 ≫ 2n , and our choice satisfies this constraint. – We can directly give a better bound Pr[bad3c ] = Õ 6 8 qm qm + 26m . – By Chebyshev, Pr[bad6 ] = O 25m 2 µ2m qm 23n . This gives the following upper bound of ϵ2 . 2 2 8 qm qm µm + vm µm qm + + Õ + , 2n L21 2n L22 2n 26m where we suppress the terms by the AM-GM inequality. For example, 4 qm 24n 8 qm 26n . 6 2qm 25n ≤ + We also have better bounds for computing Ex ϵ1 (τ )2 using Fact 6. " 2 # 2 nC Cqc Ex = Õ 2n 22n 2 2 4 2 4 µm qm qm µ4m µ2m qm µ2m qm = Õ (L1 + L2 )2 + + + + + , 22n 23n 24n 22n 23n 24n " 2 # 2 Cqc2 Cqc Ex = Õ , 2n 2 22n " 2 # 2 4 2 6 8 qc qm µm qm qm qm Ex = Õ + 5n + 6n . 22n 24n 2 2 1 We have an asymptotic upper bound of Ex ϵ1 (τ )2 2 by 2 2 4 2 µm µm µm qm µm qm qm qm qm Õ + + + + (L + L ) + + 1 2 2n 21.5n 22n 23n 2n 21.5n 22n 2 qm qm Let µ2m =: ν. The asymptotic advantage upper bound becomes n + 21.5n + 22n 2 8 uµ2m + uvm uµm qm uqm uqm + + + 2n L2 2n L2 2n 26m √ 2 √ 1 √2 √ 4 2 uµm uµm qm uµm qm uqm √ + + + + + u(L1 + L2 )ν n 1.5n 2n 2 2 2 23n √ √ uµ2m + uvm uq 4 uµm qm uq 2 ≲ + + 2 mn + 3nm + u(L1 + L2 )ν 2 n n 2 L1 2 L2 2 2 3 If qm < 22n , taking L31 = Õ u0.5 µm qm , L32 2n ν √ 4 uµ2m + uvm uqm + n 2 23n = 2 u0.5 qm 2n ν gives the asymptotic advantage: ! 4 2 2 2 2 3 u 3 µm qm u 3 qm u 3 qm + + 4n + 5n 2n 23 23 2 3 65 2 3 3 Otherwise, if qm ≤ 22n , we can choose L2 = Θ 2 4 3 2u 3 qm 4n 3 2 2 3 qm n 2 2 u3 2n qm instead, giving the same 2 2 u 3 qm 5n 2 3 upper bound. If µm > 0, ≤ + gives a simpler bound. When µm = vm = 0, we have the multi-user security of nEHtM as pseudorandom functions with the punctured codomain {0, 1}n \ {0} : 2 2/3 2 6 1/3 ! √ 4 u qm uqm uqm + . (39) + Õ 23n 22n 25n Recovering the result of [11] using δ-AXU(2) . We give the correct asymptotic bound of the multi-use nEHtM2 security following the proof of [11] assuming that H is δ-AXU(2) . Recall the following bound from [11, eprint version, page 27], which are based on the slightly worse version of Theorem 3 with a different bad event for bounding qc . ! Pα+β 2 4 |S | Lq Lq q Lq i c c m m i=1 ϵ1 (τ ) = Õ + n + 2n + 3n , 2n 2 2 2 2 q2 L2 q 8 qm + 2mn + 6nm , ϵ2 = Õ 2n 2 L 2 2 where Õ ignores the polynomial of n and ℓ. We also remove some terms in ϵ2 , which only makes the bound better. A straightforward computation using Fact 6 (i.e., assuming H is δ-AXU(2) ) gives the following bound. 4 1/2 Lqm Lqm Ex ϵ1 (τ )2 = Õ + . 21.5n 23n We suppress most of the terms using the AM-GM inequality and L ≥ 1. We stress that the red term in the above bound is missing in the final bound of [11, Theorem 5], which corresponds to √ 4 2u(n + 1)Lqm δ 1/2 2n in their notation and appeared in the second line of Advmu−prf nEHtM (u, qmax ) of page 28. This makes the actual security bound slightly worse than they claimed, even √ 22n 0.5n 3 qm , q2 gives assuming that H is a pairwise δ-AXU. Taking L = u min 2 m the final bound of 2 4 1/3 2 10 1/3 ! u qm u qm Õ + . 4n 2 27n The second term indeed appears in the original statement, and the first term uq 2 is larger than 22nm in the original bound. Also, the following inequality confirms 66 2 6 1/3 u q in the original bound is just hidden by the that the dominating term 25nm other terms; i.e., our analysis does not miss the term. 3 6 u2 qm 25n 1/3 ≤2 4 u2 qm 24n 1/3 + 10 u2 qm 27n 1/3 Finally, our new bound in Equation (39) with the same assumption is always tighter than this bound, because  1/3 2 4  qm  u 4n is the dominating term 2 2 10 1/3 2 6 1/3  qm u q  u 7n ≥ 25nm 2 2 if qm ≤ 2n , and 2 if qm ≥ 2n . Recovering the result of [11] without δ-AXU(2) . If we are willing to avoid δ-AXU(2) , we only can use Fact 5, and the probability for bad6 (denoted by bad5 in the original paper) becomes worse: – If we use Markov inequality as in our analysis, Pr[bad6 ] = Õ 2 7 L q – If we use Chebyshev inequality, Pr[bad6 ] = Õ 25nm . 4 Lqm 23n . Based on this, we can obtain the following asymptotic bounds using Fact 5 only assuming that H is δ-AXU: 2 1/2 Ex ϵ1 (τ ) = Õ 1.5 4 Lqm Lqm + 21.5n 23n , ϵ2 = Õ 2 2 qm qm + 2 n + Pr[bad6 ] , 22n L 2 (2) where the red terms are worse than the bound assuming √ δ-AXU √. 2n 2n 3 0.5n 0.5 If we use Markov inequality, we take L = min u2 qm , u2 , 2q2 to 2 qm m obtain the final bound of 2 5 1/3 3 10 1/3 ! u qm u qm Õ + . 24n 27n If we use Chebyshev inequality, we take L3 = min L4 = 4n 2 5 qm √ 0.5 u20.5n qm , , and obtain the final bound of Õ 5 u2 qm 24n 1/3 + 10 u2 qm 27n 1/3 4.5 uqm + 3n 2 ! In any case, the bound becomes worse than one using δ-AXU(2) . 67 √ u22n 2 qm or 7 Multi-User Security of DbHtS This section proves the multi-user MAC security of the Double-block Hash-thenSum (DbHtS) scheme proposed by [20] with the domain separation. Let M = {0, 1}∗ be a message space, Kh = {0, 1}k be a hash key space, and K = {0, 1}k be a block cipher key space. Note that we assume Kh = K for ease representation. Let H = (H1 , H2 ) : Kh × Kh × M → {0, 1}n−1 × {0, 1}n−1 be a hash function with (2n − 2)-bit outputs, which can be decomposed into two (n − 1)-bit hash functions H1 , H2 : Kh × M → {0, 1}n−1 so that HKh (M ) = (H1Kh (M ), H2Kh (M )) where Kh = (Kh1 , Kh2 ) ∈ Kh × Kh . Let E : K × {0, 1}n → 1 2 {0, 1}n be a block cipher modeled as an ideal cipher, i.e., keyed random permutations. We define the DbHtS constructions with the domain separation as follows: def DbHtS[H, E](Kh , K, M ) = EK (0∥H1Kh,1 (M )) ⊕ EK (1∥H2Kh,2 (M )). It is well-known that MAC security of deterministic MACs can be viewed as PRF security. Using the same reasoning, it is enough to show that PRF∗ security of DbHtS. We also introduce the additional parameter qm denoting the maximum number of queries each user makes and assume q = uqm for our security analysis; this does not lose the generality by making some redundant queries at the end. Theorem 12 shows the multi-user DbHtS security bound improved from [37] (Recall Figure 1 for the comparison). Following the original paper, we require the hash functions H1 , H2 used in DbHtS to be regular and AU, and E is implemented by the ideal cipher. The proof can be found in Section 7.1. Theorem 12 (Proof is in Section 7.1). Let n, k, u, p, l, and qm be positive integers such that p + uqm l ≤ 2n−2 . Let hash functions H1 , H2 : {0, 1}k × M → {0, 1}n−1 are δ1 -regular and δ2 -AU. Let the block cipher E : {0, 1}k × {0, 1}n → {0, 1}n be modeled as an ideal cipher. Let l be the maximum block length among all construction queries. Then, it holds that -prf Advmu DbHtS (u, qm , p) ≤ ∗ 2 3 4u2 qm lδ1 8uqm (δ1 + δ2 ) 2uqm pl 2u 2uqm pδ1 + + + + k+n k k k 2 2 2 2n 2 8uqm p u2 u(3u + p)(6u + 2p) 3 2 + k+n + k+n + + 3uqm δ2 2 2 22k 2 3 3(n + 1)3 u 2n2 uqm 128n3 uqm + + . + 22n 22n 23n Theorem 13 shows an improved result from the previous work [21]. We require the hash function to satisfy an additional assumption called δ-AU(2) : For any M1 ̸= M1′ and M2 ̸= M2′ in M, H is δ-AU(2) if: Pr [HK (M1 ) = HK (M1′ ) ∧ HK (M2 ) = HK (M2′ )] ≤ δ 2 . $ K← −K We remark that the cross-collision resistance between H1 , H2 that are originally required in [21] automatically holds by the domain separation. We defer the proof to Section 7.2. 68 Theorem 13 (Proof is in Section 7.2). Let n, k, u, p and qm be positive integers. Let hash functions H1 , H2 be δ-regular, δ-AU and δ-AU(2) . Then, it holds that 1 -prf Advmu DbHtS (u, qm , p) ≤ ∗ 7.1 2 3 4 2u2 qm δ 3upqm 47uqm δ 2upqm δ 2u2 2 32 + + 10uq + + . δ + n m +k 2k 2n 2 2k 2k 2 2 2 Proof of Theorem 12 Transcript from the ideal and real world. We consider an arbitrary distinguisher D in the information-theoretic setting. Whenever the distinguisher makes a query, it obtains two types of information depending on the query, sometimes called entry, in both the ideal world and the real world: – Ideal-cipher queries: For each primitive query on ideal cipher E with input x, we associate it with an entry (prim, J, x, y, +) for J ∈ K and x, y ∈ {0, 1}n . For each primitive query on the inverse of ideal cipher E −1 with input y, we associate it with an entry (prim, J, x, y, −) for J ∈ K and x, y ∈ {0, 1}n . – Construction queries: For each construction query on DbHtS from user i with message M, we associate it with an entry (eval, i, M, T ). Let (eval, i, Mai , Tai ) be the entry obtained when D makes the a-th query to user i. Let lai be the number of blocks of Mai and let l be the maximal number of blocks among these uqm construction queries. During the computation of (eval, i, Mai , Tai ), let Σai , Ψai be the internal outputs of hash function H, namely 2 1 (Mai ), respectively. Let Uai , Qia be the outputs (Mai ) and Ψai = HK Σai = HK h,2 h,1 of ideal cipher E deployed in DbHtS with inputs Σai and Ψai , namely Uai = E(Ki , 0∥Σai ) and Qia = E(Ki , 1∥Ψai ), respectively. For a key J ∈ {0, 1}k , let P(J) be the set of entries (prim, J, x, y, ∗) associating with the primitive query on the ideal cipher E with key J. Let Q(J) be the set of entries (eval, i, Mai , Tai ) associating with the construction query on DbHtS with the key such that Ki = J. In the real world, after the distinguisher finishes all its queries, we will further give the following information to the distinguished: 1. the keys (Khi , Ki ) for each user i, and 2. the internal values Uai , Qia for each user i and its corresponding query a. In the ideal world, we will instead give the distinguisher (Khi , Ki ) ←$ {0, 1}2k × {0, 1}k , independent of its queries. 3. In addition, we will give the distinguisher dummy values Uai and Qia computed by the simulation oracle SIM(Q(J)) (the same as that in Fig.4 in [37]). Both a transcript in the ideal world and the real world consists of 1. the revealed key pair (Khi , Ki ) for each of u users, 2. the internal values Uai and Qia for each of u users and each of their qm construction queries, and 69 3. the p primitive queries and uqm construction queries. Bad Transcript Analysis and Interpretations. We now define bad transcripts and compute the probabilities that each bad event happens. Let Tid and Tre be random variables following the distribution of the transcripts in the real world and the ideal world, respectively. Let badi be the event that Tid satisfies the i-th bad event. We call the transcript bad if any bad events happen, and good otherwise. We refer [37, Section 3] for a more detailed description of most bad events and their analysis; the analysis for our cases is done analogously with small tweaks. The events [37, bad15 , bad16 ] are excluded by the fine-tuned ideal world (bad15 ) and the domain separation (bad16 ) in the analysis below. Instead, we consider new bad15 below for i i 1. There exists user i such that Ki = Kh,1 or Ki = Kh,2 . For each user, this 2 event happens with probability at most 2k . Then, we have by the union bound that Pr [Tid ∈ bad1 ] ≤ 2u . 2k i i 2. There exists user i such that both its key (Ki and (Kh,1 or Kh,2 )) has been used by other user j or queried by primitive query (prim, J, x, y, ∗). Then, we have Pr [Tid ∈ bad2 ] ≤ u(3u + p)(6u + 2p) . 22k Note that ¬bad1 ∧ ¬bad2 guarantees that any user i has at least one fresh key. Further, excluding bad3 defined below guarantees that the distinguisher cannot control the output of the hash function by issuing primitive queries that happen to use the same key and query a message block used in the hashing process. 3. There are two queries (eval, i, Mai , Tai ) and (prim, J, x, y, +) of D such that the hash key used in the construction query is the same as the key used i i in the primitive query, namely Kh,1 = J or Kh,2 = J; and one of the message block used in the construction query is queried by the primitive query, namely x ∈ M ′ , where M ′ might be a block in Mai or a proceed block during the hashing process depending on the construction of H. Then, if p + uqm l ≤ 2n−2 , we have Pr [Tid ∈ bad3 ] ≤ 2upqm l . 2k+n Excluding the following events, bad4 and bad5 , guarantees that neither the inputs nor outputs of EKi collide with those in the primitive queries when Ki = J. 4. There are two queries (eval, i, Mai , Tai ) and (prim, J, x, y, ∗) made by D such that Ki = J and either Σai = x or Ψai = x. Then, we have Pr [Tid ∈ bad4 |¬bad2 ] ≤ 70 2upqm δ1 . 2k 5. There are entries (eval, i, Mai , Tai ) and (prim, J, x, y, ∗) such that Ki = J and either Uai = y or Qia = y. The event Uai = y or Qia = y is the same as ΦKi (Uai ) = y or ΦKi (Qia ) = y (Recall Φ is the partial function used to simulate a random permutation and defined in Fig.4 [37]). Then, if p+uqm l ≤ 2n−2 , we have Pr [Tid ∈ bad5 |¬bad2 ] ≤ 8upqm . 2k+n Excluding the following two bad events bad6 and bad7 guarantees that if Ki = Kj then all the inputs of EKi are distinct from those of EKi . 6. There is a construction entry (eval, i, Mai , Tai ) such that Ki = Kj and Σai = Σbj for some entry (eval, j, Mbj , Tbj ). Then, we have Pr [Tid ∈ bad6 |¬bad2 ] ≤ 2 δ1 u2 qm . k 2 7. There is a construction entry (eval, i, Mai , Tai ) such that Ki = Kj and Ψai = Ψbj for some entry (eval, j, Mbj , Tbj ). Then, we have Pr [Tid ∈ bad7 |¬bad2 ] ≤ 2 δ1 u2 qm . 2k j Excluding the following bad event bad8 guarantees that if Ki = Kh,1 or Ki = j Kh,2 , then the inputs to EKi do not collide with the inputs to the hash part j j with key Kh,1 or Kh,2 , respectively. j and Σai = 8. There is a construction entry (eval, i, Mai , Tai ) such that Ki = Kh,1 ′ ′ ′ j Mbj , or Ki = Kh,2 and Ψai = Mbj for some entry (eval, i, Mbj , Tbj ), where Mbj is either one of message blocks of Mbj or a proceed block of Mbj during the hashing process depending on the construction of H. Then, we have Pr [Tid ∈ bad8 ] ≤ 2 lδ1 2u2 qm . k 2 Excluding the following bad event bad9 guarantees that for every pair of construction query (eval, i, Mai , Tai ) and (eval, i, Mbi , Tbi ) from each user, at least one of 0∥Σai and 1∥Ψai is fresh for the construction query (eval, i, Mai , Tai ). In other words, at least one of inputs of the ideal cipher EKi is fresh. 9. There is a construction entry (eval, i, Mai , Tai ) such that Σai = Σbi and Ψai = Ψbi for some entry (eval, i, Mbi , Tbi ). Then, we have 2 2 Pr [Tid ∈ bad9 ] ≤ uqm δ2 . Excluding the following two bad events bad10 and bad11 guarantees that the partial function (Def. in Fig.4 [37]) behaves indistinguishably from a random permutation for every pair of construction queries. 71 10. There is a construction entry (eval, i, Mai , Tai ) such that either Σai = Σbi or Σai = Ψbi , and either Uai = Ubi or Uai = Qib for some entry (eval, i, Mbi , Tbi ). Then, we have Pr [Tid ∈ bad10 ] ≤ 2 2uqm (δ1 + δ2 ) . 2n 11. There is a construction entry (eval, i, Mai , Tai ) such that either Ψai = Σbi or Ψai = Ψbi , and either Qia = Ubi or Qia = Qib for some entry (eval, i, Mbi , Tbi ). Then, we have Pr [Tid ∈ bad11 ] ≤ 2 2uqm (δ1 + δ2 ) . 2n Excluding the following bad event and event bad9 guarantees that for every triple of construction query (eval, i, Mai , Tai ), (eval, i, Mbi , Tbi ) and (eval, i, Mci , Tci ) from each user, at least one of 0∥Σai and 1∥Ψai is fresh for the construction query (eval, i, Mai , Tai ). In other words, at least one of inputs of the ideal cipher EKi is fresh. 12. There is a construction entry (eval, i, Mai , Tai ) such that i Σa = Σbi and Ψai = Ψci or Σai = Σci and Ψai = Ψbi for some entries (eval, i, Mbi , Tbi ) and (eval, i, Mci , Tci ). Then, we have 3 2 Pr [Tid ∈ bad12 ] ≤ 2uqm δ2 . Excluding the following two bad events bad13 , bad14 guarantees that the partial function (Def. in Fig.4 of [37]) behaves indistinguishably with a random permutation for every triple of construction query. 13. There is a construction entry (eval, i, Mai , Tai ) such that either Σai = Σbi or Σai = Ψbi , and either Uai = Uci or Uai = Qic for some entry (eval, i, Mbi , Tbi ) and (eval, i, Mci , Tci ). Then, we have Pr [Tid ∈ bad13 ] ≤ 3 2uqm (δ1 + δ2 ) . 2n 14. There is a construction entry (eval, i, Mai , Tai ) such that either Ψai = Σbi or Ψai = Ψbi , and either Qia = Uci or Qia = Qic for some entry (eval, i, Mbi , Tbi ) and (eval, i, Mci , Tci ). Then, we have Pr [Tid ∈ bad14 ] ≤ 3 2uqm (δ1 + δ2 ) . 2n Excluding the following bad event guarantees that there are no more than n users who share the same ideal cipher key. 72 15. There are no i1 , . . . , in ∈ [u] such that Ki1 = · · · = Kin . Then we have u n 2k(n−1) Pr [Tid ∈ bad15 ] = ≤ u2 2k+n . The overall probability of the bad event is bounded as follows: 2u u(3u + p)(6u + 2p) 2upqm l 2upqm δ1 + + k+n + 2k 22k 2 2k 2 2 3 8upqm 4u qm lδ1 8uqm (δ1 + δ2 ) u2 3 2 + k+n + + 3uq δ + + . (40) m 2 2 2k 2n 2k+n Pr [Tid ∈ bad] ≤ Good Transcript Analysis. The following analysis is built upon the analysis presented in [37], highlighting only the modifications relevant to the context of our paper. Let τ denote a good transcript and S(J), F (J), Q(J) are defined in [37, Figure 4] and g is defined in [37, page 15]. We first compute  1 Pr [Tid = τ ] = 2uk 2 (N − 1)uqm Y J∈{0,1} 1  1 · |S(J)| (N − 2 |F (J)|)g k |P(J)|−1 Y i=0  1 , N − 2 |F (J)| − g − i and  Pr [Tre = τ ] = 1 22uk Y  J∈{0,1}k 1 (N )|Q(J)|+|F (J)|+g 73 |P(J)|−1 Y i=0  1 . N − |Q(J)| − |F (J)| − g − i Then, we have Pr [Tre = τ ] ≥ (N − 1)uqm Pr [Tid = τ ] Y ≥ J∈{0,1}k ≥ Y J∈{0,1}k |S(J)| (N − 2 |F (J)|)g (N )|Q(J)|+|F (J)|+g (N − 1)|Q(J)| (N − 2 |F (J)|)g · |S(J)| (N )|Q(J)|+|F (J)|+g (N − 1)|Q(J)| (N − 2 |F (J)|)g (N )2|F (J)| (N )|Q(J)|+|F (J)|+g (N − 1)|F (J)| J∈{0,1}k ! 3 2 128 |F (J)| 8(n + 1)3 2 |F (J)| × 1− − − (by Theorem 6) N2 N3 3N 2 Y Y ≥ J∈{0,1}k (N − 1)|Q(J)|−|F (J)| (N − 2 |F (J)| − g)|Q(J)|−|F (J)| 3 2 2 |F (J)| 128 |F (J)| 8(n + 1)3 1− − − 2 3 N N 3N 2 × ≥ Y J∈{0,1}k ! (N − 1)|Q(J)|−|F (J)| (N − 2 |F (J)| − g)|Q(J)|−|F (J)| 2 3 2n2 qm 128n3 qm 8(n + 1)3 × 1− − − N2 N3 3N 2 (∵ |F (J)| ≤ nqm , guaranteed by ¬bad15 ) ≥1− 3 2 128un3 qm 8(n + 1)3 u 2un2 qm − − , 2 3 N N 3N 2 (41) where the last line is because there are at most u used J, so at most u product terms. Conclude the Proof. From Equations (40) and (41), define def ϵ1 = 2 3 128n3 uqm 8(n + 1)3 u 2n2 uqm + + 22n 23n 3 · 22n and def ϵ2 = 2 3 2u 2upqm δ1 4u2 qm lδ1 8uqm (δ1 + δ2 ) 2upqm l + + + + k+n k k k 2 2 2 2n 2 8upqm u2 u(3u + p)(6u + 2p) 3 2 + k+n + k+n + + 3uqm δ2 . 2 2 22k 74 Then by Lemma 1, we conclude that -prf Advmu DbHtS (u, qm , p) ≤ ∗ 7.2 2 3 4u2 qm lδ1 8uqm (δ1 + δ2 ) 2upqm l 2u 2upqm δ1 + + + + k+n 2k 2k 2k 2n 2 2 8upqm u u(3u + p)(6u + 2p) 3(n + 1)3 u 3 2 + k+n + k+n + + 3uqm δ2 + 2k 2 2 2 22n 2 2 3 3 2n uqm 128n uqm + + . 22n 23n Proof of Theorem 13 The general idea of the proof follows the proof of [21, Theorem 1]. The concrete proof diverges in threefold. First, we analyze the multi-user security of DbHtS in the fine-tuned ideal world, which excludes the bad event Bad-Tag unavoidable in the analysis of [21]. Second, our DbHtS construction also assumes the hash function is δ-AU(2) . Third, we explicitly introduce qm instead of approximating it as q in the analysis. These variances not only modify the calculations associated with certain bad events presented in [21] but also lead to consequential shifts in the final statement (Theorem 13). For the sake of completeness, we give the full 2 32 proof in the following. Note that if qm δ ≥ 1, then Theorem 13 trivially holds. 2 32 Hence in the following, we prove Theorem 13 for the case of qm δ < 1. Transcript From the Ideal and Real World. We consider an arbitrary distinguisher D in the information-theoretic setting. After the distinguisher finishes querying, it obtains two types of information in both of the ideal world and the real world – Ideal-cipher queries: for each primitive query on ideal cipher E with input x, we associate it with an entry (prim, J, x, y, +) for J ∈ K and x, y ∈ {0, 1}n . For each primitive query on the inverse of ideal cipher E −1 with input y, we associate it with an entry (prim, J, x, y, −) for J ∈ K and x, y ∈ {0, 1}n . – Construction queries: for each construction query on DbHtS from user i with message M, we associate it with an entry (eval, i, M, T ). Let (eval, i, Mai , Tai ) be the entry obtained when D makes the a-th query to user i. During the computation of (eval, i, Mai , Tai ), let Σai , Ψai be the internal 1 2 outputs of hash function H, namely Σai = HK (Mai ) and Ψai = HK (Mai ), h,1 h,2 i i respectively. Let Ua , Qa be the outputs of ideal cipher E deployed in DbHtS with inputs Σai and Ψai , namely Uai = E(Ki , 0∥Σai ) and Qia = E(Ki , 1∥Ψai ), respectively. To make it a bit easy to read, we use the term "block cipher key" to refer to the key Ki for user i and "ideal cipher key" to refer to the key J used in an Ideal-cipher (primitive) query. Let s denote the total number of distinct ideal cipher key used during the evaluation of primitive queries. Let r denote the total number of distinct block cipher keys that collide with ideal cipher key used during the evaluation of primitive queries. In the real world, after the distinguisher finishes all its queries, we will further give it: 1) the keys (Khi , Ki ) for each user i and 2) the internal values 75 (Σai , Ψai , Uai , Qia ) for each user i and its corresponding query a. In the ideal world, we will instead give the distinguisher (Khi , Ki ) ←$ {0, 1}2k × {0, 1}k , independent of its queries. In addition, we will give the distinguisher dummy values (Σai , Ψai , Uai , Qia ). All these values are computed by the simulation oracle shown in Algorithm 3. Note that we remove Bad-Tag event since we are assuming the fine-tuned ideal world while there is an additional bad event, dubbed Bad5, to achieve better security. Both a transcript in the ideal world and the real world consists of 1. the revealed keys (Khi , Ki ) for each of u users, 2. the internal values (Σai , Ψai , Uai , Qia ) for each of u users and each of their qm construction queries, 3. and the p primitive queries and uqm construction queries. Algorithm 3 Offline oracle in the ideal world i i )i∈[u] ←$ Kh × Kh , Kh,2 1: (Kh,1 2: (Ki )i∈[u] ←$ {0, 1}k 1 2 3: (Σai , Ψai )(i,a)∈[u]×[qm ] ← (HK (Mai ), HK (Mai ))(i,a)∈[u]×[qm ] h,1 h,2 4: if BadK = 1 ∨ Bad1 = 1 ∨ Bad2 = 1 ∨ Bad3 = 1 ∨ Bad4 = 1 ∨ Bad5 = 1 then aborts def 5: Q= = {(i, a) ∈ [u] × [qm ] : ∃(prim, Ki , x, y, ∗); ∀(prim, Ki , x, y, ∗), x ̸= Σai , x ̸= Ψai } def = ▷ i ∈ I= 6: I= = {i ∈ [u] : (i, ∗) ∈ Q= } = I= i1 ⊔ · · · ⊔ Iir ij if Kij is used in primitive queries, where ij ∈ [s] as there are s distinct ideal-cipher key 7: for j ← 1 to r do i i i 8: ∀i ∈ I= ij let Σa be not fresh in (Σ1 , . . . , Σqm ) for some construction query i i (eval, i, Ma , Ta ) def def Let Dom(Kij ) = {x : (prim, Kij , x, y, ∗)} and Ran(Kij ) = {y : (prim, Kij , x, y, ∗)} 10: if 0∥Σai ∈ / Dom(Kij ) then Pij (Σai ) ← Uai ←$ {0, 1}n \ Ran(Kij ), Qia ← Uai ⊕ Tai i 11: else Ua ← Pij (Σai ), Qia ← Uai ⊕ Tai 9: if Qia ∈ Ran(Kij ) then Bad-Samp ← 1 , aborts S S else Dom(Kij ) ← Dom(Kij ) {0∥Σai , 1∥Ψai }, Ran(Kij ) ← Ran(Kij ) {Uai , Qia } 12: 13: def 14: Q̸= = {(i, a) ∈ [u] × [qm ] : ∀(prim, J, x, y, ∗), J ̸= Ki } ̸= def = = {i ∈ [u] : (i, ∗) ∈ Q̸= } = I̸= ▷ i, j ∈ I̸= i1 ⊔ · · · ⊔ Ii′r ij if Ki = Kj S S ′ i i i i g g ij ij 16: ∀j ∈ [r ] : Σ = i∈I̸= {Σ1 , . . . , Σqm }, Ψ = i∈I̸= {Ψ1 , . . . , Ψqm } ij ij def S i i 17: ∀j ∈ [r′ ] : (Uai , Qia )i∈I̸= ,a∈[qm ] ←$ Sij where Sij = { i∈I̸= ,a∈[qm ] {Za,1 , Za,2 } ∈ 15: I ij ij g g i i n |Σ j |+|Ψ j | ({0, 1} ) return i i : Za,1 ⊕ Za,2 = Tai } (Σai , Ψai , Uai , Qia )(i,a)∈[u]×[qm ] Bad Transcript Analysis and Interpretations. We now give the definition of bad transcripts and compute their corresponding probabilities. Let Tid 76 and Tre be random variables following the distribution of the transcripts in the real world and the ideal world, respectively. We define and calculate the following bad events that are with non-zero probability in Tid . BadK: There exists distinct user i1 and i2 such that both block-cipher key and i1 i2 i1 i2 (one of) hash key collide, i.e., Ki1 = Ki2 ∧ (Kh,1 = Kh,1 ∨ Kh,2 = Kh,2 ). Then, we have Pr [Tid ∈ BadK] ≤ 2u2 . 22k Bad1: There exists a construction query (eval, i, Mai , Tai ) such that its corresponding block-cipher key and (one of) hash output collide with a primitive query (prim, J, x, y, ∗). Then, we have Pr [Tid ∈ Bad1] ≤ 2upqm δ . 2k Bad2: Either B.21 or B.22 happens, so we have Pr [Tid ∈ Bad2|¬BadK] ≤ 2 2 uqm δ 2u2 qm δ + . n k 2 2 B.21: There exists a user i whose two construction queries both collide on tag and (one of) hash output. That is, there exists (eval, i, Mai , Tai ) and (eval, i, Mbi , Tbi ) such that Tai = Tbi ∧ (Σai = Σbi ∨ Ψai = Ψbi ). Then, we have Pr [Tid ∈ B.21|¬BadK] ≤ 2 uqm δ . n 2 B.22: There exists users i1 and i2 whose construction queries both collide on tag and (one of) hash output. That is, there exists (eval, i1 , Mai1 , Tai1 ) and (eval, i2 , Mbi2 , Tbi2 ) such that Tai1 = Tbi2 ∧ (Σai1 = Σbi2 ∨ Ψai1 = Ψbi2 ). Then, we have Pr [Tid ∈ B.22|¬BadK] ≤ 2 δ 2u2 qm . k 2 Bad3: Either B.31 or B.32 happens, so we have 3 2 2 Pr [Tid ∈ Bad3] ≤ 2uqm δ . B.31: There exists user i such that {(a, b) : a < b ∧ Σai = Σbi } ≥ L1 . Let L1 = 2 qm δ 2 1 4 + qm2δ . Let Iia,b be the indicator random variable which takes the P value 1 if Σai = Σbi ; 0 otherwise. Let Ii = a<b Iia,b . We calculate X i q2 δ Ex Ii = Pr Σa = Σbi ≤ m , 2 a<b 77 and, by assuming the hash function is δ-AU(2) , " ! !# X X i i 2 q4 δ2 i i Var I ≤ Ex (I ) = Ex Ia,b Ic,d ≤ m . 4 a<b c<d Then, by Lemma 3, we have Pr [Tid ∈ B.31] ≤ X Pr Ii ≥ L1 i∈[u] ≤ = 4 2 uqm δ 2 δ)2 (2L1 − qm 4 2 uqm δ (∵ Plug in L1 ) 1 2 δ2 qm 3 2 2 ≤ uqm δ . B.32: There exists user i such that {(a, b) : a < b ∧ Ψai = Ψbi } ≥ L1 . Similarly, we have 3 2 2 Pr [Tid ∈ B.32] ≤ uqm δ . Bad4: One of B.41, B.42, and B.43 happens, so we have 3 2 2 2 2 uqm δ 9uqm δ 2 32 + 4uqm δ ≤ . 2 2 B.41: There exists a user i whose two construction queries both two hash outputs collide. That is, there exists (eval, i, Mai , Tai ) and (eval, i, Mbi , Tbi ) such that Σai = Σbi ∧ Ψai = Ψbi . Then, we have Pr [Tid ∈ Bad4] ≤ 2 2 δ uqm . 2 B.42: There exists a user i and its tuple of four construction queries indices a, b, c, d such that Σai = Σbi ∧ Ψbi = Ψci ∧ Σci = Σdi . Then, we have X Pr [Tid ∈ B.42|¬B.31] ≤ L21 δ Pr [Tid ∈ B.41] ≤ i∈[u] 1 2 qm δ qm δ 4 + 2 2 = !2 1 4 3 1 2 3 q δ + qm δ 2 2 m 2 2 32 ≤ qm δ . ≤ δ (∵ Plug in L1 ) (by Lemma 6) Further, Pr [Tid ∈ B.42] ≤ Pr [Tid ∈ B.42|¬B.31] + Pr [Tid ∈ B.31] 3 2 2 ≤ 2qm δ . 78 3 2 2 (∵ qm δ < 1) Note that the summation of Pr [Tid ∈ B.31] will happen again when we compute the total probability of bad events, so this is indeed redundant computation, but we leave it for simplicity of the proof. B.43: There exists a user i and its tuple of four construction queries indices a, b, c, d such that Ψai = Ψbi ∧ Σbi = Σci ∧ Ψci = Ψdi . Similarly, we have 3 2 2 Pr [Tid ∈ B.43] ≤ 2qm δ . Bad5: There exists user i such that {(a, b, c) ∈ [qm ]∗3 : Σai = Σbi ∧ Ψbi = Ψci } ≥ L2 . Let Iia,b,c be the indicator random variable which takes the value 1 if Σai = Σbi ∧ Ψbi = Ψci ; 0 otherwise. Let X Iia,b,c Ii = (a,b,c)∈[qm ]∗3 1 1 2 3 and L2 = qm δ 2 2 3 n . Since we assume δ-AU(2) , X 3 2 Ex Ii = Pr Σai = Σbi ∧ Ψbi = Ψci ≤ qm δ , (a,b,c)∈[qm ]∗3 Then, by Lemma 2, we have Pr [Tid ∈ Bad5] ≤ X Pr Ii ≥ L2 i∈[u] ≤ 3 2 δ uqm L2 8 = 3 3 δ2 uqm . (∵ Plug in L2 ) 2 23n Bad-Samp: Bad-Samp happens if in the simulation oracle [line 12, Algorithm 3], the simulated output Qia collides with any primitive query (prim, Kij , x, y, ∗) or previous simulated output. We bound Bad-Samp by the union of events BS1 and BS2. We assume q = uqm . BS1: There exists a primitive query (prim, J, x, y, ∗) and a user i ∈ I= ij for some i j ∈ [r] such that both its output y collides with Qa and its ideal-cipher key J equals to the block-cipher key Kij . Then we have we have 2pq 2upqm = n+k . n+k 2 2 BS2: There exists a primitive query (prim, J, x, y, ∗) and two users i, i′ ∈ I= ij Pr [Tid ∈ BS1] ≤ ′ for some j ∈ [r] such that both Qia collides with Qia and its ideal-cipher key J equals to the block-cipher key Kij . For BS2, we have Pr [Tid ∈ BS2] ≤ u 2 2 X pqm upqm pqqm = = n+k . n+k n+k 2 2 2 i=1 79 We consider two cases. If q ≤ 2n/2 , then we have pqm pqqm upqm ≤ n +k ≤ n +k . 2 2n+k 2 22 On the other hand, if q > 2n/2 , we start with considering the first qn2 2 users. Similarly to [21, Inequality (25)], we define an event Aux as follows: if the key for any of first qn2 users collide with a primitive query key, we 2 call Aux occurs. We can see q p n upqm 2 Pr [Tid ∈ Aux] ≤ 2 k ≤ n +k . 2 22 If u ≤ q n 22 , it is an upper bound of Pr [Tid ∈ BS2] ≤ q = uqm ≥ q n 22 upqm . n 2 2 +k Otherwise, qm , which says qm ≤ 2n/2 . Then we have pqqm pq upqm ≤ n +k = n +k , n+k 2 22 22 and Pr [Tid ∈ BS2] ≤ upqm . n 2 2 +k To conclude, we have Pr [Tid ∈ Bad-Samp] ≤ 2upqm 3upqm upqm + n +k ≤ n +k . 2n+k 22 22 Summing up the probability of the bad events and define bad = BadK ∨ Bad1 ∨ Bad2 ∨ Bad3 ∨ Bad4 ∨ Bad5 ∨ Bad-Samp, we have Pr [Tid ∈ bad] ≤ 2 2u2 δ 2upqm δ 2u2 qm 3upqm 2 32 + + + 8uqm δ + n +k . 22k 2k 2k 22 (42) Good Transcript Analysis. The following analysis aims to compute a lower re =τ ] bound for the ratio Pr[T Pr[Tid =τ ] on a good transcript. We first consider the transcript for the construction query indexed by Q= . Recall def Q= = (i, a) ∈ [u] × [qm ] : ∃(prim, Ki , x, y, ∗); ∀(prim, Ki , x, y, ∗), x ̸= Σai , x ̸= Ψai as defined in Algorithm 3. For each j ∈ [r] and each i ∈ I= ij , we consider the internal value sequence (U1i , . . . , Uqim ), (Qi1 , . . . , Qiqm ). From this sequence, we construct a bipartite graph Gi , where the nodes in one partition represent values Uai and the nodes in the other represent Qia . We connect the node representing Uai and Qia with an edge labeled with Tai , where Uai ⊕ Qia = Tai . If Uai = Ubi where a ̸= b, then we merge the corresponding nodes into a single one. We do the same thing if Qia = Qib where a ̸= b. 80 Since the transcript is good, we know that each component of Gi is acyclic, which is guaranteed by ¬B.41. Guaranteed by ¬B.42 ∧ ¬B.43, each component contains a path of length at most 3. Also, guaranteed by ¬B.31 ∧ ¬B.32, the size q2 δ 1 4 of each component is restricted up to L1 = m2 + qm2δ . Furthermore, guaranteed by ¬Bad1, the value of each vertex of the graph Gi is distinct from the input of any primitive query. Guaranteed by ¬B.21, if two nodes are connected in Gi the label of their path cannot be zero. Guaranteed by ¬B.22, if two distinct users i1 , i2 whose keys collide, then their corresponding graph Gi1 and Gi2 are distinct. We use vi to denote the size of the graph Gi , and wi to denote the number of components of Gi . We then consider the transcript for the construction query indexed by Q̸= . def = Recall I̸= = {i ∈ [u] : (i, ∗) ∈ Q̸= } = I̸= i1 ⊔ · · · ⊔ Ii′r as defined in Algorithm 3. ′ = For each j ∈ [r ] and each i ∈ Iij , we consider the internal value sequence (U1i , . . . , Uqim ), (Qi1 , . . . , Qiqm ). Similarly, we can construct a bipartite graph Hi . We use vi′ to denote the size of the graph Hi , and wi′ to denote the number of components of Hi . We now are ready to compute Pr [Tre = τ ], the probability of real-world hits a good transcript τ. Let pj be the number of primitive query use the j-th idealcipher key. We have   r 1 1  Y · Pr [Tre = τ ] =   3k n  2 (2 ) i=1 j=1  u Y pj + P i∈I= ij      Y 1   n  (2 )pj j∈[s]\{i1 ,...,ir }  vi    ′ r Y 1    n  (2 ) j=1  P    ̸= i∈I ij     .     v′   i And for Pr [Tid = τ ], we have  Pr [Tid = τ ] = 1 2nuqm u Y 1 23k i=1   r Y 1  ·  n  (2 ) j=1  pj + P i∈I= ij 81  ′   r  Y Y 1 1   .   (2n )pj j=1 Sij j∈[s]\{i1 ,...,ir }  wi  Plugging in the above two expressions, we have (2n )      P    r wi    ′ pj + r   i∈I= Y ij Sij Pr [Tre = τ ]   Y · = 2nuqm    n ) n )    Pr [Tid = τ ] (2 (2 j=1 j=1 P     P vi   pj + i∈I= ij  ̸= i∈I ij         ′  v i           n      1 − δij · (2 )       P     ′   v  r   r′ i   ̸= Y  Y i∈I  ij 1   nuqm  ,     ≥2  ·       P P j=1  n  j=1  ′ −w ′ )  n  (v    2 − pj − wi  i i     ̸=    i∈I  i∈I= i P   ij j n       2 · (2 ) (vi −wi )    i∈I=   P ′  ij   v   ̸= i∈I ij (∵ Plug in Theorem 5) where def δ ij = 2 X 9qc,i i∈I̸= i 2 1≤k≤αi ck · 22n P 8 + 31qc,i qi2 16q 4 + 3ni 2n 2 2 j ≤ 2 2 4 9qc,i L22 31qc,i qm 16qm + + . 8 · 22n 22n 23n (∵ ¬Bad5) We here explain more on the definition of δij . i ∈ I̸= ij is the user index and Hi is the bipartite graph constructed mentioned above. We use qi to denote the total number of edges in Hi and qc,i to denote the total number of edges of components in Hi with size larger than 2. We use αi to denote the total number of components in Hi with size larger than 2 and ck to denote the size of k-th component in the graph Hi . 82 i We further lower bound the expression of      r nq |I= | Y 2 m ij Pr [Tre = τ ]    ≥   Pr [Tid = τ ] P j=1  n  2 − pj − wi   i∈I= ij    Pr[Tre =τ ] Pr[Tid =τ ] P i∈I= ij in the following           ′ nq |I̸= |  r  Y 2 m ij · 1 − δij       ·      P  j=1  ′ −w ′ )  n (v   i  i  ̸=  i∈I ij  2 (vi −wi ) ′ ≥1− r X δ ij j=1 ′ ≥1− r X X 2 2 4 9qc,i L22 31qc,i qm 16qm + + 8 · 22n 22n 23n j=1 i∈I̸= ! ij ′ ≥1− 8 r X X 3 3 δ2 9qm 2 8 · 23n j=1 i∈I̸= 1 3 4 4 4 δ δ + 31qm 16qm 31qm + + 2 · 22n 23n ! ij (∵ ¬Bad3 ∧ ¬Bad5) 8 3 ≥1− 9uqm δ 8·2 3 2 2 3n + 3 47uqm δ 22n 1 4 ! (43) Conclude the Proof. From Equations (42) and (43), define 8 def ϵ1 = 1 3 3 δ2 9uqm 2 8 · 23n + 3 4 47uqm δ 2n 2 and def ϵ2 = 2 2u2 2upqm δ 2u2 qm δ 3upqm 2 32 + + + 8uqm δ + n +k . 2k k k 2 2 2 22 Then by Lemma 1, we conclude that 1 -prf Advmu DbHtS (u, qm , p) ≤ ∗ 2 3 4 3upqm 2u2 2upqm δ 2u2 qm δ 47uqm δ 2 32 + + 10uqm δ + n +k + 2k + . k k 2n 2 2 2 2 22 References [1] Bellare, M., Impagliazzo, R.: A tool for obtaining tighter security analyses of pseudorandom function based constructions, with applications to PRP to PRF conversion. Cryptology ePrint Archive, Report 1999/024 (1999), https://eprint. iacr.org/1999/024 3 83 [2] Bellare, M., Desai, A., Jokipii, E., Rogaway, P.: A concrete security treatment of symmetric encryption. In: 38th FOCS. pp. 394–403 3 [3] Bellare, M., Guérin, R., Rogaway, P.: XOR MACs: New methods for message authentication using finite pseudorandom functions. In: CRYPTO’95. LNCS, vol. 963, pp. 15–28 3 [4] Bellare, M., Kilian, J., Rogaway, P.: The security of cipher block chaining. In: CRYPTO’94. LNCS, vol. 839, pp. 341–358 3 [5] Bellare, M., Krovetz, T., Rogaway, P.: Luby-Rackoff backwards: Increasing security by making block ciphers non-invertible. In: EUROCRYPT’98. LNCS, vol. 1403, pp. 266–280 3 [6] Bellare, M., Rogaway, P.: The security of triple encryption and a framework for code-based game-playing proofs. In: EUROCRYPT 2006. LNCS, vol. 4004, pp. 409–426 3 [7] Bernstein, D.J.: How to stretch random functions: The security of protected counter sums. Journal of Cryptology 12(3), 185–192 3 [8] Bhattacharya, S., Nandi, M.: Full indifferentiable security of the xor of two or more random permutations using the χ2 method. In: EUROCRYPT 2018, Part I. LNCS, vol. 10820, pp. 387–412 3 [9] Chang, D., Nandi, M.: A short proof of the PRP/PRF switching lemma. Cryptology ePrint Archive, Report 2008/078 (2008), https://eprint.iacr.org/2008/ 078 3 [10] Chen, S., Steinberger, J.P.: Tight security bounds for key-alternating ciphers. In: EUROCRYPT 2014. LNCS, vol. 8441, pp. 327–350 11 [11] Chen, Y.L., Choi, W., Lee, C.: Improved multi-user security using the squaredratio method. In: CRYPTO 2023, to apear 2, 3, 4, 5, 6, 8, 9, 11, 42, 50, 51, 64, 66, 67 [12] Chen, Y.L., Mennink, B., Preneel, B.: Categorization of faulty nonce misuse resistant message authentication. In: ASIACRYPT 2021, Part III. LNCS, vol. 13092, pp. 520–550 4, 8 [13] Choi, W., Kim, H., Lee, J., Lee, Y.: Multi-user security of the sum of truncated random permutations. In: ASIACRYPT 2022, Part II. LNCS, vol. 13792, pp. 682– 710 3, 6, 42 [14] Choi, W., Lee, B., Lee, J.: Indifferentiability of truncated random permutations. In: ASIACRYPT 2019, Part I. LNCS, vol. 11921, pp. 175–195 3 [15] Choi, W., Lee, B., Lee, J., Lee, Y.: Toward a fully secure authenticated encryption scheme from a pseudorandom permutation. In: ASIACRYPT 2021, Part III. LNCS, vol. 13092, pp. 407–434 3 [16] Choi, W., Lee, B., Lee, Y., Lee, J.: Improved security analysis for nonce-based enhanced hash-then-mask MACs. In: ASIACRYPT 2020, Part I. LNCS, vol. 12491, pp. 697–723 4, 5, 6, 8, 9, 19, 21, 50 [17] Choi, W., Lee, J., Lee, Y.: Building prfs from tprps: Beyond the block and the tweak length bounds. IACR Cryptol. ePrint Arch. p. 918 28 [18] Cogliati, B., Lampe, R., Patarin, J.: The indistinguishability of the XOR of k permutations. In: FSE 2014. LNCS, vol. 8540, pp. 285–302 3 [19] Dai, W., Hoang, V.T., Tessaro, S.: Information-theoretic indistinguishability via the chi-squared method. In: CRYPTO 2017, Part III. LNCS, vol. 10403, pp. 497– 523 3, 5, 6, 10 [20] Datta, N., Dutta, A., Nandi, M., Paul, G.: Double-block hash-then-sum: A paradigm for constructing BBB secure PRF. IACR Trans. Symm. Cryptol. 2018(3), 36–92 4, 5, 68 84 [21] Datta, N., Dutta, A., Nandi, M., Talnikar, S.: Tight multi-user security bound of dbhts. IACR Trans. Symmetric Cryptol. 2023(1), 192–223, https://doi.org/ 10.46586/tosc.v2023.i1.192-223 5, 6, 7, 9, 68, 75, 80 [22] Dutta, A., Nandi, M., Saha, A.: Proof of mirror theory for ξ max = 2. IEEE Trans. Inf. Theory 68(9), 6218–6232 3 [23] Dutta, A., Nandi, M., Talnikar, S.: Beyond birthday bound secure MAC in faulty nonce model. In: EUROCRYPT 2019, Part I. LNCS, vol. 11476, pp. 437–466 4, 5, 49 [24] Gilboa, S., Gueron, S., Morris, B.: How many queries are needed to distinguish a truncated random permutation from a random function? Journal of Cryptology 31(1), 162–171 3 [25] Gunsing, A., Mennink, B.: The summation-truncation hybrid: Reusing discarded bits for free. In: CRYPTO 2020, Part I. LNCS, vol. 12170, pp. 187–217 3 [26] Hall, C., Wagner, D., Kelsey, J., Schneier, B.: Building PRFs from PRPs. In: CRYPTO’98. LNCS, vol. 1462, pp. 370–389 3 [27] Impagliazzo, R., Rudich, S.: Limits on the provable consequences of one-way permutations. In: CRYPTO’88. LNCS, vol. 403, pp. 8–26 3 [28] Jha, A., Nandi, M.: Tight security of cascaded LRW2. J. Cryptol. 33(3), 1272– 1317, https://doi.org/10.1007/s00145-020-09347-y 7 [29] Kim, S., Lee, B., Lee, J.: Tight security bounds for double-block hash-then-sum MACs. In: EUROCRYPT 2020, Part I. LNCS, vol. 12105, pp. 435–465 5 [30] Landecker, W., Shrimpton, T., Terashima, R.S.: Tweakable blockciphers with beyond birthday-bound security. In: CRYPTO 2012. LNCS, vol. 7417, pp. 14–30 7 [31] Lee, J.: Indifferentiability of the sum of random permutations toward optimal security. IEEE Trans. Inf. Theory 63(6), 4050–4054, https://doi.org/10.1109/ TIT.2017.2679757 3 [32] Lucks, S.: The sum of PRPs is a secure PRF. In: EUROCRYPT 2000. LNCS, vol. 1807, pp. 470–484 3 [33] Mennink, B.: Towards tight security of cascaded LRW2. In: Theory of Cryptography - 16th International Conference, TCC 2018, Panaji, India, November 11-14, 2018, Proceedings, Part II. Lecture Notes in Computer Science, vol. 11240, pp. 192–222. https://doi.org/10.1007/978-3-030-03810-6_8 7 [34] Patarin, J.: A proof of security in o(2n) for the xor of two random permutations. In: ICITS 2008. LNCS, vol. 5155, pp. 232–248 3 [35] Patarin, J.: A proof of security in O(2n ) for the xor of two random permutations — proof with the “Hσ technique” —. Cryptology ePrint Archive, Report 2008/010 (2008), https://eprint.iacr.org/2008/010 3 [36] Patarin, J.: Introduction to mirror theory: Analysis of systems of linear equalities and linear non equalities for cryptography. Cryptology ePrint Archive, Report 2010/287 (2010), https://eprint.iacr.org/2010/287 3 [37] Shen, Y., Wang, L., Gu, D., Weng, J.: Revisiting the security of DbHtS MACs: Beyond-birthday-bound in the multi-user setting. In: CRYPTO 2021, Part III. LNCS, vol. 12827, pp. 309–336 5, 6, 7, 68, 69, 70, 71, 72, 73 85

Fine-Tuning XoP Security: Ideal Worlds & Permutation Outputs

Related documents

Products

Support

Fine-Tuning XoP Security: Ideal Worlds & Permutation Outputs

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib