Submitted to the Annals of Statistics arXiv: arXiv:1312.2645 SUPPLEMENT TO SUBSAMPLING BOOTSTRAP OF COUNT FEATURES OF NETWORKS By Sharmodeep Bhattacharyya∗,† and Peter J. Bickel∗ University of California, Berkeley∗ Oregon State University† Appendix. A1. Variance of P̂b1 (R). The variance of P̂b1 (R) is 1 Varb X m p |Iso(R)| S⊆Km ,S ∼ =R X Varb S⊆Km ,S ∼ =R X Varb m p |Iso(R)| 1(S ⊆ H)G = !2 1 = 1(S ⊆ H)G S⊆Km ,S ∼ =R Ñ é2 X Eb Ñ − 1(S ⊆ H) S⊆Km ,S ∼ =R Eb X S⊆Km ,S ∼ =R Ñ Eb é2 X 1(S ⊆ H) S⊆Km ,S ∼ =R G = Eb X S⊆Km ,S ∼ =R + Eb X S,T ⊆Km S,T ∼ =R,S6=T Thus, n−p m−p n 1(S m S⊆Kn ,S ∼ =R X 1 ⊆ G) G é2 1(S ⊆ H)G 1(S ⊆ H)G 1(S, T ⊆ H)G = I + II (Suppose) I= 1(S ⊆ H)G 2 BHATTACHARYYA AND BICKEL X II = Eb S,T ⊆Km S,T ∼ =R,S6=T 1(S, T ⊆ H)G Now, a host of subgraphs can be formed by the intersection of two copies of R. The number of intersected vertices can range from 0 to p − 1. Let us consider, that for number of vertices in intersection as k (k = 1, . . . , (p − 1)), the number of graph structures that can be formed is gk and we represent that graph structure by Wjk , where, j = 1, . . . , gk . Thus, II = p−1 gk XX n−(2p−k) m−(2p−k) 1(S n m X k=0 j=1 S⊆Kn ,S ∼ =Wjk ⊆ G) So, Ñ é2 X Eb 1(S ⊆ H) S⊆Km ,S ∼ =R G = + n−p m−p n 1(S m S⊆Kn ,S ∼ =R X ⊆ G) p−1 gk XX k=0 j=1 X S⊆Kn ,S ∼ =Wjk Varb X S⊆Km ,S ∼ =R 1(S ⊆ H)G = + n−(2p−k) m−(2p−k) 1(S n m n−p m−p n 1(S ∼ m S⊆Kn ,S =R X ⊆ G) ⊆ G) p−1 gk XX k=0 j=1 X S⊆Kn ,S ∼ =Wjk Ñ − n−(2p−k) m−(2p−k) 1(S n m n−p m−p n 1(S m S⊆Kn ,S ∼ =R X ⊆ G) ⊆ G) é2 3 SUPPLEMENT X 1 Varb m 1(S ⊆ H)G p |Iso(R)| S⊆K ,S ∼ =R m 1 m p |Iso(R)| = 1 m p |Iso(R)| − + !2 n−p m−p n 1(S m S⊆Kn ,S ∼ =R X ⊆ G) !2 Ñ n−p m−p n 1(S m S⊆Kn ,S ∼ =R X !2 p−1 g k X X X 1 m p |Iso(R)| k=0 j=1 S⊆Kn S∼ =Wjk é2 ⊆ G) n−(2p−k) m−(2p−k) n m 1(S ⊆ G) So, î Varb P̂b1 (R) ó = − + 1 m p |Iso(R)| 1 m p |Iso(R)| !2 n−p m−p n 1(S m S⊆Kn ,S ∼ =R X !2 Ñ ⊆ G) n−p m−p n 1(S ∼ m S⊆Kn ,S =R X !2 p−1 g k X X X 1 m p |Iso(R)| k=0 j=1 S⊆Kn S∼ =Wjk é2 ⊆ G) n−(2p−k) m−(2p−k) 1(S n m ⊆ G) A2. Proof of Theorem 3.1. Proof. (i) Now, let us try to try to find the expectation of P̂b1 (R) 4 BHATTACHARYYA AND BICKEL under the sampling distribution conditional on the given data G. Eb m p |Iso(R)| S⊆Km ,S ∼ =R 1 = = 1 X E m p |Iso(R)| 1 X S⊆Km ,S ∼ =R 1 X 1(S ⊆ H)G 1(S ⊆ H)G X 1(S m n p |Iso(R)| H⊆G,|H|=m m S⊆Km ,S ∼ =R = 1 X 1 X m n 1(S p |Iso(R)| S⊆Kn H⊇S,H⊆G m S∼ =R |H|=m X 1 = m p |Iso(R)| S⊆K ,S ∼ =R n n−p m−p n 1(S m ⊆ H) ⊆ G) ⊆ G) X 1 = n 1(S ⊆ G) p |Iso(R)| S⊆K ,S ∼ =R n So, we have, Eb [P̄B1 (R)|G] = Eb [P̂b1 (R)|G] = P̂ (R) (ii) Given G, Varb [ρ−e P̄B1 (R)|G] = ρ−2e 1 B2 B X Varb [P̂b1 (R)] b=1 é B X + Covb (P̂b1 (R), P̂b0 1 (R)) b,b0 =1,b6=b0 Now, the formula for Varb [P̂b1 (R)] from A1 we get that î Varb ρ−e n P̂b1 (R) ó = − + ρ−e n m p |Iso(R)| ρ−e n m |Iso(R)| p !2 n−p m−p n 1(S ∼ m S⊆Kn ,S =R X ⊆ G) !2 Ñ n−p m−p n 1(S m S⊆Kn ,S ∼ =R X !2 p−1 g k X X X ρ−e n m p |Iso(R)| k=0 j=1 S⊆Kn S∼ =Wjk é2 ⊆ G) n−(2p−k) m−(2p−k) 1(S n m ⊆ G) 5 SUPPLEMENT So, î ó Varb ρ−e n P̂b1 (R) = O Ç = O ρen Ñ ! 1 +O m p 1 mp ρen m eW pW ρn m2 2e p ρn å Å 1 +O m é ã where, W = S ∪ S 0 , pW = |V (W )| and eW = |E(W )|. (iii) Here, we use properties of the underlying model. Let us condition on ξ = {ξ1 , . . . , ξn } and the whole graph G separately. Now, conditioning on ξ, we get the main term of P̂ (R) to be, (0.1) Ñ é E(ρ−e P̂ (R)|ξ) = 1 X w(ξi , ξj ) +O(n−1 λn ). Y n p |Iso(R)| S⊆Kn ,S ∼ =R (i,j)∈E(S) We shall use the same decomposition as used in [1] of (ρ−e n P̄B1 (R) − P̃ (R)) into Ä −e (ρ−e P̄B1 − Eb [P̂b1 (R)|G] n P̄B1 (R) − P̃ (R)) = ρn Ä ä ä +ρ−e P̂ (R) − E(P̂ (R)|ξ) n +E(P̂ (R)|ξ)ρ−e n − P̃ (R) Let us define, U3 = E(P̂ (R)|ξ)ρ−e n − P̃ (R) Ä U2 = ρ−e P̂ (R) − E(P̂ (R)|ξ) n ä Ä ä U1 = ρ−e P̄B1 − Eb [P̂b1 (R)|G] n Now, it is easy to see that Var(ρ−e P̄B1 (R)) = E(Var(ρ−e P̄B1 (R)|G)) + Var(E(ρ−e P̄B1 (R)|G)) = E(Var(ρ−e P̄B1 (R) − P̂ (R)|G) + Var(P̂ (R)) = E(Var(U1 |G)) + E(Var(P̂ (R)|ξ)) + Var(E(P̂ (R)|ξ)) = E(Var(U1 |G)) + E(Var(U2 |ξ)) + Var(U3 ) −e We shall try to see the behavior of Var(U1 |G) = Varb [ρ P̄B1 (R)|G]. î ó 1 1 From (ii) we get that, Varb ρ−e n P̂b1 (R) = O mp ρe ∨ m . Similarly, n 1 −e Covb [ρ−e n P̂b1 (R), ρn P̂b0 1 (R)] = O( m ) for acyclic and k-cycle R following similar steps as variance in Appendix A1. If we consider the uniform probability for bootstrap to be γ, then, B = O(γnp ). Note that, 6 BHATTACHARYYA AND BICKEL if E(Hb ) ∩ E(Hb0 ) = φ, then, Covb (P̂b1 (R), P̂b0 1 (R)) = 0. The number of pairs such that E(Hb ) ∩ E(Hb0 ) 6= φ is O(m2 γ 2 n2m−2 ). Also, the number of edges for the leading term in the covariance is equal to or more than 2e. So, Ç å Ç å 1 m2 γ 2 n2m−2 E(Varb [ρ P̄B1 (R)|G]) = O + O B(mp ρen ∧ m) mγ 2 n2m å Ç Ç å 1 1 m =O =O + + o(n−1 ) B(mp ρen ∧ m) n2 B(mp ρen ∧ m) −e The second equality follows since we have m/n → 0 as n → ∞. So, since, B(mp ρen ∧ m) > O(n), we have, E(Var(U1 |G)) = o(n−1 ). Now, by proof of Theorem 1 in [1], we have, Var(U2 ) = o(n−1 ) Var(U3 ) = o(n−1 ) −e P̄ (R)) = o(n−1 ). Since, we already know √nSo, we get, Var(ρ B1 Ä ä √ consistency of ρ−e n-consistency of n P̂ (R) − P̃ (R)) , this proves the −e ρ−e n P̄B1 (R) to ρn P̂ (R). A3. Proof of Theorem 3.2. For variance calculation, we also need the joint inclusion probability of two items, S, S 0 ∈ Sp , which are subgraphs of G induced by the set of vertices {w1 , . . . , wp } and {w10 , . . . , wp0 } respectively, 0 where, we take that wi+1 wi and wi+1 wi0 , i = 1, . . . , p − 1. So, πSS 0 ≡ Inclusion Probability of S and S 0 = P[(w1 , . . . , wp ) is selected and (w10 , . . . , wp0 ) is selected] = p Y d=1 (qd )z1d p Ä Y qd2 d=1 where, ® z1d = ® z2d = 1(wd = wd0 ), 0 1((wd , wd−1 ) = (wd0 , wd−1 )), for d = 1 for d = 2, . . . , p 1(wd 6= wd0 ), 0 1((wd , wd−1 ) 6= (wd0 , wd−1 )), for d = 1 for d = 2, . . . , p äz2d 7 SUPPLEMENT (i) We know that P̂b2 (R) is a Horvitz-Thompson estimator with inclusion Q probability of each population unit to be π = pd=1 qd . So, according to the sampling theory [2], P̂b2 (R) is an unbiased estimator of P̂ (R) given the network G, if P(P̂b2 (R) = 0|P̂ (R)) → 0 as n → ∞. Now, P(P̂b2 (R) = 0|P̂ (R)) ≤ (1 − qd )λn for all d = 1, . . . , p. For all d = 1, . . . , p, (1 − qd )λn → 0 if λn qd → ∞. So, under the condition, λn qd ∞ and qd → 0 as n → ∞, we have, P̂b2 (R) is an asymptotically unbiased estimator of P̂ (R). (ii) The variance of P̂b2 (R) coming from the bootstrap sampling only is given by î ó 1 1 − π X Varb ρ−e 1(S ∼ = R) n P̂b2 (R) = N2 π S∈S p X + S,S 0 ∈Sp ,S6=S 0 πSS 0 − π 2 1(S ∼ = R, S 0 ∼ = R) π2 where, Ç å N= ρen n |Iso(R)| p From the formula of Varb [P̂b2 (R)|G], we see that, the covariance terms vanishes when πSS 0 = π 2 . Now, if q1 = 1, then, πSS 0 = π 2 if E(S) ∩ E(S 0 ) = φ. The number of pairs such that E(S) ∩ E(S 0 ) 6= φ is O(p2 n2p−2 ). Now, the condition of q1 = 1 is a bit restrictive. In stead, if we have q1 → 1 as n → ∞, then, the highest order term of covariance term comes from the case when E(S) ∩ E(S 0 ) 6= φ but the root nodes are same that is w1 = w10 . So, for some constant C > 0, 1 N2 ≤ X S,S 0 ∈Sp ,S6=S 0 C N2 πSS 0 − π 2 1(S ∼ = R, S 0 ∼ = R) π2 X S,S 0 ∈Sp ,S6=S 0 q1 − q12 1(S ∼ = R, S 0 ∼ = R) q12 ÇÅ ã å 1 n2p−1 =O −1 q1 n2p ÅÅ ã ã 1 1 −1 =O q1 n 8 BHATTACHARYYA AND BICKEL Now, for the variance term to vanish we need the conditions q1 = 1 or q1 → 1 and qd → 0 and λn qd → ∞ for d = 2, . . . , p as n → ∞. Since, we know that O(1) ≤ λn O(n), we get nqd → ∞ for d = 2, . . . , p as n → ∞. So, we have Å ã Å ã 1 1−π X 1 1 ∼ 1(S R) = − 1 O = N 2 π S∈S π np ρe p Ç = O 1 p n ρen π å p Y 1 = O nρne−p+1 ! 1 · λ q d=2 n d So, Ä E î óä Varb ρ−e n P̂b2 (R) ÅÅ =O ã ã 1 1 −1 +O q1 n 1 nρe−p+1 n p Y 1 · λ q d=2 n d ! (iii) We shall use the same decomposition as used in [1] of (ρ−e n P̄B2 (R) − P̃ (R)) into Ä −e (ρ−e P̄B2 − Eb [P̂b2 (R)|G] n P̄B2 (R) − P̃ (R)) = ρn Ä ä ä +ρ−e P̂ (R) − E(P̂ (R)|ξ) n +E(P̂ (R)|ξ)ρ−e n − P̃ (R) Let us define, U3 = E(P̂ (R)|ξ)ρ−e n − P̃ (R) Ä ä Ä ä U2 = ρ−e P̂ (R) − E(P̂ (R)|ξ) n U1 = ρ−e P̄B2 − Eb [P̂ (R)|G] n Now, it is easy to see that Var(ρ−e P̄B2 (R)) = E(Var(ρ−e P̄B2 (R)|G)) + Var(E(ρ−e P̄B2 (R)|G)) = E(Var(ρ−e P̄B2 (R) − P̂ (R)|G) + Var(P̂ (R)) = E(Var(U1 |G)) + E(Var(P̂ (R)|ξ)) + Var(E(P̂ (R)|ξ)) = E(Var(U1 |G)) + E(Var(U2 |ξ)) + Var(U3 ) We shall try to see the behavior of Varb [ρ−e n P̂b2 (R)|G]. Ä E î óä Varb ρ−e n P̂b2 (R) ÅÅ =O ã ã 1 1 −1 +O q1 n 1 nρe−p+1 n p Y 1 · λ q d=2 n d ! 9 SUPPLEMENT Now, since the bootstrap samples for subgraph sampling are selected independently, we have that, Ä E î Varb ρ−e n P̄B2 (R) ÅÅ óä =O ã ã 1 1 −1 +O q1 nB p Y 1 Bnρne−p+1 ä Ä 1 · λ q d=2 n d Now, under the condition B1 q11 − 1 → 0, qd → 0 for all d = 1, . . . , p Q 1 , we have and B pd=2 qd ≥ np−1 ρe n Ä ä −1 E(Var(U1 |G)) = E Varb [ρ−e n P̂b2 (R)|G] = o(n ) Now, by proof of Theorem 1 in [1], we have, Var(U2 ) = o(n−1 ) Var(U3 ) = o(n−1 ) −e P̄ (R)) = o(n−1 ). Since, we already know √nSo, we get, Var(ρ B2 Ä ä √ consistency of ρ−e P̂ (R) − P̃ (R)) , this proves the n-consistency of n −e ρ−e n P̄B2 (R) to ρn P̂ (R). B1. Proof of Proposition 6. î ó σ 2 (R; ρ) = Var ρ−e P̂ (R) = Var 1(S ⊆ G) e ρ np |Iso(R)| ,S ∼ =R X S⊆Kn 2 1 Ä X =Ä 1(S ⊆ G) − P̃ (R) ä2 E n e ∼ ρ p |Iso(R)| S⊆Kn ,S =R 1 ä2 E =Ä ρe np |Iso(R)| ä2 X S,T ⊆Kn S,T ∼ =R,S∩T 6=∅ 1(S, T ⊆ G) − 1 − n−p ! Ä ä2 p P̃ (R) n p If R is a connected subgraph, then, we can write, Ä ρe 1 n p |Iso(R)| ä2 E X S,T ⊆Kn S,T ∼ =R,S∩T 6=∅ 1(S, T ⊆ G) 1 =Ä ä2 n e ρ p |Iso(R)| X W :W =S∪T S,T ∼ =R,S∩T 6=∅ E X W ⊆Kn 1(W ⊆ G) ! 10 BHATTACHARYYA AND BICKEL Ä ä σ(R1 , R2 ; ρ) = Cov ρ−e1 P̂ (R1 ), ρ−e2 P̂ (R2 ) 1 ä× =Ä n n ρe1 +e2 p1 p2 |Iso(R1 )||Iso(R2 )| Ü êÜ X E ê X 1(S ⊆ G) S⊆Kn , S∼ =R1 1(S ⊆ G) S⊆Kn , S∼ =R2 − P̃ (R1 )P̃ (R2 ) =Ä ρe1 +e2 1 äE n n p1 p2 |Iso(R1 )||Iso(R2 )| X 1(S, T ⊆ G) S,T ⊆Kn S∼ =R1 ,T ∼ =R2 ,S∩T 6=∅ n−p1 ! p P̃ (R1 )P̃ (R2 ) − 1 − n2 p2 If R is a connected subgraph, then, we can write, 1 Ä ρe1 +e2 =Ä ρe1 +e2 n n p1 p2 |Iso(R1 )||Iso(R2 )| äE X S,T ⊆Kn S∼ =R1 ,T ∼ =R2 ,S∩T 6=∅ 1(S, T ⊆ G) 1 ä n n p1 p2 |Iso(R1 )||Iso(R2 )| X E W :W =S∪T S∼ =R1 T ∼ =R2 ,S∩T 6=∅ X 1(W ⊆ G) W ⊆Kn B2. Proof of Lemma 7. Let us define 1/(1 − x) σ̃ 2 (R) = Ä ä2 ρen np |Iso(R)| where, x = 1 − ((n−p)!)2 n!(n−2p)! σ̃(R1 , R2 ) = Ä e1 +e2 ρn X 1(W ⊆ G) − W ⊆Kn :W =S∪T, S,T ∼ =R,|S∩T |=1,p 2 xρ−2e n P̂ (R) (1 − x) and 1/(1 − y) ä n n p1 p2 |Iso(R1 )||Iso(R2 )| −(e1 +e2 ) X W ⊆Kn ,W =S∪T, S∼ =R1 ,T ∼ =R2 ,|S∩T |=1 1(W ⊆ G) − yρn P̂ (R1 )P̂ (R2 ) . (1 − y) 11 SUPPLEMENT where, y = 1 − Now, (n−p1 )!(n−p2 )! n!(n−p1 −p2 )! . î ó xρ−2e E P̂ (R)2 X n 1(W ⊆ G)− E σ̃ 2 (R) = Ä ä2 E (1 − x) ρen np |Iso(R)| W ⊆Kn :W =S∪T, î ó 1/(1 − x) S,T ∼ =R,|S∩T |=1,p î Ä ä ó xρ−2e Var P̂ (R) + P (R)2 X n 1(W ⊆ G)− =Ä ä2 E n (1 − x) e ρn p |Iso(R)| W ⊆Kn :W =S∪T, 1/(1 − x) S,T ∼ =R,|S∩T |=1,p = 1 Ä 1 − x ρe 1 n n p ä2 E |Iso(R)| X 1(W ⊆ G) − W ⊆Kn :W =S∪T, S,T ∼ =R,|S∩T |=1,p Ä ä xρ−2e n Var P̂ (R) − 1 1 = Ä ä 1 − x ρe n |Iso(R)| 2 p xP̃ (R)2 1−x (1 − x) X E X 1(W ⊆ G) − W ⊆Kn W :W =S∪T S,T ∼ =R,S∩T 6=∅ xP̃ (R)2 1−x Ä ä xρ−2e Var P̂ (R) X n 1(W ⊆ G)− − ä2 E Ä n (1 − x) e (1 − x) ρn p |Iso(R)| W ⊆Kn :W =S∪T, 1 Ä = Var ρ−e n P̂ (R) (1 − x) S,T ∼ =R,1<|S∩T |<p Ä ä − xρn−2e Var P̂ (R) (1 − x) ä Ä Ä ä Ä ää − o Var ρ−e n P̂ (R) Ä Ä ää −e = Var ρ−e n P̂ (R) − o Var ρn P̂ (R) Similarly, we get, Ä ä Ä Ä ää −e2 −e1 −e2 1 E [σ̃(R1 , R2 )] = Cov ρ−e n P̂ (R1 ), ρn P̂ (R2 ) − o Cov ρn P̂ (R1 ), ρn P̂ (R2 ) Now, from the Theorem 1(a) in [1], we know that as λn → ∞, if ρ̂n = as defined in (2.6), ρ̂n P →1 ρn So, using the estimate ρ̂n , we get that, σ̂ 2 (R) P → 1, σ 2 (R; ρ) σ̂(R1 , R2 ) P →1 σ(R1 , R2 ; ρ) D̄ n−1 12 BHATTACHARYYA AND BICKEL B3. Proof of Lemma 8. Given G, î 2 E σ̂Bi (R)|G = ó Ä X W =S∪T,S,T ∼ =R, |S∩T |=1,p = X W =S∪T,S,T ∼ =R, |S∩T |=1,p (1 − Ä ä −2e n xE ρ̂n P̄Bi (R)2 |G pW |Iso(R)| Ä ä2 E P̄Bi (W )|G − (1 − x) x) ρ̂en np |Iso(R)| ρ̂enW ä (1 − Ä n xρ̂n−2e P̂ (R)2 xVar pW |Iso(R)| − Ä ä2 P̂ (W )− (1 − x) 1 x) ρ̂en np |Iso(R)| ρ̂enW ä Ä 2 = σ̂ (R) − xVar T̂Bi (R) −x = σ̂ 2 (R) − o σ̂ 2 (R) 1−x where the last inequality follows since x = O Theorem 3.2 for i = 1, 2. Similarly, we get, Ä Ä ä 1 n Ä ä T̂Bi (R) ä and Theorem 3.1 and E [σ̂Bi (R1 , R2 )|G] = σ̂(R1 , R2 ) − o σ̂ 2 (R) ä So using Lemma 7, we have that, 2 (R) σ̂Bi P → 1, 2 σ (R; ρ) σ̂Bi (R1 , R2 ) P →1 σ(R1 , R2 ; ρ) for i = 1, 2 References. [1] Bickel, P. J., Chen, A. and Levina, E. (2011). The method of moments and degree distributions for network models. Ann. Statist. 39 2280–2301. . MR2906868 [2] Thompson, S. K. (2012). Sampling, third ed. Wiley Series in Probability and Statistics. John Wiley & Sons Inc., Hoboken, NJ. . MR2894042 (2012k:62024) Department of Statistics 44 Kidder Hall Corvallis, OR, 97331 E-mail: bhattash@science.oregonstate.edu Department of Statistics 367 Evans Hall Berkeley, CA, 94720 E-mail: bickel@stat.berkeley.edu