SUPPLEMENT TO “ESTIMATION AND MODEL SELECTION IN GENERALIZED ADDITIVE PARTIAL LINEAR MODELS FOR CORRELATED DATA WITH DIVERGING NUMBER OF COVARIATES” By Li Wang, Lan Xue, Annie Qu, and Hua Liang University of Georgia, Oregon State University, University of Illinois at Urbana-Champaign, and George Washington University In this document, we collect a number of technical lemmas and their proofs. The technical lemmas are used in the proofs of Theorems 1-5 in the paper. S.1. Preliminary lemmas. Let Bl = (Bl,1 , . . . , Bl,Jn )T be a set of orthonormal bases of φ0,n with respect to inner product as defined in (A.1) l of the main text, for l = 1, . . . , dx . Let B = (B1T , . . . , BdTx )T , then B is a set of bases for Mn . In the following, let a be a unit dn -vector unless otherwise specified. Recall ρn = ((1 − δ)/2)(dx −1)/2 . Lemma S.1. that Under condition (C1), there exists a constant c > 0, such ) ( dx dx ∑ ∑ 2 2 2 2 ∥αl ∥ , ∀ α = α0 + αl ∈ M. ∥α∥ ≥ cρn α0 + l=1 l=1 Proof. The result follows from an extension of Lemma 3 in [3]. Lemma S.2. Under Conditions (C2), (C3) and (C8), there exist constants C ≥ c > 0, such that, as n → ∞, ) } { ( n ∑ 2 T −1 T cρn ≤ E a n Di Di a ≤ C i=1 ( and cρ2n ≤ aT n−1 n ∑ ) DiT Di a ≤ C, i=1 except in an event whose probability tends to zero. 1 2 The proof of Lemma S.2 is analogous to that of Lemma A.3 given by [4], replacing Di by Bi and applying Lemma S.1, thus omitted. ∑ In the following, define a dn -vector Gn,k (θ) = n1 ni=1 gik (θ) , for k = 1, . . . , K, where (S.1) −1/2 gik (θ) = gik (µi (θ)) = DiT ∆i Ai −1/2 Mk Ai {Yi − µi (θ)} . Then Gn in (2.3) of the main text can be written as (GTn,1 (θ) , ..., GTn,K (θ))T . Note that the regression function ηi (θ) ≡ DiT θ can be decomposed as ηi (θ) = η0,i + ζi (θ), where (S.2) ζi (θ) = (α0 − α e) (Xi ) + Bi (e γ − γ) + Zi (β0 − β) ≡ ζi1 (α) + ζi2 (γ) + ζi3 (β) . Thus, µi (θ) = µ {η0,i + ζi (θ)} . (S.3) Similar to (S.1), we express (k) 0 gik (θ) = DiT ∆T0,i V0,i [ei + ∆0,i {ζi1 (α) + ζi2 (γ) + ζi3 (β)}] 0 0 0 0 ≡ gik1 + gik2 (α) + gik3 (γ) + gik4 (β) , where (k) (k) 0 0 gik1 = DiT ∆0,i V0,i ei , gik2 (α) = DiT Γ0,i ζi1 (α) , (k) (k) 0 0 gik3 (γ) = DiT Γ0,i ζi2 (γ) , gik4 (β) = DiT Γ0,i ζi3 (β) . ∑n 0 = 1 0 , and define g 0 (α), g 0 (γ), g 0 (β) similarly. Next Let gk1 gik1 i=1 ∑ k2 k3 k4 n 0 (θ), then define G0 (θ) , C 0 (θ) , Q0 (θ) simidefine G0n,k (θ) = n1 ni=1 gik n n n 0 larly as Gn (θ){, Cn (θ) , Qn (θ) but replacing gik with gik . Furthermore, let e ) ∥n = Θn (C1 , C2 ) = θ = (β T , γ T )T : ∥β − β0 ∥ = C1 n−1/2 dn , ∥B T (γ − γ } 1/2 C2 n−1/2 dn for positive constants C1 and C2 . 1/2 Lemma S.3. ( −1/2 ) 0 ∥ = O Under the conditions of Theorem 1, ∥gk1 , P n 0 (α) ∥ = O (J −r d ∥gk2 x ). Furthermore, there exist constants 0 < c2 ≤ c1 , P n 0 < c4 ≤ c3 such that for any θ ∈ Θn (C1 , C2 ), 1/2 0 c2 C1 ρn n−1/2 d1/2 ≤ ∥gk3 (γ) ∥ ≤ c1 C1 n−1/2 d1/2 n n , 0 c4 C2 ρn n−1/2 d1/2 ≤ ∥gk4 (β) ∥ ≤ c3 C2 n−1/2 d1/2 n n , except in an event whose probability goes to 0 as n → ∞. 3 0 has mean 0, and by Conditions (C5), Proof. For any k = 1, . . . , K, aT gk1 (C6) and Lemma S.2, there exists a constant c > 0, such that ( ) 1 ( ) c c 0 0 0T Var aT gk1 = EaT gik1 gik1 a ≤ aT E(DiT Di )a ≤ aT a = O n−1 . n n n ( ) 0 ∥ = sup |aT g 0 | = O −1/2 . The rest of the results follow Therefore ∥gk1 P n a k1 similarly from Lemma S.2 and Lemma B.8 of the Supplement of [5]. Lemma S.4. Under Conditions (C1)–(C8), there exist constants 0 < c6 ≤ c5 , 0 < c8 ≤ c7 , 0 < c10 ≤ c9 such that for any θ ∈ Θn (C1 , C2 ), (S.4) 0 −1/2 1/2 c6 (C1 + C2 )n−1/2 d1/2 dn , n ρn ≤ ∥Gn (θ) ∥ ≤ c5 (C1 + C2 )n (S.5) c8 ρn ≤ λmin {Cn0 (θ)} < λmax {Cn0 (θ)} ≤ c7 , (S.6) −1 c10 (C1 + C2 )n−1 dn ≤ Q0n (θ) ≤ c9 (C1 + C2 )ρ−1 n n dn , except in an event whose probability goes to 0 as n → ∞. ei = Proof. Equation (S.4) follows immediately from Lemma S.3. Next let D IK ⊗ Di , where IK is the identity matrix of order K , and ⊗ denotes the Kronecker product. Furthermore, let Mdiag = diag (M1 , . . . , MK ). To show (S.5), Conditions (C5) and (C6) ensure that it is sufficient to show that there exist constants c∗ and C∗ such that when n is large enough, 1 ∑ T eT T e ia ≤ C ∗, a Di Mdiag Mdiag D n n c∗ ρn ≤ (S.7) i=1 for any a = (aT0 , . . . , aTK ) with each ak ∈ Rdn and aT a = 1. The eigenvalues T of Mdiag Mdiag are the squares of the singular values of Mdiag , which are bounded from 0 and +∞ by Condition (C7). Then T ∑ 1 ∑ T eT T e ia ≍ 1 e TD e i a. a Di Mdiag Mdiag D aT D i n n n n i=1 i=1 By Lemma S.2, cρn ≤ C K ∑ aTk ak ≤ k=1 K ∑ ≤C k=1 ∑ 1∑ 1 ∑ T eT e a Di Di a = aTk DiT Di ak n n n K n i=1 k=1 i=1 aTk ak = C. 4 Therefore (S.7) holds. Observing that 0 0 2 0 0T 0 −1 0 λ−1 max {Cn (θ)}∥Gn (θ)∥ ≤ Qn (θ) = Gn (θ)(Cn (θ)) Gn (θ) 0 0 2 ≤ λ−1 min {Cn (θ)}∥Gn (θ)∥ , thus (S.6) follows from Lemma S.2 and (S.4). Lemma S.5. ciently large, Under Conditions (C1)–(C8), for some C1 and C2 suffi- (S.8) ∥Gn (θ) − G0n (θ) ∥ sup θ∈Θn (C1 ,C2 ) (S.9) ∥Cn (θ) − Cn0 (θ) ∥ = OP (n−1/2 dn ), sup θ∈Θn (C1 ,C2 ) (S.10) ( ) −1/2 −r 1/2 = OP Jn−2r dn1/2 dx + n−1 d3/2 + n J d d , n x n n sup Qn (θ) − Q0n (θ) θ∈Θn (C1 ,C2 ) { ( )} −1/2 3 = OP ρ−1 nJn−4r dn d2x + ρ−1 dn + Jn−2r d2n dx . n n n Proof. Similar to [5], it is sufficient to show that (S.8) holds for each of its components Gn,k (θ) − G0n,k (θ) with k = 1, . . . , K. The Taylor expansion of aT Gn,k (θ), (S.2) and (S.3) implies that aT Gn,k (θ) = aT Gn,k (µ (θ)) n 1∑ T T −1/2 = a Di ∆i {µi (η0,i + ζi (θ))} Ai {µi (η0,i + ζi (θ))} n i=1 −1/2 ×Mk Ai {µi (η0,i + ζi (θ))} {yi − µi (η0,i + ζi (θ))} n n ∑ 1 1 ∑ T T (k) (k) = aT DiT ∆0,i V0,i ei + a Di Γ0,i ζi (θ) n n i=1 1 + n where Rn∗ (µ∗ ) = i=1 n ∑ T T ζi (θ) ∆0,i −1/2 T ∂µi i=1 1 n ∑n −1/2 ∂a Di ∆0,i A0,i Mk A0,i ∗ ∗ i=1 Rni (µi ) ei + Rn∗ (µ∗ ) , with −1/2 ∂ 2 aT DiT ∆i (µi ) Ai 1 ∗ Rni (µ∗i ) = ζiT ∆i 2 −1/2 (µi ) Mk Ai ∂µi ∂µTi (µi ) (yi − µi ) ∆i ζi 5 evaluated at µ∗i = g −1 (η0,i + τi ζi ) with 0 < τi < 1. Then one has −1/2 −1/2 n ∂aT DiT ∆0,i A0,i Mk A0,i { } 1∑ aT Gn,k (θ)−G0n,k (θ) = ζiT (θ) ∆0,i n ∂µi ei i=1 +Rn∗ (µ∗ ) = Ik (θ) + Rn∗ (µ∗ ) . Let −1/2 (k) Li (S.11) = ∆i −1/2 Mk Ai ∂aT DiT ∆i Ai ∂µi . For Ik (θ), one can write Ik (θ) = 1∑ T 1∑ T 1∑ T (k) (k) (k) ζi1 (α) L0,i ei + ζi2 (γ) L0,i ei + ζi3 (β) L0,i ei n n n n n n i=1 i=1 i=1 = Ik1 (α) + Ik2 (γ) + Ik3 (β) , (k) (k) where L0,i is the value of Li at µi = µ0,i . Conditions (C5), (C6) and (k) 1/2 Lemma 6.2 of [1] indicate that sup1≤i≤n, a ∥L0,i ∥ = OP (dn ). Furthermore, Ik (θ) has mean 0, and Condition (C3) and Lemma S.2 ensure that ) ( cdn cdn T T Var (Ik1 ) ≤ E {ζi1 ei eTi ζi1 } ≤ E (ζi1 ζi1 ) = O n−1 Jn−2r dn dx . n n Therefore, supa, can show that θ∈Θn (C1 ,C2 ) |Ik1 (α)| = OP (n−1/2 Jn−r dn dx ). Similarly, we 1/2 1/2 |Ik2 (γ) + Ik3 (β)| = OP (n−1 dn ). sup a, θ∈Θn (C1 ,C2 ) Let Fi∗ { T = ∆i −1/2 ∂ 2 aT DiT ∆Ti (µi ) Ai −1/2 (µi ) Mk Ai ∂µi ∂µTi (µi ) (yi − µi ) } |µi =µ∗i ∆i . By [5], sup1≤i≤n,a ∥Fi∗ ∥ = OP (dn ). Then 1/2 1 ∑ T 1 ∑ T ζi1 (α) Fi∗ ζi1 (α) + ζi2 (γ) Fi∗ ζi2 (γ) 2n 2n n Rn∗ (µ∗ ) = + i=1 n ∑ 1 2n i=1 n T ζi3 (β) Fi∗ ζi3 (β) + i=1 n ∑ 1 n T ζi1 (α) Fi∗ ζi2 (γ) i=1 n n 1∑ T 1∑ T ∗ ζi1 (α) Fi ζi3 (β) + ζi2 (γ) Fi∗ ζi3 (β) + n n i=1 i=1 ( ) −2r 1/2 −1 3/2 = OP Jn dn dx + n dn + n−1/2 Jn−r dn d1/2 . x 6 Thus, ∥Gn,k (θ) − G0n,k (θ) ∥ sup θ∈Θn (C1 ,C2 ) = { } |aT Gn,k (θ) − G0n,k (θ) | sup a, θ∈Θn (C1 ,C2 ) ( ) −1/2 −r 1/2 = OP Jn−2r dn1/2 dx + n−1 d3/2 + n J d d . n n n x Next, for any a = (aT0 , . . . , aTK )T with each ak ∈ Rdn and aT a = 1, 1∑ T a {gi (θ) giT (θ) − gi0 (θ) gi0T (θ)}a n n aT {Cn (θ) − Cn0 (θ)}a = i=1 n K ∑ 1∑ T T 0 0T ak {gi,k (θ) gi,k = ′ (θ) − gi,k (θ) gi,k ′ (θ)}ak ′ . n ′ k,k =1 i=1 For any k, k ′ , 1∑ T T 0 0T ak {gi,k (θ) gi,k ′ (θ) − gi,k (θ) gi,k ′ (θ)}ak ′ n n i=1 { 1∑ T (k) (k) (k′ ) {ζi (θ) ∆0,i L0,i ei + Rni (µ∗ )} 2aTk′ DiT ∆0,i V0,i ei n i=1 }T (k) (k′ ) (k′ ) +2aTk′ DiT Γ0,i ζi (θ) + ζiT (θ) ∆0,i L0,i ei + Rni (µ∗∗ ) n = = O(n−1/2 dn ). Finally, note that 1 Qn (θ) − Q0n (θ) n = GTn (θ)Cn−1 (θ) Gn (θ) − {G0n (θ)}T {Cn0 (θ)}−1 G0n (θ) ≤ (Gn − G0n )T (θ)Cn−1 (θ) (Gn − G0n )(θ) +2 {G0n (θ)}T {Cn0 (θ)}−1 (Gn − G0n )(θ) [ ] + {G0n (θ)}T Cn−1 (θ) Cn (θ) − Cn0 (θ) {Cn0 (θ)}−1 G0n (θ) 0 2 ≤ λ−1 min {Cn (θ)}∥(Gn − Gn )(θ)∥ 0 0 0 +2λ−1 min {Cn (θ)}∥Gn (θ)∥∥(Gn − Gn )(θ)∥ −1 0 0 0 2 +λ−1 min {Cn (θ)}λmin {Cn (θ)}λmax {Cn (θ) − Cn (θ)}∥Gn (θ)∥ . Thus, (S.10) follows. 7 Next let (k) 1 ∑ T (k) Di Γ0,i Di , n (k) 1 ∑ T (k) Di Γ0,i Zi , = n JDZ = {(JDZ )T , ..., (JDZ )T }T , and (k) n 1 ∑ T (k) Di Γ0,i Bi , = n JDB = {(JDB )T , ..., (JDB )T }T . n (S.12) JDD = (1) (K) JDD = {(JDD )T , ..., (JDD )T }T , i=1 n (S.13) JDZ (1) (K) (1) (K) i=1 (S.14) JDB i=1 Lemma S.6. ciently large, Under Conditions (C1)–(C8), for some C1 and C2 suffi{ } ∥Ġn (θ) − JDD ∥ + ∥Ġβ (θ) − JDZ ∥ + ∥Ġγ (θ) − JDB ∥ sup θ∈Θ(C1 ,C2 ) √ = OP ( d3n /n). Proof. We only prove (S.15) √ ∥Ġn (θ) − JDD ∥ = OP ( d3n /n), sup θ∈Θ(C1 ,C2 ) and the proofs of the other two terms are similar. Note that n n { } ∑ ∑ (k) (k) (k) T T na Ġn,k (θ) − JDD = a Di (Γi (θ) − Γ0,i )Di + [DiT T { ×∆i (µi ) i=1 i=1 ]T ∂ T T −1/2 −1/2 a Di ∆i (µi ) Ai (µi ) Mk Ai (µi ) (Yi − µi ) . ∂µi (k) Using the notation Li second term as } defined in (S.11), we can write the transpose of the 1 ∑ T (k) 1 ∑ T (k) Di Li (µi ) (Yi − µi ) = Di Li (µi ) n n n n i=1 i=1 × [ei + ∆0,i {ζ1i (α) + ζ2i (γ) + ζ3i (β)}] . Next, one has n 1 ∑ (k) 1/2 −r sup DiT Li (µi ) (Yi − µi ) = OP (n−1/2 d3/2 n + dx dn Jn ). n a, θ∈Θ(C1 ,C2 ) i=1 8 On the other hand, a, n 1 ∑ { } (k) (k) T T sup a Di Γi (θ) − Γ0,i Di n θ∈Θ(C1 ,C2 ) i=1 n 1 ∑ ∂ (k) = sup (µi − µ0,i )T aT DiT Γi (µi )Di ∂µi a, θ∈Θ(C1 ,C2 ) n = OP (n −1/2 dn + i=1 1/2 1/2 −r dx dn Jn ). Hence (S.15) holds. Lemma S.7. Under Conditions (C1)–(C8), one has sup −1/2 1/2 ∥Sn (θ)∥ = OP (ρ−1 dn ). n n θ∈Θ(C1 ,C2 ) Proof: According to Lemmas S.5, S.6 and the definition of JDD given in (S.12), one has Sn (θ) = ĠTn (θ)Cn−1 (θ)Gn (θ) }T { { √ = JDD + OP ( d3n /n) Cn−1 (θ) G0n (θ) + OP (Jn−2r d1/2 n dx ) } −1/2 −r + OP (n−1 d3/2 Jn dn d1/2 n ) + OP (n x ) { } { √ T = JDD + OP ( d3n /n) {Cn0 (θ)}−1 G0n (θ) + +OP (Jn−2r d1/2 n dx ) } −1/2 −r + OP (n−1 d3/2 Jn dn d1/2 n ) + OP (n x ) { } √ T { } + JDD + OP ( d3n /n) Cn−1 (θ) Cn0 (θ) − Cn (θ) {Cn0 (θ)}−1 { } −1 3/2 −1/2 −r 1/2 × G0n (θ) + OP (Jn−2r d1/2 d + n d + n J d d ) . x n x n n n Lemma S.5 implies that the order of supθ∈Θ(C1 ,C2 ) ∥Sn (θ)∥ is the same as that of supθ∈Θ(C1 ,C2 ) ∥JDD ∥∥{Cn0 (θ)}−1 ∥∥G0n (θ)∥. By Lemma S.4, ∥G0n (θ)∥ = OP (n−1/2 d1/2 n ), sup θ∈Θ(C1 ,C2 ) sup λmin {Cn0 (θ)} ≥ cρn . θ∈Θ(C1 ,C2 ) In addition, Lemma S.2 implies that aT JDD a ≤ C, except in an event whose probability tends to zero. Therefore, one has sup −1/2 1/2 ∥JDD ∥∥{Cn0 (θ)}−1 ∥∥G0n (θ)∥ = OP (ρ−1 dn ). n n θ∈Θ(C1 ,C2 ) The desired result follows. 9 Lemma S.8. Under Conditions (C1)–(C8), there exist constants 0 < c ≤ C, such that, for any vector a2 of length dx Jn and any θ ∈ Θ(C1 , C2 ), as 2 n → ∞, cρn ∥a2 ∥22 ≤ aT2 Hγγ (θ)a2 ≤ Cρ−1 n ∥a2 ∥2 , except in an event whose probability tends to zero. Proof. Note that aT2 Hγγ (θ)a2 = aT2 ĠTγ (θ)Cn−1 (θ)Ġγ (θ)a2 , √ and according to Lemma S.6, ∥ĠTγ (θ) − JDB ∥ = OP ( d3n /n), where JDB is given in (S.14). Therefore, { }T{ } √ √ }−1{ aT2 Hγγ (θ)a2 = aT2 JDB +OP ( d3n /n) Cn0 (θ) JDB +OP ( d3n /n) a2 { }T √ ( ) +aT2 JDB + oP ( d3n /n) Cn−1 (θ) Cn0 − Cn (θ) } { √ ×{Cn0 (θ)}−1 JDB + OP ( d3n /n) a2 { 0 }−1 T = aT2 JDB Cn (θ) JDB a2 {1 + oP (1)}. According to (S.5), one has { 0 }−1 T T T T caT2 JDB JDB a2 ≤ aT2 JDB Cn (θ) JDB a2 ≤ Cρ−1 n a2 JDB JDB a2 . Thus the lemma follows from Lemma S.2. Lemma S.9. Under Conditions (C1)–(C8), there exist constants 0 < c ≤ C, such that, for any θ ∈ Θ(C1 , C2 ), as n → ∞, cρn ≤ aT Hn (θ)a ≤ Cρ−1 n , except in an event whose probability tends to zero. Proof. We write a = (aT1 , aT2 )T , where a1 ∈ Rdz and a2 ∈ Rdx Jn . Note that aT Hn (θ) a = aT1 Hββ (θ) a1 + aT2 Hγγ (θ) a2 + 2aT1 Hβγ (θ) a2 . According 2 to Lemma S.8, cρn ∥a2 ∥22 ≤ aT2 Hγγ (θ)a2 ≤ Cρ−1 n ∥a2 ∥2 , a.s. Similar to the proof of Lemma S.8, we could have c∥a1 ∥22 ≤ aT1 Hββ (θ)a1 ≤ C∥a1 ∥22 , a.s. Note that T aT1 Hβγ (θ)a2 = aT1 JDZ (Cn0 )−1 JDB a2 {1 + oP (1)}, and T aT1 JDZ JDB a2 = K ∑ k=1 (k) (k) aT1 (JDZ )T JDB a2 = K n ∑ 1 ∑ T T (k) (k) a1 Zi Γ0,i Di DiT′ Γ0,i′ Bi′ a2 , n2 ′ k=1 i,i =1 10 (k) (k) (k) (k) aT1 ZiT Γ0,i Di DiT′ Γ0,i′ Bi′ a2 = aT1 ZiT Γ0,i Zi ZiT′ Γ0,i′ Bi′ a2 (k) (k) +aT1 ZiT Γ0,i Bi BiT′ Γ0,i′ Bi′ a2 , thus |aT1 Hβγ (θ) a2 | = o(∥a1 ∥∥a2 ∥), a.s. Hence, cρn ≤ aT Hn (θ)a ≤ Cρ−1 n , a.s. S.2. Asymptotic results with best approximation. The following lemma shows the consistency of θbQIF . Lemma S.10. Under Conditions (C1)–(C8), ∥θbQIF − θe0 ∥ = OP (n−1/2 d1/2 n ). Proof. We first show that for any ε > 0, there exists a C = C(ε), such that { } (S.16) P inf Q0 (θ) > Q0 (θe0 ) > 1 − ε, ∥θ−θe0 ∥=Cn−1/2 dn 1/2 n n as n → ∞. Let Un (θ) = G0n (θ) − G0n (θe0 ). Then n−1 {Q0n (θ) − Q0n (θe0 )} = UnT (θ) (Cn0 )−1 Un (θ) + 2UnT (θ) (Cn0 )−1 G0n (θe0 ) +{G0n (θ)}T (Cn0 (θ))−1 {Cn0 − Cn0 (θ)}(Cn0 )−1 G0n (θ) . By Lemma S.3, there exists a constant c11 > 0 such that for any θ ∈ Θn (C), K ∑ 0 { } T {g (γ) + g 0 (β)}T g 0 + g 0 (α) Un (θ) (Cn0 )−1 G0n (θe0 ) ≤ c11 ρ−1 n k3 k4 k1 k2 k=1 ≤ c11 ρ−1 n K ∑ ( 0 ) 0 0 0 (∥gk3 (γ) ∥ + ∥gk4 (β) ∥) ∥gk1 ∥ + ∥gk2 (α) ∥ . k=1 Furthermore, Lemma S.3 entails that there exists a constant C1 > 0 such that ( √ ) −1/2 1/2 |UnT (θ) (Cn0 )−1 G0n (θe0 )| ≤ CC1 dx1/2 hr + 1/ n ρ−1 dn n n when n is large enough. On the other hand, Condition (C7) entails that there exists a c12 > 0, such that Un (θ) (Cn0 )−1 Un (θ) T ≥ c12 K ∑ k=1 0 0 ∥gk3 (γ) + gk4 (β) ∥2 , 11 and Lemma S.3 indicates that there exists a constant C2 > 0 such that UnT (θ) (Cn0 )−1 Un (θ) ≥ C 2 C2 n−1 dn . Therefore we prove (S.16) by choosing C sufficiently large such that C > C1 /C2 . Equations (S.10) and (S.16) together imply that } { e Qn (θ) > Qn (θ0 ) > 1 − ε. P inf −1/2 d e ∥θ−θ∥=Cn n 1/2 Next { } P ∥θbQIF − θe0 ∥ ≤ Cn−1/2 d1/2 ≥P n } { inf ∥θ−θe0 ∥=Cn−1/2 dn 1/2 Qn (θ) > Qn (θe0 ) > 1 − ε, 1/2 which entails that ∥θbQIF − θe0 ∥ = OP (n−1/2 dn ). Lemma S.11. Under Conditions (C1)–(C8), ( 0 )−1 T Hββ (θe0 ) − Hβγ (θe0 ){Hγγ (θe0 )}−1 Hγβ (θe0 ) = JbDZ Cn JbDZ {1 + oP (1)}, e T )T and JbDZ is given in (2.7). where θe0 = (β0T , γ Proof. According to the notation in (A.7), one has Hββ (θ) − Hβγ (θ) {Hγγ (θ)}−1 Hγβ (θ) = ĠTβ (θ) Cn−1 (θ) Ġβ (θ) { } −ĠTβ (θ) Cn−1 (θ) Ġγ (θ) ĠTγ (θ) Cn−1 (θ) Ġγ (θ) ĠTγ (θ) Cn−1 (θ) Ġβ (θ) . Lemma S.6 implies that ( )−1 ( 0 )−1 T Hββ (θe0 ) = ĠTβ (θe0 ) Cn0 Ġβ (θe0 ){1+oP (1)} = JDZ Cn JDZ {1+oP (1)}, where JDZ in (S.13), i.e., (1) DiT Γ0,i Zi n 1 ∑ .. Hββ (θe0 ) = 2 . n i=1 T (K) Di Γ0,i Zi T T (1) D Γ Z i i 0,i n ( 0 )−1 ∑ .. Cn {1+oP (1)}. . (K) i=1 T Di Γ0,i Zi Similarly, Lemma S.6 implies that ( )−1 ( 0 )−1 T Hβγ (θe0 ) = ĠTβ (θe0 ) Cn0 Ġγ (θe0 ){1+oP (1)} = JDZ Cn JDB {1 + oP (1)}, 12 where JDB in (S.14), i.e., (1) DiT Γ0,i Zi n ∑ 1 .. Hβγ (θe0 ) = 2 . n (K) i=1 DiT Γ0,i Zi T ( 0 )−1 Cn (1) DiT Γ0,i Bi .. {1+oP (1)}. . (K) i=1 DiT Γ0,i Bi n ∑ Therefore, letting ( 0 )−1 ( 0 )−1 T bi = Zi − Bi {J T Z JDB }−1 {JDZ Cn JDB }T , DB Cn (S.17) we have { }−1 Hββ (θe0 ) − Hβγ (θe0 ) Hγγ (θe0 ) Hγβ (θe0 ) T (1) b T (1) b D Γ Z DiT Γ0,i Z i i i 0,i n n ∑ ∑ ( 0 )−1 1 . . .. .. = 2 Cn n (K) (K) b i=1 i=1 bi DiT Γ0,i Z DiT Γ0,i Z i {1 + oP (1)}. The result follows from (S.13). In the following, let X and Z be the collection of all Xit ’s and Zit ’s, respectively, i.e., X = (X1T , ..., XnT ), Z = (Z1T , ..., ZnT ). Lemma S.12. Under Conditions (C1)–(C8), ( ) −1 T e e0 )H −1 (θe0 )ĠT (θe0 ) C −1 (θe0 )Gn (θe0 ) n1/2 An Σ−1/2 Ψ Ġ ( θ ) − H ( θ 0 βγ n n γγ γ n β D − → N (0, ΣA ), where An is any q × dz matrix with a finite q such that ΣA = b −1 b b −1 b limn→∞ A⊗2 n , and Σn = Ψn Ωn Ψn with Ωn and Ψn given in (2.9). √ Proof. According to Lemma S.6, we have ∥ĠTβ (θe0 ) − JDZ ∥ = OP ( d3n /n) √ and ∥ĠT (θe0 ) − JDB ∥ = OP ( d3 /n). Hence, γ n −1 e ĠTβ (θe0 ) − Hβγ (θe0 )Hγγ (θ0 )ĠTγ (θe0 ) = JbDZ + OP ( √ d3n /n), where JbDZ is given in (2.7). Using similar arguments as in the proof of Lemma 13 S.5, we can show that ∥Gn (θe0 ) − G0n (θe0 )∥ = oP (hr ). Thus, √ T e −1 e T e T −1 e e e nAn Σ−1/2 Ψ−1 n n {Ġβ (θ0 ) − Hβγ (θ0 )Hγγ (θ0 )Ġγ (θ0 )} Cn (θ0 )Gn (θ0 ) { } √ T { ( )} √ = nAn Σ−1/2 Ψ−1 JbDZ + OP ( d3n /n) (Cn0 )−1 G0n + OP dx Jn−r n n { }T √ √ bDZ + OP ( d3 /n) (C 0 (θe0 ))−1 J + nAn Σn−1/2 Ψ−1 n n n { } −r ×{Cn0 − Cn0 (θe0 )}(Cn0 )−1 G0n + OP (d1/2 x Jn ) √ 0 −1 0 −1 2 −1 bT = nAn Σ−1/2 Ψ−1 n n {JDZ (Cn ) Gn + OP (n dn ρn ) −1 −r +OP (n−1/2 d3/2 n ρn dx Jn )} √ 0 −1 0 −1/2 bT = nAn Σ−1/2 Ψ−1 )}. n n {JDZ (Cn ) Gn + oP (n ( )−1 ( 0 ) 0 Next we write Cn0 = Ckk , where Ckk ′ ′ is a submatrix of 1≤k,k′ ≤K ′ dimension dn × dn , for any 1 ≤ k, k ≤ K. Thus, T (1) T (1) b T Z D Γ D ∆ V e i 0,i i i 0,i i 0,i n n ( 0 )−1 ∑ ( 0 )−1 0 1 ∑ .. .. T Cn Cn Gn = 2 JbDZ . . n (K) i=1 i=1 T (K) b T Di Γ0,i Zi Di ∆0,i V0,i ei } }T { n { n ∑ ∑ ∑ ′ 1 (k ) (k) b 0 Ckk DiT ∆i V0,i ei DiT Γ0,i Z = 2 ′ i n i=1 i=1 1≤k,k′ ≤K ∑ ∑ 1 (k) (k′ ) 0 T b T′ = 2 Γ0,i′ Di′ Ckk ei . Z ′ Di ∆0,i V0,i i n ′ ′ 1≤i,i ≤n 1≤k,k ≤K For any c ∈ Rdz and ∥c∥ = 1, we write 1/2 n c T An Σ−1/2 Ψ−1 n n n { }( ) ∑ T e −1 T e 0 −1 0 Ġβ (θ0 ) − Hβγ Hγγ Ġγ (θ0 ) Cn Gn ≍ ai ϵi , i=1 where ⊗2 n K ∑ ∑ ′ 1 (k) −1 b T 0 T (k ) . ′ a2i = n cT An Σ−1/2 Ψ Z Γ D C D Γ ′ ′ ′ i n n i 0,i i kk 0,i ′ n ′ i =1 k,k =1 Σi Note that conditioning on (X, Z), ϵi are independent, in addition, we have 14 max a2i = OP (ρ−5 n dn ) and n ∑ a2i = Var {√ } ( 0 )−1 0 −1 bT ncT An Σ−1/2 Ψ J C G X, Z n n DZ n n i=1 ⊗2 (1) 1/2 T D ∆ V Σ 0,i i 0,i i n ( ) 1 ∑ T . −1/2 −1 bT 0 −1 . = c An Σn Ψn JDZ Cn . . n (K) 1/2 i=1 DiT ∆0,i V0,i Σi ( ) ∑ −1 Thus, one has maxi a2i / ni=1 a2i = OP ρ−5 n n dn = oP (1). By the Linderbergn ∑ ∑ Feller central limit theorem, one has ni=1 ai ϵi /( a2i )1/2 → N (0, 1). The i=1 desired result follows from the Cramér-Wold device. Similar to (A.2), we define T (S.18) en (β) = G ė (β) C e −1 (β) G e n (β) , S n n (S.19) fn (β) = G ė (β) C e −1 (β) G ė n (β) . H n n T Lemma S.13. Under Conditions (C2)–(C8), as n → ∞, √ −1/2 e en Σ e QIF e A ), nA (βQIF − β) −→ N (0, Σ en is any q × dz matrix with a finite q such that A e⊗2 converges to where A n e A , and Σ e QIF = limn→∞ Ψ−1 Ωn Ψ−1 with a nonnegative symmetric matrix Σ n n { } ⊗2 ( 0 )−1 ( 0 )−1 1 ∑n T T Ψn = JDZ Cn Wi for Wi in JDZ and Ωn = n i=1 JDZ Cn (2.8). ė n and Q ë n be the gradient vector and Hessian matrix of Q en. Proof. Let Q By Taylor’s expansion, ė n (βeQIF ) − Q ė n (β0 ) = Q ë n (β)(βeQIF − β0 ) Q ė n (β) 1 eQIF e T ∂ Q + (β − β0 ) 2 ∂β∂β T (βeQIF − βe0 ), β=β ∗ ė n (βeQIF ) = 0 where β ∗ = tβeQIF + (1 − t)β0 for some t ∈ [0, 1]. Note that Q e n (β), thus we have since βeQIF is the minimizer of Q ė n (β0 ) = n−1 Q ë n (β ∗ ) (βeQIF − β0 ) −n−1 Q ė n (β) 1 −1 e ∂ Q T + n (βQIF − βe0 ) 2 ∂β∂β T (βeQIF − βe0 ). β=β ∗ 15 According to the Cauchy-Schwarz inequality, one has 2 ė n (β) −1 eQIF e T ∂ Q eQIF − βe0 ) ( β = OP (d2n /n2 )Op (d3n ) = oP (n−1 ). n (β − β0 ) ∂β∂β T By [2], ( ) en (β) + OP n−1 ρ−1 dn = 0, n−1 Q fn (β) + oP (1), ė n (β) = 2S ë n (β) = 2H n−1 Q n where Sn (β) and Hn (β) are defined in (S.18) and (S.19). Thus, { { } ( )} en (β0 ) + OP n−1 ρ−1 dn fn (β0 ) + oP (1) (βeQIF − β0 ) − 2S = 2 H n +oP (n−1/2 ). √ e e −1/2 e By Lemma S.10, the asymptotic distribution of nA n ΣQIF (βQIF − β0 ) is √ e e −1/2 f−1 en (β0 ). Simthe same as the asymptotic distribution nAn ΣQIF Hn (β0 )S ilar to the proof of Lemma S.12, here we derive the asymptotic normality of √ e e −1/2 f−1 en (β0 ) by using the Cramér-Wold device and checknAn ΣQIF Hn (β0 )S √ fn (β0 ) = Ψn + OP ( d3 /n) by ing the Linderberg Condition. Note that H n T Lemma S.6, where Ψn = JDZ (Cn0 )−1 JDZ , and T √ ė e −1 e f−1 e −1/2 en Σ nA QIF Hn (β0 )Gn (β0 )Cn (β0 )Gn (β0 ) T √ 0 −1 e en Σ e −1/2 f−1 ė = nA Gn (β0 ) QIF Hn (β0 )Gn (β0 )(Cn ) T √ −1/2 f−1 ė (β0 )C e −1 (β0 ){C 0 − C en (β0 )}(C 0 )−1 G e n (β0 ) e QIF en Σ Hn (β0 )G + nA n n n n { ( )} √ T 1−r r en Σ e −1/2 f−1 = nA {(Cn0 )−1 + oP (ρ−2 QIF Hn (β0 ) JDZ + OP Jn n dx Jn )} { } ) ( e 0 + OP J −r × G n n ( 0 )−1 0 ( √ ) √ T en Σ e −1/2 e }{1 + oP 1/ n }. G = nA QIF Ψn {JDZ Cn n For any c ∈ Rdz and ∥c∥ = 1, { } √ T T en Σ e −1/2 e0 nc A JDZ (Cn0 )−1 G QIF Ψn n n n K ∑ ∑ ∑ ′ √ 1 (k) T 0 T (k ) en Σ e −1/2 ei ′ cT A Ψ Z = n Γ D C D Γ ′ ′ ′ n i QIF i 0,i i kk 0,i ′ n ′ i=1 = n ∑ i=1 bi ϵi , i =1 k,k =1 16 where ⊗2 n K ∑ ∑ ′ 1 (k) T 0 T (k ) en Σ e −1/2 . b2i = n cT A Γ0,i′ Di′ Ckk ′ Di Γ0,i QIF Ψn Zi′ ′ n ′ i =1 k,k =1 Σi Note that max b2i = OP (ρ−5 n dn ) and n ∑ b2i = Var i=1 } } { {√ T 0 −1 e 0 e −1/2 X, Z J (C ) G ncT An Σ Ψ n QIF DZ n n }⊗2 ( 0 )−1 1 ∑ { T e e −1/2 T c An ΣQIF Ψn JDZ Cn Wi . n n = i=1 ( ) −1 = OP ρ−5 n n dn = oP (1). The Linderbergn ∑n ∑ b2i )1/2 → N (0, 1). Feller central limit theorem implies that b ϵ /( i i i=1 Thus, one has maxi b2i / ∑n 2 i=1 bi The proof is complete. i=1 References. [1] He, X. and Shi, P. (1996). Bivariate tensor-product B-splines in a partly linear model. J. Multivariate Anal. 58 162–181. MR1405586 [2] Qu, A., Lindsay, B. G. and Li, B. (2000). Improving generalised estimating equations using quadratic inference functions. Biometrika 87 823–836. MR1813977 [3] Stone, C. J. (1985). Additive regression and other nonparametric models. Ann. Statist. 13 689–705. MR0790566 [4] Wang, L., Liu, X., Liang, H. and Carroll, R. (2011). Estimation and variable selection for generalized additive partial linear models. Ann. Statist. 39 1827-1851. MR2893854 [5] Xue, L., Qu, A. and Zhou, J. (2010). Consistent model selection for marginal generalized additive model for correlated data. J. Amer. Statist. Assoc. 105 1518-1530. MR2796568 Li Wang Department of Statistics University of Georgia Athens, GA 30602, USA E-mail: lilywang@uga.edu Lan Xue Department of Statistics Oregon State University Corvallis, OR 97331, USA E-mail: xuel@stat.oregonstate.edu Annie Qu Department of Statistics University of Illinois at Urbana-Champaign Champaign, IL 61820, USA E-mail: anniequ@illinois.edu Hua Liang Department of Statistics George Washington University Washington, D.C. 20052, USA E-mail: hliang@gwu.edu