Covariate Adjusted Functional Principal Component Analysis (FPCA) for Longitudinal Data Ci-Ren Jiang & Jane-Ling Wang University of California, Davis National Taiwan University July 9, 2009 Ci-Ren Jiang Ph. D. Candidate, UC Davis Outline Introduction (Univariate) Covariate adjusted FPCA ? (Multivariate ) Covariate adjusted FPCA FPCA as a building block for Modeling Application to PET data 1. Introduction Principal Component analysis is a standard dimension reduction tool for multivariate data. It has been extended to functional data and termed functional principal component analysis (FPCA). Standard FPCA approaches treat functional data as if they are from a single population. Our goal is to accommodate covariate information in the framework of FPCA for longitudinal data. Functional vs. Longitudinal Data A sample of curves, with one curve, X(t), per subject. - These curves are usually considered realizations of a stochastic process in L2 (I ) . - dimensional Functional Data - In reality, X(t) is recorded at a regular and dense time grid high-dimensional. Longitudinal Data – irregularly sampled X(t). - often sparse, as in medical follow-up studies. Longitudinal AIDS Data CD4 counts of 369 patients were recorded. The number ni , of repeated measurements for subject i, varies with an average of 6.44. This resulted in longitudinal data of uneven no. of measurements at irregular time points. CD4 Counts of First 25 Patients 3500 3000 CD4 Count 2500 2000 1500 1000 500 0 -3 -2 -1 0 1 2 3 time since seroconversion 4 5 6 Review of FPCA ² A ssum e dat a or iginat es fr om a r andom funct ion X ( t ) , wit h m ean ¹ ( t ) and covar iance funct ion ¡ ( s; t ) = C ov ( X ( s) ; X ( t ) ) , s & t 2 a com pact int er val. ² FPCA cor r esponds t o a sp ect r al decom posit ion of t he covar iance ¡ ( s; t ) , which leads t o K ar hunen-L oeve decom posit ion of t he r andom funct ion as: X X (t ) = ¹ (t ) + A k Ák ( t ) ; k wher e var ( A k ) = ¸ k and Á k ( t ) ar e t he eigenvalues and eigenfunct ions of ¡ ( s; t ) ; R A k = f X (t) ¡ ¹ (t)gÁ(t)dt ar e or t hogoanl Review of FPCA Both longitudinal and functional data may be observed with noise (measurement errors). the observed data for subject i might be Yij Yi (tij) = (tij ) A (tij ) ei (tij ). k 1 ik k Review of FPCA Functional Data Dauxois, Pousse & Romain (1982) Rice & Silverman (1991) Cardot (2000) Hall & Hosseini-Nasab (2006) Longitudinal Data Shi, Weiss & Taylor(1996) James, Sugar & Hastie(2000) Rice & Wu (2001) Yao, Müller & Wang (2005) Steps to FPCA 1. Est imat e t he mean ¹ (t ) and covar iance ¡ (s; t ). (T his usually involves smoot hing). 2. Est imat e t he eigenvalues and R eigenfunct ions of ¡ (s; t ). 3. Est imat e PC scor es A i k = (X (t ) ¡ ¹ (t ))Á(t )dt . ² W hen funct ional dat a ar e obser ved at ir r egular & few t ime point s, t he funct ional PC scor es cannot be est imat ed t hr ough int egr at ion met hod. ² Yao et al. (2005) pr oposed PA CE t o r esolve t his issue. A^ i k = E^ (A i k jYi ) = ¸^ k Á^ Tk §^ ¡Yi 1 (Yi ¡ ¹ i ) Estimation of Mean Function Taipei 101 CD4 Counts of First 25 Patients 3500 3000 CD4 Count 2500 2000 1500 1000 500 0 -3 -2 -1 0 1 2 3 time since seroconversion 4 5 6 CD4 Counts of First 25 Patients 3500 3000 CD4 Count 2500 2000 1500 1000 500 0 -3 -2 -1 0 1 2 3 time since seroconversion 4 5 6 Mean Curve: CD4 counts of all patients 3500 3000 CD4 Count 2500 2000 1500 1000 500 0 -3 -2 -1 0 1 2 3 time since seroconversion 4 5 6 Estimation of Covariance Function Row Covariance Plot: [Y (tij ) (tij )][Y (tik ) (tik )], j, k Y(t)= X(t)+e(t) Cov (Y(s), Y(t)) = Cov (X(s), X(t)), if s t, but var(Y(t)) =var(X(t))+ 2 . Row Covariance Plot: [Y (tij ) (tij )][Y (tik ) (tik )], j, k Y(t)= X(t)+e(t) Cov (Y(s), Y(t)) = Cov (X(s), X(t)), if s t, but var(Y(t)) =var(X(t))+ 2 . Covariance & Variance References Yao, Müller and Wang (2005, JASA) Methods and theory for the mean and covariance functions. Hall, Müller and Wang (2006, AOS) Theory on eigenfunctions and eigenvalues. End of Introduction 2. Covariate adjusted FPCA – Univariate Z For dense functional data Chiou, Müller & Wang (2003) Cardot (2006) Their method does not work for sparse dara. We propose two ways: fFPCA & mFPCA Covariate adjusted FPCA: Longitudinal Data Supp ose t he dat a or iginat e fr om a r andom funct ion X ( t ; z) wit h m ean ¹ ( t ; z) and covar iance funct ion ¡ ( s; t ; z) , wher e z is t he value of a covar iat e Z , and s and t ar e in a com pact t im e int er val. Fully Adjusted FPCA (fFPCA) ² T his appr oach assumes t hat t he covar iance funct ion ¡ (s; t ; z) var ies wit h z, so t hat t he cor r esponding eigenfunct ions Á k (t ; z) and eigenvalues ¸ k ( z) var y wit h Z : X ¡ (s; t ; z) = ¸ k (z)Á k ( s; z)Á k (t ; z) k ² K ar hunen-L oeve expansion implies r andom t r aj ect or y X (t ; z) can be r epr esent ed as X X (t ; z) = ¹ (t ; z) + A k (z)Á k (t ; z) k Mean Adjusted FPCA (mFPCA) ² T he second appr oach t ook t he view t hat t he covar iat e Z is a r andom var iable, and if we pool all t he sub j ect s t oget her aft er cent er ing each individual cur ve t o zer o, we would have a pooled covar iance funct ion X ¡ ¤ (s; t ) = ¸ ¤k Á ¤k (s)Á ¤k (t ) k ² K ar hunen-L oeve expansion t hus implies t hat t he r andom t r aj ect or y X (t ; z) can be r epr esent ed as X X (t ; z) = ¹ (t ; z) + A ¤k Á ¤k (t ) k Estimation: Mean Function T he m ean funct ion for fFPCA and m FPCA ar e t he sam e and can b e est im at ed using any t wo-dim ensional scat t er -plot sm oot her of Yi j on ( Ti j ; Z i ) : For exam ples: N adar aya-W at son ker nel est im at or : P n P Ni t ¡ Ti j z ¡ Z i j = 1 K 2 ( h ¹ ; t ; h ¹ ; z ) Yi j i= 1 ¹^ N W ( t ; z) = P n P N i t ¡ Ti j z ¡ Z i ( K 2 h ¹ ;t ; h ¹ ;z ) j=1 i= 1 Estimation: Mean Function L ocal linear est imat or : ¹^ L (t ; z) = ¯^0 ; wher e for ¯ = (¯ 0 ; ¯ 1 ; ¯ 2 ) Xn XN i t ¡ T ij z ¡ Zi ¯^ = argmin K 2( ; ) ¯ h ¹ ;t h ¹ ;z i= 1 j = 1 £ [Yi j ¡ ¯ 0 ¡ ¯ 1 (Ti j ¡ t ) ¡ ¯ 2 (Z i ¡ z)] 2 : CD4 Counts of All Patients and Mean Curve 3500 3000 CD4 Count 2500 2000 1500 1000 500 0 -3 -2 -1 0 1 2 3 time since seroconversion 4 5 6 AIDS CD4: Estimated Mean Estimation: Covariance Function T he covar iance est im at or s can also be expr essed as a scat t er -plot sm oot her of t he so called \ r aw covar iances" de¯ned as: C i j k = ( Yi j ¡ ¹^ ( Ti j ; Z i ) ) ( Yi k ¡ ¹^ ( Ti k ; Z i ) ) : ² fFPCA : t hr ee-dim ensional sm oot her of C i j k on ( Ti j ; Ti k ; Z i ) ² m FPCA : t wo-dim ensional sm oot her of C i j k on ( Ti j ; Ti k ) . Estimation: Covariance Function Since cov(Yi j ; Yi k jTi j ; Ti k ; Z i ) = cov(X (Ti j ; Z i ); X (Ti j ; Z i )) + ¾2 ±j k ; wher e ±j k is 1 if j = k , and 0 ot her wise, t he diagonal of t he \ r aw" covar iances C i j k should not be included in t he covar iance funct ion smoot hing st ep. Example of Covariance Estimates ² L inear local sm oot her for fFPCA : ¡^ L (t ; s; z) = ¯^0 ; wher e Xn X t ¡ Ti j s ¡ Ti k z ¡ Z i ^ ¯ = arg minf K 3( ; ; ) ¯ h G ;t h G ;t h G ;z i = 1 16 j 6 = k6 N i £ [C i j k ¡ (¯ 0 + ¯ 1 ( Ti j ¡ t ) + ¯ 2 (Ti k ¡ s) + ¯ 3 ( Z i ¡ z))] 2 g: ² L inear local sm oot her for m FPCA : ¡^ ¤ ( t ; s) = ¯^0 ; wher e Xn ¯^ = arg minf ¯ X i = 1 16 j 6 = k6 N i K 1( t ¡ Ti j hG¤ )K 1 ( s ¡ Ti k hG¤ ) £ [C i j k ¡ (¯ 0 + ¯ 1 (Ti j ¡ t ) + ¯ 2 (Ti k ¡ s)] 2 g AIDS CD4: Estimated Covariance Estimation: Variance of Measurement Errors T he var iance of Y (t ) for a given z is V (t ; z) = ¡ (t ; t ; z) + ¾2 : V^ (t ; z) = ¯^0 ; wher e ¯^ = arg min ¯ Xn XN i i= 1 t ¡ Ti j z ¡ Z i K 2( ; ) h V ;t h V ;z j=1 £ [C i j j ¡ ¯ 0 ¡ ¯ 1 ( Ti j ¡ t ) ¡ ¯ 2 (Z i ¡ z) ] 2 : Estimation: Variance of Measurement Errors For st abilit y, 2 ¾ ^ = 2 T Z Z f V^ (t ; z) ¡ ¡^ L (t ; t ; z)gdt dz; Z T1 wher e T 1 = [inf f t : t 2 T g + jT j=4; supf t : t 2 T g ¡ jT j=4]. AIDS: Estimated Covariance + measurement error Estimation: Eigenvalues and Eigenfunctions ² fFP CA : T he solut ions of t he eigen-equat ions, Z ^ k ( s; z ) ds = ¸^ k ( z ) Á ^k ( t ; z ) ; ¡^ L ( t ; s; z ) Á R ^ k ( t ; z ) sat is¯es Á ^ 2 ( t ; z ) dt = 1 and wher e t he Á k R ^ k ( t ; z) Á ^ m ( t ; z) dt = 0 for m < k . Á ² m FP CA : T he solut ions of t he eigen-equat ions, Z ^ ¤k ( s) ds = ¸^ ¤k Á ^ ¤k ( t ) ; ¡^ ¤ ( t ; s) Á R ¤ ¤ ^ ^ ( t ) ) 2 dt = 1 and wher e t he Á ( t ) sat is¯es ( Á k k R ¤ ^ (t)Á ^ ¤m ( t ) dt = 0 for m < k . Á k Estimation: Principal Component Scores ² fFP CA : U se t he condit ional exp ect at ion ( PA CE) E ( A i k ( Z i ) j Y~i ) t o est im at e t he pr incipal com ponent scor es, wher e Y~i = ( Yi 1 ; : : : ; Yi N ) T . i ² U nder t he assum pt ion t hat Y~i is mult ivar iat e nor m al: A^ i k ( Z i ) = ¸^ k Á^ Tik §^ ¡Y~ 1 ( Y~i ¡ ¹^ i ) ; i wher e ¹^ i = ( ¹^ ( Ti 1 ; Z i ) ; : : : ; ¹^ ( Ti N i ; Z i ) ) T ; ( §^ ~ ) j ;k = ¡^ L ( Ti j ; Ti k ; Z i ) + ¾ ^ 2 ±j k ; Yi ^ i k = ( Á^ k ( Ti 1 ; Z i ) ; : : : ; Á^ k ( Ti N ; Z i ) ) T : Á i Estimation: Principal Component Scores T he pr edict ion of pr incipal component scor es in mFPCA is similar . Theoretical Results D e¯nit ion: A r eal funct ion f ( x ; y ) : R n + m ! R is cont inuous on A µ R n unifor m ly in y 2 R m , if given any x 2 A and " > 0 t her e exist s a neighbor hood of x not dep ending on y , say U ( x ) , s.t . jf ( x 0; y ) ¡ f ( x ; y ) j < " for all x 0 2 U ( x ) and y 2 R m . G iven an int eger Q > 1 and for q = 1; : : : ; Q, let à q : R 3 ! R sat isfy: C.1 à q ( t ; z; y ) ' s ar e cont inuous on U ( f t ; zg) unifor m ly in y 2 R. @p @t p 1 @z p 2 C.2 T he funct ions à q ( t ; z; y ) exist for all ar gument s ( t ; z; y ) and ar e cont inuous on U ( f t ; zg) unifor m ly in y 2 R , for p 1 + p 2 = p and 0 6 p 1 ; p 2 6 p. Notations: 2D Smoothers T he ker nel-weight ed aver ages for t wo-dimensional smoot her s ar e de¯ned as: ª qn = Xn XN i 1 n E N h º¹ 1;t+ 1 h º¹ 2;z+ 1 t ¡ Ti j z ¡ Z i à q (Ti j ; Z i ; Yi j )K 2 ( ; ): h ¹ ;t h ¹ ;z i= 1 j = 1 L et jº j ®q (t ; z) = @ @t º 1 @z º 2 Z à q (t ; z; y )f 3 (t ; z; y )dy ; and Z ¾qr (t ; z) = à q (t ; z; y )à r (t ; z; y )f 3 (t ; z; y )dy kK 2 k 2 ; wher e f 3 (t ; z; y ) is t he j oint densit y of (T ; Z ; Y ), R kK 2 k 2 = K 22 and 1 6 q; r 6 Q. Theoretical Results: 2D Smoothers T heorem 1. Let H : R Q ! R be a funct ion wit h cont inuous ¯r st or der der ivat ives, D H (v) = ( @@x 1 H (v); : : : ; @x@Q H (v)) T , Pn 1 and N¹ = N i . U nder suit able assumpt ions, and asn i= 1 2j· j+ 2 suming hh ¹¹ ;;tz ! ½¹ and n E (N )h ¹ ;t ! ¿¹2 for some 0 < ½¹ ; ¿¹ < 1 , we can obt ain q 1 + 1 2º 2 + 1 n N¹ h 2º h ¹ ;z [H (ª 1n ; : : : ; ª Q n ) ¡ H (®1 ; : : : ; ®Q )] ¹ ;t D ¡! N (¯ H ; [D H (®1 ; : : : ; ®Q )]T § [D H (®1 ; : : : ; ®Q )]); Theoretical Results: 2D Smoothers (cont’d) wher e § = (¾qr ) 16 q;r 6 l j· j Z X (¡ 1) ¯H = [ sk1 1 sk2 2 K 2 (s1 ; s2 )ds1 ds2 ] k !k ! k + k = j· j 1 2 1 £f 2 XQ @H @®q q= 1 [(®1 ; : : : ; ®Q ) T ] @k 1 + k 2 ¡ @t k 1 ¡ q º1¡ º2 ®q @z k 2 ¡ º 2 ®q (t ; z)g¿¹ 2+ 1 ½2k : ¹ Mean Function: Nadaraya-Watson Est. Cor ollar y 1. U nder suit able assum pt ions, and assum ing hh ¹¹ ;;zt ! ½¹ and n E ( N ) h 6¹ ;t ! ¿¹2 for som e 0 < ½¹ ; ¿¹ < 1 : q fD n N¹ h ¹ ;t h ¹ ;z [ ¹^ N W ( t ; z) ¡ ¹ ( t ; z) ] ¡! N ( ¯ N W ; § N W ) ; wher e X 1 ¯N W = k1+ k2= 2 £ f §NW k 1 !k 2 ! 1 Z q s k1 1 s k2 2 K 2 ( s 1 ; s 2 ) ds 1 ds 2 ]¿¹ [ @2 ¹ ( t ; z) @2 2+ 1 ½2k ¹ ®1 ( t ; z) ¡ f 2 ( t ; z) g f 2 ( t ; z) @t k 1 @z k 2 f 2 ( t ; z) @t k 1 @z k 2 Var ( Y jt ; z) = kK 2 k 2 ; ®1 ( t ; z) = ¹ ( t ; z) f 2 ( t ; z) ; f 2 ( t ; z) and f 2 ( t ; z) is t he j oint densit y of ( T ; Z ) . Mean Function: Local Linear Est. Cor ollar y. U nder suit able assumpt ions, and assuming ½¹ , and n E (N )h 6¹ ;t ! ¿¹2 for some 0 < ½¹ ; ¿¹ < 1 : q h ¹ ;z h ¹ ;t ! D n N¹ h ¹ ;t h ¹ ;z [¹^ L (t ; z) ¡ ¹ (t ; z)] ¡! N (¯ L ; § L ); wher e X ¯L = k1+ k2 §L = Z 1 k !k ! =2 1 2 Var (Y jt ; z) f 2 (t ; z) [ sk1 1 sk2 2 K 2 (s 1 ; s2 )ds1 ds2 ] kK 2 k 2 ; and f 2 (t ; z) is t he j oint densit y of (T ; Z ). @2 @t k 1 @z k 2 q ¹ (t ; z)¿¹ 2+ 1 ½2k ¹ Rate of Convergence If E(N) < , the rate of convergence for the 1/3 n 2D mean and covariance function is . - This is the optimal rate of convergence for 2D smoothers with independent data. If E(N) → as close to 2/5, the rate of convergence can be2/5 n as possible but not equal to If N , the convergence rate is i n . n . Notations: 3D Smoothers T he t echnique of ker nel-weight ed aver ages can be ext ended t o t hr ee-dimensional smoot her s t o obt ain t heir asympt ot ic nor malit ies. Given an int eger Q > 1, let # q : R 5 ! R for q = 1; : : : ; Q sat isfying: D .1 # q (t ; s; z; y 1 ; y 2 )' s ar e cont inuous on U (f t ; s; zg) unifor mly in (y 1 ; y 2 ) 2 R 2 . D .2 @p T he funct ions @t p 1 @s p 2 @z p 3 # q (t ; s; z; y 1 ; y 2 ) exist for all ar gument s (t ; s; z; y 1 ; y 2 ) and ar e cont inuous on U (f t ; s; zg) unifor mly in (y 1 ; y 2 ) 2 R 2 , for p 1 + p 2 + p 3 = p and 0 6 p 1 ; p 2 ; p 3 6 p. Notations: 3D Smoothers (cont’d) T he general weight ed averages of t hree-dimensional smoot hing met hods are de¯ned as: £ qn (t; s; z) = Xn X £ i = 1 16 j 6= k 6 N i 1 n E (N (N ¡ 1))h ºG1;t+ º 2 + 2 h ºG3;z+ 1 t ¡ Ti j s ¡ Ti k z ¡ Z i # q (Ti j ; Ti k ; Z i ; Yi j ; Yi k )K 3 ( ; ; ): h G ;t h G ;t h G ;z Notations: 3D Smoothers (cont’d) Let jº j @ Z »q (t; s; z) = º º º # q (t; s; z; y 1 ; y 2 )f 5 (t; s; z; y 1 ; y 2 )dy 1 dy 2 1 @s 2 @z 3 @ t Z ! qr = # q (t; s; z; y 1 ; y 2 )# r (t; s; z; y 1 ; y 2 )f 5 (t; s; z; y 1 ; y 2 )dy 1 dy 2 kK 3 k 2 ; where f 5 (t; s; z; y 1 ; y 2 ) is t he joint density of (T1 ; T2 ; Z ; Y1 ; Y2 ), R kK 3 k 2 = K 32 , and 1 6 q; r 6 l. Theoretical Results: 3D Smoothers T heor em. L et H : R Q ! R be a funct ion wit h cont inuous @ T ¯r st or der der ivat ives, D H (v) = ( @@ H (v); : : : ; H (v)) , x1 @x Q 1 P n ¹ and N = N i . U nder suit able assumpt ions, h G ;z h G ;t n i= 1 2j · j + 3 ! ½G and n E (N (N ¡ 1))h G ;t for some 0 < ½G ; ¿G < 1 : ! ¿G2 q 1 + 2º 2 + 2 2º 3 + 1 n N¹ ( N¹ ¡ 1)h 2º h G ;z f H (£ 1n ; : : : ; £ Q n ) ¡ H (»1 ; : : : ; »Q )g G ;t D ¡! N (° H ; [D H (»1 ; : : : ; »Q )]T - [D H (»1 ; : : : ; »Q )]); Theoretical Results: 3D Smoothers (cont’d ) where - = (! qr ) 16 q;r 6 Q and XQ °H = X f (¡ 1) j· j Z u ·1 1 u ·2 2 u ·3 3 K 3 (u 1 ; u 2 ; u 3 )du 1 du 2 du 3 g j· j! Z j· j d £ · 1 · 2 · 3 #q (t ; s; z; y 1 ; y 2 )f 5 (t ; s; z; y 1 ; y 2 )dy 1 dy 2 dt ds dz q @H £ (»1 ; : : : ; »Q ) T ¿G ½2·G 3 + 1 : @»q q= 1 · 1 + · 2 + · 3 = j· j Covariance in fFPCA: Nadaraya Watson Est. Corollary. Under suit able assumpt ions, and assuming h G ;z ! ½G and h G ;t n E (N (N ¡ 1))h 7G ;t ! ¿G2 for some 0 < ½G ; ¿G < 1 : q D 2 ¹ ¹ ^ n N ( N ¡ 1)h G ;t h G ;z f ¡ N W (t; s; z)¡ ¡ (t; s; z)g ¡! N (° N W ; - NW ); Covariance in fFPCA: Nadaraya-Watson cont’d wher e °N W - NW 2 2 1 2 d2 d d = f ¾1 ¿1 2 ¡ (t ; s; z) + ¾22 ¿1 2 ¡ (t ; s; z) + ¾32 ¿2 2 ¡ (t ; s; z)g 2 dt ds dz d d d d 2 2 + f ¾1 ¿1 ( ¡ (t ; s; z))( g3 (t ; s; z)) + ¾2 ¿1 ( ¡ (t ; s; z))( g3 (t ; s; z)) dt dt ds ds d d 2 + ¾3 ¿2 ( ¡ (t ; s; z))( g3 (t ; s; z))g=g3 (t ; s; z); dz dz À3 (t ; s; z)kK 3 k 2 = ; g3 (t ; s; z) and g3 (t ; s; z) is t he joint densit y of (T1 ; T2 ; Z ). Covariance in fFPCA: Local Linear Smoothers h Cor ollar y. U nder suit able assumpt ions, assuming h GG ;;zt ! ½G , and n E (N (N ¡ 1)) h 7G ;t ! ¿G2 for som e 0 < ½G ; ¿G < 1 : q D n N¹ ( N¹ ¡ 1)h 2G ;t h G ;z f ¡^ L (t ; s; z) ¡ ¡ (t ; s; z)g ¡! N (° L ; - L ); wher e 1 2 d2 d2 d2 2 2 ° L = f ¾1 ¿1 2 ¡ (t ; s; z) + ¾2 ¿1 2 ¡ (t ; s; z) + ¾3 ¿2 2 ¡ (t ; s; z) g 2 dt ds dz À3 ( t ; s; z)kK 3 k 2 - L = ; g3 (t ; s; z) and g3 (t ; s; z) is t he j oint densit y of (T1 ; T2 ; Z ) . Covariance in mFPCA: Local Linear Smoothers Cor ollar y. U nder suit able assum pt ions, h G ¤ ! 0, n E ( N 2 )h 2G ¤ ! 1 , h G ¤ E ( N 3 ) ! 0, n E ( N ( N ¡ 1) ) h 6G ¤ ! for som e 0 6 ¿ < 1 , we can obt ain q D n N¹ ( N¹ ¡ 1) h 2G ¤ f ¡^ ¤ ( t ; s) ¡ ¡ ¤ ( t ; s) g ¡! N ( ° ¤ ; - ¤ ) ; ¿2 wher e Z 2 2 ¿ d d ¤ °¤ = u 2 K 1 ( u )du f 2 ¡ ¤ ( t ; s) + ¡ ( t ; s)g; 2 2 dt ds 4 À (t ; s) kK k 2 1 - ¤= ; g2 ( t ; s) À2 (t ; s) = Var ( ( Y1 ¡ ¹ (T1 ; Z ) ) ( Y2 ¡ ¹ ( T2 ; Z ) ) jT1 = t ; T2 = s) ; and g2 ( t ; s) is t he j oint densit y of ( T1 ; T2 ) . Rates of Convergence If E(N) < , the rate of convergence for the 3D covariance is n 2/7, which is the optimal rate of convergence for independent data. If E(N) → , the rate of convergence can be as 2/5 close to n as possible, but not equal to it. If N i , the convergent rate should be n. Theorem 3: Eigen-values/functions in mFPCA T heor em. L et n ´ ¡ ( 1=3) 6 h ¹ = o(1) for some ´ > 0, and assume t hat for an int eger j 0 > 1 t her e ar e no t ies among t he (j 0 + 1) lar gest eigenvalues of ¡ ¤ (t ; s); t hat (i) n ´ 1 ¡ ( 1=3) 6 h G ¤ for some ´ 1 > 0, h ¹ = o(h G ¤ ), ( 2=3) ( ¡ 8=3) max(n ¡ 1=3 h G ¤ ; n ¡ 1 h G ¤ ) = o(h ¹ ), and h G ¤ = o(1) (ii) n ´ ¡ ( 3=8) 6 h G ¤ , and h G ¤ + h ¹ = o(n ¡ 1=4 ): A lso, let ¤ = (¸ 1 ; : : : ; ¸ j 0 ) T , and ¤^ = ( ¸^ 1 ; : : : ; ¸^ j 0 ) T . Theorem 3 (cont’d) : 1 2 ¤ L et N = P n i= 1 N i (N i ¡ 1): U nder assumpt ions (i), k Á^ j ¡ Á j k = 2 C 1j N ¤h G¤ + C 2j h 4G ¤ + op f (n h G ¤ ) ¡ 1 + h 4G ¤ g; and under assumpt ions (ii): p n ( ¤^ ¡ ¤ ) is asympt ot ically a mult ivar iat e nor mal dist r ibut ion wit h mean 0 and covar iance mat r ix § . Optimal Rates of Convergence The first k eigenfunctions can be estimates at the same optimal rate as a 1-dim nonprametric regression function. The largest k eigenvalues can be estimated at the n rate. Bandwidth Selection ² M ean Funct ion ¹ ( t ; z ) and covar iance ¡ ¤ ( s; t ) : L eave one sub j ect out cr oss-validat ion ² Covar iance Funct ion ¡ ( s; t ; z ) : k-fold cr oss-validat ion Supp ose t hat t he sub j ect s ar e r andom ly assigned t o k set s ( S1 ; S2 ; : : : ; Sk ) . Xk X X h = arg min h f C i j m ¡ ¡^ ( ¡ Sl ) ( T i j ; T i m ; z i ) g2 ; l = 1 i 2 S l 16 j 6 = m6Ni wher e ¡^ ( ¡ Sl ) ( Ti j ; Ti m ; z i ) is t he est im at ed covar iance funct ion at ( Ti j ; Ti m ; z i ) when t he sub j ect s in Sl ar e not used t o est im at e ¡ ( t ; s; z ) . Number of Eigenfunctions We used three methods: AIC BIC FVE: minimum number of eigen-components needed to explained at least a specified total fraction of the variation. Predicted Trajectory for X(t) Suppose t hat t he ¯r st K eigenfunct ions ar e used t o pr edict t he t r aj ect or ies; given t 2 T and z 2 Z , t he pr edict ed t r aj ect or y of X i (t ; z) based on t he ¯r st K eigenfunct ions will be X^ iK (t ; z) = ¹^ L (t ; z) + X^ iK XK k= 1 XK X (t ; z) = ¹^ L (t ; z) + k= 1 A^ i k (z) Á^ k (t ; z) A^ ¤ik Á^ ¤k (t ) (fFPCA ) (mFPCA ) Simulation Study ² L et covar iat e Z » U (0; 1) ² ¹ (t ; z) = t + z sin (t ) + (1 ¡ z) cos ( t ) p ² Á 1 ( t ; z) = ¡ cos (¼(t + z=2)) 2 and p Á 2 ( t ; z) = sin ( ¼( t + z=2)) 2 ² ¸ 1 (z) = z=9, ¸ 2 ( z) = z=36 and ¸ k (z) = 0 for k > 3. ² A i k » N ( 0; ¸ k ( z)) ² m easur em ent er r or s » N ( 0; 0:052 ) T he simulat ion consist s of 100 r uns. T he number of sub ect is 100 in each r un. True and Estimated Mean Surface Simulation: Estimated Eigenfunctions (mFPCA) Simulation: Estimated Eigenfunctions (fFPCA) Simulation Study covar iat e z I SE of ¡^ L I SE of Á^1 ( t ; z) I SE of Á^2 ( t ; z) ¸^ 1 ( z) ¸^ 2 ( z) 0.1 0.00015 0.0294 0.2720 0.0047 ( 0.0073) 0.0034 ( 0.0045) 0.3 0.00025 0.0076 0.0305 -0.0041 ( 0.0106) 0.0001 ( 0.0039) 0.5 0.00071 0.0071 0.0242 -0.0113 ( 0.0181) 0.0005 ( 0.0057) 0.7 0.0014 0.0074 0.0179 -0.0202 ( 0.0205) -0.0002 ( 0.0077) 0.9 0.0030 0.0112 0.0300 -0.0242 ( 0.0333) -0.0037 ( 0.0094) Simulat ion r esult s of fF PCA . T he t hr ee r ows cor r esp onding t o I SE ar e based on t he aver age int egr at ed squar ed er r or s of t he 100 simulat ions, and t he r ows cor r esp onding t o ¸^ i ar e t he biases and st andar d deviat ion ( in br acket ) . Simulation Study uFP CA m FP CA fFP CA FV E 0.0325 0.0103 0.0085 M I SE = M I SE A IC 0.0198 0.0063 0.0077 n Z X 1 1 n 0 i= 1 M SFE = . BIC 0.0197 0.0063 0.0077 (X i (t ; z i ) ¡ X^ iK (t ; z i )) 2 dt n 1X n FV E 0.0067 0.0050 0.0022 M SFE AIC 0.0065 0.0017 0.0015 i= 1 N 1 Xi Ni j=1 (Yi j ¡ Y^i j ) 2 : BIC 0.0065 0.0017 0.0015 Conclusions ² T hr ough simulat ions and dat a analysis, we have shown t hat cur r ent appr oaches for funct ional pr incipal com p onent analysis ar e no longer suit able for funct ional dat a when covar iat e infor m at ion is available. ² N um er ical evidence suppor t s t he sim pler m ean-adj ust ed appr oach esp ecially when t he pur p ose is t o pr edict t he t r aj ect or ies Y ( t ) . ² T he cat ch is t he high-dim ensional sm oot hing involved wit h a vect or Z . Som e dim ension r educt ion on Z will b e needed for pr act ical im plem ent at ion and t his will b e a fut ur e r esear ch pr oj ect . End of Single Covariate * Multidimensional Covariates Assume that Z and only the mean function depends on Z. (t, z)= (t, T z) single index or (t, z)= (t, 1T z, 2T z, ..., kT z ), k<p p, multiple indices Dimension Reduction Models There are many ways to estimate the indices for independent data, i.e. when there is no t. Y = (1Tz, 2Tz, ..., kTz)+ . Few has been extended to longitudinal or functional data, but none for the multi-index model Y (t) = (t, 1Tz, 2Tz, ..., kTz)+ (t). We choose an approach “MAVE” by Xia et al. (2002) to extend to longitudinal data. n - convergence of T heor em . L et ¯^ b e t he est im at or of ¯ 0 in t he algor it hm . U nder som e r egular it y condit ions, we have p n ( ¯^ ¡ ¯ 0 ) ¡ ! D N ( 0; § ) ; wher e § = [E ( G ( T ; Z ) ) ] + § ¤ [E ( G ( T ; Z ) ) ] + ; µ ¶2 T ¡ T ¢ d¹ ( t ; ¯ 0 z) T G ( t ; z) = zz ¡ m ( t ; z) m ( t ; z) ; T d( ¯ 0 z) µ ¶ T d¹ ( t ; ¯ 0 z) G 0 ( t ; z) = ( z ¡ m ( t ; z) ) ; T d( ¯ 0 z) m ( t ; z) = E ( Z j T = t ; ¯ 0T Z = ¯ 0T z) ; and A + is t he M oor e-Penr ose inver se of m at r ix A . n - convergence of ¤ § = EN ¡ 1 E (f G 0 (T ; Z )² gf G 0 (T ; Z )² )gT ) EN 1 + E (f G 0 (T ; Z )² gf G 0 (T ; Z )² gT ) EN AIDS CD4: Estimated Mean AIDS: Estimated Covariance + measurement error End of Multidimensional Covariates 3. What’s Next After FPCA? FPCA can be the end product - to explore the covariate effects, to recover the trajectories of each subject, and to explore the modes of variation etc. FPCA can help to find more parsimonious model. AIDS CD4: Estimated Mean AIDS CD4 Data This suggests the possibility of a more parsimonious model with multiplicative covariate effects. T Y (t ) (t ) ( z ) e(t ). (t ) could be parametric, e.g. a polynomial. Common marginal models for longitudinal data take the additive form, and employ parametric models for both the mean and covariance function. - Both parametric forms are difficult to detect for sparse and noisy longitudinal data. AIDS CD4: Estimated Covariance AIDS CD4: Estimated Eigenfunctions MSE: FVE MSE K 0.1154 1 AIC (BIC) MSE K 0.0937 3 Adding Random Effects Help to identify the form of the random effects. Y (t ) (t ) ( z ) a bt e(t ). T random effects Semiparametric Product Model If we assume that the first eigenfunction is proportional to the population mean function (t , Z ) , and discards the remaining eigenfunctions, we arrive at the following multiplicative random effect model: Y (t ) (t , z ) A (t , z ) e(t ) b (t , z ) e(t ). effects Random PET Data First Eigenfunction 0.25 0.2 0.15 0.1 0.05 0 -0.05 -0.1 -0.15 -0.2 0 10 20 30 40 50 time 60 70 80 90 4. Dynamic Positron Emission Tomography (PET) Time Course Data Joint work with Ciren Jiang , UC Davis & John Aston Academia Sinica & Univ. of Warwick John Aston, Academia Sinica & Warwick U. Dynamic PET Time-Course Data PET is a nuclear medicine imaging technique which produces a three-dimensional image or picture of functional processes in the body. A PET scan measures important body functions, such as blood flow, oxygen use, and sugar (glucose) metabolism, to help doctors evaluate how well organs and tissues are functioning. Measured 11C-Diprenorphine Data The scan is an experiment on epilepsy. The chemical compound Diprenorphine measures the concentration of opioid (pain) receptors in the brain. The idea of the overall experiment was to see if there was a difference in the concentration of receptors in Epileptics against normal subjects. However, the changes that were hypothesized were very small so it was important that the experiment could get as accurate measurements as possible. Measured 11C-Diprenorphine Data A dynamic scan from a measured 11C-diprenorphine study of a normal subject were analyzed. Four Dimensional Data: three spatial + one temporal 128 × 128 × 95 × 32 Voxel sizes were 2.096 mm × 2.096 mm × 2.43 mm. Scans were rebinned into 32 time frames of increasing duration. t = 27.5, 60, 70, 80, 100, 130, ..., 1075, 1195, 1315, ..., 4435, 5035 seconds Example of Analysis for five voxels Motivation Due to experimental constraints, the time course measurements are often fairly noisy. The Spectral Analysis method (Cunningham and Jones, 1993) is well known to be sensitive to noise with the bias being highly dependent on the level of noise present. By borrowing information across space through the use of a non-parametric covariate adjustment, it is possible to reconstruct the PET time course data and thus reduce noise. Motivation Many of the processes presented in the PET time course data have chemical rates associated with them. These rates are dependent on a large number of biological factors, too numerous and complex to be exhaustively represented or identified in the discretely and noisily measured data. However, if an alternative viewpoint that the rates are random variables is taken, then a small additive random change in one rate will lead to a multiplicative change in the time course. Multiplicative Nonparametric Random Effects Model Since Á 1 ( t ; z) / ¹ (t ; z) , fr om t he r andom cur ve viewpoint t he concent r at ion cur ve of t he voxel i wit h covar iat e z i can b e r epr esent ed as X1 X i ( t ; zi ) = ¹ ( t ; zi ) + A i k Ák ( t ; zi ) k= 1 X1 = B i ¹ ( t ; zi ) + A i k Á k (t ; z i ) ; k= 2 wher e B i = 1 + ®A i 1 and Á 1 ( t ; z) = ®¹ (t ; z) . Estimation Procedures ² D et er m ine in-br ain voxels ² A pply 2D sm oot her t o t he set f ( Yi j ; t j ; z i ) ji = 1; : : : ; n ; 1 6 j 6 pg t o est im at e ¹ ( t ; z) ² A pply least squar es m et hod t o est im at e B i ² A pply 3D sm oot her t o t he set f ( G i j k ; t j ; t k ; z i ) ji = 1; : : : ; n ; 1 6 j 6 = k 6 pg t o est im at e ¡ ( t ; s; z) , wher e G i j k = ( Yi j ¡ B^ i ¹^ ( t j ; z i )) ( Yi k ¡ B^ i ¹^ ( t k ; z i ) ) Estimation Procedures ² Est im at e ¸ k ( z) and Á k ( t ; z) , and apply FV E t o choose t he numb er of eigenfunct ions ² A pply I nt egr at ion M et hod t o est im at e t he pr incipal com ponent scor es A i k ( z i ) ² R econst r uct t he r andom cur ve for each voxel: P K ^ ^ X^ iK ( t ) = B^ i ¹^ ( t ; z i ) + k = 2 A i k ( z i ) Ák ( t ; z i ) ² A pply par am et r ic m et hod t o t he r econst r uct ed cur ves. Variable Bandwidth b(t) A global bandwidth is appropriate along the covariate coordinate, but not desirable in the time coordinate. denser measurement schedule at the beginning and sharp peak near the left boundary Smaller bandwidths are preferred near the peak, while larger bandwidths are used near the right boundary. Choose Variable Bandwidth b(t) choose 13 time locations. [t-b(t), t+b(t)] includes at least four observations boundary correction to ensure a positive bandwidth To ensure a smooth outcome, fit a polynomial of order 4 to the pairs (tj, b(tj)). The resulting b(t) is further multiplied by a constant α, determined by a cross-validation step. Example of Analysis for five voxels Spectral Analysis (Cunningham and Jones, 1993) Sp ect r al A nalysis does not assum e a known com par t m ent al st r uct ur e, but r at her per for m s a m odel select ion t hr ough a non-negat ivit y const r aint on t he par am et er s. I n par t icular , t he concent r at ion cur ve X (t ) is par am et er ized by O XK X (t ) = I (t ) ®j exp ¡ ¯ j t ; j=1 wher e I ( t ) is a known input funct ion and ®j and ¯ j ar e t he non-negat ive par am et er s t o be est im at ed. T he par am et er of int er est VT is t he int egr al of t he im pulse r esp onse funct ion Z 1 XK VT = ®j exp ¡ ¯ j t dt = 0 j=1 XK ®j j=1 ¯j : Conclusions This is consistent with the knowledge that Spectral Analysis has high positive bias at voxel noise levels of around 5%. By reconstructing the data through fFPCA, the noise level is reduced, and thus the level of bias is also reduced. (Overall mean squared residuals reduce 71.82%) The covariate adjusted FPCA can be applied in practice to measured PET data using spatially pooled information. Thank You The End Sedona, Arizona, 2006 IMS WNAR Covariate adjusted FPCA: Longitudinal Data We propose two ways: fFPCA & mFPCA. Both consist of two parts: a systematic part for the mean function and a stochastic part for the covariance function. Difference - handling of the covariance structure