Covariate adjusted FPCA

advertisement
Covariate Adjusted
Functional Principal Component Analysis
(FPCA) for Longitudinal Data
Ci-Ren Jiang & Jane-Ling Wang
University of California, Davis
National Taiwan University
July 9, 2009
Ci-Ren Jiang
Ph. D. Candidate, UC Davis
Outline
 Introduction
 (Univariate) Covariate adjusted FPCA
? (Multivariate ) Covariate adjusted FPCA
 FPCA as a building block for Modeling
 Application to PET data
1. Introduction
 Principal Component analysis is a standard
dimension reduction tool for multivariate data. It
has been extended to functional data and termed
functional principal component analysis (FPCA).
 Standard FPCA approaches treat functional data
as if they are from a single population.
 Our goal is to accommodate covariate information
in the framework of FPCA for longitudinal data.
Functional vs. Longitudinal Data
 A sample of curves, with one curve, X(t), per subject.
- These curves are usually considered realizations of a
stochastic process in L2 (I ) .
- dimensional

 Functional Data - In reality, X(t) is recorded at a
regular and dense time grid
high-dimensional.
 Longitudinal Data – irregularly sampled X(t).
- often sparse, as in medical follow-up studies.
Longitudinal AIDS Data
 CD4 counts of 369 patients were recorded.
The number ni , of repeated measurements
for subject i, varies with an average of 6.44.
 This resulted in longitudinal data of uneven
no. of measurements at irregular time points.
CD4 Counts of First 25 Patients
3500
3000
CD4 Count
2500
2000
1500
1000
500
0
-3
-2
-1
0
1
2
3
time since seroconversion
4
5
6
Review of FPCA
² A ssum e dat a or iginat es fr om a r andom funct ion X ( t ) ,
wit h m ean ¹ ( t ) and covar iance funct ion
¡ ( s; t ) = C ov ( X ( s) ; X ( t ) ) , s & t 2 a com pact int er val.
² FPCA cor r esponds t o a sp ect r al decom posit ion of t he
covar iance ¡ ( s; t ) , which leads t o K ar hunen-L oeve decom posit ion of t he r andom funct ion as:
X
X (t ) = ¹ (t ) +
A k Ák ( t ) ;
k
wher e var ( A k ) = ¸ k and Á k ( t ) ar e t he eigenvalues and
eigenfunct ions of ¡ ( s; t ) ;
R
A k = f X (t) ¡ ¹ (t)gÁ(t)dt ar e or t hogoanl
Review of FPCA
 Both longitudinal and functional data
may be observed with noise
(measurement errors).
the observed data for subject i
might be

Yij Yi (tij) =  (tij )   A  (tij )  ei (tij ).
k 1
ik k
Review of FPCA
Functional Data
Dauxois, Pousse & Romain (1982)
Rice & Silverman (1991)
Cardot (2000)
Hall & Hosseini-Nasab (2006)
Longitudinal Data
Shi, Weiss & Taylor(1996)
James, Sugar & Hastie(2000)
Rice & Wu (2001)
Yao, Müller & Wang (2005)
Steps to FPCA
1. Est imat e t he mean ¹ (t ) and covar iance ¡ (s; t ).
(T his usually involves smoot hing).
2. Est imat e t he eigenvalues and
R eigenfunct ions of ¡ (s; t ).
3. Est imat e PC scor es A i k = (X (t ) ¡ ¹ (t ))Á(t )dt .
² W hen funct ional dat a ar e obser ved at ir r egular & few
t ime point s, t he funct ional PC scor es cannot be est imat ed t hr ough int egr at ion met hod.
² Yao et al. (2005) pr oposed PA CE t o r esolve t his issue.
A^ i k = E^ (A i k jYi ) = ¸^ k Á^ Tk §^ ¡Yi 1 (Yi ¡ ¹ i )
Estimation of Mean Function
Taipei 101
CD4 Counts of First 25 Patients
3500
3000
CD4 Count
2500
2000
1500
1000
500
0
-3
-2
-1
0
1
2
3
time since seroconversion
4
5
6
CD4 Counts of First 25 Patients
3500
3000
CD4 Count
2500
2000
1500
1000
500
0
-3
-2
-1
0
1
2
3
time since seroconversion
4
5
6
Mean Curve: CD4 counts of all patients
3500
3000
CD4 Count
2500
2000
1500
1000
500
0
-3
-2
-1
0
1
2
3
time since seroconversion
4
5
6
Estimation of Covariance Function
Row Covariance Plot: [Y (tij )  (tij )][Y (tik )  (tik )], j, k
Y(t)= X(t)+e(t)
Cov (Y(s), Y(t))
= Cov (X(s), X(t)),
if s  t,
but var(Y(t))
=var(X(t))+
2
.
Row Covariance Plot: [Y (tij )  (tij )][Y (tik )  (tik )], j, k
Y(t)= X(t)+e(t)
Cov (Y(s), Y(t))
= Cov (X(s), X(t)),
if s  t,
but var(Y(t))
=var(X(t))+
2
.
Covariance & Variance
References
 Yao, Müller and Wang (2005, JASA)
Methods and theory for the mean and
covariance functions.
 Hall, Müller and Wang (2006, AOS)
Theory on eigenfunctions and eigenvalues.
End of Introduction
2. Covariate adjusted FPCA – Univariate Z
 For dense functional data
Chiou, Müller & Wang (2003)
Cardot (2006)
 Their method does not work for sparse
dara.
 We propose two ways:
fFPCA & mFPCA
Covariate adjusted FPCA: Longitudinal Data
Supp ose t he dat a or iginat e fr om
a r andom funct ion X ( t ; z)
wit h m ean ¹ ( t ; z)
and covar iance funct ion ¡ ( s; t ; z) ,
wher e z is t he value of a covar iat e Z ,
and s and t ar e in a com pact t im e int er val.
Fully Adjusted FPCA (fFPCA)
² T his appr oach assumes t hat t he covar iance funct ion
¡ (s; t ; z) var ies wit h z,
so t hat t he cor r esponding eigenfunct ions Á k (t ; z)
and eigenvalues ¸ k ( z) var y wit h Z :
X
¡ (s; t ; z) =
¸ k (z)Á k ( s; z)Á k (t ; z)
k
² K ar hunen-L oeve expansion implies
r andom t r aj ect or y X (t ; z) can be r epr esent ed as
X
X (t ; z) = ¹ (t ; z) +
A k (z)Á k (t ; z)
k
Mean Adjusted FPCA (mFPCA)
² T he second appr oach t ook t he view t hat t he covar iat e
Z is a r andom var iable, and if we pool all t he sub j ect s
t oget her aft er cent er ing each individual cur ve t o zer o,
we would have a pooled covar iance funct ion
X
¡ ¤ (s; t ) =
¸ ¤k Á ¤k (s)Á ¤k (t )
k
² K ar hunen-L oeve expansion t hus implies t hat t he r andom t r aj ect or y X (t ; z) can be r epr esent ed as
X
X (t ; z) = ¹ (t ; z) +
A ¤k Á ¤k (t )
k
Estimation: Mean Function
T he m ean funct ion for fFPCA and m FPCA
ar e t he sam e and can b e est im at ed using any
t wo-dim ensional scat t er -plot sm oot her of
Yi j on ( Ti j ; Z i ) :
For exam ples:
N adar aya-W at son ker nel est im at or :
P n P Ni
t ¡ Ti j z ¡ Z i
j = 1 K 2 ( h ¹ ; t ; h ¹ ; z ) Yi j
i= 1
¹^ N W ( t ; z) = P n P N i
t ¡ Ti j z ¡ Z i
(
K
2 h ¹ ;t ; h ¹ ;z )
j=1
i= 1
Estimation: Mean Function
L ocal linear est imat or :
¹^ L (t ; z) = ¯^0 ; wher e for ¯ = (¯ 0 ; ¯ 1 ; ¯ 2 )
Xn XN i
t
¡
T
ij z ¡ Zi
¯^ = argmin
K 2(
;
)
¯
h ¹ ;t
h ¹ ;z
i= 1 j = 1
£ [Yi j ¡ ¯ 0 ¡ ¯ 1 (Ti j ¡ t ) ¡ ¯ 2 (Z i ¡ z)] 2 :
CD4 Counts of All Patients and Mean Curve
3500
3000
CD4 Count
2500
2000
1500
1000
500
0
-3
-2
-1
0
1
2
3
time since seroconversion
4
5
6
AIDS CD4: Estimated Mean
Estimation: Covariance Function
T he covar iance est im at or s can also be expr essed
as a scat t er -plot sm oot her of t he so called
\ r aw covar iances" de¯ned as:
C i j k = ( Yi j ¡ ¹^ ( Ti j ; Z i ) ) ( Yi k ¡ ¹^ ( Ti k ; Z i ) ) :
²
fFPCA : t hr ee-dim ensional sm oot her of
C i j k on ( Ti j ; Ti k ; Z i )
²
m FPCA : t wo-dim ensional sm oot her of
C i j k on ( Ti j ; Ti k ) .
Estimation: Covariance Function
Since
cov(Yi j ; Yi k jTi j ; Ti k ; Z i )
= cov(X (Ti j ; Z i ); X (Ti j ; Z i )) + ¾2 ±j k ;
wher e ±j k is 1 if j = k , and 0 ot her wise, t he diagonal of
t he \ r aw" covar iances C i j k
should not be included in t he covar iance
funct ion smoot hing st ep.
Example of Covariance Estimates
² L inear local sm oot her for fFPCA :
¡^ L (t ; s; z) = ¯^0 ; wher e
Xn
X
t ¡ Ti j s ¡ Ti k z ¡ Z i
^
¯ = arg minf
K 3(
;
;
)
¯
h G ;t
h G ;t
h G ;z
i = 1 16 j 6
= k6 N
i
£ [C i j k ¡ (¯ 0 + ¯ 1 ( Ti j ¡ t ) + ¯ 2 (Ti k ¡ s) + ¯ 3 ( Z i ¡ z))] 2 g:
² L inear local sm oot her for m FPCA :
¡^ ¤ ( t ; s) = ¯^0 ; wher e
Xn
¯^ = arg minf
¯
X
i = 1 16 j 6
= k6 N i
K 1(
t ¡ Ti j
hG¤
)K 1 (
s ¡ Ti k
hG¤
)
£ [C i j k ¡ (¯ 0 + ¯ 1 (Ti j ¡ t ) + ¯ 2 (Ti k ¡ s)] 2 g
AIDS CD4: Estimated Covariance
Estimation: Variance of Measurement Errors
T he var iance of Y (t ) for a given z is
V (t ; z) = ¡ (t ; t ; z) + ¾2 :
V^ (t ; z) = ¯^0 ; wher e
¯^ = arg min
¯
Xn XN i
i= 1
t ¡ Ti j z ¡ Z i
K 2(
;
)
h V ;t
h V ;z
j=1
£ [C i j j ¡ ¯ 0 ¡ ¯ 1 ( Ti j ¡ t ) ¡ ¯ 2 (Z i ¡ z) ] 2 :
Estimation: Variance of Measurement Errors
For st abilit y,
2
¾
^ =
2
T
Z
Z
f V^ (t ; z) ¡ ¡^ L (t ; t ; z)gdt dz;
Z
T1
wher e
T 1 = [inf f t : t 2 T g + jT j=4; supf t : t 2 T g ¡ jT j=4].
AIDS: Estimated Covariance + measurement error
Estimation: Eigenvalues and Eigenfunctions
² fFP CA : T he solut ions of t he eigen-equat ions,
Z
^ k ( s; z ) ds = ¸^ k ( z ) Á
^k ( t ; z ) ;
¡^ L ( t ; s; z ) Á
R
^ k ( t ; z ) sat is¯es Á
^ 2 ( t ; z ) dt = 1 and
wher e t he Á
k
R
^ k ( t ; z) Á
^ m ( t ; z) dt = 0 for m < k .
Á
² m FP CA : T he solut ions of t he eigen-equat ions,
Z
^ ¤k ( s) ds = ¸^ ¤k Á
^ ¤k ( t ) ;
¡^ ¤ ( t ; s) Á
R ¤
¤
^
^ ( t ) ) 2 dt = 1 and
wher
e
t
he
Á
(
t
)
sat
is¯es
(
Á
k
k
R ¤
^ (t)Á
^ ¤m ( t ) dt = 0 for m < k .
Á
k
Estimation: Principal Component Scores
² fFP CA :
U se t he condit ional exp ect at ion ( PA CE) E ( A i k ( Z i ) j Y~i )
t o est im at e t he pr incipal com ponent scor es, wher e
Y~i = ( Yi 1 ; : : : ; Yi N ) T .
i
² U nder t he assum pt ion t hat Y~i is mult ivar iat e nor m al:
A^ i k ( Z i ) = ¸^ k Á^ Tik §^ ¡Y~ 1 ( Y~i ¡ ¹^ i ) ;
i
wher e
¹^ i = ( ¹^ ( Ti 1 ; Z i ) ; : : : ; ¹^ ( Ti N i ; Z i ) ) T ;
( §^ ~ ) j ;k = ¡^ L ( Ti j ; Ti k ; Z i ) + ¾
^ 2 ±j k ;
Yi
^ i k = ( Á^ k ( Ti 1 ; Z i ) ; : : : ; Á^ k ( Ti N ; Z i ) ) T :
Á
i
Estimation: Principal Component Scores
T he pr edict ion of pr incipal component scor es
in mFPCA is similar .
Theoretical Results
D e¯nit ion: A r eal funct ion f ( x ; y ) : R n + m ! R is cont inuous on A µ R n unifor m ly in y 2 R m , if given any x 2 A
and " > 0 t her e exist s a neighbor hood of x not dep ending
on y , say U ( x ) , s.t . jf ( x 0; y ) ¡ f ( x ; y ) j < " for all x 0 2 U ( x )
and y 2 R m .
G iven an int eger Q > 1 and for q = 1; : : : ; Q, let à q :
R 3 ! R sat isfy:
C.1 Ã q ( t ; z; y ) ' s ar e cont inuous on U ( f t ; zg) unifor m ly in
y 2 R.
@p
@t p 1 @z p 2
C.2 T he funct ions
à q ( t ; z; y ) exist for all ar gument s
( t ; z; y ) and ar e cont inuous on U ( f t ; zg) unifor m ly in
y 2 R , for p 1 + p 2 = p and 0 6 p 1 ; p 2 6 p.
Notations: 2D Smoothers
T he ker nel-weight ed aver ages for t wo-dimensional smoot her s
ar e de¯ned as:
ª
qn
=
Xn XN i
1
n E N h º¹ 1;t+ 1 h º¹ 2;z+ 1
t ¡ Ti j z ¡ Z i
à q (Ti j ; Z i ; Yi j )K 2 (
;
):
h ¹ ;t
h ¹ ;z
i= 1 j = 1
L et
jº j
®q (t ; z) =
@
@t º 1 @z º 2
Z
à q (t ; z; y )f 3 (t ; z; y )dy ; and
Z
¾qr (t ; z) =
à q (t ; z; y )à r (t ; z; y )f 3 (t ; z; y )dy kK 2 k 2 ;
wher e f 3 (t
; z; y ) is t he j oint densit y of (T ; Z ; Y ),
R
kK 2 k 2 = K 22 and 1 6 q; r 6 Q.
Theoretical Results: 2D Smoothers
T heorem 1. Let H : R Q ! R be a funct ion wit h cont inuous
¯r st or der der ivat ives, D H (v) = ( @@x 1 H (v); : : : ; @x@Q H (v)) T ,
Pn
1
and N¹ =
N i . U nder suit able assumpt ions, and asn
i= 1
2j· j+ 2
suming hh ¹¹ ;;tz ! ½¹ and n E (N )h ¹ ;t
! ¿¹2 for some 0 <
½¹ ; ¿¹ < 1 , we can obt ain
q
1 + 1 2º 2 + 1
n N¹ h 2º
h ¹ ;z [H (ª 1n ; : : : ; ª Q n ) ¡ H (®1 ; : : : ; ®Q )]
¹ ;t
D
¡! N (¯ H ; [D H (®1 ; : : : ; ®Q )]T § [D H (®1 ; : : : ; ®Q )]);
Theoretical Results: 2D Smoothers (cont’d)
wher e
§ = (¾qr ) 16 q;r 6 l
j· j Z
X
(¡ 1)
¯H =
[ sk1 1 sk2 2 K 2 (s1 ; s2 )ds1 ds2 ]
k !k !
k + k = j· j 1 2
1
£f
2
XQ @H
@®q
q= 1
[(®1 ; : : : ; ®Q ) T ]
@k 1 + k 2 ¡
@t k 1 ¡
q
º1¡ º2
®q @z k 2 ¡ º 2
®q (t ; z)g¿¹
2+ 1
½2k
:
¹
Mean Function: Nadaraya-Watson Est.
Cor ollar y 1. U nder suit able assum pt ions, and assum ing hh ¹¹ ;;zt !
½¹ and n E ( N ) h 6¹ ;t ! ¿¹2 for som e 0 < ½¹ ; ¿¹ < 1 :
q
fD
n N¹ h ¹ ;t h ¹ ;z [ ¹^ N W ( t ; z) ¡ ¹ ( t ; z) ] ¡! N ( ¯ N W ; § N W ) ;
wher e
X
1
¯N W =
k1+ k2= 2
£ f
§NW
k 1 !k 2 !
1
Z
q
s k1 1 s k2 2 K 2 ( s 1 ; s 2 ) ds 1 ds 2 ]¿¹
[
@2
¹ ( t ; z)
@2
2+ 1
½2k
¹
®1 ( t ; z) ¡
f 2 ( t ; z) g
f 2 ( t ; z) @t k 1 @z k 2
f 2 ( t ; z) @t k 1 @z k 2
Var ( Y jt ; z)
=
kK 2 k 2 ; ®1 ( t ; z) = ¹ ( t ; z) f 2 ( t ; z) ;
f 2 ( t ; z)
and f 2 ( t ; z) is t he j oint densit y of ( T ; Z ) .
Mean Function: Local Linear Est.
Cor ollar y. U nder suit able assumpt ions, and assuming
½¹ , and n E (N )h 6¹ ;t ! ¿¹2 for some 0 < ½¹ ; ¿¹ < 1 :
q
h ¹ ;z
h ¹ ;t
!
D
n N¹ h ¹ ;t h ¹ ;z [¹^ L (t ; z) ¡ ¹ (t ; z)] ¡! N (¯ L ; § L );
wher e
X
¯L =
k1+ k2
§L =
Z
1
k !k !
=2 1 2
Var (Y jt ; z)
f 2 (t ; z)
[
sk1 1 sk2 2 K 2 (s 1 ; s2 )ds1 ds2 ]
kK 2 k 2 ;
and f 2 (t ; z) is t he j oint densit y of (T ; Z ).
@2
@t k 1 @z k 2
q
¹ (t ; z)¿¹
2+ 1
½2k
¹
Rate of Convergence

 If E(N) <
, the rate of convergence for the
1/3
n
2D mean and covariance function is
.
- This is the optimal rate of convergence for 2D
smoothers with independent data.
 If E(N) →
as close to
2/5, the rate of convergence can be2/5
n
as possible but not equal to
 If N   , the convergence rate is
i
n .
n
.
Notations: 3D Smoothers
T he t echnique of ker nel-weight ed aver ages can be ext ended t o t hr ee-dimensional smoot her s t o obt ain t heir asympt ot ic nor malit ies. Given an int eger Q > 1, let # q : R 5 ! R
for q = 1; : : : ; Q sat isfying:
D .1 # q (t ; s; z; y 1 ; y 2 )' s ar e cont inuous on U (f t ; s; zg) unifor mly in (y 1 ; y 2 ) 2 R 2 .
D .2
@p
T he funct ions @t p 1 @s p 2 @z p 3 # q (t ; s; z; y 1 ; y 2 ) exist for all
ar gument s (t ; s; z; y 1 ; y 2 ) and ar e cont inuous on U (f t ; s; zg)
unifor mly in (y 1 ; y 2 ) 2 R 2 , for p 1 + p 2 + p 3 = p and
0 6 p 1 ; p 2 ; p 3 6 p.
Notations: 3D Smoothers (cont’d)
T he general weight ed averages of t hree-dimensional
smoot hing met hods are de¯ned as:
£ qn (t; s; z) =
Xn
X
£
i = 1 16 j 6= k 6 N i
1
n E (N (N ¡ 1))h ºG1;t+ º 2 + 2 h ºG3;z+ 1
t ¡ Ti j s ¡ Ti k z ¡ Z i
# q (Ti j ; Ti k ; Z i ; Yi j ; Yi k )K 3 (
;
;
):
h G ;t
h G ;t
h G ;z
Notations: 3D Smoothers (cont’d)
Let
jº j
@
Z
»q (t; s; z) = º º º # q (t; s; z; y 1 ; y 2 )f 5 (t; s; z; y 1 ; y 2 )dy 1 dy 2
1 @s 2 @z 3
@
t
Z
! qr = # q (t; s; z; y 1 ; y 2 )# r (t; s; z; y 1 ; y 2 )f 5 (t; s; z; y 1 ; y 2 )dy 1 dy 2 kK 3 k 2 ;
where f 5 (t;
s; z; y 1 ; y 2 ) is t he joint density of (T1 ; T2 ; Z ; Y1 ; Y2 ),
R
kK 3 k 2 = K 32 , and 1 6 q; r 6 l.
Theoretical Results: 3D Smoothers
T heor em. L et H : R Q ! R be a funct ion wit h cont inuous
@
T
¯r st or der der ivat ives, D H (v) = ( @@
H
(v);
:
:
:
;
H
(v))
,
x1
@x Q
1 P n
¹
and N =
N i . U nder suit able assumpt ions,
h G ;z
h G ;t
n
i= 1
2j · j + 3
! ½G and n E (N (N ¡ 1))h G ;t
for some 0 < ½G ; ¿G < 1 :
! ¿G2
q
1 + 2º 2 + 2 2º 3 + 1
n N¹ ( N¹ ¡ 1)h 2º
h G ;z f H (£ 1n ; : : : ; £ Q n ) ¡ H (»1 ; : : : ; »Q )g
G ;t
D
¡! N (° H ; [D H (»1 ; : : : ; »Q )]T - [D H (»1 ; : : : ; »Q )]);
Theoretical Results: 3D Smoothers (cont’d )
where - = (! qr ) 16 q;r 6 Q and
XQ
°H =
X
f
(¡ 1)
j· j
Z
u ·1 1 u ·2 2 u ·3 3 K 3 (u 1 ; u 2 ; u 3 )du 1 du 2 du 3 g
j· j!
Z
j· j
d
£ · 1 · 2 · 3 #q (t ; s; z; y 1 ; y 2 )f 5 (t ; s; z; y 1 ; y 2 )dy 1 dy 2
dt ds dz
q
@H
£
(»1 ; : : : ; »Q ) T ¿G ½2·G 3 + 1 :
@»q
q= 1 · 1 + · 2 + · 3 = j· j
Covariance in fFPCA: Nadaraya Watson Est.
Corollary. Under suit able assumpt ions, and assuming
h G ;z
! ½G and
h G ;t
n E (N (N ¡ 1))h 7G ;t ! ¿G2 for some 0 < ½G ; ¿G < 1 :
q
D
2
¹
¹
^
n N ( N ¡ 1)h G ;t h G ;z f ¡ N W (t; s; z)¡ ¡ (t; s; z)g ¡! N (° N W ; -
NW
);
Covariance in fFPCA: Nadaraya-Watson cont’d
wher e
°N W
-
NW
2
2
1 2 d2
d
d
= f ¾1 ¿1 2 ¡ (t ; s; z) + ¾22 ¿1 2 ¡ (t ; s; z) + ¾32 ¿2 2 ¡ (t ; s; z)g
2
dt
ds
dz
d
d
d
d
2
2
+ f ¾1 ¿1 ( ¡ (t ; s; z))( g3 (t ; s; z)) + ¾2 ¿1 ( ¡ (t ; s; z))( g3 (t ; s; z))
dt
dt
ds
ds
d
d
2
+ ¾3 ¿2 ( ¡ (t ; s; z))( g3 (t ; s; z))g=g3 (t ; s; z);
dz
dz
À3 (t ; s; z)kK 3 k 2
=
;
g3 (t ; s; z)
and g3 (t ; s; z) is t he joint densit y of (T1 ; T2 ; Z ).
Covariance in fFPCA: Local Linear Smoothers
h
Cor ollar y. U nder suit able assumpt ions, assuming h GG ;;zt !
½G , and n E (N (N ¡ 1)) h 7G ;t ! ¿G2 for som e 0 < ½G ; ¿G < 1 :
q
D
n N¹ ( N¹ ¡ 1)h 2G ;t h G ;z f ¡^ L (t ; s; z) ¡ ¡ (t ; s; z)g ¡! N (° L ; -
L );
wher e
1 2 d2
d2
d2
2
2
° L = f ¾1 ¿1 2 ¡ (t ; s; z) + ¾2 ¿1 2 ¡ (t ; s; z) + ¾3 ¿2 2 ¡ (t ; s; z) g
2
dt
ds
dz
À3 ( t ; s; z)kK 3 k 2
- L =
;
g3 (t ; s; z)
and g3 (t ; s; z) is t he j oint densit y of (T1 ; T2 ; Z ) .
Covariance in mFPCA: Local Linear Smoothers
Cor ollar y. U nder suit able assum pt ions, h G ¤ ! 0,
n E ( N 2 )h 2G ¤ ! 1 , h G ¤ E ( N 3 ) ! 0, n E ( N ( N ¡ 1) ) h 6G ¤ !
for som e 0 6 ¿ < 1 , we can obt ain
q
D
n N¹ ( N¹ ¡ 1) h 2G ¤ f ¡^ ¤ ( t ; s) ¡ ¡ ¤ ( t ; s) g ¡! N ( ° ¤ ; - ¤ ) ;
¿2
wher e
Z
2
2
¿
d
d
¤
°¤ =
u 2 K 1 ( u )du f 2 ¡ ¤ ( t ; s) +
¡
( t ; s)g;
2
2
dt
ds
4
À
(t
;
s)
kK
k
2
1
- ¤=
;
g2 ( t ; s)
À2 (t ; s) = Var ( ( Y1 ¡ ¹ (T1 ; Z ) ) ( Y2 ¡ ¹ ( T2 ; Z ) ) jT1 = t ; T2 = s) ;
and g2 ( t ; s) is t he j oint densit y of ( T1 ; T2 ) .
Rates of Convergence
 If E(N) <  , the rate of convergence for the 3D
covariance is n 2/7, which is the optimal rate of
convergence for independent data.

 If E(N) →
, the rate of convergence can be as
2/5
close to n as possible, but not equal to it.
 If
N i   , the convergent rate should be
n.
Theorem 3: Eigen-values/functions in mFPCA
T heor em. L et n ´ ¡ ( 1=3) 6 h ¹ = o(1) for some ´ > 0, and
assume t hat for an int eger j 0 > 1 t her e ar e no t ies among
t he (j 0 + 1) lar gest eigenvalues of ¡ ¤ (t ; s); t hat
(i) n ´ 1 ¡ ( 1=3) 6 h G ¤ for some ´ 1 > 0, h ¹ = o(h G ¤ ),
( 2=3)
( ¡ 8=3)
max(n ¡ 1=3 h G ¤ ; n ¡ 1 h G ¤ ) = o(h ¹ ), and h G ¤ = o(1)
(ii) n ´ ¡
( 3=8)
6 h G ¤ , and h G ¤ + h ¹ = o(n ¡
1=4
):
A lso, let ¤ = (¸ 1 ; : : : ; ¸ j 0 ) T , and ¤^ = ( ¸^ 1 ; : : : ; ¸^ j 0 ) T .
Theorem 3 (cont’d) :
1
2
¤
L et N =
P
n
i= 1
N i (N i ¡ 1):
U nder assumpt ions (i),
k Á^ j ¡ Á j k =
2
C 1j
N ¤h G¤
+ C 2j h 4G ¤ + op f (n h G ¤ ) ¡ 1 + h 4G ¤ g;
and under assumpt ions (ii):
p
n ( ¤^ ¡ ¤ ) is asympt ot ically a mult ivar iat e nor mal dist r ibut ion wit h mean 0 and covar iance mat r ix § .
Optimal Rates of Convergence
 The first k eigenfunctions can be estimates at
the same optimal rate as a 1-dim
nonprametric regression function.
 The largest k eigenvalues can be estimated at
the n rate.
Bandwidth Selection
² M ean Funct ion ¹ ( t ; z ) and covar iance ¡ ¤ ( s; t ) :
L eave one sub j ect out cr oss-validat ion
² Covar iance Funct ion ¡ ( s; t ; z ) : k-fold cr oss-validat ion
Supp ose t hat t he sub j ect s ar e r andom ly assigned t o k
set s ( S1 ; S2 ; : : : ; Sk ) .
Xk X
X
h = arg min
h
f C i j m ¡ ¡^ ( ¡
Sl )
( T i j ; T i m ; z i ) g2 ;
l = 1 i 2 S l 16 j 6
= m6Ni
wher e ¡^ ( ¡ Sl ) ( Ti j ; Ti m ; z i ) is t he est im at ed covar iance funct ion at ( Ti j ; Ti m ; z i ) when t he sub j ect s in Sl ar e not used
t o est im at e ¡ ( t ; s; z ) .
Number of Eigenfunctions
We used three methods:
AIC
BIC
FVE:
minimum number of eigen-components
needed to explained at least a specified total
fraction of the variation.
Predicted Trajectory for X(t)
Suppose t hat t he ¯r st K eigenfunct ions ar e used t o pr edict t he t r aj ect or ies; given t 2 T and z 2 Z , t he pr edict ed
t r aj ect or y of X i (t ; z) based on t he ¯r st K eigenfunct ions
will be
X^ iK (t ; z) = ¹^ L (t ; z) +
X^ iK
XK
k= 1
XK
X
(t ; z) = ¹^ L (t ; z) +
k= 1
A^ i k (z) Á^ k (t ; z)
A^ ¤ik Á^ ¤k (t )
(fFPCA )
(mFPCA )
Simulation Study
² L et covar iat e Z » U (0; 1)
² ¹ (t ; z) = t + z sin (t ) + (1 ¡ z) cos ( t )
p
² Á 1 ( t ; z) = ¡ cos (¼(t + z=2))
2 and
p
Á 2 ( t ; z) = sin ( ¼( t + z=2)) 2
² ¸ 1 (z) = z=9, ¸ 2 ( z) = z=36 and ¸ k (z) = 0 for k > 3.
² A i k » N ( 0; ¸ k ( z))
² m easur em ent er r or s » N ( 0; 0:052 )
T he simulat ion consist s of 100 r uns.
T he number of sub ect is 100 in each r un.
True and Estimated Mean Surface
Simulation:
Estimated Eigenfunctions (mFPCA)
Simulation:
Estimated Eigenfunctions (fFPCA)
Simulation Study
covar iat e z
I SE of ¡^ L
I SE of Á^1 ( t ; z)
I SE of Á^2 ( t ; z)
¸^ 1 ( z)
¸^ 2 ( z)
0.1
0.00015
0.0294
0.2720
0.0047
( 0.0073)
0.0034
( 0.0045)
0.3
0.00025
0.0076
0.0305
-0.0041
( 0.0106)
0.0001
( 0.0039)
0.5
0.00071
0.0071
0.0242
-0.0113
( 0.0181)
0.0005
( 0.0057)
0.7
0.0014
0.0074
0.0179
-0.0202
( 0.0205)
-0.0002
( 0.0077)
0.9
0.0030
0.0112
0.0300
-0.0242
( 0.0333)
-0.0037
( 0.0094)
Simulat ion r esult s of fF PCA . T he t hr ee r ows cor r esp onding t o I SE ar e based on t he aver age int egr at ed squar ed er r or s of t he 100 simulat ions, and t he r ows cor r esp onding t o
¸^ i ar e t he biases and st andar d deviat ion ( in br acket ) .
Simulation Study
uFP CA
m FP CA
fFP CA
FV E
0.0325
0.0103
0.0085
M I SE =
M I SE
A IC
0.0198
0.0063
0.0077
n Z
X
1
1
n
0
i= 1
M SFE =
.
BIC
0.0197
0.0063
0.0077
(X i (t ; z i ) ¡ X^ iK (t ; z i )) 2 dt
n
1X
n
FV E
0.0067
0.0050
0.0022
M SFE
AIC
0.0065
0.0017
0.0015
i= 1
N
1 Xi
Ni
j=1
(Yi j ¡ Y^i j ) 2 :
BIC
0.0065
0.0017
0.0015
Conclusions
² T hr ough simulat ions and dat a analysis, we have shown
t hat cur r ent appr oaches for funct ional pr incipal com p onent analysis ar e no longer suit able for funct ional
dat a when covar iat e infor m at ion is available.
² N um er ical evidence suppor t s t he sim pler m ean-adj ust ed
appr oach esp ecially when t he pur p ose is t o pr edict t he
t r aj ect or ies Y ( t ) .
² T he cat ch is t he high-dim ensional sm oot hing involved
wit h a vect or Z . Som e dim ension r educt ion on Z will
b e needed for pr act ical im plem ent at ion and t his will
b e a fut ur e r esear ch pr oj ect .
End of Single Covariate
* Multidimensional Covariates
Assume that Z 
and only the mean function depends on Z.
 (t, z)= (t,  T z)  single index
or
 (t, z)= (t, 1T z,  2T z, ...,  kT z ), k<p
p,

multiple indices
Dimension Reduction Models
 There are many ways to estimate the indices for
independent data, i.e. when there is no t.
Y = (1Tz, 2Tz, ..., kTz)+ .
 Few has been extended to longitudinal or functional
data, but none for the multi-index model
Y (t) = (t, 1Tz, 2Tz, ..., kTz)+ (t).
 We choose an approach “MAVE” by
Xia et al. (2002) to extend to longitudinal data.
n - convergence of

T heor em . L et ¯^ b e t he est im at or of ¯ 0 in t he algor it hm .
U nder som e r egular it y condit ions, we have
p
n ( ¯^ ¡ ¯ 0 ) ¡ ! D N ( 0; § ) ;
wher e
§ = [E ( G ( T ; Z ) ) ] + § ¤ [E ( G ( T ; Z ) ) ] + ;
µ
¶2
T
¡ T
¢
d¹ ( t ; ¯ 0 z)
T
G ( t ; z) =
zz ¡ m ( t ; z) m ( t ; z) ;
T
d( ¯ 0 z)
µ
¶
T
d¹ ( t ; ¯ 0 z)
G 0 ( t ; z) =
( z ¡ m ( t ; z) ) ;
T
d( ¯ 0 z)
m ( t ; z) = E ( Z j T = t ; ¯ 0T Z = ¯ 0T z) ;
and A + is t he M oor e-Penr ose inver se of m at r ix A .
n - convergence of
¤
§ =
EN ¡ 1

E (f G 0 (T ; Z )² gf G 0 (T ; Z )² )gT )
EN
1
+
E (f G 0 (T ; Z )² gf G 0 (T ; Z )² gT )
EN
AIDS CD4: Estimated Mean
AIDS: Estimated Covariance + measurement error
End of Multidimensional Covariates
3. What’s Next After FPCA?
 FPCA can be the end product - to explore the
covariate effects, to recover the trajectories of
each subject, and to explore the modes of
variation etc.
 FPCA can help to find more parsimonious
model.
AIDS CD4: Estimated Mean
AIDS CD4 Data
 This suggests the possibility of a more
parsimonious model with multiplicative covariate
effects.
T
Y (t )   (t )  (  z )  e(t ).
  (t ) could be parametric, e.g. a polynomial.
 Common marginal models for longitudinal data
take the additive form, and employ parametric
models for both the mean and covariance function.
- Both parametric forms are difficult to detect for
sparse and noisy longitudinal data.
AIDS CD4: Estimated Covariance
AIDS CD4: Estimated Eigenfunctions
MSE:
FVE
MSE
K
0.1154 1
AIC (BIC)
MSE
K
0.0937 3
Adding Random Effects
 Help to identify the
form of the random
effects.
Y (t )   (t ) (  z )
a  bt  e(t ).
T
 
random effects
Semiparametric Product Model
 If we assume that the first eigenfunction is
proportional to the population mean function
 (t , Z ) , and discards the remaining
eigenfunctions, we arrive at the following
multiplicative random effect model:
Y (t )   (t , z )  A (t , z )  e(t )
 b (t , z )  e(t ).
 effects
Random
PET Data
First Eigenfunction
0.25
0.2
0.15
0.1
0.05
0
-0.05
-0.1
-0.15
-0.2
0
10
20
30
40
50
time
60
70
80
90
4. Dynamic Positron Emission Tomography
(PET) Time Course Data
Joint work with
Ciren Jiang , UC Davis
&
John Aston
Academia Sinica & Univ. of Warwick
John Aston, Academia Sinica & Warwick U.
Dynamic PET Time-Course Data
 PET is a nuclear medicine imaging technique
which produces a three-dimensional image or
picture of functional processes in the body.
 A PET scan measures important body
functions, such as blood flow, oxygen use, and
sugar (glucose) metabolism, to help doctors
evaluate how well organs and tissues are
functioning.
Measured 11C-Diprenorphine Data
 The scan is an experiment on epilepsy. The chemical
compound Diprenorphine measures the concentration of
opioid (pain) receptors in the brain.
 The idea of the overall experiment was to see if there was
a difference in the concentration of receptors in
Epileptics against normal subjects.
 However, the changes that were hypothesized were very
small so it was important that the experiment could get
as accurate measurements as possible.
Measured 11C-Diprenorphine Data
 A dynamic scan from a measured 11C-diprenorphine study of a
normal subject were analyzed.
 Four Dimensional Data: three spatial + one temporal
128 × 128 × 95 × 32
 Voxel sizes were
2.096 mm × 2.096 mm × 2.43 mm.
 Scans were rebinned into 32 time frames of increasing duration.
t = 27.5, 60, 70, 80, 100, 130, ..., 1075, 1195, 1315, ..., 4435, 5035
seconds
Example of Analysis for five voxels
Motivation
 Due to experimental constraints, the time course
measurements are often fairly noisy.
 The Spectral Analysis method (Cunningham and Jones,
1993) is well known to be sensitive to noise with the bias
being highly dependent on the level of noise present.
 By borrowing information across space through the use of
a non-parametric covariate adjustment, it is possible to
reconstruct the PET time course data and thus reduce
noise.
Motivation
 Many of the processes presented in the PET time course
data have chemical rates associated with them. These
rates are dependent on a large number of biological
factors, too numerous and complex to be exhaustively
represented or identified in the discretely and noisily
measured data.
 However, if an alternative viewpoint that the rates are
random variables is taken, then a small additive random
change in one rate will lead to a multiplicative change in
the time course.
Multiplicative Nonparametric Random
Effects Model
Since Á 1 ( t ; z) / ¹ (t ; z) , fr om t he r andom cur ve viewpoint t he concent r at ion cur ve of t he voxel i wit h covar iat e
z i can b e r epr esent ed as
X1
X i ( t ; zi ) = ¹ ( t ; zi ) +
A i k Ák ( t ; zi )
k= 1
X1
= B i ¹ ( t ; zi ) +
A i k Á k (t ; z i ) ;
k= 2
wher e B i = 1 + ®A i 1 and Á 1 ( t ; z) = ®¹ (t ; z) .
Estimation Procedures
² D et er m ine in-br ain voxels
² A pply 2D sm oot her t o t he set
f ( Yi j ; t j ; z i ) ji = 1; : : : ; n ; 1 6 j 6 pg t o est im at e ¹ ( t ; z)
² A pply least squar es m et hod t o est im at e B i
² A pply 3D sm oot her t o t he set
f ( G i j k ; t j ; t k ; z i ) ji = 1; : : : ; n ; 1 6 j 6
= k 6 pg t o est im at e
¡ ( t ; s; z) , wher e
G i j k = ( Yi j ¡ B^ i ¹^ ( t j ; z i )) ( Yi k ¡ B^ i ¹^ ( t k ; z i ) )
Estimation Procedures
² Est im at e ¸ k ( z) and Á k ( t ; z) , and apply FV E t o choose
t he numb er of eigenfunct ions
² A pply I nt egr at ion M et hod t o est im at e t he pr incipal
com ponent scor es A i k ( z i )
² R econst r uct t he r andom
cur ve for each voxel:
P
K
^
^
X^ iK ( t ) = B^ i ¹^ ( t ; z i ) +
k = 2 A i k ( z i ) Ák ( t ; z i )
² A pply par am et r ic m et hod t o t he r econst r uct ed cur ves.
Variable Bandwidth b(t)
 A global bandwidth is appropriate along the
covariate coordinate, but not desirable in the
time coordinate.
 denser measurement schedule at the beginning
and sharp peak near the left boundary
 Smaller bandwidths are preferred near the
peak, while larger bandwidths are used near the
right boundary.
Choose Variable Bandwidth b(t)
 choose 13 time locations.
 [t-b(t), t+b(t)] includes at least four observations
 boundary correction to ensure a positive bandwidth
 To ensure a smooth outcome, fit a polynomial of order 4
to the pairs (tj, b(tj)).
 The resulting b(t) is further multiplied by a constant α,
determined by a cross-validation step.
Example of Analysis for five voxels
Spectral Analysis
(Cunningham and Jones, 1993)
Sp ect r al A nalysis does not assum e a known com par t m ent al st r uct ur e, but r at her per for m s a m odel select ion
t hr ough a non-negat ivit y const r aint on t he par am et er s. I n
par t icular , t he concent r at ion cur ve X (t ) is par am et er ized
by
O XK
X (t ) = I (t )
®j exp ¡ ¯ j t ;
j=1
wher e I ( t ) is a known input funct ion and ®j and ¯ j ar e t he
non-negat ive par am et er s t o be est im at ed.
T he par am et er of int er est VT is t he int egr al of t he im pulse r esp onse funct ion
Z
1
XK
VT =
®j exp ¡ ¯ j t dt =
0
j=1
XK ®j
j=1
¯j
:
Conclusions
 This is consistent with the knowledge that
Spectral Analysis has high positive bias at voxel
noise levels of around 5%. By reconstructing the
data through fFPCA, the noise level is reduced,
and thus the level of bias is also reduced.
(Overall mean squared residuals reduce 71.82%)
 The covariate adjusted FPCA can be applied in
practice to measured PET data using spatially
pooled information.
Thank You
The End
Sedona, Arizona, 2006 IMS WNAR
Covariate adjusted FPCA: Longitudinal Data
 We propose two ways:
fFPCA &
mFPCA.
 Both consist of two parts:
a systematic part for the mean function and
a stochastic part for the covariance function.
 Difference - handling of the covariance structure
Download