Table A4. Estimation results for different sample size in

advertisement
Appendix
(A) The EM Algorithm for the Mixture Model
Below we give the details of the statistical model and the EM steps for parameter estimation. (For
simplicity of indexing of mathematical expressions, we assume the existence of yi ,0 with value 0.)
1) Statistical Model
1. Let Yi '  (Yi ,1 , Yi ,2 ,..., Yi ,T ) be a sequence of observed symptom measurements in subject i; and
xi '  ( xi ,1 , xi ,2 ,..., xi ,M ) denote the genotypes or other covariates (including constant 1) for subject i.
2. Assume that drug effect may occur at time Zi = k or k+1(<T) for subject i, which is unknown. We
assume Zi follow a Bernoulli distribution with P  Zi  k  1  pi . We link the parameter pi with any
genotype or covariate by logit ( pi )  x 'i  , where  is the coefficient vector.
3. An AR(1) model is used to model the symptoms during the latency period, up to and including the
change point Zi , when measurements are assumed to be stationary, and (Yi ,1 , Yi ,2 ,..., Yi , Zi ) follows an
AR(1) process Yi ,t  iYi ,t 1  ai ,t (1  t  Z i ) with parameter i where ai ,t is a white noise process with
variance  2 . We link the parameter i with any genotype or covariate by i  xi '  , where  is the
coefficient vector. (Here, the mean of the AR(1) process is assumed to be zero.)
4. A linear trend model is used to model the increase of symptom measurements occurring after the
change point Zi , or (Yi , Zi , Yi , Zi 1 ,..., Yi ,T ) follows a linear trend model Yi ,t  Yi ,t 1   i   i ,t ( Z i  t  T ) with
parameter  i where  i ,t ~ i.i.d N (0,  2 ) is the error term with variance  2 . We link the parameter  i
with any genotype or covariate by i  xi '  , where  is the coefficient vector.
2) The Likelihood
when t  Z i , Yi ,t  iYi ,t 1  ai ,t or P Yi ,t |Yi ,t 1  yi ,t 1  ~ N i yi ,t 1 ,  2   N ( yi ,t 1 xi '  ,  2 ) ;
when t  Z i , Yi ,t  Yi ,t 1  i   i ,t or P Yi ,t |Yi ,t 1  yi ,t 1  ~ N  yi ,t 1  i ,  2   N ( yi ,t 1  xi '  ,  2 ) .
The data likelihood of sequence i is
1
P Yi  ( yi ,1 , yi ,2 ,..., yi ,T ) |  
 P(Yi  ( yi ,1 , yi ,2 ,..., yi ,T ), Z i  k |  )  P(Yi  ( yi ,1 , yi ,2 ,..., yi ,T ), Z i  k  1|  )
 (1  pi )  P Yi | Z i  k ,    pi  P Yi | Z i  k  1,  
 (1  pi ) (
 pi (
1 T
 1

) exp   2 (  ( yi ,t  yi ,t 1 xi '  ) 2   ( yi ,t  yi ,t 1  xi '  ) 2 ) 
2
t  k 1:T
 2 t 1:k

1 T
 1

) exp   2 (  ( yi ,t  yi ,t 1 xi '  ) 2   ( yi ,t  yi ,t 1  xi '  ) 2 ) 
2
t  k  2:T
 2 t 1:k 1

1 T
 1

) exp   2 (  ( yi ,t  yi ,t 1 xi '  ) 2   ( yi ,t  yi ,t 1  xi '  ) 2 ) 
2
t  k  2:T
 2 t 1:k

(

 1
 1
2
2 
 (1  pi ) exp   2 2 ( yi ,k 1  yi ,k  xi '  )   pi exp   2 2 ( yi ,k 1  yi ,k xi '  )  





1 T
 1

(
) exp   2 (  ( yi ,t  yi ,t 1 xi '  ) 2   ( yi ,t  yi ,t 1  xi '  ) 2 ) 
2
t  k  2:T
 2 t 1:k

 1
1
 1
 1
2
2 
 1  e xi ' exp   2 2 ( yi ,k 1  yi ,k  xi '  )   1  e  xi ' exp   2 2 ( yi ,k 1  yi ,k xi '  )  





(A.2.1)
And thus the complete data likelihood is
P Y1 ,,YN |     P Yi |  
i
 1

1
) NT exp   2 (  ( yi ,t  yi ,t 1 xi '  ) 2   ( yi ,t  yi ,t 1  xi '  ) 2 ) 
2
i 1: N ,t  k  2:T
 2 i 1:N ,t 1:k

(

1
  1  e
i 1:N
xi '
1
 1

 1

exp   2 ( yi ,k 1  yi ,k  xi '  ) 2  
exp   2 ( yi ,k 1  yi ,k xi '  ) 2  
 xi '
 2
 1 e
 2

It is hard to estimate the MLE of   (  ,  , ,  ) directly by maximizing above likelihood function due
to the existence of the missing data Z  {z1, z2 ,...z N } . For this reason, we propose an EM based
iterative algorithm to estimate the local MLE of   (  ,  , ,  ) .
3) EM Estimation Procedure
Assume at step n, the estimated value of parameter is  ( n )  ( ( n ) ,  ( n ) , ( n ) ,( 2 )( n ) ) , then at step (n+1),
 ( n 1)  argMax ( EZ |Y , lnP(Y , Z |  ))  argMax ( EZ |Y , lnP(Yi , Z i |  ))
(n)
(n)
i
i
i
Below, we give the details of E-step and M-step.

E-step
We denoted Ai ,k ( n)  P(Zi  k | Yi , ( n) ) and Ai ,k 1( n)  P(Zi  k  1 | Yi , ( n) ) , calculated as below,
2
Ai ,k ( n )  P( Zi  k | Yi , ( n ) ) 
(n)
i , k 1
A
 P( Zi  k  1| Yi ,
where pi ( n ) 
1
1  e xi '
(n)
(1  pi ( n ) ) P(Yi | Zi  k , ( n ) )
(1  pi ( n ) ) P(Yi | Zi  k , ( n ) )  pi ( n ) P(Yi | Z i  k  1, ( n ) )
(n)
pi ( n ) P(Yi | Zi  k  1, ( n ) )
)
(1  pi ( n ) ) P(Yi | Zi  k , ( n ) )  pi ( n ) P(Yi | Z i  k  1, ( n) )
and
1


T
)
exp

( ( yi ,t  yi ,t 1 xi '  ( n ) ) 2   ( yi ,t  yi ,t 1  xi '  ( n ) ) 2 ) 

2( n ) 
(n)
2
t 1:k
t  k 1:T
 2

1
1


P(Yi | Z i  k  1,  ( n ) )  (
)T exp   2( n ) (  ( yi ,t  yi ,t 1 xi '  ( n ) ) 2   ( yi ,t  yi ,t 1  xi '  ( n ) )2 ) 
(n)
2
t 1:k 1
t  k  2:T
 2

.
P(Yi | Z i  k ,  ( n ) )  (
1
Then
EZ |Y , ( n ) lnP(Yi , Z i |  )
i i
 Ai ,k ( n ) lnP (Yi , Z i  k |  )  Ai ,k 1( n ) lnP (Yi , Z i  k  1|  )
1 T
1


 Ai ,k ( n )  ln(1  pi )  ln(
)  2 (  ( yi ,t  yi ,t 1 xi '  ) 2   ( yi ,t  yi ,t 1  xi '  ) 2 ) 
2 t 1:k
2
t  k 1:T


1 T
1


 Ai ,k 1  ln( pi )  ln(
)  2 (  ( yi ,t  yi ,t 1 xi '  ) 2   ( yi ,t  yi ,t 1  xi '  ) 2 ) 
2 t 1:k 1
2
t  k  2:T


  Ai ,k ( n )  ln(1  pi )  Ai ,k 1( n ) ln( pi ) 
 ln(
1 T
)
2

1 

( yi ,t  yi ,t 1 xi '  ) 2  Ai ,k 1( n ) ( yi ,k 1  yi ,k xi '  ) 2 
2 
2  t 1:k


1  (n)

A
( yi ,k 1  yi ,k  xi '  ) 2   ( yi ,t  yi ,t 1  xi '  ) 2 
2  i ,k
2 
t  k  2:T

1
1 

  Ai ,k ( n ) ln
 Ai ,k 1( n ) ln
xi '
1 e
1  e  xi ' 

1 T
 ln(
)
2


2
(n)
2
   ( yi ,t  yi ,t 1 xi '  )  Ai ,k 1 ( yi ,k 1  yi ,k xi '  ) 

1   t 1:k


 2
2   ( n )

   Ai ,k ( yi ,k 1  yi ,k  xi '  ) 2   ( yi ,t  yi ,t 1  xi '  ) 2  
t  k  2:T

 
3

M-step,
To find  ( n 1)  argMax ( EZ |Y , ( n ) lnP(Y , Z |  ))  argMax ( EZ |Y , ( n ) lnP(Yi , Z i |  )) we need to
i
i
i
maximize above likelihood summed over all the sequences. Below gives the details for estimating
each individual parameter.

1
1 
 Ai ,k 1( n )  ln

xi '
1 e
1  e xi ' 
i 1:N
This expression corresponds to a weighted logistic regression. We use the Newton-Raphson algorithm
  A
( n 1)
a) To find 
, we maximize
to calculate 
( n 1)
(n)
i ,k
 ln
which maximize above expression.
b) To estimate  ( n 1) , maximize

   ( y
i 1:N
t 1:k
i ,t

 yi ,t 1 xi '  )2  Ai ,k 1( n ) ( yi ,k 1  yi ,k xi '  )2 

Let
Y1 * 
 X1 * 
 yi ,2

 yi ,1 xi '

... 
... 








...

...

Let Y *  Yi *  and X *   X i *  , where Yi *   y
and X i *   y x '


i , k 1 i
i ,k








...
...




 Ai ,k 1( n ) yi ,k xi '
 Ai ,k 1( n ) yi ,k 1 
YN *
 X N *




This is a weighted linear regression and the solution is  ( n 1)   X *' X * X *' Y * .
1
c) To estimate  ( n1) , maximize

  A
i 1:N
i ,k
(n)
( yi ,k 1  yi ,k  xi '  )2 

( yi ,t  yi ,t 1  xi '  )2 
t  k  2:T


Y1 * 
 X1 * 
 A (n)  y

 A ( n ) x '
i , k 1  yi , k 
... 
... 
 i ,k

 i ,k i 




 yi ,k  2  yi ,k 1 

x '

Let Y *  Yi *  and X *   X i *  , where Yi *  
 and X i *   i





...

...


...
...








x
'
y

y
YN *
 X N *
 i
 i ,T

i ,T 1 
This is also a weighted linear regression and the solution is  ( n1)   X *' X * X *' Y * .
1
d) After estimating  ( n 1) and  ( n 1) , ( 2 )( n1) can be calculated by
( 2 )( n 1)
  ( yi ,t  yi ,t 1 xi '  ( n 1) ) 2  Ai ,k 1( n )  ( yi ,k 1  yi ,k xi '  ( n 1) ) 2

1
 t 1:k



NT i 1:N   Ai ,k ( n )  ( yi ,k 1  yi ,k  xi '  ( n 1) ) 2   ( yi ,t  yi ,t 1  xi '  ( n 1) ) 2 
t  k  2:T


4
The EM step iterates until convergence or until the difference between the estimated parameter values
at steps n and n+1 are below a pre-specified threshold value.
(B) Information Matrix Calculation
Below we give the details of the calculation of the information matrix using Louis (1982)’ method.
1) First and Second Derivatives of The Log Likelihood of Complete Data
Let Wi  (Yi , Zi ) denote the complete data including the observed sequence Y and the missing onset Zi ,
then
P(Wi |  )
 P ( Z i |  )  P(Yi | Z i ,  )
1 T 
1
 1

(
)  (
) exp   2 (  ( yi ,t  yi ,t 1 xi '  ) 2   ( yi ,t  yi ,t 1  xi '  ) 2 )  
xi '
2
t  k 1:T
 2 t 1:k

 1 e
(1 zi )

1
 1

) exp   2 (  ( yi ,t  yi ,t 1 xi '  ) 2   ( yi ,t  yi ,t 1  xi '  ) 2 )  
(
 xi '
t  k  2:T
 2 t 1:k 1

 1 e
zi
And
Li  ln( P(Wi |  ))
T
ln(2 2 )
2

 1

 (1  zi ) ln(1  e xi ' )   2 (  ( yi ,t  yi ,t 1 xi '  ) 2   ( yi ,t  yi ,t 1  xi '  ) 2 )  
t  k 1:T
 2 t 1:k



 1

 zi ln(1  e  xi ' )   2 (  ( yi ,t  yi ,t 1 xi '  ) 2   ( yi ,t  yi ,t 1  xi '  ) 2 )  
t  k  2:T
 2 t 1:k 1



The first and second derivatives for Li  ln( P(Wi |  )) are given below.
5
Li
 1

 1

 (1  zi )  2 (  xi yi ,t 1 ( yi ,t  yi ,t 1 xi '  ))   zi  2 (  xi yi ,t 1 ( yi ,t  yi ,t 1 xi '  )) 

  t 1:k

  t 1:k 1

xi 

y ( yi ,t  yi ,t 1 xi '  )  zi yi ,k ( yi ,k 1  yi ,k xi '  ) 
2   i ,t 1
  t 1:k

2
 Li
x x '

  i 2i   ( yi ,t 1 ) 2  zi ( yi ,k ) 2 
2

  t  2:k

2
 Li
0


 2 Li
0

 2 Li
x 

  i4   yi ,t 1 ( yi ,t  yi ,t 1 xi '  )  zi yi ,k ( yi ,k 1  yi ,k xi '  ) 
2

  t 1:k

Li
1 




 2  (1  zi )   xi '( yi ,t  yi ,t 1  xi '  )   zi   xi ( yi ,t  yi ,t 1  xi '  )  
  
 t  k 1:T

 t  k  2:T

x 

 i2   ( yi ,t  yi ,t 1  xi '  )  zi ( yi ,k 1  yi ,k  xi '  ) 
  t  k 1:T

2
 Li
(T  k )  zi
(T  k  zi )
 xi xi '
 xi xi '
2
2


2
 2 Li
0

 2 Li
x 

  i4   ( yi ,t  yi ,t 1  xi '  )  zi ( yi ,k 1  yi ,k  xi '  ) 
2

  t  k 1:T


Li
x e xi '
x e  xi '
e xi '
 (1  zi ) i xi '  zi i  xi '  xi  zi 

1 e
1 e
1  e xi '


1 

  xi  zi 

1  e xi ' 


 2 Li
e xi '
  xi xi '
 2
(1  e xi ' ) 2
 2 Li
0
 2
6


2
2 
(1  zi )   ( yi ,t  yi ,t 1 xi '  )   ( yi ,t  yi ,t 1  xi '  )  
Li
T
1
t  k 1:T
 t 1:k
 
 2  4 
2


2
2 

2
2
  zi   ( yi ,t  yi ,t 1 xi '  )   ( yi ,t  yi ,t 1  xi '  )  
t  k  2:T
 t 1:k 1



T
1
 4
2
2
2

T
1
 4
2
2
2
  ( yi ,t  yi ,t 1 xi '  ) 2  zi ( yi ,k 1  yi ,k xi '  ) 2

 t 1:k


2
2
( yi ,t  yi ,t 1  xi '  )  zi ( yi ,k 1  yi ,k  xi '  ) 
 t 
k 1:T

2
2
  ( yi ,t  yi ,t 1 xi '  )   ( yi ,t  yi ,t 1  xi '  ) 
 t 1:k

t  k 1:T


2
2
  zi  ( yi ,k 1  yi ,k xi '  )  ( yi ,k 1  yi ,k  xi '  )  
 ( yi ,t  yi ,t 1 xi '  ) 2   ( yi ,t  yi ,t 1  xi '  ) 2 
 2 Li
T
1  t

1:k
t  k 1:T


 ( 2 ) 2 2 4  6   z ( y  y x '  ) 2  ( y  y  x '  ) 2
 
k 1
k
i
 i  k 1 k i
Let W  (Y , Z ) , the complete data including the observed sample Y and missing data Z , then the
gradient and Hessian matrix for the complete data log likelihood can be obtained from above results.
2) The Gradient S (W ) And Eq (S(W )× S(W )' |Y)
7
 Li 
   
 i

 Li 
   
i

S (W |  )  
 Li 
   
 i

 Li 
   2 
 i


1

 
x
y ( yi ,t  yi ,t 1 xi '  )  zi yi ,k ( yi ,k 1  yi ,k xi '  )  

2  i   i ,t 1
 
  i  t 1:k


1


x
( yi ,t  yi ,t 1  xi '  )  zi ( yi ,k 1  yi ,k  xi '  ) 


2  i 
 i  t  k 1:T





1




x


z
i i  1  e xi ' i 



2
2 
  ( yi ,t  yi ,t 1 xi '  )   ( yi ,t  yi ,t 1  xi '  )  
 NT
1

t  k 1:T
  2  4   t 1:k
2 i   z ( y  y x '  ) 2  ( y  y  x '  ) 2  
 2
 
i , k 1
i ,k
i
 i  i ,k 1 i ,k i

Now, we calculate Eq (S(W )× S(W )' |Y) of each estimator. Below formula facilitate the calculation.



E   xi  Ai  zi Bi     x j '  C j  z j D j  
 i
 j





 E   xi xi '  Ai  zi Bi  Ci  zi Di    E   xi x j '  Ai  zi Bi   C j  z j D j  
 i

 i j

  xi xi '  Ai Ci  pi Ai Di  pi Bi Ci  pi Bi Di    xi x j '  Ai  pi Bi   C j  p j D j 
(B.b.1)
i j
i
  xi xi '  Ai  pi Bi  Ci  pi Di    xi xi '( pi  pi 2 ) Bi Di   xi x j '  Ai  pi Bi   C j  p j D j 
i
i j
i



   xi  Ai  pi Bi     x j '  C j  p j D j     xi xi '( pi  pi 2 ) Bi Di
 i
 j
 i
where pi  p( Zi  1) . In our method, this is the conditional probability Ai ,k 1  p ( Z i  k  1| Yi )
'
 L  L 
For example, to calculate E (

 | Y ) , take Ai  Ci   yi ,t 1 ( yi ,t  yi ,t 1 xi '  ) ,
t 1:k
    
8
'
 L  L 
  | Y ) , take
    
Bi  Di  yk ( yi ,k 1  yi ,k xi '  ) . To calculate E (
Ai   yi ,t 1 ( yi ,t  yi ,t 1 xi '  ) , Bi  yk ( yi ,k 1  yi ,k xi '  ) , Ci 
t 1:k

t  k 1:T
( yi ,t  yi ,t 1  xi '  ) ,
Di  ( yi ,k 1  yi ,k  xi '  ) . Then utilizing formula (B.b.1) with pi  Ai ,k 1  p ( Z i  k  1| Yi ) and
ˆ  ( ˆ , ˆ,ˆ, ˆ 2 ) as MLE from the EM algorithm, we can get the estimated value of each element of
Eq (S(W )× S(W )' |Y) .
3) The Hessian Matrix B(W ) and Eq (B(W ) |Y )
The Hessian matrix B(W ) contains all the second derivatives of the log likelihood and is a symmetric
matrix. Below we give its elements on the upper right.
9
 2 L
  2
 i


B(W )  






2 L
2 L
2 L 
2
i

2
L 
i  2 

2
L 
i  2 

2
L 
i ( 2 )2 

     
i
i
L
i  2
2
2 L
i 
2 L
i  2


  ( yi ,t 1 ) 2  
1
 x x '  t 1:k

0
i i
 2 


2 
i

  zi ( yi ,k )  



1

 xi xi '(T  k  zi )
2 i
















  yi ,t 1 ( yi ,t  yi ,t 1 xi '  ) 
1

x
 i t 1:k
 4 i  z y ( y  y x '  ) 
 i i ,k i ,k 1 i ,k i



0



  ( yi ,t  yi ,t 1  xi '  ) 

1


t  k 1:T
0
x

i
 4 i  z ( y  y  x '  ) 

 i i ,k 1 i ,k i




e xi ' 
0
i  xi xi ' (1  e xi ' )2 








2
  ( yi ,t  yi ,t 1 xi '  )

t 1:k


NT 1
 4  6     ( yi ,t  yi ,t 1  xi '  )2  

2  i  t k 1:T


2


( yi ,k 1  yi ,k xi '  )



  zi 

2
  ( yi ,k 1  yi ,k  xi '  )   
Let Ai ,k 1  p ( Z i  k  1| Yi ) , then
10
Eq (B(W ) |Y )
é
é
æ å ( yi,t-1 )2 ö ù
ê 1 ê
÷ú
0
ê 2 å ê xi xi ' ç t=1:k
2÷ ú
s
ç
i
+
A
(
y
)
ê
è
ø
êë
i,k+1 i,k
úû
ê
ê
1
ê
å x x '(T - k - Ai,k+1 )
s2 i i i
ê
ê
ê
ê
=ê
ê
ê
ê
ê
ê
ê
ê
ê
ê
ê
êë
æ
0
é å yi,t-1 ( yi,t - yi,t-1xi ' b ) ù
1
ú
å x êt=1:k
s 4 i i ê+ A y ( y - y x ' b ) ú
ë i,k+1 i,k i,k+1 i,k i û
0
é å ( yi,t - yi,t-1 - xi 'g ) ù
1
ú
x
å êt=k+1:T
s 4 i i ê - A ( y - y - x 'g ) ú
ë i,k+1 i,k+1 i,k i û
e - xi ' t
å çè x x ' (1+ e
i i
i
ö
) ÷ø
ù
ú
ú
ú
ú
ú
ú
ú
ú
ú
ú
ú
ú
ú
ú
ú
ú
ú
ú
ú
ú
ú
úû
0
- xi ' t 2
é
ù
ê
ú
ê å ( yi,t - yi,t-1xi ' b )2
ú
ê t=1:k
ú
NT 1 ê
ú
2
- 4 + 6 å ê + å ( yi,t - yi,t-1 - xi 'g )
ú
2s s i t=k+1:T
ê
ú
æ ( yi,k+1 - yi,k xi ' b )2 ö ú
ê
ê + Ai,k+1 ç
÷ú
2
è -( yi,k+1 - yi,k - xi 'g ) ø úû
êë
The difference of Eq (B(W ) |Y )and Eq (S(W )× S(W )' |Y) , gives the asymptotic estimation of the
information matrix I ( | Y ) , after substituting each parameter value by the estimated parameter value
from the EM algorithm.
(C) Log Likelihood Ratio Test
To evaluate whether a genetic variant or environmental covariate significantly affected the distribution
of the onset of drug effect, we applied the classic Log Likelihood Ratio Test. Let
ˆ0  (ˆ0 , ˆ0 ,ˆ0 ,ˆ 02 ) and ˆ1  (ˆ1 , ˆ1 ,ˆ1 , ˆ 12) be the estimated parameter values from the EM algorithm
under the hypothesis-reduced model H0 and full model H1. The likelihood ratio test statistic is then
2ln

P  Y , , Y
 where P Y ,,Y
| ˆ 
P Y1 ,,YN | ˆ0
1
N
1
1
N
|     P Yi |   and P Yi |   is given in equation (A.2.1).
i
And
11
ln

P  Y , , Y

| ˆ 
P Y1 ,,YN | ˆ0
1
N
1
 1 

  
( yi ,t  yi ,t 1 xi ' ˆ0 ) 2   ( yi ,t  yi ,t 1  xi ' ˆ0 ) 2  
2 
i 1: N  2ˆ 0  t 1:k
t  k  2:T

 1 

 
( yi ,t  yi ,t 1 xi ' ˆ1 ) 2   ( yi ,t  yi ,t 1  xi ' ˆ1 ) 2  
2 
i 1:N  2ˆ1  t 1:k
t  k  2:T

 1


1
exp  
( yi ,k 1  yi ,k  xi ' ˆ0 ) 2  

xi 'ˆ0
2
 2ˆ 0

 1 e
  ln 


i 1:N
1
1
2 
ˆ

 1  e  xi 'ˆ0 exp   2ˆ 2 ( yi ,k 1  yi ,k xi '  0 )  
0



 1


1
exp  
( yi ,k 1  yi ,k  xi ' ˆ1 ) 2  

xi 'ˆ1
2
 2ˆ1

 1 e
  ln 


i 1:N
1
1
2 
ˆ

 1  e  xi 'ˆ1 exp   2ˆ 2 ( yi ,k 1  yi ,k xi ' 1 )  

1


NT

(ln ˆ 02  ln ˆ 12)
2
(D) Tables of Some Simulation Results
The tables below contain results from simulation study. For each simulation scenario, we used 500
subjects for each run (unless specified) and conducted 1000 independent runs. The mean, SD, and
prediction rate (proportion of onset times that were correctly predicted) were calculated from results of
the 1000 runs.
1) Effects of different  values (with corresponding odds ratios when SNP value is 2) on estimation
accuracy
Table A1. Estimation results for differentvalues in Scenario 1
odds ratio
True Value of 
Mean of Estimated 
SD of Estimated 
1.1
(-3.080, 1.588)
(-3.091, 1.59)
(0.301, 0.2028)
1.2
(-3.080, 1.631)
(-3.103, 1.641)
(0.3147, 0.2086)
1.3
(-3.080, 1.671)
(-3.125, 1.693)
(0.304, 0.2102)
1.4
(-3.080, 1.708)
(-3.104, 1.713)
(0.3197, 0.2215)
1.5
(-3.080, 1.743)
(-3.093, 1.751)
(0.3087, 0.212)
2
(-3.080, 1.887)
(-3.115, 1.905)
(0.2895, 0.2139)
4
(-3.080, 2.233)
(-3.116, 2.255)
(0.2978, 0.2169)
Table A2. Estimation results for differentvalues in Scenario 2
12
1.1
True Value of 
(-3.080, 2.088, -0.200)
Mean of Estimated 
(-3.123, 2.111, -0.202)
SD of Estimated 
(0.4326, 0.2207, 0.0598)
1.2
(-3.080, 2.131, -0.200)
(-3.114, 2.152, -0.201)
(0.4269, 0.2142, 0.0608)
1.3
(-3.080, 2.171, -0.200)
(-3.117, 2.187, -0.199)
(0.4174, 0.2175, 0.0608)
1.4
(-3.080, 2.208, -0.200)
(-3.145, 2.237, -0.199)
(0.4169, 0.2174, 0.0594)
1.5
(-3.080, 2.243, -0.200)
(-3.112, 2.272, -0.204)
(0.4281, 0.2125, 0.0615)
2
(-3.080, 2.387, -0.200)
(-3.136, 2.426, -0.201)
(0.4058, 0.2135, 0.0583)
4
(-3.080, 2.733, -0.200)
(-3.105, 2.766, -0.204)
(0.4584, 0.2387, 0.0711)
odds ratio
Table A3. More estimation results for differentvalues in Scenario 2
1.1
True Value of 
(-3.080, 4.230, -1.057)
Mean of Estimated 
(-3.201, 4.346, -1.077)
SD of Estimated 
(0.615, 0.4603, 0.1261)
1.2
(-3.080, 4.230, -1.040)
(-3.16, 4.317, -1.061)
(0.5996, 0.4676, 0.1275)
1.3
(-3.080, 4.230, -1.024)
(-3.14, 4.284, -1.036)
(0.595, 0.436, 0.1178)
1.4
(-3.080, 4.230, -1.009)
(-3.152, 4.322, -1.03)
(0.582, 0.424, 0.1155)
1.5
(-3.080, 4.230, -0.995)
(-3.141, 4.299, -1.011)
(0.5751, 0.4193, 0.1149)
2
(-3.080, 4.230, -0.937)
(-3.161, 4.309, -0.951)
(0.5529, 0.421, 0.1117)
4
(-3.080, 4.230, -0.799)
(-3.089, 4.287, -0.816)
(0.4718, 0.3998, 0.1102)
odds ratio
2) Effects of sample size on estimation accuracy
Table A4. Estimation results for different sample size in Scenario 1
50
True Value of 
(-3.08, 2.23)
Mean of Estimated 
(-3.254, 2.363)
SD of Estimated 
(0.9515, 0.7093)
60
(-3.08, 2.23)
(-3.333, 2.421)
(0.8971, 0.6457)
70
(-3.08, 2.23)
(-3.214, 2.326)
(0.857, 0.6034)
80
(-3.08, 2.23)
(-3.239, 2.354)
(0.7859, 0.5473)
100
(-3.08, 2.23)
(-3.191, 2.303)
(0.7000, 0.5228)
200
(-3.08, 2.23)
(-3.125, 2.260)
(0.4887, 0.3458)
300
(-3.08, 2.23)
(-3.125, 2.268)
(0.3986, 0.3115)
500
(-3.08, 2.23)
(-3.106, 2.249)
(0.2988, 0.2415)
800
(-3.08, 2.23)
(-3.099, 2.246)
(0.2293, 0.1823)
1000
(-3.08, 2.23)
(-3.089, 2.234)
(0.2138, 0.1976)
Sample Size
Table A5. Estimation results for different sample size in Scenario 2
50
True Value of 
(-3.08, 4.23, -0.20)
Mean of Estimated 
(-1.935, 3.254, -0.236)
SD of Estimated 
(1.5781, 1.446, 0.2283)
60
(-3.08, 4.23, -0.20)
(-2.221, 3.505, -0.223)
(1.1664, 0.4207, 0.2267)
Sample Size
13
70
(-3.08, 4.23, -0.20)
(-2.194, 3.352, -0.199)
(0.9078, 0.4003, 0.1729)
80
(-3.08, 4.23, -0.20)
(-2.411, 3.598, -0.208)
(0.8976, 0.4003, 0.1817)
100
(-3.08, 4.23, -0.20)
(-2.645, 3.842, -0.209)
(0.7825, 0.427, 0.1494)
200
(-3.08, 4.23, -0.20)
(-2.911, 4.096, -0.209)
(0.6619, 0.497, 0.0966)
300
(-3.08, 4.23, -0.20)
(-3.106, 4.287, -0.206)
(0.6524, 0.5505, 0.0741)
400
(-3.08, 4.23, -0.20)
(-3.22, 4.38, -0.202)
(0.6442, 0.5469, 0.072)
500
(-3.08, 4.23, -0.20)
(-3.187, 4.344, -0.201)
(0.5779, 0.5093, 0.0695)
800
(-3.08, 4.23, -0.20)
(-3.21, 4.35, -0.199)
(0.5359, 0.456, 0.0501)
1000
(-3.08, 4.23, -0.20)
(-3.183, 4.322, -0.198)
(0.4651, 0.3899, 0.045)
3) Joint effects of sequence length and sample size on estimation accuracy
Table A6. Estimation results for different sample size in Scenario 1 when
sequence length T=12, early onset k=6
50
True Value of 
(-3.08, 2.23)
Mean of Estimated 
(-3.254, 2.363)
SD of Estimated 
(0.9515, 0.7093)
60
(-3.08, 2.23)
(-3.333, 2.421)
(0.8971, 0.6457)
70
(-3.08, 2.23)
(-3.214, 2.326)
(0.857, 0.6034)
80
(-3.08, 2.23)
(-3.239, 2.354)
(0.7859, 0.5473)
100
(-3.08, 2.23)
(-3.191, 2.303)
(0.7000, 0.5228)
200
(-3.08, 2.23)
(-3.125, 2.260)
(0.4887, 0.3458)
300
(-3.08, 2.23)
(-3.125, 2.268)
(0.3986, 0.3115)
500
(-3.08, 2.23)
(-3.106, 2.249)
(0.2988, 0.2415)
800
(-3.08, 2.23)
(-3.099, 2.246)
(0.2293, 0.1823)
1000
(-3.08, 2.23)
(-3.089, 2.234)
(0.2138, 0.1976)
Sample Size
Table A7. Estimation results for different sample size in Scenario 1 when
sequence length T=10, early onset k=5
Sample Size
True Value of 
Mean of Estimated 
SD of Estimated 
50
(-3.08, 2.23)
(-3.32, 2.412)
(0.9531, 0.7238)
60
(-3.08, 2.23)
(-3.294, 2.381)
(0.9264, 0.6575)
70
(-3.08, 2.23)
(-3.212, 2.328)
(0.837, 0.5915)
80
(-3.08, 2.23)
(-3.229, 2.356)
(0.8084, 0.5966)
100
(-3.08, 2.23)
(-3.215, 2.316)
(0.7548, 0.4889)
200
(-3.08, 2.23)
(-3.113, 2.258)
(0.4816, 0.3636)
300
(-3.08, 2.23)
(-3.148, 2.285)
(0.377, 0.286)
500
(-3.08, 2.23)
(-3.088, 2.23)
(0.2925, 0.2475)
800
(-3.08, 2.23)
(-3.096, 2.24)
(0.2296, 0.1939)
1000
(-3.08, 2.23)
(-3.094, 2.242)
(0.2076, 0.1743)
14
Table A8. Estimation results for different sample size in Scenario 1 when
sequence length T=8, early onset k=4
50
True Value of 
(-3.08, 2.23)
Mean of Estimated 
(-3.29, 2.392)
SD of Estimated 
(0.9837, 0.786)
60
(-3.08, 2.23)
(-3.238, 2.341)
(0.8542, 0.6605)
70
(-3.08, 2.23)
(-3.248, 2.361)
(0.8445, 0.6112)
80
(-3.08, 2.23)
(-3.279, 2.375)
(0.8163, 0.5938)
100
(-3.08, 2.23)
(-3.232, 2.342)
(0.6765, 0.5056)
200
(-3.08, 2.23)
(-3.17, 2.304)
(0.4596, 0.3446)
300
(-3.08, 2.23)
(-3.145, 2.279)
(0.401, 0.3014)
500
(-3.08, 2.23)
(-3.106, 2.253)
(0.3072, 0.2524)
800
(-3.08, 2.23)
(-3.087, 2.227)
(0.2323, 0.2084)
1000
(-3.08, 2.23)
(-3.105, 2.24)
(0.2195, 0.2218)
Sample Size
Table A9. Estimation results for different sample size in Scenario 1 when
sequence length T=6, early onset k=3
50
True Value of 
(-3.08, 2.23)
Mean of Estimated 
(-3.37, 2.414)
SD of Estimated 
(0.9925, 0.7267)
60
(-3.08, 2.23)
(-3.255, 2.356)
(0.9334, 0.7048)
70
(-3.08, 2.23)
(-3.241, 2.362)
(0.8833, 0.6443)
80
(-3.08, 2.23)
(-3.246, 2.354)
(0.7951, 0.5562)
100
(-3.08, 2.23)
(-3.233, 2.336)
(0.6580, 0.5104)
200
(-3.08, 2.23)
(-3.144, 2.283)
(0.4916, 0.3800)
300
(-3.08, 2.23)
(-3.135, 2.263)
(0.3886, 0.3031)
500
(-3.08, 2.23)
(-3.099, 2.249)
(0.3000, 0.2594)
800
(-3.08, 2.23)
(-3.095, 2.237)
(0.2359, 0.205)
1000
(-3.08, 2.23)
(-3.103, 2.239)
(0.2099, 0.2253)
Sample Size
4) Effects of post-onset slope on estimation accuracy
Table A10. Effects of post-onset slope () on estimation and prediction accuracy
True Value of 
Mean of Estimated 
SD of Estimated 
Prediction Rate
(0.40, -0.06)
True Value of 
(-3.08, 2.23)
(-3.074, 2.192)
(0.302, 0.242)
0.776
(0.30, -0.045)
(-3.08, 2.23)
(-3.044, 2.136)
(0.308, 0.305)
0.772
(0.20, -0.03)
(-3.08, 2.23)
(-3.169, 2.129)
(0.791, 0.523)
0.751
(0.10, -0.015)
(-3.08, 2.23)
(-3.370, 2.930)
(1.110, 2.589)
0.698
15
5) Effects of data variance on estimation accuracy
Table A11. Effects of data variance (2) on estimation and prediction accuracy
True Value of 
True Value of 
Mean of Estimated 
SD of Estimated 
Prediction Rate
0.01
(-3.08, 2.23)
(-3.095, 2.245)
(0.294, 0.212)
0.7777
0.05
(-3.08, 2.23)
(-3.169, 2.129)
(0.791, 0.523)
0.751
0.10
(-3.08, 2.23)
(-3.541, 3.036)
(1.370, 2.651)
0.699
0.15
(-3.08, 2.23)
(-3.696, 3.988)
(2.380, 3.649)
0.678
16
Download