Appendix (A) The EM Algorithm for the Mixture Model Below we give the details of the statistical model and the EM steps for parameter estimation. (For simplicity of indexing of mathematical expressions, we assume the existence of yi ,0 with value 0.) 1) Statistical Model 1. Let Yi ' (Yi ,1 , Yi ,2 ,..., Yi ,T ) be a sequence of observed symptom measurements in subject i; and xi ' ( xi ,1 , xi ,2 ,..., xi ,M ) denote the genotypes or other covariates (including constant 1) for subject i. 2. Assume that drug effect may occur at time Zi = k or k+1(<T) for subject i, which is unknown. We assume Zi follow a Bernoulli distribution with P Zi k 1 pi . We link the parameter pi with any genotype or covariate by logit ( pi ) x 'i , where is the coefficient vector. 3. An AR(1) model is used to model the symptoms during the latency period, up to and including the change point Zi , when measurements are assumed to be stationary, and (Yi ,1 , Yi ,2 ,..., Yi , Zi ) follows an AR(1) process Yi ,t iYi ,t 1 ai ,t (1 t Z i ) with parameter i where ai ,t is a white noise process with variance 2 . We link the parameter i with any genotype or covariate by i xi ' , where is the coefficient vector. (Here, the mean of the AR(1) process is assumed to be zero.) 4. A linear trend model is used to model the increase of symptom measurements occurring after the change point Zi , or (Yi , Zi , Yi , Zi 1 ,..., Yi ,T ) follows a linear trend model Yi ,t Yi ,t 1 i i ,t ( Z i t T ) with parameter i where i ,t ~ i.i.d N (0, 2 ) is the error term with variance 2 . We link the parameter i with any genotype or covariate by i xi ' , where is the coefficient vector. 2) The Likelihood when t Z i , Yi ,t iYi ,t 1 ai ,t or P Yi ,t |Yi ,t 1 yi ,t 1 ~ N i yi ,t 1 , 2 N ( yi ,t 1 xi ' , 2 ) ; when t Z i , Yi ,t Yi ,t 1 i i ,t or P Yi ,t |Yi ,t 1 yi ,t 1 ~ N yi ,t 1 i , 2 N ( yi ,t 1 xi ' , 2 ) . The data likelihood of sequence i is 1 P Yi ( yi ,1 , yi ,2 ,..., yi ,T ) | P(Yi ( yi ,1 , yi ,2 ,..., yi ,T ), Z i k | ) P(Yi ( yi ,1 , yi ,2 ,..., yi ,T ), Z i k 1| ) (1 pi ) P Yi | Z i k , pi P Yi | Z i k 1, (1 pi ) ( pi ( 1 T 1 ) exp 2 ( ( yi ,t yi ,t 1 xi ' ) 2 ( yi ,t yi ,t 1 xi ' ) 2 ) 2 t k 1:T 2 t 1:k 1 T 1 ) exp 2 ( ( yi ,t yi ,t 1 xi ' ) 2 ( yi ,t yi ,t 1 xi ' ) 2 ) 2 t k 2:T 2 t 1:k 1 1 T 1 ) exp 2 ( ( yi ,t yi ,t 1 xi ' ) 2 ( yi ,t yi ,t 1 xi ' ) 2 ) 2 t k 2:T 2 t 1:k ( 1 1 2 2 (1 pi ) exp 2 2 ( yi ,k 1 yi ,k xi ' ) pi exp 2 2 ( yi ,k 1 yi ,k xi ' ) 1 T 1 ( ) exp 2 ( ( yi ,t yi ,t 1 xi ' ) 2 ( yi ,t yi ,t 1 xi ' ) 2 ) 2 t k 2:T 2 t 1:k 1 1 1 1 2 2 1 e xi ' exp 2 2 ( yi ,k 1 yi ,k xi ' ) 1 e xi ' exp 2 2 ( yi ,k 1 yi ,k xi ' ) (A.2.1) And thus the complete data likelihood is P Y1 ,,YN | P Yi | i 1 1 ) NT exp 2 ( ( yi ,t yi ,t 1 xi ' ) 2 ( yi ,t yi ,t 1 xi ' ) 2 ) 2 i 1: N ,t k 2:T 2 i 1:N ,t 1:k ( 1 1 e i 1:N xi ' 1 1 1 exp 2 ( yi ,k 1 yi ,k xi ' ) 2 exp 2 ( yi ,k 1 yi ,k xi ' ) 2 xi ' 2 1 e 2 It is hard to estimate the MLE of ( , , , ) directly by maximizing above likelihood function due to the existence of the missing data Z {z1, z2 ,...z N } . For this reason, we propose an EM based iterative algorithm to estimate the local MLE of ( , , , ) . 3) EM Estimation Procedure Assume at step n, the estimated value of parameter is ( n ) ( ( n ) , ( n ) , ( n ) ,( 2 )( n ) ) , then at step (n+1), ( n 1) argMax ( EZ |Y , lnP(Y , Z | )) argMax ( EZ |Y , lnP(Yi , Z i | )) (n) (n) i i i Below, we give the details of E-step and M-step. E-step We denoted Ai ,k ( n) P(Zi k | Yi , ( n) ) and Ai ,k 1( n) P(Zi k 1 | Yi , ( n) ) , calculated as below, 2 Ai ,k ( n ) P( Zi k | Yi , ( n ) ) (n) i , k 1 A P( Zi k 1| Yi , where pi ( n ) 1 1 e xi ' (n) (1 pi ( n ) ) P(Yi | Zi k , ( n ) ) (1 pi ( n ) ) P(Yi | Zi k , ( n ) ) pi ( n ) P(Yi | Z i k 1, ( n ) ) (n) pi ( n ) P(Yi | Zi k 1, ( n ) ) ) (1 pi ( n ) ) P(Yi | Zi k , ( n ) ) pi ( n ) P(Yi | Z i k 1, ( n) ) and 1 T ) exp ( ( yi ,t yi ,t 1 xi ' ( n ) ) 2 ( yi ,t yi ,t 1 xi ' ( n ) ) 2 ) 2( n ) (n) 2 t 1:k t k 1:T 2 1 1 P(Yi | Z i k 1, ( n ) ) ( )T exp 2( n ) ( ( yi ,t yi ,t 1 xi ' ( n ) ) 2 ( yi ,t yi ,t 1 xi ' ( n ) )2 ) (n) 2 t 1:k 1 t k 2:T 2 . P(Yi | Z i k , ( n ) ) ( 1 Then EZ |Y , ( n ) lnP(Yi , Z i | ) i i Ai ,k ( n ) lnP (Yi , Z i k | ) Ai ,k 1( n ) lnP (Yi , Z i k 1| ) 1 T 1 Ai ,k ( n ) ln(1 pi ) ln( ) 2 ( ( yi ,t yi ,t 1 xi ' ) 2 ( yi ,t yi ,t 1 xi ' ) 2 ) 2 t 1:k 2 t k 1:T 1 T 1 Ai ,k 1 ln( pi ) ln( ) 2 ( ( yi ,t yi ,t 1 xi ' ) 2 ( yi ,t yi ,t 1 xi ' ) 2 ) 2 t 1:k 1 2 t k 2:T Ai ,k ( n ) ln(1 pi ) Ai ,k 1( n ) ln( pi ) ln( 1 T ) 2 1 ( yi ,t yi ,t 1 xi ' ) 2 Ai ,k 1( n ) ( yi ,k 1 yi ,k xi ' ) 2 2 2 t 1:k 1 (n) A ( yi ,k 1 yi ,k xi ' ) 2 ( yi ,t yi ,t 1 xi ' ) 2 2 i ,k 2 t k 2:T 1 1 Ai ,k ( n ) ln Ai ,k 1( n ) ln xi ' 1 e 1 e xi ' 1 T ln( ) 2 2 (n) 2 ( yi ,t yi ,t 1 xi ' ) Ai ,k 1 ( yi ,k 1 yi ,k xi ' ) 1 t 1:k 2 2 ( n ) Ai ,k ( yi ,k 1 yi ,k xi ' ) 2 ( yi ,t yi ,t 1 xi ' ) 2 t k 2:T 3 M-step, To find ( n 1) argMax ( EZ |Y , ( n ) lnP(Y , Z | )) argMax ( EZ |Y , ( n ) lnP(Yi , Z i | )) we need to i i i maximize above likelihood summed over all the sequences. Below gives the details for estimating each individual parameter. 1 1 Ai ,k 1( n ) ln xi ' 1 e 1 e xi ' i 1:N This expression corresponds to a weighted logistic regression. We use the Newton-Raphson algorithm A ( n 1) a) To find , we maximize to calculate ( n 1) (n) i ,k ln which maximize above expression. b) To estimate ( n 1) , maximize ( y i 1:N t 1:k i ,t yi ,t 1 xi ' )2 Ai ,k 1( n ) ( yi ,k 1 yi ,k xi ' )2 Let Y1 * X1 * yi ,2 yi ,1 xi ' ... ... ... ... Let Y * Yi * and X * X i * , where Yi * y and X i * y x ' i , k 1 i i ,k ... ... Ai ,k 1( n ) yi ,k xi ' Ai ,k 1( n ) yi ,k 1 YN * X N * This is a weighted linear regression and the solution is ( n 1) X *' X * X *' Y * . 1 c) To estimate ( n1) , maximize A i 1:N i ,k (n) ( yi ,k 1 yi ,k xi ' )2 ( yi ,t yi ,t 1 xi ' )2 t k 2:T Y1 * X1 * A (n) y A ( n ) x ' i , k 1 yi , k ... ... i ,k i ,k i yi ,k 2 yi ,k 1 x ' Let Y * Yi * and X * X i * , where Yi * and X i * i ... ... ... ... x ' y y YN * X N * i i ,T i ,T 1 This is also a weighted linear regression and the solution is ( n1) X *' X * X *' Y * . 1 d) After estimating ( n 1) and ( n 1) , ( 2 )( n1) can be calculated by ( 2 )( n 1) ( yi ,t yi ,t 1 xi ' ( n 1) ) 2 Ai ,k 1( n ) ( yi ,k 1 yi ,k xi ' ( n 1) ) 2 1 t 1:k NT i 1:N Ai ,k ( n ) ( yi ,k 1 yi ,k xi ' ( n 1) ) 2 ( yi ,t yi ,t 1 xi ' ( n 1) ) 2 t k 2:T 4 The EM step iterates until convergence or until the difference between the estimated parameter values at steps n and n+1 are below a pre-specified threshold value. (B) Information Matrix Calculation Below we give the details of the calculation of the information matrix using Louis (1982)’ method. 1) First and Second Derivatives of The Log Likelihood of Complete Data Let Wi (Yi , Zi ) denote the complete data including the observed sequence Y and the missing onset Zi , then P(Wi | ) P ( Z i | ) P(Yi | Z i , ) 1 T 1 1 ( ) ( ) exp 2 ( ( yi ,t yi ,t 1 xi ' ) 2 ( yi ,t yi ,t 1 xi ' ) 2 ) xi ' 2 t k 1:T 2 t 1:k 1 e (1 zi ) 1 1 ) exp 2 ( ( yi ,t yi ,t 1 xi ' ) 2 ( yi ,t yi ,t 1 xi ' ) 2 ) ( xi ' t k 2:T 2 t 1:k 1 1 e zi And Li ln( P(Wi | )) T ln(2 2 ) 2 1 (1 zi ) ln(1 e xi ' ) 2 ( ( yi ,t yi ,t 1 xi ' ) 2 ( yi ,t yi ,t 1 xi ' ) 2 ) t k 1:T 2 t 1:k 1 zi ln(1 e xi ' ) 2 ( ( yi ,t yi ,t 1 xi ' ) 2 ( yi ,t yi ,t 1 xi ' ) 2 ) t k 2:T 2 t 1:k 1 The first and second derivatives for Li ln( P(Wi | )) are given below. 5 Li 1 1 (1 zi ) 2 ( xi yi ,t 1 ( yi ,t yi ,t 1 xi ' )) zi 2 ( xi yi ,t 1 ( yi ,t yi ,t 1 xi ' )) t 1:k t 1:k 1 xi y ( yi ,t yi ,t 1 xi ' ) zi yi ,k ( yi ,k 1 yi ,k xi ' ) 2 i ,t 1 t 1:k 2 Li x x ' i 2i ( yi ,t 1 ) 2 zi ( yi ,k ) 2 2 t 2:k 2 Li 0 2 Li 0 2 Li x i4 yi ,t 1 ( yi ,t yi ,t 1 xi ' ) zi yi ,k ( yi ,k 1 yi ,k xi ' ) 2 t 1:k Li 1 2 (1 zi ) xi '( yi ,t yi ,t 1 xi ' ) zi xi ( yi ,t yi ,t 1 xi ' ) t k 1:T t k 2:T x i2 ( yi ,t yi ,t 1 xi ' ) zi ( yi ,k 1 yi ,k xi ' ) t k 1:T 2 Li (T k ) zi (T k zi ) xi xi ' xi xi ' 2 2 2 2 Li 0 2 Li x i4 ( yi ,t yi ,t 1 xi ' ) zi ( yi ,k 1 yi ,k xi ' ) 2 t k 1:T Li x e xi ' x e xi ' e xi ' (1 zi ) i xi ' zi i xi ' xi zi 1 e 1 e 1 e xi ' 1 xi zi 1 e xi ' 2 Li e xi ' xi xi ' 2 (1 e xi ' ) 2 2 Li 0 2 6 2 2 (1 zi ) ( yi ,t yi ,t 1 xi ' ) ( yi ,t yi ,t 1 xi ' ) Li T 1 t k 1:T t 1:k 2 4 2 2 2 2 2 zi ( yi ,t yi ,t 1 xi ' ) ( yi ,t yi ,t 1 xi ' ) t k 2:T t 1:k 1 T 1 4 2 2 2 T 1 4 2 2 2 ( yi ,t yi ,t 1 xi ' ) 2 zi ( yi ,k 1 yi ,k xi ' ) 2 t 1:k 2 2 ( yi ,t yi ,t 1 xi ' ) zi ( yi ,k 1 yi ,k xi ' ) t k 1:T 2 2 ( yi ,t yi ,t 1 xi ' ) ( yi ,t yi ,t 1 xi ' ) t 1:k t k 1:T 2 2 zi ( yi ,k 1 yi ,k xi ' ) ( yi ,k 1 yi ,k xi ' ) ( yi ,t yi ,t 1 xi ' ) 2 ( yi ,t yi ,t 1 xi ' ) 2 2 Li T 1 t 1:k t k 1:T ( 2 ) 2 2 4 6 z ( y y x ' ) 2 ( y y x ' ) 2 k 1 k i i k 1 k i Let W (Y , Z ) , the complete data including the observed sample Y and missing data Z , then the gradient and Hessian matrix for the complete data log likelihood can be obtained from above results. 2) The Gradient S (W ) And Eq (S(W )× S(W )' |Y) 7 Li i Li i S (W | ) Li i Li 2 i 1 x y ( yi ,t yi ,t 1 xi ' ) zi yi ,k ( yi ,k 1 yi ,k xi ' ) 2 i i ,t 1 i t 1:k 1 x ( yi ,t yi ,t 1 xi ' ) zi ( yi ,k 1 yi ,k xi ' ) 2 i i t k 1:T 1 x z i i 1 e xi ' i 2 2 ( yi ,t yi ,t 1 xi ' ) ( yi ,t yi ,t 1 xi ' ) NT 1 t k 1:T 2 4 t 1:k 2 i z ( y y x ' ) 2 ( y y x ' ) 2 2 i , k 1 i ,k i i i ,k 1 i ,k i Now, we calculate Eq (S(W )× S(W )' |Y) of each estimator. Below formula facilitate the calculation. E xi Ai zi Bi x j ' C j z j D j i j E xi xi ' Ai zi Bi Ci zi Di E xi x j ' Ai zi Bi C j z j D j i i j xi xi ' Ai Ci pi Ai Di pi Bi Ci pi Bi Di xi x j ' Ai pi Bi C j p j D j (B.b.1) i j i xi xi ' Ai pi Bi Ci pi Di xi xi '( pi pi 2 ) Bi Di xi x j ' Ai pi Bi C j p j D j i i j i xi Ai pi Bi x j ' C j p j D j xi xi '( pi pi 2 ) Bi Di i j i where pi p( Zi 1) . In our method, this is the conditional probability Ai ,k 1 p ( Z i k 1| Yi ) ' L L For example, to calculate E ( | Y ) , take Ai Ci yi ,t 1 ( yi ,t yi ,t 1 xi ' ) , t 1:k 8 ' L L | Y ) , take Bi Di yk ( yi ,k 1 yi ,k xi ' ) . To calculate E ( Ai yi ,t 1 ( yi ,t yi ,t 1 xi ' ) , Bi yk ( yi ,k 1 yi ,k xi ' ) , Ci t 1:k t k 1:T ( yi ,t yi ,t 1 xi ' ) , Di ( yi ,k 1 yi ,k xi ' ) . Then utilizing formula (B.b.1) with pi Ai ,k 1 p ( Z i k 1| Yi ) and ˆ ( ˆ , ˆ,ˆ, ˆ 2 ) as MLE from the EM algorithm, we can get the estimated value of each element of Eq (S(W )× S(W )' |Y) . 3) The Hessian Matrix B(W ) and Eq (B(W ) |Y ) The Hessian matrix B(W ) contains all the second derivatives of the log likelihood and is a symmetric matrix. Below we give its elements on the upper right. 9 2 L 2 i B(W ) 2 L 2 L 2 L 2 i 2 L i 2 2 L i 2 2 L i ( 2 )2 i i L i 2 2 2 L i 2 L i 2 ( yi ,t 1 ) 2 1 x x ' t 1:k 0 i i 2 2 i zi ( yi ,k ) 1 xi xi '(T k zi ) 2 i yi ,t 1 ( yi ,t yi ,t 1 xi ' ) 1 x i t 1:k 4 i z y ( y y x ' ) i i ,k i ,k 1 i ,k i 0 ( yi ,t yi ,t 1 xi ' ) 1 t k 1:T 0 x i 4 i z ( y y x ' ) i i ,k 1 i ,k i e xi ' 0 i xi xi ' (1 e xi ' )2 2 ( yi ,t yi ,t 1 xi ' ) t 1:k NT 1 4 6 ( yi ,t yi ,t 1 xi ' )2 2 i t k 1:T 2 ( yi ,k 1 yi ,k xi ' ) zi 2 ( yi ,k 1 yi ,k xi ' ) Let Ai ,k 1 p ( Z i k 1| Yi ) , then 10 Eq (B(W ) |Y ) é é æ å ( yi,t-1 )2 ö ù ê 1 ê ÷ú 0 ê 2 å ê xi xi ' ç t=1:k 2÷ ú s ç i + A ( y ) ê è ø êë i,k+1 i,k úû ê ê 1 ê å x x '(T - k - Ai,k+1 ) s2 i i i ê ê ê ê =ê ê ê ê ê ê ê ê ê ê ê êë æ 0 é å yi,t-1 ( yi,t - yi,t-1xi ' b ) ù 1 ú å x êt=1:k s 4 i i ê+ A y ( y - y x ' b ) ú ë i,k+1 i,k i,k+1 i,k i û 0 é å ( yi,t - yi,t-1 - xi 'g ) ù 1 ú x å êt=k+1:T s 4 i i ê - A ( y - y - x 'g ) ú ë i,k+1 i,k+1 i,k i û e - xi ' t å çè x x ' (1+ e i i i ö ) ÷ø ù ú ú ú ú ú ú ú ú ú ú ú ú ú ú ú ú ú ú ú ú ú úû 0 - xi ' t 2 é ù ê ú ê å ( yi,t - yi,t-1xi ' b )2 ú ê t=1:k ú NT 1 ê ú 2 - 4 + 6 å ê + å ( yi,t - yi,t-1 - xi 'g ) ú 2s s i t=k+1:T ê ú æ ( yi,k+1 - yi,k xi ' b )2 ö ú ê ê + Ai,k+1 ç ÷ú 2 è -( yi,k+1 - yi,k - xi 'g ) ø úû êë The difference of Eq (B(W ) |Y )and Eq (S(W )× S(W )' |Y) , gives the asymptotic estimation of the information matrix I ( | Y ) , after substituting each parameter value by the estimated parameter value from the EM algorithm. (C) Log Likelihood Ratio Test To evaluate whether a genetic variant or environmental covariate significantly affected the distribution of the onset of drug effect, we applied the classic Log Likelihood Ratio Test. Let ˆ0 (ˆ0 , ˆ0 ,ˆ0 ,ˆ 02 ) and ˆ1 (ˆ1 , ˆ1 ,ˆ1 , ˆ 12) be the estimated parameter values from the EM algorithm under the hypothesis-reduced model H0 and full model H1. The likelihood ratio test statistic is then 2ln P Y , , Y where P Y ,,Y | ˆ P Y1 ,,YN | ˆ0 1 N 1 1 N | P Yi | and P Yi | is given in equation (A.2.1). i And 11 ln P Y , , Y | ˆ P Y1 ,,YN | ˆ0 1 N 1 1 ( yi ,t yi ,t 1 xi ' ˆ0 ) 2 ( yi ,t yi ,t 1 xi ' ˆ0 ) 2 2 i 1: N 2ˆ 0 t 1:k t k 2:T 1 ( yi ,t yi ,t 1 xi ' ˆ1 ) 2 ( yi ,t yi ,t 1 xi ' ˆ1 ) 2 2 i 1:N 2ˆ1 t 1:k t k 2:T 1 1 exp ( yi ,k 1 yi ,k xi ' ˆ0 ) 2 xi 'ˆ0 2 2ˆ 0 1 e ln i 1:N 1 1 2 ˆ 1 e xi 'ˆ0 exp 2ˆ 2 ( yi ,k 1 yi ,k xi ' 0 ) 0 1 1 exp ( yi ,k 1 yi ,k xi ' ˆ1 ) 2 xi 'ˆ1 2 2ˆ1 1 e ln i 1:N 1 1 2 ˆ 1 e xi 'ˆ1 exp 2ˆ 2 ( yi ,k 1 yi ,k xi ' 1 ) 1 NT (ln ˆ 02 ln ˆ 12) 2 (D) Tables of Some Simulation Results The tables below contain results from simulation study. For each simulation scenario, we used 500 subjects for each run (unless specified) and conducted 1000 independent runs. The mean, SD, and prediction rate (proportion of onset times that were correctly predicted) were calculated from results of the 1000 runs. 1) Effects of different values (with corresponding odds ratios when SNP value is 2) on estimation accuracy Table A1. Estimation results for differentvalues in Scenario 1 odds ratio True Value of Mean of Estimated SD of Estimated 1.1 (-3.080, 1.588) (-3.091, 1.59) (0.301, 0.2028) 1.2 (-3.080, 1.631) (-3.103, 1.641) (0.3147, 0.2086) 1.3 (-3.080, 1.671) (-3.125, 1.693) (0.304, 0.2102) 1.4 (-3.080, 1.708) (-3.104, 1.713) (0.3197, 0.2215) 1.5 (-3.080, 1.743) (-3.093, 1.751) (0.3087, 0.212) 2 (-3.080, 1.887) (-3.115, 1.905) (0.2895, 0.2139) 4 (-3.080, 2.233) (-3.116, 2.255) (0.2978, 0.2169) Table A2. Estimation results for differentvalues in Scenario 2 12 1.1 True Value of (-3.080, 2.088, -0.200) Mean of Estimated (-3.123, 2.111, -0.202) SD of Estimated (0.4326, 0.2207, 0.0598) 1.2 (-3.080, 2.131, -0.200) (-3.114, 2.152, -0.201) (0.4269, 0.2142, 0.0608) 1.3 (-3.080, 2.171, -0.200) (-3.117, 2.187, -0.199) (0.4174, 0.2175, 0.0608) 1.4 (-3.080, 2.208, -0.200) (-3.145, 2.237, -0.199) (0.4169, 0.2174, 0.0594) 1.5 (-3.080, 2.243, -0.200) (-3.112, 2.272, -0.204) (0.4281, 0.2125, 0.0615) 2 (-3.080, 2.387, -0.200) (-3.136, 2.426, -0.201) (0.4058, 0.2135, 0.0583) 4 (-3.080, 2.733, -0.200) (-3.105, 2.766, -0.204) (0.4584, 0.2387, 0.0711) odds ratio Table A3. More estimation results for differentvalues in Scenario 2 1.1 True Value of (-3.080, 4.230, -1.057) Mean of Estimated (-3.201, 4.346, -1.077) SD of Estimated (0.615, 0.4603, 0.1261) 1.2 (-3.080, 4.230, -1.040) (-3.16, 4.317, -1.061) (0.5996, 0.4676, 0.1275) 1.3 (-3.080, 4.230, -1.024) (-3.14, 4.284, -1.036) (0.595, 0.436, 0.1178) 1.4 (-3.080, 4.230, -1.009) (-3.152, 4.322, -1.03) (0.582, 0.424, 0.1155) 1.5 (-3.080, 4.230, -0.995) (-3.141, 4.299, -1.011) (0.5751, 0.4193, 0.1149) 2 (-3.080, 4.230, -0.937) (-3.161, 4.309, -0.951) (0.5529, 0.421, 0.1117) 4 (-3.080, 4.230, -0.799) (-3.089, 4.287, -0.816) (0.4718, 0.3998, 0.1102) odds ratio 2) Effects of sample size on estimation accuracy Table A4. Estimation results for different sample size in Scenario 1 50 True Value of (-3.08, 2.23) Mean of Estimated (-3.254, 2.363) SD of Estimated (0.9515, 0.7093) 60 (-3.08, 2.23) (-3.333, 2.421) (0.8971, 0.6457) 70 (-3.08, 2.23) (-3.214, 2.326) (0.857, 0.6034) 80 (-3.08, 2.23) (-3.239, 2.354) (0.7859, 0.5473) 100 (-3.08, 2.23) (-3.191, 2.303) (0.7000, 0.5228) 200 (-3.08, 2.23) (-3.125, 2.260) (0.4887, 0.3458) 300 (-3.08, 2.23) (-3.125, 2.268) (0.3986, 0.3115) 500 (-3.08, 2.23) (-3.106, 2.249) (0.2988, 0.2415) 800 (-3.08, 2.23) (-3.099, 2.246) (0.2293, 0.1823) 1000 (-3.08, 2.23) (-3.089, 2.234) (0.2138, 0.1976) Sample Size Table A5. Estimation results for different sample size in Scenario 2 50 True Value of (-3.08, 4.23, -0.20) Mean of Estimated (-1.935, 3.254, -0.236) SD of Estimated (1.5781, 1.446, 0.2283) 60 (-3.08, 4.23, -0.20) (-2.221, 3.505, -0.223) (1.1664, 0.4207, 0.2267) Sample Size 13 70 (-3.08, 4.23, -0.20) (-2.194, 3.352, -0.199) (0.9078, 0.4003, 0.1729) 80 (-3.08, 4.23, -0.20) (-2.411, 3.598, -0.208) (0.8976, 0.4003, 0.1817) 100 (-3.08, 4.23, -0.20) (-2.645, 3.842, -0.209) (0.7825, 0.427, 0.1494) 200 (-3.08, 4.23, -0.20) (-2.911, 4.096, -0.209) (0.6619, 0.497, 0.0966) 300 (-3.08, 4.23, -0.20) (-3.106, 4.287, -0.206) (0.6524, 0.5505, 0.0741) 400 (-3.08, 4.23, -0.20) (-3.22, 4.38, -0.202) (0.6442, 0.5469, 0.072) 500 (-3.08, 4.23, -0.20) (-3.187, 4.344, -0.201) (0.5779, 0.5093, 0.0695) 800 (-3.08, 4.23, -0.20) (-3.21, 4.35, -0.199) (0.5359, 0.456, 0.0501) 1000 (-3.08, 4.23, -0.20) (-3.183, 4.322, -0.198) (0.4651, 0.3899, 0.045) 3) Joint effects of sequence length and sample size on estimation accuracy Table A6. Estimation results for different sample size in Scenario 1 when sequence length T=12, early onset k=6 50 True Value of (-3.08, 2.23) Mean of Estimated (-3.254, 2.363) SD of Estimated (0.9515, 0.7093) 60 (-3.08, 2.23) (-3.333, 2.421) (0.8971, 0.6457) 70 (-3.08, 2.23) (-3.214, 2.326) (0.857, 0.6034) 80 (-3.08, 2.23) (-3.239, 2.354) (0.7859, 0.5473) 100 (-3.08, 2.23) (-3.191, 2.303) (0.7000, 0.5228) 200 (-3.08, 2.23) (-3.125, 2.260) (0.4887, 0.3458) 300 (-3.08, 2.23) (-3.125, 2.268) (0.3986, 0.3115) 500 (-3.08, 2.23) (-3.106, 2.249) (0.2988, 0.2415) 800 (-3.08, 2.23) (-3.099, 2.246) (0.2293, 0.1823) 1000 (-3.08, 2.23) (-3.089, 2.234) (0.2138, 0.1976) Sample Size Table A7. Estimation results for different sample size in Scenario 1 when sequence length T=10, early onset k=5 Sample Size True Value of Mean of Estimated SD of Estimated 50 (-3.08, 2.23) (-3.32, 2.412) (0.9531, 0.7238) 60 (-3.08, 2.23) (-3.294, 2.381) (0.9264, 0.6575) 70 (-3.08, 2.23) (-3.212, 2.328) (0.837, 0.5915) 80 (-3.08, 2.23) (-3.229, 2.356) (0.8084, 0.5966) 100 (-3.08, 2.23) (-3.215, 2.316) (0.7548, 0.4889) 200 (-3.08, 2.23) (-3.113, 2.258) (0.4816, 0.3636) 300 (-3.08, 2.23) (-3.148, 2.285) (0.377, 0.286) 500 (-3.08, 2.23) (-3.088, 2.23) (0.2925, 0.2475) 800 (-3.08, 2.23) (-3.096, 2.24) (0.2296, 0.1939) 1000 (-3.08, 2.23) (-3.094, 2.242) (0.2076, 0.1743) 14 Table A8. Estimation results for different sample size in Scenario 1 when sequence length T=8, early onset k=4 50 True Value of (-3.08, 2.23) Mean of Estimated (-3.29, 2.392) SD of Estimated (0.9837, 0.786) 60 (-3.08, 2.23) (-3.238, 2.341) (0.8542, 0.6605) 70 (-3.08, 2.23) (-3.248, 2.361) (0.8445, 0.6112) 80 (-3.08, 2.23) (-3.279, 2.375) (0.8163, 0.5938) 100 (-3.08, 2.23) (-3.232, 2.342) (0.6765, 0.5056) 200 (-3.08, 2.23) (-3.17, 2.304) (0.4596, 0.3446) 300 (-3.08, 2.23) (-3.145, 2.279) (0.401, 0.3014) 500 (-3.08, 2.23) (-3.106, 2.253) (0.3072, 0.2524) 800 (-3.08, 2.23) (-3.087, 2.227) (0.2323, 0.2084) 1000 (-3.08, 2.23) (-3.105, 2.24) (0.2195, 0.2218) Sample Size Table A9. Estimation results for different sample size in Scenario 1 when sequence length T=6, early onset k=3 50 True Value of (-3.08, 2.23) Mean of Estimated (-3.37, 2.414) SD of Estimated (0.9925, 0.7267) 60 (-3.08, 2.23) (-3.255, 2.356) (0.9334, 0.7048) 70 (-3.08, 2.23) (-3.241, 2.362) (0.8833, 0.6443) 80 (-3.08, 2.23) (-3.246, 2.354) (0.7951, 0.5562) 100 (-3.08, 2.23) (-3.233, 2.336) (0.6580, 0.5104) 200 (-3.08, 2.23) (-3.144, 2.283) (0.4916, 0.3800) 300 (-3.08, 2.23) (-3.135, 2.263) (0.3886, 0.3031) 500 (-3.08, 2.23) (-3.099, 2.249) (0.3000, 0.2594) 800 (-3.08, 2.23) (-3.095, 2.237) (0.2359, 0.205) 1000 (-3.08, 2.23) (-3.103, 2.239) (0.2099, 0.2253) Sample Size 4) Effects of post-onset slope on estimation accuracy Table A10. Effects of post-onset slope () on estimation and prediction accuracy True Value of Mean of Estimated SD of Estimated Prediction Rate (0.40, -0.06) True Value of (-3.08, 2.23) (-3.074, 2.192) (0.302, 0.242) 0.776 (0.30, -0.045) (-3.08, 2.23) (-3.044, 2.136) (0.308, 0.305) 0.772 (0.20, -0.03) (-3.08, 2.23) (-3.169, 2.129) (0.791, 0.523) 0.751 (0.10, -0.015) (-3.08, 2.23) (-3.370, 2.930) (1.110, 2.589) 0.698 15 5) Effects of data variance on estimation accuracy Table A11. Effects of data variance (2) on estimation and prediction accuracy True Value of True Value of Mean of Estimated SD of Estimated Prediction Rate 0.01 (-3.08, 2.23) (-3.095, 2.245) (0.294, 0.212) 0.7777 0.05 (-3.08, 2.23) (-3.169, 2.129) (0.791, 0.523) 0.751 0.10 (-3.08, 2.23) (-3.541, 3.036) (1.370, 2.651) 0.699 0.15 (-3.08, 2.23) (-3.696, 3.988) (2.380, 3.649) 0.678 16