Statistical considerations Alfredo García – Arieta, PhD Training workshop: Training of BE assessors, Kiev, October 2009 Outline Basic statistical concepts on equivalence How to perform the statistical analysis of a 2x2 cross-over bioequivalence study How to calculate the sample size of a 2x2 cross-over bioequivalence study How to calculate the CV based on the 90% CI of a BE study 2| Training workshop: Training of BE assessors, Kiev, October 2009 Basic statistical concepts 3| Training workshop: Training of BE assessors, Kiev, October 2009 Type of studies Superiority studies – A is better than B (A = active and B = placebo or gold-standard) – Conventional one-sided hypothesis test Equivalence studies – A is more or less like B (A = active and B = standard) – Two-sided interval hypothesis Non-inferiority studies – A is not worse than B (A = active and B = standard with adverse effects) – One-sided interval hypothesis 4| Training workshop: Training of BE assessors, Kiev, October 2009 Hypothesis test Conventional hypothesis test H0: = 1 H1: 1 (in this case it is two-sided) If P<0,05 we can conclude that statistical significant difference exists If P≥0,05 we cannot conclude – With the available potency we cannot detect a difference – But it does not mean that the difference does not exist – And it does not mean that they are equivalent or equal We only have certainty when we reject the null hypothesis – In superiority trials: H1 is for existence of differences This conventional test is inadequate to conclude about “equalities” – In fact, it is impossible to conclude “equality” 5| Training workshop: Training of BE assessors, Kiev, October 2009 Null vs. Alternative hypothesis Fisher, R.A. The Design of Experiments, Oliver and Boyd, London, 1935 “The null hypothesis is never proved or established, but is possibly disproved in the course of experimentation. Every experiment may be said to exist only in order to give the facts a chance of disproving the null hypothesis” Frequent mistake: The absence of statistical significance has been interpreted incorrectly as absence of clinically relevant differences. 6| Training workshop: Training of BE assessors, Kiev, October 2009 Equivalence We are interested in verifying (instead of rejecting) the null hypothesis of a conventional hypothesis test We have to redefine the alternative hypothesis as a range of values with an equivalent effect The differences within this range are considered clinically irrelevant Problem: it is very difficult to define the maximum difference without clinical relevance for the Cmax and AUC of each drug Solution: 20% based on a survey among physicians 7| Training workshop: Training of BE assessors, Kiev, October 2009 Interval hypothesis or two one-sided tests Redefine the null hypothesis: How? Solution: It is like changing the null to the alternative hypothesis and vice versa. Alternative hypothesis test: Schuirmann, 1981 – H01: 1 – H02: 2 Ha1: 1< Ha2: < 2. This is equivalent to: – H0: 1 or 2 Ha: 1<<2 It is called as an interval hypothesis because the equivalence hypothesis is in the alternative hypothesis and it is expressed as an interval 8| Training workshop: Training of BE assessors, Kiev, October 2009 Interval hypothesis or two one-sided tests The new alternative hypothesis is decided with a statistic that follows a distribution that can be approximated to a tdistribution To conclude bioequivalence a P value <0.05 has to be obtained in both one-sided tests The hypothesis tests do not give an idea of magnitude of equivalence (P<0001 vs. 90% CI: 0.95 – 1.05). That is why confidence intervals are preferred 9| Training workshop: Training of BE assessors, Kiev, October 2009 Point estimate of the difference If T=R, d=T-R=0 If T>R, d=T-R>0 If T<R, d=T-R<0 d<0 Negative effect 10 | d=0 No difference Training workshop: Training of BE assessors, Kiev, October 2009 d>0 Positive effect Estimation with confidence intervals in a superiority trial It is not statistically significant! Because the CI includes the d=0 value Confidence interval 90% - 95% d<0 Negative effect 11 | d=0 No difference Training workshop: Training of BE assessors, Kiev, October 2009 d>0 Positive effect Estimation with confidence intervals in a superiority trial It is statistically significant! Because the CI does not includes the d=0 value Confidence interval 90% - 95% d<0 Negative effect 12 | d=0 No difference Training workshop: Training of BE assessors, Kiev, October 2009 d>0 Positive effect Estimation with confidence intervals in a superiority trial It is statistically significant with P=0.05 Because the boundary of the CI touches the d=0 value Confidence interval 90% - 95% d<0 Negative effect 13 | d=0 No difference Training workshop: Training of BE assessors, Kiev, October 2009 d>0 Positive effect Equivalence study Region of clinical equivalence -d d<0 Negative effect 14 | +d d=0 No difference Training workshop: Training of BE assessors, Kiev, October 2009 d>0 Positive effect Equivalence vs. difference Region of clinical equivalence Equivalent? Different? ? Yes Yes Yes ? Yes Yes Yes ? No ? Yes Yes Yes No ? -d d<0 Negative effect 15 | +d d=0 No difference Training workshop: Training of BE assessors, Kiev, October 2009 d>0 Positive effect Non-inferiority study Inferiority limit Inferior? ? Yes ? No No No No No -d d<0 Negative effect 16 | d=0 No difference Training workshop: Training of BE assessors, Kiev, October 2009 d>0 Positive effect Superiority study (?) Superiority limit ? Superior? No No No d<0 Negative effect 17 | No, not clinically and ? statistically No, not clinically, but yes statistically ?, but yes statistically Yes, statistical & clinically Yes, but only the +d point estimate d=0 d>0 No difference Positive effect Training workshop: Training of BE assessors, Kiev, October 2009 How to perform the statistical analysis of a 2x2 cross-over bioequivalence study 18 | Training workshop: Training of BE assessors, Kiev, October 2009 Statistical Analysis of BE studies Sponsors have to use validated software – E.g. SAS, SPSS, Winnonlin, etc. In the past, it was possible to find statistical analyses performed with incorrect software. – Calculations based on arithmetic means, instead of Least Square Means, give biased results in unbalanced studies • Unbalance: different number of subjects in each sequence – Calculations for replicate designs are more complex and prone to mistakes 19 | Training workshop: Training of BE assessors, Kiev, October 2009 The statistical analysis is not so complex 2x2 BE trial Period 1 Period 2 Y11 Y12 Y21 Y22 N=12 Sequence 1 (BA) BA is RT Sequence 2 (AB) AB is TR 20 | Training workshop: Training of BE assessors, Kiev, October 2009 We don’t need to calculate an ANOVA table Sources of variation Inter-subject Carry-over Residual / subjects d. f. SS MS F P 23 1 22 16487,49 276,00 16211,49 716,85 276,00 736,89 4,286 0,375 4,406 0,5468 0,0005 Intra-subject Formulation Period Residual 1 1 22 3778,19 62,79 35,97 3679,43 62,79 35,97 167,25 0,375 0,215 0,5463 0,6474 Total 47 20265,68 21 | Training workshop: Training of BE assessors, Kiev, October 2009 With complex formulae 2 2 nk 2 SSTotal Yijk Y··· k 1 j 1 i 1 2 2 nk 2 SSW ithin Yijk Yi ·k k 1 j 1 i 1 2 nk SS Between 2 Yi ·k Y··· k 1 i 1 22 | Training workshop: Training of BE assessors, Kiev, October 2009 2 More complex formulae SS Between SSCarry SSint er SSCarry 2n1n2 2 Y·12 Y·22 Y·11 Y·21 n1 n2 2 SS Inter 23 | nk 2 i ·k 2 2 ··k Y Y k 1 i 1 2 k 1 2nk Training workshop: Training of BE assessors, Kiev, October 2009 And really complex formulae SSW ithin SS Drug SS Period SS Intra SS Drug 2n1n2 n1 n2 SS Period 1 Y·21 Y·11 Y·22 Y·12 2 2n1n2 n1 n2 nk 2 1 Y·21 Y·11 Y·12 Y·22 2 nk Y· 2jk 2 2 Y Y SS Intra Yijk2 ··k k 1 j 1 i 1 k 1 i 1 2 k 1 j 1 nk k 1 2nk 2 24 | 2 2 Training workshop: Training of BE assessors, Kiev, October 2009 2 i ·k 2 2 2 Given the following data, it is simple 2x2 BE trial Period 1 Period 2 Y11 Y12 N=12 Sequence 1 (BA) 75, 95, 90, 80, 70, 85 70, 90, 95, 70, 60, 70 Sequence 2 (AB) Y21 Y22 75, 85, 80, 90, 50, 65 40, 50, 70, 80, 70, 95 25 | Training workshop: Training of BE assessors, Kiev, October 2009 First, log-transform the data 2x2 BE trial Period 1 Period 2 Sequence 1 (BA) Y11 4.3175, 4.5539, 4.4998, 4.3820, 4.2485, 4.4427 Y12 4.2485, 4.4998, 4.5539, 4.2485, 4.0943, 4.2485 Sequence 2 (AB) Y21 4.3175, 4.4427, 4.3820, 4,4998, 3,9120, 4.1744 Y22 3.6889, 3,9120, 4,2485, 4.3820, 4.2485, 4.5539 N=12 26 | Training workshop: Training of BE assessors, Kiev, October 2009 Second, calculate the arithmetic mean of each period and sequence 2x2 BE trial Period 1 Period 2 Sequence 1 (BA) Y11 = 4.407 Y12 = 4.316 Sequence 2 (AB) Y21 = 4.288 Y22 = 4,172 N=12 27 | Training workshop: Training of BE assessors, Kiev, October 2009 Note the difference between Arithmetic Mean and Least Square Mean The arithmetic mean (AM) of T (or R) is the mean of all observations with T (or R) irrespective of its group or sequence – All observations have the same weight The LSM of T (or R) is the mean of the two sequence by period means – In case of balanced studies AM = LSM – In case of unbalanced studies observations in sequences with less subjects have more weight – In case of a large unbalance between sequences due to dropouts or withdrawals the bias of the AM is notable 28 | Training workshop: Training of BE assessors, Kiev, October 2009 Third, calculate the LSM of T and R 2x2 BE trial Period 1 Period 2 Y11 = 4.407 Y12 = 4.316 N=12 Sequence 1 (BA) B = 4.2898 Sequence 2 (AB) 29 | Y21 = 4.288 Training workshop: Training of BE assessors, Kiev, October 2009 A = 4.3018 Y22 = 4,172 Fourth, calculate the point estimate F = LSM Test (A) – LSM Reference (B) F = 4.30183 – 4.28985 = 0.01198 Fifth step! Back-transform to the original scale Point estimate = eF = e0.01198 = 1.01205 Five very simple steps to calculate the point estimate!!! 30 | Training workshop: Training of BE assessors, Kiev, October 2009 Now we need to calculate the variability! Step 1: Calculate the difference between periods for each subject and divide it by 2: (P2-P1)/2 Step 2: Calculate the mean of these differences within each sequence to obtain 2 means: d1 and d2 Step 3:Calculate the difference between “the difference in each subject” and “its corresponding sequence mean”. And square it. Step 4: Sum these squared differences Step 5: Divide it by (n1+n2-2), where n1 and n2 is the number of subjects in each sequence. In this example 6+6-2 = 10 – This value multiplied by 2 is the MSE – CV (%) = 100 x √eMSE-1 31 | Training workshop: Training of BE assessors, Kiev, October 2009 This can be done easily in a spreadsheet! PERIOD I R 4,31748811 4,55387689 4,49980967 4,38202663 4,24849524 4,44265126 II T 4,24849524 4,49980967 4,55387689 4,24849524 4,09434456 4,24849524 Step 1 P2-P1 -0,06899287 -0,05406722 0,05406722 -0,13353139 -0,15415068 -0,19415601 Step 2 Mean d1 = n1 = -0,09180516 -0,04590258 6 T 4,3175 4,4427 4,3820 4,4998 3,9120 4,1744 R 3,6889 3,9120 4,2485 4,3820 4,2485 4,5539 Step 2 Mean d2 = n2 = 32 | -0,62860866 -0,53062825 -0,13353139 -0,11778304 0,33647224 0,37948962 Step 1 (P2-P1)/2 -0,03449644 -0,02703361 0,02703361 -0,0667657 -0,07707534 -0,09707801 -0,31430433 -0,26531413 -0,0667657 -0,05889152 0,16823612 0,18974481 Step 3 d - mean d 0,01140614 0,01886897 0,07293619 -0,02086312 -0,03117276 -0,05117543 Step 3 squared 0,0001301 0,00035604 0,00531969 0,00043527 0,00097174 0,00261892 -0,25642187 -0,20743167 -0,00888324 -0,00100906 0,22611858 0,24762727 0,06575218 0,0430279 7,8912E-05 1,0182E-06 0,05112961 0,06131926 -0,11576491 -0,05788246 6 Training workshop: Training of BE assessors, Kiev, October 2009 Sum = Step 4 0,23114064 Step 5 Sigma2(d) = 0,02311406 MSE= 0,04622813 CV = 21,7516218 Step 1: Calculate the difference between periods for each subject and divide it by 2: (P2-P1)/2 PERIOD 33 | I R 4,31748811 4,55387689 4,49980967 4,38202663 4,24849524 4,44265126 II T 4,24849524 4,49980967 4,55387689 4,24849524 4,09434456 4,24849524 Step 1 P2-P1 -0,06899287 -0,05406722 0,05406722 -0,13353139 -0,15415068 -0,19415601 Step 2 Mean d1 = n1 = -0,09180516 -0,04590258 6 T 4,3175 4,4427 4,3820 4,4998 3,9120 4,1744 R 3,6889 3,9120 4,2485 4,3820 4,2485 4,5539 Step 2 Mean d2 = n2 = -0,62860866 -0,53062825 -0,13353139 -0,11778304 0,33647224 0,37948962 Step 1 (P2-P1)/2 -0,03449644 -0,02703361 0,02703361 -0,0667657 -0,07707534 -0,09707801 -0,31430433 -0,26531413 -0,0667657 -0,05889152 0,16823612 0,18974481 -0,11576491 -0,05788246 6 Training workshop: Training of BE assessors, Kiev, October 2009 Step 2: Calculate the mean of these differences within each sequence to obtain 2 means: d1 & d2 PERIOD 34 | I R 4,31748811 4,55387689 4,49980967 4,38202663 4,24849524 4,44265126 II T 4,24849524 4,49980967 4,55387689 4,24849524 4,09434456 4,24849524 Step 1 P2-P1 -0,06899287 -0,05406722 0,05406722 -0,13353139 -0,15415068 -0,19415601 Step 2 Mean d1 = n1 = -0,09180516 -0,04590258 6 T 4,3175 4,4427 4,3820 4,4998 3,9120 4,1744 R 3,6889 3,9120 4,2485 4,3820 4,2485 4,5539 Step 2 Mean d2 = n2 = -0,62860866 -0,53062825 -0,13353139 -0,11778304 0,33647224 0,37948962 Step 1 (P2-P1)/2 -0,03449644 -0,02703361 0,02703361 -0,0667657 -0,07707534 -0,09707801 -0,31430433 -0,26531413 -0,0667657 -0,05889152 0,16823612 0,18974481 -0,11576491 -0,05788246 6 Training workshop: Training of BE assessors, Kiev, October 2009 Step 3: Squared differences PERIOD 35 | I R 4,31748811 4,55387689 4,49980967 4,38202663 4,24849524 4,44265126 II T 4,24849524 4,49980967 4,55387689 4,24849524 4,09434456 4,24849524 Step 1 P2-P1 -0,06899287 -0,05406722 0,05406722 -0,13353139 -0,15415068 -0,19415601 Step 2 Mean d1 = n1 = -0,09180516 -0,04590258 6 T 4,3175 4,4427 4,3820 4,4998 3,9120 4,1744 R 3,6889 3,9120 4,2485 4,3820 4,2485 4,5539 Step 2 Mean d2 = n2 = -0,62860866 -0,53062825 -0,13353139 -0,11778304 0,33647224 0,37948962 Step 1 (P2-P1)/2 -0,03449644 -0,02703361 0,02703361 -0,0667657 -0,07707534 -0,09707801 -0,31430433 -0,26531413 -0,0667657 -0,05889152 0,16823612 0,18974481 -0,11576491 -0,05788246 6 Training workshop: Training of BE assessors, Kiev, October 2009 Step 3 d - mean d 0,01140614 0,01886897 0,07293619 -0,02086312 -0,03117276 -0,05117543 Step 3 squared 0,0001301 0,00035604 0,00531969 0,00043527 0,00097174 0,00261892 -0,25642187 -0,20743167 -0,00888324 -0,00100906 0,22611858 0,24762727 0,06575218 0,0430279 7,8912E-05 1,0182E-06 0,05112961 0,06131926 Step 4: Sum these squared differences Step 3 squared 0,0001301 0,00035604 0,00531969 0,00043527 0,00097174 0,00261892 Sum = Step 5 Sigma2(d) = 0,02311406 MSE= 0,04622813 CV = 21,7516218 0,06575218 0,0430279 7,8912E-05 1,0182E-06 0,05112961 0,06131926 36 | Step 4 0,23114064 Training workshop: Training of BE assessors, Kiev, October 2009 Step 5: Divide the sum by n1+n2-2 Step 3 squared 0,0001301 0,00035604 0,00531969 0,00043527 0,00097174 0,00261892 Sum = Step 5 Sigma2(d) = 0,02311406 MSE= 0,04622813 CV = 21,7516218 0,06575218 0,0430279 7,8912E-05 1,0182E-06 0,05112961 0,06131926 37 | Step 4 0,23114064 Training workshop: Training of BE assessors, Kiev, October 2009 Calculate the confidence interval with point estimate and variability Step 11: In log-scale 90% CI: F ± t(0.1, n1+n2-2)-√((Sigma2(d) x (1/n1+1/n2)) F has been calculated before The t value is obtained in t-Studient tables with 0,1 alpha and n1+n2-2 degrees of freedom – Or in MS Excel with the formula =DISTR.T.INV(0.1; n1+n2-2) Sigma2(d) has been calculated before. 38 | Training workshop: Training of BE assessors, Kiev, October 2009 Final calculation: the 90% CI Log-scale 90% CI: F±t(0.1, n1+n2-2)-√((Sigma2(d)·(1/n1+1/n2)) F = 0.01198 t(0.1, n1+n2-2) = 1.8124611 Sigma2(d) = 0.02311406 90% CI: LL = -0.14711 to UL= 0,17107 Step 12: Back transform the limits with eLL and eUL eLL = e-0.14711 = 0.8632 and eUL = e0.17107 = 1.1866 39 | Training workshop: Training of BE assessors, Kiev, October 2009 How to calculate the sample size of a 2x2 cross-over bioequivalence study 40 | Training workshop: Training of BE assessors, Kiev, October 2009 Reasons for a correct calculation of the sample size Too many subjects – It is unethical to disturb more subjects than necessary – Some subjects at risk and they are not necessary – It is an unnecessary waste of some resources ($) Too few subjects – A study unable to reach its objective is unethical – All subjects at risk for nothing – All resources ($) is wasted when the study is inconclusive 41 | Training workshop: Training of BE assessors, Kiev, October 2009 Frequent mistakes To calculate the sample size required to detect a 20% difference assuming that treatments are e.g. equal – Pocock, Clinical Trials, 1983 To use calculation based on data without logtransformation – Design and Analysis of Bioavailability and Bioequivalence Studies, Chow & Liu, 1992 (1st edition) and 2000 (2nd edition) Too many extra subjects. Usually no need of more than 10%. Depends on tolerability – 10% proposed by Patterson et al, Eur J Clin Pharmacol 57: 663-670 (2001) 42 | Training workshop: Training of BE assessors, Kiev, October 2009 Methods to calculate the sample size Exact value has to be obtained with power curves Approximate values are obtained based on formulae – Best approximation: iterative process (t-test) – Acceptable approximation: based on Normal distribution Calculations are different when we assume products are really equal and when we assume products are slightly different Any minor deviation is masked by extra subjects to be included to compensate drop-outs and withdrawals (10%) 43 | Training workshop: Training of BE assessors, Kiev, October 2009 Calculation assuming that treatments are equal N 2 s Z1b 2 w 2 Z1a 2 Ln1.25 2 s Ln 1 CV 2 w 2 CV expressed as 0.3 for 30% Z(1-(b/2)) = DISTR.NORM.ESTAND.INV(0.05) for 90% 1-b Z(1-(b/2)) = DISTR.NORM.ESTAND.INV(0.1) for 80% 1-b Z(1-a) = DISTR.NORM.ESTAND.INV(0.05) for 5% a 44 | Training workshop: Training of BE assessors, Kiev, October 2009 Example of calculation assuming that treatments are equal If we desire a 80% power, Z(1-(b/2)) = -1.281551566 Consumer risk always 5%, Z(1-a) = -1.644853627 The equation becomes: N = 343.977655 x S2 Given a CV of 30%, S2 = 0,086177696 Then N = 29,64 We have to round up to the next pair number: 30 Plus e.g. 4 extra subject in case of drop-outs 45 | Training workshop: Training of BE assessors, Kiev, October 2009 Example of calculation assuming that treatments are equal If we desire a 90% power, Z(1-(b/2)) = -1.644853627 Consumer risk always 5%, Z(1-a) = -1.644853627 The equation becomes: N = 434.686167 x S2 Given a CV of 25%, S2 = 0,06062462 Then N = 26,35 We have to round up to the next pair number: 28 Plus e.g. 4 extra subject in case of drop-outs 46 | Training workshop: Training of BE assessors, Kiev, October 2009 Calculation assuming that treatments are not equal N 2 s Z1 b Z1a 2 w LnT 2 R Ln1.25 2 T R 1 Z(1-b) = DISTR.NORM.ESTAND.INV(0.1) for 90% 1-b Z(1-b) = DISTR.NORM.ESTAND.INV(0.2) for 80% 1-b Z(1-a) = DISTR.NORM.ESTAND.INV(0.05) for 5% a 47 | Training workshop: Training of BE assessors, Kiev, October 2009 Example of calculation assuming that treatments are 5% different If we desire a 90% power, Z(1-b) = -1.28155157 Consumer risk always 5%, Z(1-a) = -1.644853627 If we assume that T/R=1.05 The equation becomes: N = 563.427623 x S2 Given a CV of 40 %, S2 = 0,14842001 Then N = 83.62 We have to round up to the next pair number: 84 Plus e.g. 8 extra subject in case of drop-outs 48 | Training workshop: Training of BE assessors, Kiev, October 2009 Example of calculation assuming that treatments are 5% different If we desire a 80% power, Z(1-b) = -0.84162123 Consumer risk always 5%, Z(1-a) = -1.644853627 If we assume that T/R=1.05 The equation becomes: N = 406.75918 x S2 Given a CV of 20 %, S2 = 0,03922071 Then N = 15.95 We have to round up to the next pair number: 16 Plus e.g. 2 extra subject in case of drop-outs 49 | Training workshop: Training of BE assessors, Kiev, October 2009 Example of calculation assuming that treatments are 10% different If we desire a 80% power, Z(1-b) = -0.84162123 Consumer risk always 5%, Z(1-a) = -1.644853627 If we assume that T/R=1.11 The equation becomes: N = 876.366247 x S2 Given a CV of 20 %, S2 = 0,03922071 Then N = 34.37 We have to round up to the next pair number: 36 Plus e.g. 4 extra subject in case of drop-outs 50 | Training workshop: Training of BE assessors, Kiev, October 2009 How to calculate the CV based on the 90% CI of a BE study 51 | Training workshop: Training of BE assessors, Kiev, October 2009 Example of calculation of the CV based on the 90% CI Given a 90% CI: 82.46 to 111.99 in BE study with N=24 Log-transform the 90% CI: 4.4123 to 4.7184 The mean of these extremes is the point estimate: 4.5654 Back-transform to the original scale e4.5654 = 96.08 The width in log-scale is 4.7184 – 4.5654 = 0,1530 With the sample size calculate the t-value. How? – Based on the Student-t test tables or a computer (MS Excel) 52 | Training workshop: Training of BE assessors, Kiev, October 2009 Example of calculation of the CV based on the 90% CI Given a N = 24, the degrees of freedom are 22 t = DISTR.T.INV(0.1;n-2) = 1.7171 Standard error of the difference (SE(d)) = Width / t-value = 0.1530 / 1.7171 = 0,0891 Square it: 0.08912 = 0,0079 and divide it by 2 = 0,0040 Multiply it by the sample size: 0.0040x24 = 0,0953 = MSE CV (%) = 100 x √(eMSE-1) = 100 x √(e0.0953-1) = 31,63 % 53 | Training workshop: Training of BE assessors, Kiev, October 2009