New Calibrated Bayesian Internal Goodness-of-Fit Methods: Sampled Posterior P-values as Simple and General P-values that Allow Double Use of the Data Frédéric Gosselin Cemagref, UR EFNO, F-45290 Nogent-sur-Vernisson, France E-mail: frederic.gosselin@cemagref.fr Results of Scenario 1 1 Text S1. Scenario 1 p sp and p nsp results In the following tables, we present the results of psp (sampled posterior p-values), and its normalized version, pnsp , in the case of Scenario 1: Scenario 1. In our first scenario, the model which generated the data and the model used to fit the data were exactly the same – including for the prior distribution. The common priors were λ ~ Gamma α0, β0 with α0 / β0 = θ0 = exp 1 and α 0 / β02 / α0 / β0 = ρ0 ~ Uniform 0;2 + .05 , 1 / σ 2 ~ Gamma α0, β0 and θ ~ Normal θ0 ,σ 2 / σ 02 with α0 / β0 1 , α0 / β02 = ρ0 ~ 10 Uniform0;11 , θ0 0 , and σ0 1 , and θ ~ Beta α0, β0 with α0 = 2θ0 ρ0 and β0 = 21 θ0 ρ0 , with ρ0 ~ 10Uniform 0;1 and θ0 = .5 , for the Poisson, Normal and Bernoulli cases, respectively. In the following tables, we display the Kolmogorow-Smirnov statistic of the comparison of the p-values with a uniform distribution (ks.D), the proportion of values in the 5% extreme positions on the interval [0;1] (p.05), and the same for 1% (p.01). 10,000 data set sampling and analyses were performed. n denotes the sample size and # the number of simulations corresponding to the sample size. For proportions, we used tail-area tests to detect significant departures from the point null hypotheses and also studied whether the posterior density of the underlying proportion was negligibly or non-negligibly different from the target proportions. The notation for the significance of the tests is as follows: (*) means that the test is significant at a level between .05 and .1; * between .01 and .05; ** between .0001 and .01; *** less than .0001. The notation for the study of negligible/non-negligible departures from the expected values were as follows: 2 – for the proportion of the p-values in the 5% extreme positions on the [0;1] (p.05): 00 (respectively, 0) means 95% of the estimated values of the underlying p-value are in the interval .045;.055 (resp. .04;.06 ); ++ (respectively, +) means 95% of the estimated of the underlying p-value are in the interval ].06;1] (resp. ].055;1] ); -- (respectively, -) means 95% of the estimated of the underlying p-value are in the interval [0;.04[ (resp. [0;.045[ ); – for the proportion of the p-values in the 1% extreme positions on the [0;1] (p.01): 00 (respectively, 0) means 95% of the estimated values of the underlying p-value are in the interval .009;.011 (resp. .008;.012); ++ (respectively, +) means 95% of the estimated of the underlying p-value are in the interval ].012;1] (resp. ].011;1] ); -- (respectively, -) means 95% of the estimated of the underlying p-value are in the interval [0;.008[ (resp. [0;.009[ ). In the following results, it may seem that some departure from the uniform distribution appears for p.01 and large values of n, in the case of psp , t = mean or psp , d = meanc . We checked this specific situation with independent simulations on a larger sample size – 20,000 data sets generated – and confirmed that there was no significant departure from the uniform distribution (results not shown). 3 Case 1: Poisson pnsp , t = mean 1 2 3 4 5 n [ 20, 80) [ 80, 300) [300, 600) [600,1000] ALL ks.D 0.011 0.010 0.021 0.017 0.007 p.05 0.050 0.051 0.048 0.050 0.050 0 0 0 0 00 p.01 0.010 0.012 0.011 0.007 0.010 ks.D 0.012 0.014 0.024 (*) 0.019 0.010 p.05 0.052 0 0.047 0 0.055 0.042 (*) 0.050 00 p.01 0.013 0.010 0.012 0.008 0.011 ks.D 0.016 0.015 0.013 0.019 0.010 p.05 0.046 0.050 0.053 0.053 0.050 00 p.01 0.009 0.011 0.010 0.009 0.010 0 ks.D 0.012 0.023 0.015 0.028 (*) 0.012 p.05 0.053 0.054 0.042 0.048 0.049 * 0 00 p.01 0.010 0.011 0.008 0.010 0.010 0 ks.D 0.020 0.014 0.018 0.009 0.005 p.05 0.050 0.055 0.044 0.046 0.049 pnsp , t = var 1 2 3 4 5 n [ 20, 80) [ 80, 300) [300, 600) [600,1000] ALL pnsp , t = skew 1 2 3 4 5 n [ 20, 80) [ 80, 300) [300, 600) [600,1000] ALL 0 pnsp , t = kurt 1 2 3 4 5 n [ 20, 80) [ 80, 300) [300, 600) [600,1000] ALL pnsp , t = Z a 1 2 3 4 5 n [ 20, 80) [ 80, 300) [300, 600) [600,1000] ALL 0 00 p.01 0.010 0.013 0.008 0.008 0.010 4 psp , t = mean 1 2 3 4 5 n [ 20, 80) [ 80, 300) [300, 600) [600,1000] ALL ks.D 0.013 0.016 0.018 0.013 0.005 p.05 0.048 0.054 0.051 0.056 0.052 ks.D 0.010 0.016 0.017 0.015 0.008 p.05 0.047 0 0.047 0 0.057 (*) 0.044 0.049 00 p.01 0.009 0.007 0.012 0.009 0.010 ks.D 0.024 (*) 0.008 0.017 0.024 0.006 p.05 0.052 0.052 0.052 0.054 0.053 p.01 0.010 0.011 0.009 0.011 0.010 ks.D 0.012 0.016 0.019 0.014 0.005 p.05 0.049 0.053 0.054 0.054 0.052 0 0 0 p.01 0.011 0.013 0.011 0.007 0.011 psp , t = var 1 2 3 4 5 n [ 20, 80) [ 80, 300) [300, 600) [600,1000] ALL 0 psp , t = p0 1 2 3 4 5 n [ 20, 80) [ 80, 300) [300, 600) [600,1000] ALL 0 0 psp , d = meanc 1 2 3 4 5 n [ 20, 80) [ 80, 300) [300, 600) [600,1000] ALL 0 0 p.01 0.010 0.013 0.012 0.005 *,0.010 5 psp , d = varc 1 2 3 4 5 n [ 20, 80) [ 80, 300) [300, 600) [600,1000] ALL ks.D 0.011 0.015 0.016 0.015 0.009 p.05 0.047 0 0.047 0 0.057 (*) 0.046 0.050 00 p.01 0.010 0.007 0.012 0.008 0.009 ks.D 0.012 0.021 0.011 0.022 0.009 p.05 0.050 0.048 0.053 0.049 0.050 p.01 0.012 0.008 0.009 0.007 (*) 0.009 psp , d = LL 1 2 3 4 5 n [ 20, 80) [ 80, 300) [300, 600) [600,1000] ALL 0 0 0 00 6 Case 2: Normal pnsp , t = mean 1 2 3 4 5 n [ 20, 80) [ 80, 300) [300, 600) [600,1000] ALL ks.D 0.016 0.019 0.027 * 0.018 0.013 (*) p.05 0.050 0.052 0.046 0.052 0.050 ks.D 0.015 0.010 0.021 0.026 0.011 p.05 0.049 0.050 0.048 0.047 0.049 ks.D 0.012 0.010 0.016 0.014 0.006 p.05 0.053 0.053 0.046 0.044 0.049 ks.D 0.011 0.020 0.022 0.025 0.009 p.05 0.048 0.051 0.045 0.049 0.048 ks.D 0.014 0.011 0.018 0.021 0.008 p.05 0.050 0.050 0.048 0.055 0.050 0 0 0 00 p.01 0.009 0.010 0.009 0.005 *,0.008 pnsp , t = var 1 2 3 4 5 n [ 20, 80) [ 80, 300) [300, 600) [600,1000] ALL 00 p.01 0.011 0.008 0.011 0.008 0.009 00 p.01 0.010 0.011 0.010 0.009 0.010 0 0 0 pnsp , t = skew 1 2 3 4 5 n [ 20, 80) [ 80, 300) [300, 600) [600,1000] ALL pnsp , t = kurt 1 2 3 4 5 n [ 20, 80) [ 80, 300) [300, 600) [600,1000] ALL 0 0 0 0 p.01 0.011 0.011 0.009 0.009 0.010 pnsp , t = Z a 1 2 3 4 5 n [ 20, 80) [ 80, 300) [300, 600) [600,1000] ALL 0 0 0 00 p.01 0.009 0.012 0.008 0.014 (*) 0.011 7 psp , t = mean 1 2 3 4 5 n [ 20, 80) [ 80, 300) [300, 600) [600,1000] ALL ks.D 0.016 0.019 0.027 * 0.017 0.013 (*) p.05 0.050 0.052 0.046 0.051 0.050 ks.D 0.015 0.010 0.021 0.027 0.011 p.05 0.049 0.050 0.049 0.046 0.049 00 p.01 0.010 0.007 0.011 0.007 0.009 ks.D 0.011 0.019 0.024 (*) 0.022 0.006 p.05 0.046 0.048 0 0.042 (*) 0.057 0.048 0 p.01 0.009 0.011 0.009 0.009 0.009 ks.D 0.016 0.018 0.026 * 0.018 0.013 (*) p.05 0.051 0.051 0.046 0.054 0.050 p.01 0.007 (*) 0.010 0.009 0.005 **,0.008 * ks.D 0.014 0.010 0.022 0.026 0.011 p.05 0.049 0.050 0.050 0.047 0.049 0 0 00 p.01 0.008 0.010 0.009 0.005 *,0.008 (*) psp , t = var 1 2 3 4 5 n [ 20, 80) [ 80, 300) [300, 600) [600,1000] ALL 0 0 0 psp , t = p0 1 2 3 4 5 n [ 20, 80) [ 80, 300) [300, 600) [600,1000] ALL psp , d = meanc 1 2 3 4 5 n [ 20, 80) [ 80, 300) [300, 600) [600,1000] ALL 0 0 00 psp , d = varc 1 2 3 4 5 n [ 20, 80) [ 80, 300) [300, 600) [600,1000] ALL 0 0 0 00 p.01 0.012 0.008 0.011 0.008 0.010 8 0 psp , d = LL 1 2 3 4 5 n [ 20, 80) [ 80, 300) [300, 600) [600,1000] ALL ks.D 0.019 0.011 0.022 0.024 0.012 p.05 0.047 0.051 0.048 0.045 0.048 0 0 0 0 p.01 0.011 0.009 0.012 0.007 0.010 9 0 Case 3: Bernoulli pnsp , t = mean 1 2 3 4 5 n [ 20, 80) [ 80, 300) [300, 600) [600,1000] ALL ks.D 0.015 0.010 0.009 0.018 0.005 p.05 0.049 0.056 0.047 0.049 0.050 0 0 00 p.01 0.009 0.013 0.011 0.012 0.011 ks.D 0.026 (*) 0.012 0.018 0.015 0.013 (*) p.05 0.054 0.045 0.055 0.054 0.052 0 p.01 0.009 0.009 0.012 0.009 0.010 ks.D 0.027 0.016 0.013 0.021 0.008 p.05 0.057 (*) 0.047 0 0.046 0.056 0.051 0 p.01 0.013 0.013 0.011 0.007 0.011 ks.D 0.015 0.012 0.018 0.014 0.007 p.05 0.049 0.047 0.045 0.051 0.048 p.01 0.014 (*) 0.009 0.012 0.011 0.011 ks.D 0.021 0.009 0.011 0.022 0.010 p.05 0.051 0.049 0.045 0.055 0.050 0 pnsp , t = var 1 2 3 4 5 n [ 20, 80) [ 80, 300) [300, 600) [600,1000] ALL 0 pnsp , t = skew 1 2 3 4 5 n [ 20, 80) [ 80, 300) [300, 600) [600,1000] ALL * pnsp , t = kurt 1 2 3 4 5 n [ 20, 80) [ 80, 300) [300, 600) [600,1000] ALL 0 0 0 0 pnsp , t = Z a 1 2 3 4 5 n [ 20, 80) [ 80, 300) [300, 600) [600,1000] ALL 0 0 00 p.01 0.012 0.009 0.009 0.011 0.010 10 psp , t = mean 1 2 3 4 5 n [ 20, 80) [ 80, 300) [300, 600) [600,1000] ALL ks.D 0.012 0.011 0.009 0.013 0.006 p.05 0.048 0.053 0.047 0.056 0.051 0 0 00 p.01 0.010 0.011 0.011 0.009 0.011 psp , t = var 1 2 3 4 5 n [ 20, 80) [ 80, 300) [300, 600) [600,1000] ALL ks.D 0.027 0.014 0.023 0.022 0.005 * p.05 0.054 0.049 0.047 0.052 0.050 0 0 00 p.01 0.011 0.009 0.011 0.010 0.010 psp , t = p0 1 2 3 4 5 n [ 20, 80) [ 80, 300) [300, 600) [600,1000] ALL ks.D 0.009 0.015 0.012 0.013 0.006 p.05 0.046 0.049 0.048 0.056 0.049 ks.D 0.012 0.011 0.009 0.013 0.006 p.05 0.048 0.053 0.047 0.056 0.051 ks.D 0.028 * 0.013 0.024 (*) 0.022 0.007 p.05 0.053 0.051 0.047 0.052 0.051 0 0 00 p.01 0.009 0.012 0.011 0.011 0.011 psp , d = meanc 1 2 3 4 5 n [ 20, 80) [ 80, 300) [300, 600) [600,1000] ALL 0 0 00 p.01 0.010 0.011 0.011 0.009 0.011 psp , d = varc 1 2 3 4 5 n [ 20, 80) [ 80, 300) [300, 600) [600,1000] ALL 0 0 00 p.01 0.010 0.008 0.011 0.009 0.010 11 0 psp , d = LL 1 2 3 4 5 n [ 20, 80) [ 80, 300) [300, 600) [600,1000] ALL ks.D 0.018 0.016 0.025 (*) 0.022 0.007 p.05 0.047 0.053 0.049 0.055 0.051 0 0 00 p.01 0.009 0.011 0.011 0.010 0.010 12