Text S1. Scenario 1 and results

advertisement
New Calibrated Bayesian Internal Goodness-of-Fit Methods:
Sampled Posterior P-values as Simple and General P-values that
Allow Double Use of the Data
Frédéric Gosselin
Cemagref, UR EFNO, F-45290 Nogent-sur-Vernisson, France
E-mail: frederic.gosselin@cemagref.fr
Results of Scenario 1
1
Text S1. Scenario 1 p sp and p nsp results
In the following tables, we present the results of psp (sampled posterior p-values), and its
normalized version, pnsp , in the case of Scenario 1:
Scenario 1. In our first scenario, the model which generated the data and the model
used to fit the data were exactly the same – including for the prior distribution. The common
priors were λ ~ Gamma α0, β0  with α0 / β0 = θ0 = exp 1 and
α
0
/ β02  / α0 / β0  = ρ0 ~ Uniform 0;2  + .05 , 1 / σ 2 ~ Gamma α0, β0  and θ ~ Normal θ0 ,σ 2 / σ 02 
with α0 / β0  1 , α0 / β02 = ρ0 ~ 10 Uniform0;11 , θ0  0 , and σ0  1 , and θ ~ Beta α0, β0  with
α0 = 2θ0 ρ0 and β0 = 21  θ0 ρ0 , with ρ0 ~ 10Uniform 0;1 and θ0 = .5 , for the Poisson, Normal
and Bernoulli cases, respectively.
In the following tables, we display the Kolmogorow-Smirnov statistic of the comparison of
the p-values with a uniform distribution (ks.D), the proportion of values in the 5% extreme
positions on the interval [0;1] (p.05), and the same for 1% (p.01). 10,000 data set sampling
and analyses were performed. n denotes the sample size and # the number of simulations
corresponding to the sample size. For proportions, we used tail-area tests to detect significant
departures from the point null hypotheses and also studied whether the posterior density of the
underlying proportion was negligibly or non-negligibly different from the target proportions.
The notation for the significance of the tests is as follows: (*) means that the test is significant
at a level between .05 and .1; * between .01 and .05; ** between .0001 and .01; *** less than
.0001. The notation for the study of negligible/non-negligible departures from the expected
values were as follows:
2
– for the proportion of the p-values in the 5% extreme positions on the [0;1] (p.05): 00
(respectively, 0) means 95% of the estimated values of the underlying p-value are in the
interval .045;.055 (resp. .04;.06 ); ++ (respectively, +) means 95% of the estimated of the
underlying p-value are in the interval ].06;1] (resp. ].055;1] ); -- (respectively, -) means 95%
of the estimated of the underlying p-value are in the interval [0;.04[ (resp. [0;.045[ );
– for the proportion of the p-values in the 1% extreme positions on the [0;1] (p.01): 00
(respectively, 0) means 95% of the estimated values of the underlying p-value are in the
interval .009;.011 (resp. .008;.012); ++ (respectively, +) means 95% of the estimated of the
underlying p-value are in the interval ].012;1] (resp. ].011;1] ); -- (respectively, -) means 95%
of the estimated of the underlying p-value are in the interval [0;.008[ (resp. [0;.009[ ).
In the following results, it may seem that some departure from the uniform distribution
appears for p.01 and large values of n, in the case of psp , t = mean or psp , d = meanc . We
checked this specific situation with independent simulations on a larger sample size – 20,000
data sets generated – and confirmed that there was no significant departure from the uniform
distribution (results not shown).
3
Case 1: Poisson
pnsp , t = mean
1
2
3
4
5
n
[ 20, 80)
[ 80, 300)
[300, 600)
[600,1000]
ALL
ks.D
0.011
0.010
0.021
0.017
0.007
p.05
0.050
0.051
0.048
0.050
0.050
0
0
0
0
00
p.01
0.010
0.012
0.011
0.007
0.010
ks.D
0.012
0.014
0.024 (*)
0.019
0.010
p.05
0.052 0
0.047 0
0.055
0.042 (*)
0.050 00
p.01
0.013
0.010
0.012
0.008
0.011
ks.D
0.016
0.015
0.013
0.019
0.010
p.05
0.046
0.050
0.053
0.053
0.050
00
p.01
0.009
0.011
0.010
0.009
0.010
0
ks.D
0.012
0.023
0.015
0.028 (*)
0.012
p.05
0.053
0.054
0.042
0.048
0.049
*
0
00
p.01
0.010
0.011
0.008
0.010
0.010
0
ks.D
0.020
0.014
0.018
0.009
0.005
p.05
0.050
0.055
0.044
0.046
0.049
pnsp , t = var
1
2
3
4
5
n
[ 20, 80)
[ 80, 300)
[300, 600)
[600,1000]
ALL
pnsp , t = skew
1
2
3
4
5
n
[ 20, 80)
[ 80, 300)
[300, 600)
[600,1000]
ALL
0
pnsp , t = kurt
1
2
3
4
5
n
[ 20, 80)
[ 80, 300)
[300, 600)
[600,1000]
ALL
pnsp , t = Z a
1
2
3
4
5
n
[ 20, 80)
[ 80, 300)
[300, 600)
[600,1000]
ALL
0
00
p.01
0.010
0.013
0.008
0.008
0.010
4
psp , t = mean
1
2
3
4
5
n
[ 20, 80)
[ 80, 300)
[300, 600)
[600,1000]
ALL
ks.D
0.013
0.016
0.018
0.013
0.005
p.05
0.048
0.054
0.051
0.056
0.052
ks.D
0.010
0.016
0.017
0.015
0.008
p.05
0.047 0
0.047 0
0.057 (*)
0.044
0.049 00
p.01
0.009
0.007
0.012
0.009
0.010
ks.D
0.024 (*)
0.008
0.017
0.024
0.006
p.05
0.052
0.052
0.052
0.054
0.053
p.01
0.010
0.011
0.009
0.011
0.010
ks.D
0.012
0.016
0.019
0.014
0.005
p.05
0.049
0.053
0.054
0.054
0.052
0
0
0
p.01
0.011
0.013
0.011
0.007
0.011
psp , t = var
1
2
3
4
5
n
[ 20, 80)
[ 80, 300)
[300, 600)
[600,1000]
ALL
0
psp , t = p0
1
2
3
4
5
n
[ 20, 80)
[ 80, 300)
[300, 600)
[600,1000]
ALL
0
0
psp , d = meanc
1
2
3
4
5
n
[ 20, 80)
[ 80, 300)
[300, 600)
[600,1000]
ALL
0
0
p.01
0.010
0.013
0.012
0.005 *,0.010
5
psp , d = varc
1
2
3
4
5
n
[ 20, 80)
[ 80, 300)
[300, 600)
[600,1000]
ALL
ks.D
0.011
0.015
0.016
0.015
0.009
p.05
0.047 0
0.047 0
0.057 (*)
0.046
0.050 00
p.01
0.010
0.007
0.012
0.008
0.009
ks.D
0.012
0.021
0.011
0.022
0.009
p.05
0.050
0.048
0.053
0.049
0.050
p.01
0.012
0.008
0.009
0.007 (*)
0.009
psp , d = LL
1
2
3
4
5
n
[ 20, 80)
[ 80, 300)
[300, 600)
[600,1000]
ALL
0
0
0
00
6
Case 2: Normal
pnsp , t = mean
1
2
3
4
5
n
[ 20, 80)
[ 80, 300)
[300, 600)
[600,1000]
ALL
ks.D
0.016
0.019
0.027 *
0.018
0.013 (*)
p.05
0.050
0.052
0.046
0.052
0.050
ks.D
0.015
0.010
0.021
0.026
0.011
p.05
0.049
0.050
0.048
0.047
0.049
ks.D
0.012
0.010
0.016
0.014
0.006
p.05
0.053
0.053
0.046
0.044
0.049
ks.D
0.011
0.020
0.022
0.025
0.009
p.05
0.048
0.051
0.045
0.049
0.048
ks.D
0.014
0.011
0.018
0.021
0.008
p.05
0.050
0.050
0.048
0.055
0.050
0
0
0
00
p.01
0.009
0.010
0.009
0.005 *,0.008
pnsp , t = var
1
2
3
4
5
n
[ 20, 80)
[ 80, 300)
[300, 600)
[600,1000]
ALL
00
p.01
0.011
0.008
0.011
0.008
0.009
00
p.01
0.010
0.011
0.010
0.009
0.010
0
0
0
pnsp , t = skew
1
2
3
4
5
n
[ 20, 80)
[ 80, 300)
[300, 600)
[600,1000]
ALL
pnsp , t = kurt
1
2
3
4
5
n
[ 20, 80)
[ 80, 300)
[300, 600)
[600,1000]
ALL
0
0
0
0
p.01
0.011
0.011
0.009
0.009
0.010
pnsp , t = Z a
1
2
3
4
5
n
[ 20, 80)
[ 80, 300)
[300, 600)
[600,1000]
ALL
0
0
0
00
p.01
0.009
0.012
0.008
0.014 (*)
0.011
7
psp , t = mean
1
2
3
4
5
n
[ 20, 80)
[ 80, 300)
[300, 600)
[600,1000]
ALL
ks.D
0.016
0.019
0.027 *
0.017
0.013 (*)
p.05
0.050
0.052
0.046
0.051
0.050
ks.D
0.015
0.010
0.021
0.027
0.011
p.05
0.049
0.050
0.049
0.046
0.049
00
p.01
0.010
0.007
0.011
0.007
0.009
ks.D
0.011
0.019
0.024 (*)
0.022
0.006
p.05
0.046
0.048 0
0.042 (*)
0.057
0.048 0
p.01
0.009
0.011
0.009
0.009
0.009
ks.D
0.016
0.018
0.026 *
0.018
0.013 (*)
p.05
0.051
0.051
0.046
0.054
0.050
p.01
0.007 (*)
0.010
0.009
0.005 **,0.008 *
ks.D
0.014
0.010
0.022
0.026
0.011
p.05
0.049
0.050
0.050
0.047
0.049
0
0
00
p.01
0.008
0.010
0.009
0.005 *,0.008 (*)
psp , t = var
1
2
3
4
5
n
[ 20, 80)
[ 80, 300)
[300, 600)
[600,1000]
ALL
0
0
0
psp , t = p0
1
2
3
4
5
n
[ 20, 80)
[ 80, 300)
[300, 600)
[600,1000]
ALL
psp , d = meanc
1
2
3
4
5
n
[ 20, 80)
[ 80, 300)
[300, 600)
[600,1000]
ALL
0
0
00
psp , d = varc
1
2
3
4
5
n
[ 20, 80)
[ 80, 300)
[300, 600)
[600,1000]
ALL
0
0
0
00
p.01
0.012
0.008
0.011
0.008
0.010
8
0
psp , d = LL
1
2
3
4
5
n
[ 20, 80)
[ 80, 300)
[300, 600)
[600,1000]
ALL
ks.D
0.019
0.011
0.022
0.024
0.012
p.05
0.047
0.051
0.048
0.045
0.048
0
0
0
0
p.01
0.011
0.009
0.012
0.007
0.010
9
0
Case 3: Bernoulli
pnsp , t = mean
1
2
3
4
5
n
[ 20, 80)
[ 80, 300)
[300, 600)
[600,1000]
ALL
ks.D
0.015
0.010
0.009
0.018
0.005
p.05
0.049
0.056
0.047
0.049
0.050
0
0
00
p.01
0.009
0.013
0.011
0.012
0.011
ks.D
0.026 (*)
0.012
0.018
0.015
0.013 (*)
p.05
0.054
0.045
0.055
0.054
0.052
0
p.01
0.009
0.009
0.012
0.009
0.010
ks.D
0.027
0.016
0.013
0.021
0.008
p.05
0.057 (*)
0.047 0
0.046
0.056
0.051 0
p.01
0.013
0.013
0.011
0.007
0.011
ks.D
0.015
0.012
0.018
0.014
0.007
p.05
0.049
0.047
0.045
0.051
0.048
p.01
0.014 (*)
0.009
0.012
0.011
0.011
ks.D
0.021
0.009
0.011
0.022
0.010
p.05
0.051
0.049
0.045
0.055
0.050
0
pnsp , t = var
1
2
3
4
5
n
[ 20, 80)
[ 80, 300)
[300, 600)
[600,1000]
ALL
0
pnsp , t = skew
1
2
3
4
5
n
[ 20, 80)
[ 80, 300)
[300, 600)
[600,1000]
ALL
*
pnsp , t = kurt
1
2
3
4
5
n
[ 20, 80)
[ 80, 300)
[300, 600)
[600,1000]
ALL
0
0
0
0
pnsp , t = Z a
1
2
3
4
5
n
[ 20, 80)
[ 80, 300)
[300, 600)
[600,1000]
ALL
0
0
00
p.01
0.012
0.009
0.009
0.011
0.010
10
psp , t = mean
1
2
3
4
5
n
[ 20, 80)
[ 80, 300)
[300, 600)
[600,1000]
ALL
ks.D
0.012
0.011
0.009
0.013
0.006
p.05
0.048
0.053
0.047
0.056
0.051
0
0
00
p.01
0.010
0.011
0.011
0.009
0.011
psp , t = var
1
2
3
4
5
n
[ 20, 80)
[ 80, 300)
[300, 600)
[600,1000]
ALL
ks.D
0.027
0.014
0.023
0.022
0.005
*
p.05
0.054
0.049
0.047
0.052
0.050
0
0
00
p.01
0.011
0.009
0.011
0.010
0.010
psp , t = p0
1
2
3
4
5
n
[ 20, 80)
[ 80, 300)
[300, 600)
[600,1000]
ALL
ks.D
0.009
0.015
0.012
0.013
0.006
p.05
0.046
0.049
0.048
0.056
0.049
ks.D
0.012
0.011
0.009
0.013
0.006
p.05
0.048
0.053
0.047
0.056
0.051
ks.D
0.028 *
0.013
0.024 (*)
0.022
0.007
p.05
0.053
0.051
0.047
0.052
0.051
0
0
00
p.01
0.009
0.012
0.011
0.011
0.011
psp , d = meanc
1
2
3
4
5
n
[ 20, 80)
[ 80, 300)
[300, 600)
[600,1000]
ALL
0
0
00
p.01
0.010
0.011
0.011
0.009
0.011
psp , d = varc
1
2
3
4
5
n
[ 20, 80)
[ 80, 300)
[300, 600)
[600,1000]
ALL
0
0
00
p.01
0.010
0.008
0.011
0.009
0.010
11
0
psp , d = LL
1
2
3
4
5
n
[ 20, 80)
[ 80, 300)
[300, 600)
[600,1000]
ALL
ks.D
0.018
0.016
0.025 (*)
0.022
0.007
p.05
0.047
0.053
0.049
0.055
0.051
0
0
00
p.01
0.009
0.011
0.011
0.010
0.010
12
Download