Task 4.2 Summary Report Sept 2015

advertisement
Task 4.2: Empirical Studies Validation
Summary Report
October 2015
(Month 18 deliverable report)
1
Norman Fenton, Martin Neil, et al, “Hypothesis testing revisited: why the classical
approach will almost certainly give the ‘wrong’ results”, working paper, Sept 2015
In this paper we will argue that the problems with classical statistical hypothesis testing are
even more fundamental than previously considered. While much of what is presented in this
paper is known, we properly articulate what hypothesis testing actually means and the
relationship between different types of hypothesis testing using a BN model that unifies the
different approaches and assumptions.
Fenton, N. E. (2014). Assessing evidence and testing appropriate hypotheses. Science
& Justice, 54(6), 502-504. http://dx.doi.org/10.1016/j.scijus.2014.10.007
It is crucial to identify the most appropriate hypotheses if one is to apply probabilistic
reasoning to evaluate and properly understand the impact of evidence. Subtle
changes to the choice of a prosecution hypothesis can result in drastically different
posterior probabilities to a defence hypothesis from the same evidence. To illustrate
the problem we consider a real case in which probabilistic arguments assumed that
the prosecution hypothesis “both babies were murdered” was the appropriate
alternative to the defence hypothesis “both babies died of Sudden Infant Death
Syndrome (SIDS)”. Since it would have been sufficient for the prosecution to
establish just one murder, a more appropriate alternative hypothesis was “at least
one baby was murdered”. Based on the same assumptions used by one of the
probability experts who examined the case, the prior odds in favour of the defence
hypothesis over the double murder hypothesis are 30 to 1. However, the prior odds
in favour of the defence hypothesis over the alternative ‘at least one murder’
hypothesis are only 5 to 2. Assuming that the medical and other evidence has a
likelihood ratio of 5 in favour of the prosecution hypothesis results in very different
conclusions about the posterior probability of the defence hypothesis.
Fenton, N., Neil, M., & Constantinou, A. C. (2015a). Simpson's Paradox and the
implications for medical trials. Under review, 2015.
This paper describes Simpson’s paradox, and explains its serious implications for
randomised control trials. In particular, we show that for any number of variables we
can simulate the result of a controlled trial which uniformly points to one conclusion
(such as ‘drug is effective’) for every possible combination of the variable states, but
when a previously unobserved confounding variable is included every possible
combination of the variables state points to the opposite conclusion (‘drug is not
effective’). In other words no matter how many variables are considered, and no
matter how ‘conclusive’ the result, one cannot conclude the result is truly ‘valid’
since there is theoretically an unobserved confounding variable that could
completely reverse the result.
Yet, B., Constantinou, A. C., Fenton, N., & Neil, M. (2015b). Partial Expected Value of
Perfect Information of Continuous Variables using Dynamic Discretisation. Under
review, 2015.
In decision theory models partial Expected Value of Perfect Information (EVPI) is an
important analysis technique that is used to identify the value of acquiring
information on individual variables. Partial EVPI can be used to prioritise the parts of
a model that should be improved or identify the parts where acquiring additional data
or expert knowledge is most beneficial. Calculating partial EVPI of continuous
2
variables is challenging, and several sampling and approximation techniques have
been proposed. This paper proposes a novel approach for calculating partial EVPI in
hybrid Bayesian network (BN) models. The proposed approach uses dynamic
discretisation (DD) and the junction tree algorithm to calculate the partial EVPI. This
approach is an improvement on the previously proposed simulation-based partial
EVPI methods, since users do not need to determine the sample size and the DD
algorithm has been implemented in a BN software tool. We compare our approach
with the previously proposed techniques using two case studies. Our approach
accurately calculates the partial EVPI values, and, unlike the previous approaches, it
does not require the user to assess sampling convergence.
Dewitt S. H., Hsu, A, Fenton NE, Lagnado D (2015), “Nested Sets or Causal Framing?
Optimal Presentation Format for Bayesian Reasoning”, submitted Journal of
Experimental Psychology: General
Previous work has found that nested sets and causal framings significantly help
people solve simple Bayesian problems. We compare the individual and combined
effects of these two approaches in a single experiment. We find a beneficial effect for
the nested sets, but no effect for causal framing. We also use qualitative methods to
capture participants’ thought processes during problem solving. Based on this data,
we propose a four-stage process model of how individuals successfully solve simple
Bayesian problems. This process is the modal response in all four experimental
conditions and is therefore proposed as a universal process that successful
individuals undertake on simple Bayesian problems regardless of framing. The model
predicts our current experiment results, being more frequent in the nested sets
condition than the control, but no more frequent in the causal condition. It also
provides a good fit with results reported in previous work on the causal framing.
3
Download