Task 3.2 Summary Report Sept 2015

Task 3.2: Algorithms and Prototype tools
Summary Report
October 2015
(Month 18 deliverable report)
Table Learning from Data and Knowledge
Much of the prototype tool development work has focused on the core BAYESKNOWLEDGE problem of combining data and knowledge for Bayesian Network table (i.e.
parameter) learning. We are currently finalising an implementation (including novel GUI) that
extends the Expectation-Maximisation algorithm to take into consideration both data and
expert knowledge. More specifically, the user will be provided with the option to specify
subjective probabilities for any of the model variables, and also indicate his or hers degree of
confidence with respect to the indications, as well as relative to the data taken into
consideration for parameter learning. The outcome would be a model learned from a dataset
with missing data, where missing data is approximated and where the estimated model is
weighted against expert beliefs.
Other algorithmic work undertaken:
Peng Lin, Martin Neil, Norman Fenton (2015) Region Based Approximation for High
Dimensional Discrete Bayesian Network Models, to be submitted to IEEE Pattern
analysis and Machine Intelligence.
This work is potentially ground-breaking. Performing efficient inference on Bayesian
Networks (BNs), with large numbers of densely connected variables is challenging.
With exact inference methods, such as the Junction Tree algorithm, clustering
complexity can grow exponentially with the number of nodes and so becomes
computationally intractable. This paper presents a general purpose approximate
inference algorithm called Triplet Region Construction (TRC), which reduces the
clustering complexity from worst case exponential to polynomial. We employ graph
factorization to reduce connection complexity and produce clusters of limited size.
TRC is guaranteed to converge and the experiments show that this new algorithm
achieves accurate results when compared with exact solutions. The TRC algorithm is
defined on BNs with discrete nodes, but can be used on hybrid BNs since any
continuous node can have an approximated discrete node probability table.
Zhou, Y., Fenton, N. E., & Neil, M. (2014). An Extended MPL-C Model for Bayesian
Network Parameter Learning with Exterior Constraints. In L. van der Gaag & A. J.
Feelders (Eds.), Probabilistic Graphical Models: 7th European Workshop. PGM 2014,
Utrecht. The Netherlands, September 17-19, 2014 (pp. 581–596). Springer Lecture
Notes in AI 8754.
Lack of relevant data is a major challenge for learning Bayesian networks (BNs) in
real-world applications. Knowledge engineering techniques attempt to address this
by incorporating domain knowledge from experts. The paper focuses on learning
node probability tables using both expert judgment and limited data. To reduce the
massive burden of eliciting individual probability table entries (parameters) it is often
easier to elicit constraints on the parameters from experts. Constraints can be interior
(between entries of the same probability table column) or exterior (between entries of
different columns). In this paper we introduce the first auxiliary BN method (called
MPL-EC) to tackle parameter learning with exterior constraints. The MPL-EC itself is
a BN, whose nodes encode the data observations, exterior constraints and
parameters in the original BN. Also, MPL-EC addresses (i) how to estimate target
parameters with both data and constraints, and (ii) how to fuse the weights from
different causal relationships in a robust way. Experimental results demonstrate the
superiority of MPL-EC at various sparsity levels compared to conventional parameter
learning algorithms and other state-of-the-art parameter learning algorithms with
constraints. Moreover, we demonstrate the successful application to learn a realworld software defects BN with sparse data.
Zhou, Y., Fenton, N. E., Hospedales, T, & Neil, M. (2015). "Probabilistic Graphical
Models Parameter Learning with Transferred Prior and Constraints", 31st Conference
on Uncertainty in Artificial Intelligence (UAI 2015), Amsterdam, 13-15 July 2015.
Learning accurate Bayesian networks (BNs) is a key challenge in real-world
applications, especially when training data are hard to acquire. Two approaches
have been used to address this challenge: 1) introducing expert judgements and 2)
transferring knowledge from related domains. This is the first paper to present a
generic framework that combines both approaches to improve BN parameter
learning. This framework is built upon an extended multinomial parameter learning
model, that itself is an auxiliary BN. It serves to integrate both knowledge transfer
and expert constraints. Experimental results demonstrate improved accuracy of the
new method on a variety of benchmark BNs, showing its potential to benefit many
real-world problems.
Zhou, Y., Fenton, N. E. (2015), "An Empirical Study of Bayesian Network Parameter
Learning with Monotonic Causality Constraints", submitted International Journal of
Approximate Reasoning
Learning accurate Bayesian networks (BNs) is a key challenge in real-world
applications, especially when training data are hard to acquire. The conventional way
to mitigate this challenge in parameter learning is to introduce domain
knowledge/expert judgements. Recently, the idea of qualitative constraints has been
introduced to improve the BN parameter learning accuracy. In this approach, the
exterior parameter constraints (between CPT entries of different parent state
configurations) are encoded in the edges/structures of BNs with ordinary variables.
However, no previous work has investigated the extent to which such constraints
exist in the standard BN repository. This paper examines such constraints in each
edge of the BNs from the standard repository. Experimental results indicate such
constraints fully or partially exist in all these BNs, and our slightly improved
constrained optimization algorithm achieves great parameter learning performance,
especially in large BNs. These results can be used for guiding when to employ
exterior constraints in parameter estimation. This has the potential to benefit many
real-world case studies in decision support and risk analysis.
Zhou, Y., Hospedales, T., Fenton, N. E. (2015), "When and where to transfer for Bayes
net parameter learning", Submitted Machine Learning Journal
Learning Bayesian networks from sparse data is a major challenge in real-world
applications, where data are hard to acquire. Transfer learning techniques attempt to
address this by leveraging data from different but related problems. For example, it
may be possible to exploit medical diagnosis data from a different country. A
challenge with this approach is heterogeneous relatedness to the target, both within
and across source networks. In this paper we introduce the first Bayesian network
parameter transfer learning (BNPTL) algorithm to reason about both network and
fragment relatedness. BNPTL addresses (i) how to find the most relevant source
network and network fragments to transfer, and (ii) how to fuse source and target
parameters in a robust way. In addition to improving target task performance, explicit
reasoning allows us to diagnose network and sub-graph relatedness across BNs,
even if latent variables are present, or if their state space is heterogeneous. This is
important in some applications where relatedness itself is an output of interest.
Experimental results demonstrate the superiority of BNPTL at various sparsity and
source relevance levels compared to single task learning and other state-of-the-art
parameter transfer methods. Moreover, we demonstrate successful application to
real-world medical case studies.