Task 2.1 Modelling redundancy Summary Report October 2015 (update/revision of Month 12 deliverable report) 1 This task involves tackling the problem of forcing users to specify unnecessary probabilities when specifying dependencies between variables in BNs. This report enumerates the current progress with respect to the objectives specified for Task 2.1. Specifically: Fenton, N., Neil, M., Lagnado, D., Marsh, W., Yet, B., & Constantinou, A. (2015b). Modelling mutual exclusive events in Bayesian networks. Under review, 2015. http://constantinou.info/downloads/papers/mutualExBN.pdf This paper focuses on minimising modelling redundancies in BNs. More specifically, the paper describes a novel and simple solution to the problem whereby a set of mutually exclusive events require to be modelled as separate nodes instead of states of a single node. This common scenario has never previously been adequately addressed. Our proposed method makes use of a special type of constraint and auxiliary node together with the formulas for assigning the necessary node probability table values. The solution enforces mutual exclusivity between events and preserves their prior probabilities. Constantinou, A. C., Fenton, N., & Neil, M. (2015a). Integrating expert knowledge with data in causal probabilistic networks: preserving the data-driven expectations when the expert variables remain unobserved. Under review, 2015. The paper (Constantinou et al., 2015a), which is currently under peer-review in the Journal of Approximate Reasoning, focuses on the problem whereby a variable in a BN is known from data, but where we wish to explicitly model the impact of some additional expert variable (for which there is expert judgment but no data). Because the statistical outcomes are already influenced by the causes an expert might identify as variables missing from the dataset, the incentive here is to add the expert factor to the model in such a way that the distribution of the data variable is preserved when the expert factor remains unobserved. We provide a method for this purpose. We also describe how the method can be used when we want to learn the parameters of extremely rare or previously unobserved events. The method also helps to minimise modelling redundancy. Zhou, Y., Fenton, N. E., & Neil, M. (2014). An Extended MPL-C Model for Bayesian Network Parameter Learning with Exterior Constraints. In L. van der Gaag & A. J. Feelders (Eds.), Probabilistic Graphical Models: 7th European Workshop. PGM 2014, Utrecht. The Netherlands, September 17-19, 2014 (pp. 581–596). Springer Lecture Notes in AI 8754. Lack of relevant data is a major challenge for learning Bayesian networks (BNs) in real-world applications. Knowledge engineering techniques attempt to address this by incorporating domain knowledge from experts. The paper focuses on learning node probability tables using both expert judgment and limited data. To reduce the massive burden of eliciting individual probability table entries (parameters) it is often easier to elicit constraints on the parameters from experts. Constraints can be interior (between entries of the same probability table column) or exterior (between entries of different columns). In this paper we introduce the first auxiliary BN method (called MPL-EC) to tackle parameter learning with exterior constraints. The MPL-EC itself is a BN, whose nodes encode the data observations, exterior constraints and parameters in the original BN. Also, MPL-EC addresses (i) how to estimate target parameters with both data and constraints, and (ii) how to fuse the weights from different causal relationships in a robust way. Experimental results demonstrate the 2 superiority of MPL-EC at various sparsity levels compared to conventional parameter learning algorithms and other state-of-the-art parameter learning algorithms with constraints. Moreover, we demonstrate the successful application to learn a real-world software defects BN with sparse data. Zhou, Y., Fenton, N. E., Hospedales, T, & Neil, M. (2015). "Probabilistic Graphical Models Parameter Learning with Transferred Prior and Constraints", 31st Conference on Uncertainty in Artificial Intelligence (UAI 2015), Amsterdam, 13-15 July 2015. Learning accurate Bayesian networks (BNs) is a key challenge in real-world applications, especially when training data are hard to acquire. Two approaches have been used to address this challenge: 1) introducing expert judgements and 2) transferring knowledge from related domains. This is the first paper to present a generic framework that combines both approaches to improve BN parameter learning. This framework is built upon an extended multinomial parameter learning model, that itself is an auxiliary BN. It serves to integrate both knowledge transfer and expert constraints. Experimental results demonstrate improved accuracy of the new method on a variety of benchmark BNs, showing its potential to benefit many real-world problems. Zhou, Y., Fenton, N. E. (2015), "An Empirical Study of Bayesian Network Parameter Learning with Monotonic Causality Constraints", submitted International Journal of Approximate Reasoning Learning accurate Bayesian networks (BNs) is a key challenge in real-world applications, especially when training data are hard to acquire. The conventional way to mitigate this challenge in parameter learning is to introduce domain knowledge/expert judgements. Recently, the idea of qualitative constraints has been introduced to improve the BN parameter learning accuracy. In this approach, the exterior parameter constraints (between CPT entries of different parent state configurations) are encoded in the edges/structures of BNs with ordinary variables. However, no previous work has investigated the extent to which such constraints exist in the standard BN repository. This paper examines such constraints in each edge of the BNs from the standard repository. Experimental results indicate such constraints fully or partially exist in all these BNs, and our slightly improved constrained optimization algorithm achieves great parameter learning performance, especially in large BNs. These results can be used for guiding when to employ exterior constraints in parameter estimation. This has the potential to benefit many real-world case studies in decision support and risk analysis. Zhou, Y., Hospedales, T., Fenton, N. E. (2015), "When and where to transfer for Bayes net parameter learning", Submitted Machine Learning Journal Learning Bayesian networks from sparse data is a major challenge in real-world applications, where data are hard to acquire. Transfer learning techniques attempt to address this by leveraging data from different but related problems. For example, it may be possible to exploit medical diagnosis data from a different country. A challenge with this approach is heterogeneous relatedness to the target, both within and across source networks. In this paper we introduce the first Bayesian network parameter transfer learning (BNPTL) algorithm to reason about both network and fragment relatedness. BNPTL addresses (i) how to find the most relevant source network and network fragments to transfer, and (ii) how to fuse source and target parameters in a robust way. In addition to improving target task performance, explicit 3 reasoning allows us to diagnose network and sub-graph relatedness across BNs, even if latent variables are present, or if their state space is heterogeneous. This is important in some applications where relatedness itself is an output of interest. Experimental results demonstrate the superiority of BNPTL at various sparsity and source relevance levels compared to single task learning and other state-of-the-art parameter transfer methods. Moreover, we demonstrate successful application to real-world medical case studies. 4