SYMBOLIC BAYESIAN NETWORKS

SYMBOLIC BAYESIAN NETWORKS Edwin DIDAY1 , Richard EMILION2, ? 1. CEREMADE, University Paris-Dauphine 2. MAPMO, University of Orléans, France ? Contact author: richard.emilion@univ-orleans.fr Keywords: Bayesian network, Conditional distribution, Dirichlet distribution, Independence test, Symbolic. Bayesian networks, see e.g. Darwich, A. (2009), are probabilistic directed acyclic graphs used for system behavior modelling through conditional distributions. They generally deal with categorical or real-valued random variables which are correlated. We consider the case of Bayesian networks which deal with probability-distribution-valued random variables. 1 Statistical setting Le X = (X1 , . . . , Xj , . . . , Xp ) be a random vector, p ≥ 1 being a integer and each Xj taking values in the space of probability measures defined on a mesaured space (Vj , Vj ), j = 1, . . . , p. Let (Xk,1 , . . . , Xk,j , . . . , Xk,p ) k = 1, . . . , K be a sample of size K of X. It can be considered that k is a row index and j a column index. 2 Motivation Actually the sample (Xk,1 , . . . , Xk,j , . . . , Xk,p ) k = 1, . . . , K is not observed but only estimated from observed data. In symbolic data analysis (SDA), each observed Qp data belong to a class among K disjoint classes, say c1 , . . . , cK . They can be either vectors in j=1 Vj or in some Vj as seen in the two examples below which illustrate two different situations. The empirical distribution of the data in Vj which belong to class ck is an estimation of the probability distribution Xk,j . This distribution is considered as the j-th descriptor of class ck 2.1 Paired Samples In the well-known Fisher’s iris data set, K = 3, c1 = ’setosa’, c2 = ’versicolor, c3 = ’virginica’, p = 4. The observations are 50 iris in each of these 3 classes. The observed samples are paired since each iris is described by a vector of 4 data. As an example, X3,2 is the probability distribution of sepal width in ’virginica’ class. 2.2 Unpaired Samples Let c1 , . . . , ck be K students and p professors that grade several students’ exams. Let Xk,j be the distribution of student ck grades given by professor j. It is seen here that the samples are unpaired since the exams and the number of exams can differ from one professor to another. 2.3 Dependencies Clearly, in the case of paired samples, within each class, data of descriptor j are correlated to data of descriptor j0 while this correlation is meaningless in the case of unpaired samples. However considering the K pairs of estimated distributions (Xk,j , Xk,j0 ), k = 1, . . . , K, j, j0 = 1, . . . , p, j 6= j0, it is seen that the random distributions Xj and Xj0 can be correlated. This motivates us to consider Bayesian networks dealing with probability distributions. 3 The case of finite sets Assume that all the sets Vj are finite. Then each estimated distribution Xk,j is just a probability vector of frequencies which size can differ from one index j to another. Bayesian networks are built on testing the independence between Xj and Xj0 . Here, this can be done by using any independence test between two random vectors. We have used the indep.etest() function implemented in the ’energy’ package for R. Szekely, G.J. - Rizzo, M.L. (2013) Distributions and conditional distributions are estimated using kernels in the nonparametric case while Dirichlet distributions are used in the parametric case. 4 The case of densities Assume that each Vj is a measurable subsets of some Rdj and that Xk,j has a density fk,j w.r.t. the Lebesgue measure. Then independence tests can be performed and conditional distributions can be estimated using some functional data analysis methods Ramsey, J.O. - Silverman, B.W. (2005). One can either use a finite number of coordinates on some basis, reducing the problem to the preceding finite sets case, or use kernel estimators w.r.t. a distance on the space of functions. References Darwiche, A. (2009). Modeling and Reasoning with Bayesian Networks. Cambridge University Press. Ramsey, J.O. - Silverman, B.W. (2005). Functional Data Analysis. Springer. Szekely, G.J. - Rizzo, M.L. (2013). The distance correlation t-test of independence in high dimension. J. Mult. Variate Anal. 17, 193–213. http://dx.doi.org/10.1016/j.jmva. 2013.02.012

SYMBOLIC BAYESIAN NETWORKS

Related documents

Products

Support

SYMBOLIC BAYESIAN NETWORKS

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib