General title: Joint And Marginal Distribution Subtitle: Marginal Distribution Introduction: In probability theory and statistics, the marginal distribution of a subset of a collection of random variables is the probability distribution of the variables contained in the subset. The term marginal variable is used to refer to those variables in the subset of variables being retained. The distribution of the marginal variables (the marginal distribution) is obtained by "marginalising" over the distribution of the variables being discarded. The context here is that the theoretical studies being undertaken ,or the data analysis being done, involves a wider set of random variables but that attention is being limited to a reduced number of those variables. In many applications an analysis may start with a given collection of random variables, then first extend the set by defining new ones (such as the sum of the original random variables) and finally reduce the number by placing interest in the marginal distribution of a subset (such as the sum). Several different analyses may be done, each treating a different subset of variables as the marginal variables. Definition of Terms Marginal Variable- is used to refer to those variables in the subset of variables being retained Marginal distribution- of a subset of a collection of random variables is the probability distribution of the variables contained in the subset. Example1. Let X and Y be two random variables, and p(x, y) their joint probability distribution. By definition, the marginal distibution of X is just the distribution of X, with Y being ignored (with a similar definition for Y). The reason why this concept was introduced is that it often happens that the joint probability distribution of the pair {X, Y} is known, while the individual distributions of X and Y are not. But it is then possible to derive these individual distributions from the joint distribution as we show now (and as is illustrated by the animation below). What is the marginal distribution of X ? X = xi if and only if one of these mutually exclusive events occur : * {xi and y1} * {xi and y2} * {xi and y3} * ------ The probability P{X = xi}is therefore the sum of the probabilities of these events, and we have : P{X = xi}= j p(xi, yj ) If the p(xi, yj ) = pij are organized as a rectangular table, P{X = xi} is the sum of all the elements in the ith row. It is often denoted pi.. Then pi. = P{X = xi} may be visualized as being written in the right margin of the table, hence the name "marginal" distribution. Similarly, P{Y = yj } = p.j is the sum of the probabilities in the jth column. This illustration assumes that X can take n values and Y can take m values, but the above result is true even if X or Y or both can take an (enumerably) infinte number of values, as it is the case, for example, for the Poisson or negative binomial distributions. Example 2: Continuous case Suppose that X and Y are continuous variables and that their joint distribution can be represented by their joint probability density f(x, y). An informal argument can be developped as for the discrete case. The probability for a realization of (X, Y) to be equal to (x, y) within dx and dy is f(x, y).dxdy. For a given value x, the probability for X to be equal to x within dx is the sum over y of these infinitesimal probabilities. Therefore, the marginal probability density fX (x) of X is given by : With a similar result for Y. ----- It is common to say that the marginal distribution of one variable is obtained by "integrating the other variable out the joint distribution". Another convenient way of calculating a marginal distribution is by calling on the properties of multivariate moment generating functions Example 3: There is but one exception to the above remark. It can be shown that : * If the variables X and Y are independent, then their joint probability distribution is the product of the (marginal) distributions of these two variables. * Conversely, if a joint probability distribution is the equal to the product of its marginal distributions, then these marginal variables are independent. f(x, y) = fX (x) fY (x) iff X and Y are independent This result provides a very powerful method for proving the independence of two random variables It generalizes to any number of variable. Example 4: Let X and Y be two independent random variables, both uniformly distributed in [0, 1]. In this Tutorial, we calculate the distributions : * Of the ratio U = X/Y, * And of the product V = XY. We do it by : * First calculating the joint probability distribution of U and V. * And then by calculating the distributions of U and V as the two marginal distributions of this joint distribution. ----- Although the results are of little practical use, they are beyond the reach of intuition, as illustrated by the above animation, and could hardly have been obtained by a more direct method. The method we describe is powerful and of general use, and this demonstration can be considered as a template for calculating the probability distributions of random variables in many circumstances where direct methods fail. Example 5: Calculating the probability distribution of a random variable A can often be most conveniently achieved : * By first calculating the joint distribution of A and some other suitably chosen r.v. B. * And then by considering the distribution of A as one of the marginals of this joint distribution. We now illustrate this indirect, yet powerful method for calculating distributions with the following animation. _______ Let X and Y be two independent random variables, both following the uniform distribution in [0, 1]. One considers the seemingly unrelated and difficult-looking problems : 1) What is the distribution of the r.v. U = X/Y ? 2) What is the distribution of the r.v. V = XY ? We show that the answers will come from : * First calculating the joint probability distribution of {U, V}, * And then calculating the distribution of U = X/Y as one of the two marginal distributions of this joint distribution, with a similar approach for V = XY. Summary: In trying to calculate the marginal probability P(H=hit), what we are asking for is the probability that only one of these variables takes a particular value, irrespective of the value of the other. (Or, in general, if a situation is described by N variables, for the probability that n variables take n particular values, where n<N.) In general you can be hit if the lights are red OR if the lights are yellow OR if the lights are green. So in this case the correct answer can be found by summing P(H=hit) for all possible values of L. Say P(L=red) = 0.6, P(L=yellow) = 0.1, P(L=green) = 0.3 and Conditional distribution: Pr(H=h|L=l) L=Red L=Yellow L=Green Total H=Not hit H=Hit 0.01 0.09 0.90 1 Using the respective probability of each L-value yields the following joint distribution: Joint distribution: Pr(H=h, L=l) Total: L=Red L=Yellow L=Green Marginal distribution H=Not hit H=Hit 0.006 0.009 0.270 0.285 So if you just cross the street, without looking at the traffic light, your chance of being hit is: P(H=hit) = 0.6*0.01 + 0.1*0.09 + 0.3*0.9 = 0.285 It is important to interpret these results correctly - even though these figures are contrived and the likelihood of being hit while crossing at a red light is probably a lot less than 1%, the chance of being hit by a car when you cross the road is obviously a lot less than 28.5%. However, what this figure is actually saying is that if you were to put on a blindfold, wear earplugs, and cross the road at some random time, you'd have a 28.5% chance of being hit by a car, which seems more reasonable. Trina Rose F. Mesa II-D1