of random variables is the probability distribution of the

advertisement
General title: Joint And Marginal Distribution
Subtitle: Marginal Distribution

Introduction:
In probability theory and statistics, the marginal distribution of a subset of a
collection of random variables is the probability distribution of the variables contained in
the subset. The term marginal variable is used to refer to those variables in the subset
of variables being retained. The distribution of the marginal variables (the marginal
distribution) is obtained by "marginalising" over the distribution of the variables being
discarded.
The context here is that the theoretical studies being undertaken ,or the data analysis
being done, involves a wider set of random variables but that attention is being limited
to a reduced number of those variables. In many applications an analysis may start with
a given collection of random variables, then first extend the set by defining new ones
(such as the sum of the original random variables) and finally reduce the number by
placing interest in the marginal distribution of a subset (such as the sum). Several
different analyses may be done, each treating a different subset of variables as the
marginal variables.

Definition of Terms

Marginal Variable- is used to refer to those variables in the subset of variables
being retained

Marginal distribution- of a subset of a collection
of random variables is the probability distribution of the variables contained in the
subset.
Example1.
Let X and Y be two random variables, and p(x, y) their joint probability
distribution.
By definition, the marginal distibution of X is just the distribution of X, with Y
being ignored (with a similar definition for Y).
The reason why this concept was introduced is that it often happens that the joint
probability distribution of the pair {X, Y} is known, while the individual distributions of X
and Y are not. But it is then possible to derive these individual distributions from the joint
distribution as we show now (and as is illustrated by the animation below).
What is the marginal distribution of X ?
X = xi if and only if one of these mutually exclusive events occur :
* {xi and y1}
* {xi and y2}
* {xi and y3}
* ------
The probability P{X = xi}is therefore the sum of the probabilities of these events, and we
have :
P{X = xi}=
j
p(xi, yj )
If the p(xi, yj ) = pij are organized as a rectangular table, P{X = xi} is the sum of all
the elements in the ith row. It is often denoted pi..
Then pi. = P{X = xi} may be visualized as being written in the right margin of the
table, hence the name "marginal" distribution.
Similarly, P{Y = yj } = p.j is the sum of the probabilities in the jth column.
This illustration assumes that X can take n values and Y can take m values, but the
above result is true even if X or Y or both can take an (enumerably) infinte number of
values, as it is the case, for example, for the Poisson or negative binomial distributions.
Example 2:
Continuous case
Suppose that X and Y are continuous variables and that their joint distribution can be
represented by their joint probability density f(x, y). An informal argument can be
developped as for the discrete case.
The probability for a realization of (X, Y) to be equal to (x, y) within dx and dy is f(x,
y).dxdy. For a given value x, the probability for X to be equal to x within dx is the sum
over y of these infinitesimal probabilities. Therefore, the marginal probability density fX
(x) of X is given by :
With a similar result for Y.
-----
It is common to say that the marginal distribution of one variable is obtained by
"integrating the other variable out the joint distribution".
Another convenient way of calculating a marginal distribution is by calling on the
properties of multivariate moment generating functions
Example 3:
There is but one exception to the above remark. It can be shown that :
* If the variables X and Y are independent, then their joint probability distribution is
the product of the (marginal) distributions of these two variables.
* Conversely, if a joint probability distribution is the equal to the product of its
marginal distributions, then these marginal variables are independent.
f(x, y) = fX (x) fY (x) iff X and Y are independent
This result provides a very powerful method for proving the independence of two
random variables It generalizes to any number of variable.
Example 4:
Let X and Y be two independent random variables, both uniformly distributed in [0, 1].
In this Tutorial, we calculate the distributions :
* Of the ratio U = X/Y,
* And of the product V = XY.
We do it by :
* First calculating the joint probability distribution of U and V.
* And then by calculating the distributions of U and V as the two marginal distributions
of this joint distribution.
-----
Although the results are of little practical use, they are beyond the reach of intuition, as
illustrated by the above animation, and could hardly have been obtained by a more
direct method.
The method we describe is powerful and of general use, and this demonstration can be
considered as a template for calculating the probability distributions of random variables
in many circumstances where direct methods fail.
Example 5:
Calculating the probability distribution of a random variable A can often be most
conveniently achieved :
* By first calculating the joint distribution of A and some other suitably chosen r.v. B.
* And then by considering the distribution of A as one of the marginals of this joint
distribution.
We now illustrate this indirect, yet powerful method for calculating distributions with the
following animation.
_______
Let X and Y be two independent random variables, both following the uniform
distribution in [0, 1].
One considers the seemingly unrelated and difficult-looking problems :
1) What is the distribution of the r.v. U = X/Y ?
2) What is the distribution of the r.v. V = XY ?
We show that the answers will come from :
* First calculating the joint probability distribution of {U, V},
* And then calculating the distribution of U = X/Y as one of the two marginal
distributions of this joint distribution, with a similar approach for V = XY.

Summary:
In trying to calculate the marginal probability P(H=hit), what we are asking for is the
probability that only one of these variables takes a particular value, irrespective of the value of
the other. (Or, in general, if a situation is described by N variables, for the probability that n
variables take n particular values, where n<N.) In general you can be hit if the lights are red OR
if the lights are yellow OR if the lights are green. So in this case the correct answer can be found
by summing P(H=hit) for all possible values of L.
Say P(L=red) = 0.6, P(L=yellow) = 0.1, P(L=green) = 0.3 and
Conditional distribution: Pr(H=h|L=l)
L=Red L=Yellow L=Green Total
H=Not hit
H=Hit
0.01
0.09
0.90
1
Using the respective probability of each L-value yields the following joint distribution:
Joint distribution: Pr(H=h, L=l)
Total:
L=Red L=Yellow L=Green
Marginal
distribution
H=Not hit
H=Hit
0.006
0.009
0.270
0.285
So if you just cross the street, without looking at the traffic light, your chance of being hit is:
P(H=hit) = 0.6*0.01 + 0.1*0.09 + 0.3*0.9 = 0.285
It is important to interpret these results correctly - even though these figures are contrived and
the likelihood of being hit while crossing at a red light is probably a lot less than 1%, the chance
of being hit by a car when you cross the road is obviously a lot less than 28.5%. However, what
this figure is actually saying is that if you were to put on a blindfold, wear earplugs, and cross the
road at some random time, you'd have a 28.5% chance of being hit by a car, which seems more
reasonable.
Trina Rose F. Mesa II-D1
Download