Uploaded by Dominion Oke

Chapter 5 Multivariate random variables

advertisement
📚
Chapter 5: Multivariate random
variables
5.3 Introduction
So far, we have considered univariate situations: one random variable at a time
Now we will be considering multivariate situations: two + random variables at
once, and together
In particular, we consider two somewhat different types of multivariate situations
1. Several different variable - such as the height and weight of a person
2. Several observations of the same variable, considered together - such as
the heights of all n people in a sample
Suppose that X1 , X2 , ..., Xn are random variables, then the vector: X =
(X1 , X2 , ..., Xn )′ is a multivariate random variable (hence n-variate), also
known as a random vector. Its possible values are the vectors: x =
(x1 , x2 , ..., xn )′ where each xi is a possible value of the random variable Xi
:
i = 1, 2, ..., n
The joint probability distribution of a multivariate random variable X is defined by
the possible values x, and their probabilities
For now, we consider just the simplest multivariate case, a bivariate random
variable where n = 2. This is sufficient for introducing most of the concepts of
multivariate random variables
For notational simplicity, we will use X and Y instead of X1 and X2 . A bivariate
random variable is then the pair (X, Y)
5.4 Joint probability functions
When the random variables in (X1 , X2 , ...Xn ) are either all discrete or all
continuous, we also call the random variable either discrete or continuous,
Chapter 5: Multivariate random variables
1
respectively.
For a discrete multivariate random variable, the joint probability distribution is
described by the joint probability function, defined as: p(x1 , x2 , ..., xn )
=
P (X1 = x1 , X2 = x2 , ..., Xn = xn ) for all vectors (x1 , x2 , ..., xn ) of n real
numbers. The value p(x1 , x2 , ..., xn ) of the joint probability function is itself a
single number, not a vector.
In the bivariate case, this is: p(x, y)
= P (X = x, Y = y) which we
sometimes write as PX ,Y (x, y) to make the random variables clear
Chapter 5: Multivariate random variables
2
5.5 Marginal distributions
Consider a multivariate discrete random variable X
= (X1 , X2 , ..., Xn )
The marginal distribution of a subset of the variables in X is the (joint)
distribution of this subset. The joint pf of these variables (the marginal pf) is
obtained by summing the joint pf of X over the variables which are not included
in the subset
Chapter 5: Multivariate random variables
3
The simplest marginal distributions are those of individual variables in the
multivariate random variable.
The marginal pf is then obtained by summing the joint pf over all the other
variables
The resulting marginal distribution is univariate, and its pf is a univariate pf
Chapter 5: Multivariate random variables
4
Even for a multivariate random variable, expected values E(Xi ), variances
Var(Xi ) and medians of individual variables are obtained from the univariate
(marginal) distributions of Xi
5.6 Continuous multivariate distributions
If all the random variables in X = (X1 , X2 , ..., Xn ) are continuous, the joint
distribution of X is specified by its joint probability density function
f(x1 , x2 , ..., xn )
Marginal distributions are defined as in the discrete case, but with integration
instead of summation
5.7 Conditional distributions
Consider discrete variables X and Y, with joint pf p(x, y)
marginal pfs pX (x) and pY (y), respectively
Chapter 5: Multivariate random variables
= pX ,Y (x, y) and
5
5.7.1 Properties of conditional distributions
Each different value of x defines a different conditional distribution and
conditional pf pY ∣X (y∣x). Each value of pY ∣X (y∣x) is a conditional
probability of the kind previously defined
A conditional distribution is itself a probability distribution, and a conditional
pf is a pf. Clearly, pY ∣X (y∣x) ≥ 0 for all y, and:
Chapter 5: Multivariate random variables
6
The conditional distribution and pf of X given Y = y (for any y such that
pY (y) > 0 is defined similarly, with the roles of X and Y reversed for any
value x:
Conditional distributions are general and are not limited to the bivariate
case. If X and/or Y are vectors of random variables, the conditional pf of Y
given X = x is:
where pX ,Y
(x, y) is the joint pf of the random vector (X, Y), and px (x)
is the marginal pf of the random vector X
5.7.2 Conditional mean and variance
Since a conditional distribution is a probability distribution, it also has a
mean (expected value) and variance (and median etc.)
These are known as the conditional mean and conditional variance, and are
denoted by:
Chapter 5: Multivariate random variables
7
5.7.3 Continuous conditional distributions
Suppose X and Y are continuous, with joint pdf fX ,Y (x, y) and marginal
pdfs fX (x) and fY (y)
The conditional distribution of Y given that X = x is continuous probability
distribution with the pdf:
This is defined if fX (x)
>0
For a conditional distribution of X given Y = y, fX ∣Y (x∣y) is defined similarly,
with the roles of X and Y reversed
Unlike the discrete case, this is not a conditional probability. However,
fY ∣X (y∣x) is a pdf of a continuous random variable, so the conditional
distribution is itself a continuous probability distribution.
Chapter 5: Multivariate random variables
8
5.8 Covariance and correlation
Suppose that the conditional distributions pY ∣X (y∣x) of a random variable Y
given different values of x of a random variable X are not the same, i.e. the
conditional distribution of Y ‘depends on’ the value of X
Therefore, there is said to be an association (or dependence) between X
and Y
If two random variables are associated (dependent), knowing the value of one
will help to predict the likely value of the other
5.8.1 Covariance
Properties of covariance
Chapter 5: Multivariate random variables
9
Suppose X and Y are random variables, and a, b, c, and d are constants
The covariance of a random variable with itself is the variance of the
random variable:
The covariance of a random variable and a constant is 0
The covariance of linear transformations of random variables is:
5.8.2 Correlation
Correlation and covariance are measures of the strength of the linear
association between X and Y
The further the correlation is from 0, the stronger the linear association
The most extreme possible values of correlation are -1 and +1, which are
obtained when Y is an exact linear function of X
Corr (X, Y) = +1 when Y = aX + b with a >0
Corr (X, Y) = -1 when Y = aX + b with a < 0
If Corr (X, Y) > 0 : X and Y are positively correlated
If Corr (X, Y) < 0 : X and Y are negatively correlated
Chapter 5: Multivariate random variables
10
5.8.3 Sample covariance and correlation
Chapter 5: Multivariate random variables
11
Let (X1 , Y1 ).(X2 , Y2 ), ..., (Xn , Yn ) be a sample of n pairs of observed
values of two random variables X and Y
We can use these observations to calculate sample versions of the
covariance and correlation between X and Y. These are measures of
association in the sample. They are also estimates of the corresponding
population quantities Cov(X, Y) and Corr(X, Y)
5.9 Independent random variables
Two discrete random variables X and Y are associated if pY ∣X(y∣x) depends
on x.
What if it does not?
Chapter 5: Multivariate random variables
12
If two random variables are independent, they are also uncorrelated:
Cov(X, Y) = 0 and Corr(X, Y) = 0
The reverse is not true, two random variables can be dependent even when
their correlation is 0
This can happen when the dependence is non-linear
5.9.1 Joint distribution of independent random variables
When random variables are independent, we can easily derive their joint pf
or pdf as the product of their univariate marginal distributions. This is
particularly simple if all the marginal distributions are the same.
Chapter 5: Multivariate random variables
13
5.10 Sums and products of random variables
Suppose X1 , X2 , ...., Xn are random variables. We now go from multivariate
setting back to the univariate setting, by considering univariate functions of
X1 , X2 , ..., Xn
a1 , a2 , ..., an and b are constants
Each such sum or product is itself a univariate random variable
The probability distribution of such a function depends on the joint distribution of
X1 , X2 , ...Xn
Chapter 5: Multivariate random variables
14
5.10.1 Distributions of sums and products
Sums
Products
Mean
Yes
Only for
independent
variables
Variance
Yes
No
Distributional
form
Normal: Yes Some other distributions: only for
independent variables
No
5.10.2 Expected values and variances of sums of random variables
If X1 , X2 , ..., Xn are random variables with means
E(X1 ), E(X2 ), ..., E(Xn ), respectively, and a1 , a2 , ..., an and b are
constants, then:
Two simple special cases of this, when n = 2 are:
E(X + Y) = E(X) + E(Y), obtained by choosing X1
Y , a1 = a2 = 1 and b = 0
E(X - Y) = E(X) - E(Y), obtained by choosing X1
Y , a1 , a2 = −1 and b = 0
Chapter 5: Multivariate random variables
= X, X2 =
= X, X2 =
15
If X1 , X2 , ..., Xn are random variables with variances
Var(X1 ), Var(X2 ), ..., Var(Xn ) and covariances Cov(Xi , Xj ) for i =/ j
and a1 , a2 , ..., an and b are constants, then:
In particular, for n = 2:
Var(X + Y) = Var(X) + Var(Y) + 2 x Cov(X, Y)
Var(X - Y) = Var(X) + Var(Y) - 2 x Cov(X, Y)
If X1 , X2 , ..., Xn are independent random variables, then Cov(Xi , Xj ) = 0
for all i =/ j:
In particular, for n = 2, when X and Y are independent:
Var(X + Y) = Var(X) + Var(Y)
Var(X - Y) = Var(X) +Var(Y)
These results also hold whenever Cov(Xi , Xj ) = 0 for all i =/ j, even if the
random variables are not independent
5.10.3 Expected values of products of independent random variables
If X1 , X2 , ..., Xn are independent random variables and a1 , a2 , ..., an are
constants, then:
In particular, when X and Y are independent:
E(XY) = E(X)E(Y)
There is no corresponding simple result for the means of products of
dependent random variables. There is also no simple result for the
Chapter 5: Multivariate random variables
16
variances of products of random variables, even when they are independent
5.10.4 Some proofs of previous results
5.10.5 Distributions of sums of random variables
Chapter 5: Multivariate random variables
17
We know the expected value and variance of the sum: a1 X1 + a2 X2
... + an Xn + b whatever the joint distribution of X1 , X2 , ..., Xn
+
This is usually all we can say about the distribution of this sum
In particular, the form of distribution of the sum (its pdf/pf) depends on the
joint distribution of X1 , X2 , ..., Xn , and there are no simple general results
about that
For example, even if X and Y have distributions from the same family, the
distributions of X +Y is often from that same family. However, such results
are available for a few special cases
Sums of independent binomial and Poisson random variables
Suppose X1 , X2 , ...Xn are random variables, and we consider the
unweighted sum:
That is, the general sum given by (5.2) with a1
b=0
= a2 = ... = an = 1 and
The following results hold when the random variables X1 , X2 , ...Xn
are independent, but not otherwise.
Application to the binomial distribution
Chapter 5: Multivariate random variables
18
Sums of normally distributed random variables
All sums (linear combinations) of normally distributed random variables
are also normally distributed
Chapter 5: Multivariate random variables
19
Chapter 5: Multivariate random variables
20
Download