Lecture 2

advertisement
Module II
Random Variables
As pointed out earlier, variability is omnipresent in the business world. To
model this variability statistically, we introduce the concept of a random variable.
A random variable is a numerically valued variable which takes on different
values with fixed probabilities.
Examples:
The return on an investment in a one year period
The price of an equity
The number of customers entering a store
The sales volume of a store on a particular day
The turnover rate at your organization next year
In order to illustrate a random variable, consider the simple experiment of
flipping a fair coin 4 times in a row. The table below shows the 16 possible
outcomes each with probability 1/16 = .0625.
HHHH
TTTT
HHHT
TTTH
HHTH
TTHT
HTHH
THTT
THHH
HTTT
HHTT
TTHH
HTHT
THTH
HTTH
THHT
Since this is a list of all possible outcomes of this “experiment”, this list is
called the “Sample Space” of this experiment.
Notice that there is no random variable.
Let x = the number of heads in the four flips. By simple counting we obtain
the table below:
Outcome
x
Outcome
x
HHHH
4
TTTT
0
HHHT
3
TTTH
1
HHTH
3
TTHT
1
HTHH
3
THTT
1
THHH
3
HTTT
1
HHTT
2
TTHH
2
HTHT
2
THTH
2
HTTH
2
THHT
2
By counting the number of times each value of x occurs, we obtain the
following table:
x
Pr(x)
0
1/16 = .0625
1
4/16 = .2500
2
6/16 = .3750
3
4/16 = .2500
4
1/16 = .0625
We can portray this same data graphically as shown below:
Probability
Probability Distribution of x
0.4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
0
0
1
2
3
x = number of heads
4
This graph is a representation of the probability distribution of the random
variable x.
It looks very much like a histogram, but it is very different since no sample
has been taken. However, we can generalize the ideas we explored for sample data
to random variables.
Consider the formula for finding the sample mean from grouped data,
specifically,
x   f i m i / n   ( f i / n )m i   pi m i
i
i
i
By analogy, one then has:
  E ( x )   xP ( x )
x
In our case we get:
E(x) = (0 x .0625)+(1 x .25)+(2 x .375)+(3 x .25)+( 4x .0625) = 2.
By a similar argument, one can show that the standard deviation of a
random variable can be computed using the formula:
  SD( x ) 
( x   )
2
P( x )
x
By formal mathematical manipulation, the formula can be simplified to:
SD( x ) 
2
2
x
P
(
x
)

E
(x)

x
EXCEL does not automatically compute the expected value and standard
deviation of a random variable, but it is extremely easy to do as illustrated below:
x
P(x)
xP(x)
x 2P(x)
0
1
2
3
4
0.0625
0.2500
0.3750
0.2500
0.0625
0
0.25
0.75
0.75
0.25
0.00
0.25
1.50
2.25
1.00
Sum
1.0000
2.00
5.00
We see by adding up the entries in the third Column that
E(x) = 2
Further by using the sum of the entries in the fourth column, we have that
SD( x )  5  2 2  5  4  1
All of the concepts we introduced for samples also apply for random
variables. For example Chebyshev’s inequality continues to hold for random
variables as it did for the sample data.
The mound rule is also applicable.
Finally the concept of “t” scores also applies except since we are now using
theoretical rather than sample values we shall use “z” scores where z is defined as:
z = (x - )/ .
It is possible to define another random variable on the same sample space.
Let y equal the number of heads which occur before the first tail when you
flip a fair coin four times. The result is:
Outcome
x
y
Outcome
x
y
HHHH
4
4
TTTT
0
0
HHHT
3
3
TTTH
1
0
HHTH
3
2
TTHT
1
0
HTHH
3
1
THTT
1
0
THHH
3
0
HTTT
1
1
HHTT
2
2
TTHH
2
0
HTHT
2
1
THTH
2
0
HTTH
2
1
THHT
2
0
Again, since we are assuming a fair coin, each sequence of four flips has
probability 1/16 = .0625. By counting the points for each distinct value of y, we can
construct the following table followed by a pictorial representation of the
probability distribution of the random variable y:
y
P(y)
0
1
2
3
4
0.5000
0.2500
0.1250
0.0625
0.0625
Sum
1.0000
Probability Distribution of y
0.6000
Probability
0.5000
0.4000
0.3000
0.2000
0.1000
0.0000
0
1
2
3
y = heads before first tail
4
As before we can compute the expected value and standard deviation of y
using EXCEL by cumulating the necessary sums as shown below:
y
P(y)
yP(y)
y2P(y)
0
1
2
3
4
0.5000
0.2500
0.1250
0.0625
0.0625
0.0000
0.2500
0.2500
0.1875
0.2500
0.0000
0.2500
0.5000
0.5625
1.0000
Sum
1.0000
0.9375
2.3125
We then have:
E(y) = .9375
SD( y )  2.3125  (.9375 ) 2  1.4335938  1.1973278
Notice that in this case the expected value is not one of the actual values of y
that can occur.
Since for each of the outcomes in our sample space we have both an x and y
value, it is possible to array the data in two dimensions simultaneously to obtain
what is called the joint probability distribution of x and y. The table below gives the
numerical values of P(x,y) for the possible x and y values in our experiment:
x\y
0
1
2
3
4
Row
Sum
0
1
2
3
4
0.0625
0.1875
0.1875
0.0625
0
0
0.0625
0.125
0.0625
0
0
0
0.0625
0.0625
0
0
0
0
0.0625
0
0
0
0
0
0.0625
0.0625
0.2500
0.3750
0.2500
0.0625
0.5000
0.2500
0.1250
0.0625
0.0625
1.0000
Column Sum
The totals to the right and below the columns of the table are called the
marginal probabilities of x and y respectively, and they agree with the probability
distributions computed for x and y earlier.
Graphically, the joint probability distribution can be illustrated three
dimensionally as in the figure on the next page.
Joint Distribution of x and y
0.2
0.18
0.16
0.14
0.12
0.1
0.08
0.06
0.04
0
0.02
1
y
0
2
3
4
0
1
2
3
x
4
Prob
It is possible to construct the conditional distribution of y given specific
values of x. For example suppose that x=3 (which has probability .25 from the
table). By the basic rules of probability:
P(y x = 3) = P(y and x = 3)/P(x = 3).
Therefore,
P(y = 0 x = 3) = .0625 / .25 = .25
P(y = 1 x = 3) = .0625 / .25 = .25
P(y = 2 x = 3) = .0625 / .25 = .25
P(y = 3 x = 3) = .0625 / .25 = .25
P(y = 4 x = 3) = .0000 / .25 = .00.
The conditional distribution of y given that x = 2, would be:
P(y = 0 x = 2) = .1875 / .375 = .5000
P(y = 1 x = 2) = .1250 / .375 = .3333
P(y = 2 x = 2) = .0625 / .375 = .1667
P(y = 3 x = 2) = .0000 / .375 = .0000
P(y = 4 x = 2) = .0000 / .375 = .0000.
The table below shows the conditional distributions for all cases as well as the
conditional expected value for each value of x.
Conditional Distributions of y given x
Value of y
0
1
2
3
4
Row
Sum
E(y | x)
Given
x=0
1
0
0
0
0
1
0
x=1
0.75
0.25
0
0
0
1
0.25
x=2
0.5
0
0
1
0.666667
x=3
0.25
0.25
0.25
0.25
0
1
1.5
x=4
0
0
0
0
1
1
4
0.333333 0.166667
The formula for the expected value of y given the value of x is given by the
formula:
E ( y | x )   yP ( y | x )
y
As can be seen in the table above, the expected value of y given x changes as
the value of x changes. This implies that there is some sort of relationship between y
and x.
The relationship of x and the expected value of y given x is shown below.
Expected Relationship
y given x
4.5
4
3.5
E(y|x)
3
2.5
2
1.5
1
0.5
0
0
1
2
3
4
5
x
Since the expected value of y changes given the value of x, this implies that y
and x are related, that is that they are not independent. When we studied sample
data we measured the degree of dependency between y and x by using the
correlation coefficient r. This concept can be generalized to measure the
dependency between two random variables.
The correlation, , between two random variables x and y is defined as:
 = Covariance (x , y) / ( SD(x) SD(y) ).
We already know the formulae for SD(x) and SD(y), the Covariance (x, y) is
defined as:
Covariance (x , y) = E(xy) – E(x) E(y),
where:
E ( xy )    xyP ( x , y )
x
y
We can compute E(xy) by taking our original joint probability distribution
shown below:
x\y
0
1
2
3
4
Row
Sum
0
1
2
3
4
0.0625
0.1875
0.1875
0.0625
0
0
0.0625
0.125
0.0625
0
0
0
0.0625
0.0625
0
0
0
0
0.0625
0
0
0
0
0
0.0625
0.0625
0.2500
0.3750
0.2500
0.0625
0.5000
0.2500
0.1250
0.0625
0.0625
1.0000
Column Sum
Then for each cell, multiply the probability in the cell by the row value of x and
then by the column value of y. For example the value of xyP(x,y) when x = 2, and
y=2 is
2 x 2 x (.0625) = .25.
The table below performs the above computation for each cell of the joint
probability distribution:
x\y
0
1
2
3
4
0
1
2
3
4
0
0
0
0
0
0
0.0625
0.25
0.1875
0
0
0
0.25
0.375
0
0
0
0
0.5625
0
0
0
0
0
1
The sum of all the values is
E(xy) = 2.6875.
Therefore we can compute  as:
 = [E(xy) – E(x)E(y)]/[SD(x) SD(y)]
= [2.6875 – (2)(.9375)]/[(1) (1.1973278)]
= .8125 / 1.1973278 = .6785945 .
One interprets  in exactly the same way as we interpret the sample
correlation coefficient r, that is we square it:
2 = (.6785945)2 = .4605 .
This implies that is we use x as a predictor of y, we can eliminate
approximately 46.05% of the variability in y.
Notice that this example makes it clear that there is no “causality” implied by
the fact that the theoretical correlation coefficient is relatively large. They are both
“caused” by the underlying experiment of flipping a fair coin four times.
Download