Random Variables

advertisement
http://dept.econ.yorku.ca/~jbsmith/ec2500_1998/lecture14/lecture15.html
Random Variables
The sample spaces of random phenomena need not consist solely of numbers. For
example, they could be H, T. Often, it is convenient to associate a number with the
outcomes in a sample space.
e.g. flipping a coin H = 1, T = 0
e.g. flipping a coin 4 times: For each outcome, count the number of H (=0, 1, 2, 3, 4)
Random Variable:
Def#1: A random variable is a variable whose value is the numerical
outcome of a random phenomenon
Def#2: A random variable is a numerical function defined on the outcome
of a random phenomenon: that is, for each outcome there is assigned a
unique numerical value.
e.g. # of heads in 4 coin tosses.
Discrete Random Variables
A discrete random variable X takes a finite number of values, call them x1, x2, …, xk. The
probability model for X is given by assigning probabilities Pi to these outcomes:
P(X = xi) = Pi
The probabilities must satisfy:
1. 0  Pi  1 for each i
2. P1 + P2 + … + Pk = 1
The probability P(A) = P (X takes values in A) is found by summing the Pi for the
outcomes xi making up A.
Note 1: random variables are denoted by capital letters such as X, Y, Z; outcomes by
'small' letters x, y, z.
Note 2: The assignment of probabilities to the value of a random variable X is called the
probability distribution of X
This can sometimes be expressed as a table
Outcome
x1
x2
x3
Probability
P1
P2
P3
Graphical Display of Probability
Distribution
Deriving the probability distribution of some
r.v.'s
X rv#1 … # of heads in 1 toss of a coin
Y rv#2 … # of heads in 2 tosses of a coin
Z rv#3 … # of heads in 3 tosses of a coin
X: outcome 0 1 Event A = > 1 head in 4 tosses
Probability .5 .5 P(A) = P(2) + P(3) + P(4)
= .6875
Y: outcome 0 1 2
Probability .25 .5 .25
Z: outcome 0 1 2 3
Probability .125 .375 .375 .125
XX: outcome 0 1 2 3 4
Probability .0625 .25 .375 .25 .0625
Continuous Random Variables
Think of the problem of choosing any real numbers at random in the interval [0, 1] .
What does this mean?
Think if it as a string and make it into a circle:
Now, put his on a piece of cardboard and make a "spinner"
like a compass.
S
p
i
n
t
h
e
s
p
i
n
n
e
r
&
w
h
e
r
e
i
t
e
n
d
s
u
p
i
s
y
o
u
r
"
d
r
a
w
"
f
r
o
m
[
0
,
1
]
.
What does it mean to draw 'at random' from 0,1? Presumably, each interval of equal
length in [0,1] has the same probability of having the spinner stop in it.
Associating Area with Probability
Area = .25 each
This is a special case where every equal-length interval has the same probability. The
area of the entire interval = 1
Tough Question: What is the probability of a point? -even though we observe points, it
has to be 0
Continuous Random Variables
A continuous random variable X takes all values in an interval of real numbers. A
probability model for X is given by assigning to a set of outcomes, A, the probability
P(A) equal to the area above A and under a curve. The curve is the graph of a function
p(x) that satisfies:
1. p(x)  0 for all outcomes x
2. the total area under the graph of p(x) =1.
Note: Calculating area is often hard (requires integration) we will often
make use of tables that give us areas under curves.
A link with the past.
In Chapter 1 we considered summarizing (approximately) relative frequency histograms
with density curves. We talked about the area under density curves as corresponding to
the relative frequency of an event. This is now reappearing except we refer to relative
frequency as probability and the density curve becomes the probability density curve.
(for continuous random variables)
We can think of a normal random variable as one with mean  , variance  2 and
probability density given by the N( , ) curve.
Before we go on, though, we have to clarify the concept of means and variance for
random variables.
Something to keep in mind
Remember that we started in Chapter 1 with the notion of a list of numbers. We then
wanted to summarize/describe the list of numbers so we introduced notions of
distributions, centre (mean, median) and spread (variance, standard deviation, IQR).
Essentially, random variables and lists are linked in the following way: suppose that some
lists of numbers represent the numerical outcomes of random phenomena/experiments.
Thus, the distriubtion of lists should correspond to the distribution of possible outcomes
of the random variable.
Of course, not all lists of numbers correspond to outcomes of a random variable… but, in
this course, we are interested in the ones that do.
Mean of a Discrete Random Variable
Definition:
If X is a discrete random variable taking on a finite number of possible outcomes or
values x1, x2, …, xk with probabilities p1, p2, … pk, then the mean (sometimes called the
expected value) of X is found by multiplying each outcome by its probability and adding
over all of the possible outcomes:
Mean of x =  = x1p1 + x2p2 + … + xkpk
=  xipi
Note:  x refers to the mean of the random variable X.  is the Greek letter 'mu'
Note: for random variable we use  instead of
(… which was the average of a list).
BUT: where does this fancy formula come from and what does it mean?
Actually, we can make sense of the notion  x by just building on something we already
know how to do… that is, computing averages.
Recall, if we have a list of numbers {x1, x2, …, xn} we calculate the average number by
the formula:
Now, suppose our list comes from recording the results of an experiment. Suppose, to
choose a special case, that the list comes from repeatedly calculating the number of heads
in 2 tosses of a fair coin. Thus, our list might look something like:
{0 , 0 , 2 , 1 , 1 , 1 , 0 , 2 , 2 , 1 , 2 , 1 , 1 ,….}
In a sense, our list is just a record of the outcome of a discrete random variable that can
take on one of 3 possible values 0, 1, 2.
Let's get the average for the first 6 elements of the list:
Note - I put a subscript on
outcomes
to remind us we had used 6
Lets organize this calculation in just a little bit different way, but one which will help us a
lot.
But this will be true if we take 10 terms from the list or k terms
This was kind of sneaky, but true.
Now, let k get really big and remember our frequency theory of probability where we
agreed that we would define probability of an event as the relative frequency when there
are an infinite numbers of repetitions. So as k gets larger we write:
For large k
This is just the formula we started from. 'Mean' is much like 'limiting case of averages.'
The story I have told you is an application of what is called the 'Law of Large Numbers'
in combination with our frequency theory of probability.
To Recap:
If a discrete random variable X has possible outcomes x1, x2, … xk (here we had k = 3, x1
= 0, x2 = 1 and x3 = 2) with associated probabilities P1, … Pk, then the mean of the
random variable is:
 x =  Pixi
The mean is the average calculated with probabilities, which we take to be "long run"
relative frequencies.
Example: Vermont Lottery
Choose a 3 digit number (there are 1000 possible values, WHY?). IF your number
matches the one drawn at random, you get $500. Otherwise, you get 0. You pay $1 to
play the game.
What are expected winnings? x1 = 0, x2 = 500
P1 = .999, P2 = .001
Expected winnings = .999(0) + (.001)(500) = 0.50
In average you win 50¢ but you pay $1 to play, so your net winnings, on average, = -50¢.
The state takes home 50¢, on average, for every ticket purchased.
Example 2: x = # of heads in 4 tosses of a fair coin
x
Outcomes
Probs.
0
1
2
3
4
.0625
.25
.375
.25
.0625
 x = 0 (.0625) + 1(.25) +2(.375) +3(.25) +4(.0625)
=2
How the law of large numbers applies. Think of a list of 1000 outcomes of tossing a coin
4 times and counting the number of heads. Calculate the sample average
3, …, 1000. Now, graph the results and you will get something like:
for k = 1, 2,
Download