p.m.f.

advertisement
Stat 330 (Spring 2015): slide set 11
1st processor
4001 4002 450
x
x
x
x
x
x
x
x
x
x
x
x
5001
x
x
x
x
5002
x
x
x
x
-
2
Since we draw at random, we assume that each of the above combinations
is equally likely. This yields the following probability mass function:
In total we have 5 · 4 = 20 possible combinations.
2nd processor
Ω
4001
4002
450
5001
5002
Summary: For a sample space we can draw a table of all the possible
combinations of processors. We will distinguish between processors of the
same speed by using the subscripts 1 and 2
Example (Cont’d)
Last update: January 28, 2015
Stat 330 (Spring 2015)
Slide set 11
Stat 330 (Spring 2015): slide set 11
1st proc.
mHz
400
450
500
2nd
400
0.1
0.1
0.2
processor
450 500
0.1 0.2
0.0 0.1
0.1 0.1
Solution:
= 0.1 + 0 + 0.1 = 0.2.
3
P (X = Y ) = pX,Y (400, 400) + pX,Y (450, 450) + pX,Y (500, 500)
This might be important if we wanted to match the chips to assemble a
dual processor machine
Question 1: What is the probability for X = Y ?
Probabilities:
Example (Cont’d)
Stat 330 (Spring 2015): slide set 11
1
Select two processors out of the box (without replacement) and let
X = speed of the first selected processor
Y = speed of the second selected processor
pX,Y (x, y) := P (X = x ∩ Y = y)
Example: A box contains 5 unmarked PowerPC G4 processors of different
speed
no
2 400 mHz
speeds:
1 450 mHz
2 500 mHz
2 variables case: Consider the two variables: X, Y are two discrete variables.
The joint probability mass function is defined as
Motivation: Real problems very seldom concern a single random variable.
As soon as more than 1 variable is involved it is not sufficient to think of
modeling them only individually - their joint behavior is important.
Compound p.m.f.
Stat 330 (Spring 2015): slide set 11
= 0.1 + 0.2 + 0.1 = 0.4.
P (X > Y ) = pX,Y (450, 400) + pX,Y (500, 400) + pX,Y (500, 450)
x,y
400 450 500 (mHz)
0.4 0.2
0.4
h(x, y)pX,Y (x, y)
x,y
|x − y|pX,Y (x, y) =
= 0 + 5 + 20 + 5 + 0 + 5 + 20 + 5 + 0 = 60.
6
|500 − 400| · 0.2 + |500 − 450| · 0.1 + |500 − 500| · 0.1 =
|450 − 400| · 0.1 + |450 − 450| · 0.0 + |450 − 500| · 0.1 +
= |400 − 400| · 0.1 + |400 − 450| · 0.1 + |400 − 500| · 0.2 +
E[|X − Y |] =
Correlation: The correlation between two variables X and Y is
Cov(X, Y )
ρ := V ar(X) · V ar(Y )
7
Remark: This definition looks very much like the definition for the variance
of a single random variable. In fact, if we set Y := X in the above
definition, then Cov(X, X) = V ar(X).
Cov(X, Y ) = E[(X − E[X])(Y − E[Y ])]
Covariance between X and Y :
Using the above definition of expected value of h(X, Y ) gives us:
Stat 330 (Spring 2015): slide set 11
Stat 330 (Spring 2015): slide set 11
Motivation: For two variables we can measure how “similar” their values
are:
5
4
Question 3: What is E[|X − Y |] (the average speed difference)?
Continued Example: Let X, Y be as before.
E[h(X, Y )] :=
Expected value for a function of several variables:
y
pY (y)
Solution: Here, we have the situation E[|X − Y |] = E[h(X, Y )], with
h(X, Y ) = |X − Y |.
pX,Y (x, y)
pX,Y (x, y)
400 450 500 (mHz)
0.4 0.2
0.4
Recall: Expected value of X is ....?
x
pX (x)
Example For the previous example the marginal probability mass functions
are
Statistics for multivariate case
x
y
Stat 330 (Spring 2015): slide set 11
Previous example again...
Example (Cont’d)
pY (y) =
pX (x) =
Marginal p.m.f.: Go from joint probability mass functions to individual pmfs:
Solution:
This is asking: What is the probability that the first processor has higher
speed than the second?
Question 2: What is the probability for X > Y ?
Example (Cont’d) and Marginal p.m.f.
X
10
Note: Random variables are independent, if all events of the form {X = x}
and {Y = y} are independent. Need pX,Y (x, y) = pX (x) · pY (y) for all
possible combinations of x and y.
11
To prove that two variables are independent we need to check pX,Y (x, y) =
pX (x) · pY (y) for all possible combinations of x and y.!
That means we need to find only one contradiction to prove that two
variables are not independent!
Answer: Therefore, X and Y are not independent!
pX,Y (450, 450) = 0 = 0.2 · 0.2 = pX (450) · pY (450).
Answer to the question:
= 250 + 0 − 500 + 0 + 0 + 0 − 500 + 250 + 0 = −500.
· · · + +(500 − 450)(500 − 450) · 0.1 =
Stat 330 (Spring 2015): slide set 11
Definition: Two random variables X and Y are independent, if their joint
probability pX,Y is equal to the product of the marginal densities pX · pY .
Recall: What is independence of events?
Question: Are X and Y independent?
(x − E[X])(y − E[Y ])pX,Y (x, y) =
Stat 330 (Spring 2015): slide set 11
Check
x,y
= (400 − 450)(400 − 450) · 0.1 + (450 − 450)(400 − 450) · 0.1 +
Cov(X, Y ) =
The covariance between X and Y is:
• E[X] = E[Y ] = 450
• V ar[X] = V ar[Y ] = 2000
Solution: Use marginal pmfs to compute:
9
−500
ρ=
=
= −0.25,
2000
V ar(X)V ar(Y )
Cov(X, Y )
Example (Cont’d)
ρ indicates a weak negative (linear) association.
ρ therefore is
Stat 330 (Spring 2015): slide set 11
In our previous Example: What is ρ(X, Y ) in our box with five chips?
Example (Cont’d)
8
ρ near ±1 indicates a strong linear relationship, ρ near 0 indicates lack
of linear association.
of
•
function
ρ is a measure of linear association between X and Y .
linear
•
ρ = 1 or -1,
Y
is a
ρ=1
→ Y = aX + b with a > 0,
ρ = −1 → Y = aX + b with a < 0,
if
•
ρ is between -1 and 1
Facts and inference about correlation:
•
Stat 330 (Spring 2015): slide set 11
Statistics for multivariate (cont’d)
Stat 330 (Spring 2015): slide set 11
V ar[aX + bY + c] = a2V ar[X] + b2V ar[Y ] + 2ab · Cov(X, Y )
12
Theorem: For two random variables X and Y and three real numbers a, b, c
holds:
V ar[X + Y ] = V ar[X] + V ar[Y ]
E[X · Y ] = = E[X] · E[Y ]
Theorem: If two random variables X and Y are independent,
More theorems on statistics
Download