MATH 2560 C F03 Elementary Statistics I LECTURE 18: Means and

advertisement
MATH 2560 C F03
Elementary Statistics I
LECTURE 18: Means and Variances of
Random Variables.
1
Outline
⇒ the mean of a random variable;
⇒ law of large numbers;
⇒ rules for means;
⇒ the variance of a random variable;
⇒ rules for variance;
2
The Mean of a Random Variable
⇒ Probability is the mathematical language that describes the long-run
regular behavior of random phenomena. The probability distribution of a
random variable is an idealized relative frequency distribution.
⇒ The mean x̄ of a set of observations is their ordinary average.
⇒ The mean of a random variable X is also an average of the possible
values of X : not all outcomes need be equally likely.
Example 4.19. Simple lottery wager: the state chooses threedigit winning number at random and pays you $500 if you number
is chosen. Let X be the amount your ticket pays you, then the probabiity
distribution is:
Payoff X
$0
$500
Probability 0.999 0.001
There are 1000 three-digit numbers, you have probability 1/1000 of winning.
What is your average payoff from many tickets? The ordinary average is:
250 = (500 + 0)/2. It makes no sense as the average because $500 is much
less likely than $0. The long-run average payoff is 500 × (1/1000) + 0 ×
(999/1000) = 0.50. This is the mean of the random variable X. (Tickets cost
$1, so in the long run the state keeps half the money you wager.)
⇒ The common symbol for the mean of a probability distribution
is µ, the Greek letter mu.
⇒ You will often find the mean of a random variable X called the expected value of X.
Below is the general definition for the mean of a discrete random variable.
Mean of a Discrete Random Variable
Suppose that X is a discrete random variable whose distribution is:
Value of X : x1 , x2 , x3 ,...,xk
Probability:p1 , p2 , p3 ,...pk .
To find the mean of X, multiply each possible value by its probability,
then add all the products:
X
µX = x1 p1 + x2 p2 + ... + xk pk =
xi pi .
Example 4.20. First nine digits ”at random” and Benford’s Law.
First nine digits ”at random”.
First digit X
Probability
1
1/9
2
1/9
3
1/9
4 ...
1/9 ...
9
1/9
The mean of this distribution is:
µX = 1(1/9) + 2(1/9) + ... + 9(1/9) = 45(1/9) = 5.
Benford’s Law.
First digit V
Probability
1
2
3
4
...
0.301 0.176 0.125 0.097 ...
9
0.046
The mean of V is:
µV = 1(0.301) + 2(0.0176) + 3(0.125) + 4(0.097) + 5(0.079) + 6(0.067)
+7(0.058) + 8(0.051) + 9(0.046) = 3.441.
The means reflect the gerater probability of smaller first digits under
Benford’s Law.
2.1
Statistical Estimation and Law of Large Numbers
⇒ To estimate µ, we choose an SRS of population and use the sample mean
x̄ to estimate the unknown population mean µ.
⇒ µ is a parameter and x̄ is a statistics.
⇒ If we keep on adding observations to our random sample, the statistic
x̄ is guaranteed to get as close as we wish to the parameter µ and then stay
that close.
⇒ This remarkable fact is called the law of large numbers.
Law of Large Numbers
Draw independent observations at random from any population with finite mean µ.
Decide how accurately you would like to estimate µ.
As the number of observations drawn increases, the mean x̄ of the
observed values eventually approaches the mean µ of the population
as closely as you specified and then stays that close.
⇒ The behavior of x̄ is similar to the idea of probability.
Figure 4.14 shows the behavior of the mean height x̄ of n women chosen at
random from a population whose heights follow the N (64.5, 2.5) distribution.
3
Rules for Means
Rules for Means
Rule 1. If X is a random variable and a and b fixed numbers, then
µa+bX = a + bµX .
Rule 2. If X and Y are random variables, then
µX+Y = µX + µY .
Example 4.23. The military and the civilian market. Let X and
Y be the number of military and number of civilian units sold, respectively.
Gain makes a profit of $2000 on each military unit sold and $3500 on each
civilian unit.
The military market.
Units sold 1000 3000 5000 10, 000
Probability 0.1
0.3
0.4
0.2
µX = 1000(0.1)+3000(0.3)+5000(0.4)+10, 000(0.2) = 100+900+2000+2000 = 5000
units.
Using Rule 1 we obtain:
The profit is:
µ2000X = 2000µX = 2000(5000) = $10, 000, 000.
The civilian market.
Units sold 300 500 750
Probability 0.4 0.5 0.1
µY = 300(0.4) + 500(0.5) + 750(0.1) = 120 + 250 + 75 = 445
units. Using Rule 1 we obtain:
The profit is:
µ3500Y = 3500µY = 3500(445) = $1, 557, 500.
The total profit (military and civilian):
Z = 2000X + 3500Y.
Using Rule 2 we obtain:
µZ = µ2000X + µ3500Y = 10, 000, 000 + 1, 557, 500 = 11, 557, 500
dollars.
Combining Rules 1 and 2 we can obtain the result more quickly:
µZ = µ2000X+3500Y = 2000µX +3500µY = 2000(5000)+3500(445) = 11, 557, 500
dollars.
4
The Variance of a Random Variable
2
⇒ We write the variance of a random variable X as σX
.
⇒ The variance is an average of the squared deviation (X − µX )2 of the
variable X from its mean µX . This is similar to the difinition of the sample
variance s2 given in Chapter 1.
Below is the definition of the variance for discrete random variable.
Variance of a Discrete Random Variable
Suppose that X is a discrete random variable whose distribution is:
Value of X : x1 , x2 , ..., xn ;
Probability: p1 , p2 , ..., pn .
Let µ is the mean of X.
The variance of X is:
2
σX
= (x1 − µX )2 p1 + (x2 − µX )2 p2 + ... + (xk − µX )2 pk =
X
(xi − µX )2 pi .
The standard deviation σX of X is the square root of the variance:
q
2
.
σX = + σX
Example 4.24. The military market (see Example 4.23). Let us find
the mean and variance of X by arranging the calculation in the form of a
2
table. Both µX and σX
are sums of columns in this table.
The military market.
xi
1, 000
3, 000
5, 000
10, 000
−
pi
xi pi
(xi − µX )2 pi
0.1
100
(1, 000 − 5, 000)2 (0.1) = 1, 600, 000
0.3
900
(3, 000 − 5, 000)2 (0.3) = 1, 200, 000
0.4
2, 000
(5, 000 − 5, 000)2 (0.4) = 0
0.2
2, 000
(10, 000 − 5, 000)2 (0.2) = 5, 000, 000
2
− µX = 5, 000
σX
= 7, 800, 000
The standard deviation is:
σX =
p
7, 800, 000 = 2792.8.
5
Rules for Variance
Rules for Variance
Rule 1. If X is a random variable and a and b are fixed numbers, then
2
2
σa+bX
= b2 σX
.
Rule 2. If X and Y are independent random variables, then
2
2
σX+Y
= σX
+ σY2 ,
2
2
σX−Y
= σX
+ σY2 .
This is the addition rule for variances of independent random variables.
Rule 3. If X and Y have correlation ρ, then
2
2
= σX
+ σY2 + 2ρσX σY ,
σX+Y
2
2
σX−Y
= σX
+ σY2 − 2ρσX σY .
This is the general addition rule for variance of random variables.
⇒ When random variables are not independent, the variance of their sum
depends on the correlation between them as well as on their individual
variances.
⇒ The correlation between two independent random variables is zero.
⇒ Rule 2 for variance implies that standard deviations of independent
random variables do not add.
Example 4.25. Simple lottery wager (see Example 4.19).
xi
pi
xi pi
(xi − µX )2 pi
0
0.999
0
(0 − 0.5)2 (0.999) = 0.24975
500 0.001
0.5
(500 − 0.5)2 (0.001) = 249.50025
2
−
−
µX = 0.5
σX
= 249.75
The standard deviation is:
σX =
√
249.75 = 15.80
dollars.
You lose an average:
µW = µX − 1 = 0.5 − 1 = −0.5
dollars, where
W =X −1
is your winning.
Let us buy a ticket on each of two different days: the payoff X and Y are
independent. Total payoff X + Y has mean:
µX+Y = µX + µY = 0.50 + 0.50 = 1.00
dollars. The variance of X + Y is:
2
2
σX+Y
= σX
+ σY2 = 249.75 + 249.75 = 499.5.
The standard deviation is:
σX+Y =
√
499.5 = 22.35
dollars.
This is not the same as the sum of the individual standard deviations:
15.80 + 15.80 = 31.60.
Example 4.26. SAT scores.
SAT math score X
SAT verbal score Y
µX = 625 σX = 90
µY = 590 σY = 100
The mean overall SAT score is:
µX+Y = µX + µY = 625 + 590 = 1215.
The variance and standard deviations of the total cannot be computed
from the information given.
We need to know the correlation between X and Y to apply Rule 3. Let
ρ = 0.7. Then:
2
2
σX+Y
= σX
+ σY2 + 2ρσX σY = 902 + 1002 + 2(0.7)(90)(100) = 30, 700.
The standard deviation of X + Y is equal to:
p
σX+Y = 30, 700 = 175.
Example 4.27. Investment portfolio and diversification. Someone
invested 20% in Treasury bills and 80% in an ”index fund” that represents
all US common stocks. Let X and Y be the annual return on T bills and on
stocks. The portfolio rate of return is:
R = 0.2X + 0.8Y.
Based on historucal data, we have:
X=annual return on T -bills µX = 5.2% σX = 2.9%;
Y =annual return on stocks µY = 13.3% σY = 17.0%;
Correlation between X and Y : ρ = −0.1.
The mean value of R is:
µR = 0.2µX + 0.8µY = (0.2 × 5.2) + (0.8 × 13.3) = 11.68%.
Applying Rules 1 and 3 we obtain the variance of the portfolio return:
2
2
σR2 = σ0.2X
+ σ0.8Y
+ 2ρσ0.2X σ0.8Y
2
= (0.2)2 σX
+ (0.8)2 σY2 + 2ρ(0.2σX )(0.8σY ) = 183.719.
The standard deviation is:
σR =
√
183.719 = 13.55%.
6
Summary
1. The probability distribution of a random variable X, like a distribution of
data, has a mean µX and a standard deviation σX .
2. The law of large numbers says that the average of the values of X
observed in many trials must approach µ.
3. The mean µ is the balance point of the probability histogram or density
curve.
If X is discrete with possible values xi having probabilities pi , the mean
is the average of the values of X, each weighted by its probability:
µX = x1 p1 + x2 p2 + ... + xk pk .
2
4. The variance σX
is the average squared deviation of the values of the
variable from their mean. For a discrete random variable,
2
σX
= (x1 − µ)2 p1 + (x2 − µ)2 p2 + ... + (xk − µ)2 pk .
5. The standard deviation σX is the square root of the variance. The
standard deviation measures the variability of the distribution about the
mean. It is easiest to intrepret for normal distributions.
6. The mean and variance of a continuous random variable can be computed
from the density curve, but to do so requires more advanced mathematics.
7. The mean and variances of random variables obey the following rules.
If a and b are fixed numbers, then
µa+bX = a + bµX ,
and
2
2
σa+bX
= b2 σX
.
8. If X and Y are any two random variables, then
µX+Y = µX + µY ,
and if X and Y are independent, then
2
2
σX+Y
= σX
+ σY2 ,
and
2
2
+ σY2 .
= σX
σX−Y
Download