Statistics 2014, Fall 2001

advertisement
1
More Than One Random Variable – Independence
In most experiments, we have more than one measured variable. Hence we need to examine
probabilities associated with events that specify conditions on two or more random variables.
Defn: Let X and Y be two continuous r.v.’s. If for any open rectangle in
2
bounded by a < x < b
b d
and c < y < d, we have P  a  X  b, c  Y  d     f  x, y dydx , then the function f(x, y) is called
a c
the joint probability density function (joint p.d.f.) for X and Y.
Defn: Let X and Y be two discrete r.v.’s. If for any pair of possible values x of X and y of Y, we have
f  x, y   P( X  x, Y  y) , then f(x, y) is called the joint probability mass function (joint p.m.f.) for
X and Y.
The above definitions may be immediately extended to any finite number of (continuous or discrete)
r.v.’s.
Defn: We say that the r.v.’s X1, X2, …, Xn are independent if for any E1, E2, …, En  , we have
P  X1  E1 , X 2  E2 , , X n  En   P  X1  E1  P  X 2  E2  P  X n  En  .
Example: p. 108, Exercise 3-130.
Reliability of Multi-Component Systems
As an example of the uses of sets of independent random variables, consider a system (mechanical or
electronic) consisting of several components which operate independently of each other. There are
two basic types of such systems, series and parallel.
Defn: A system consisting of several components is called a series system if all components must be
functioning for the system to function.
Defn: A system consisting of several components is called a parallel system if the system will
function so long as any single component is functioning.
Defn: The reliability of a system at a time t is the probability that the system will operate according to
specifications over the interval (0, t).
Note: For a series system, the system reliability is the product of the component reliabilities. For a
parallel system, the system reliability is 1 minus the product of the component failure probabilities.
Some systems may be hybrids of series and parallel systems – a system may, for example, be a series
system with components that are parallel systems of subcomponents.
Example: Consider a series system of n independently operating, identical electronic components.
Assume that the lifetime of each component has an exponential distribution with a mean of 500 hours.
Let X1, X2, ..., Xn be the component lifetimes. For component i, the probability that the component
will still operate up to time x is
2



1
 u 
 u 
 x 
Ri  P X i  x   
exp  
du   exp  
  exp  
 . The system reliability will be
500
 500 
 500  x
 500 

x
n
  x 
R   Ri  exp  
 .
i 1
  500 
n
Example: Instead of the series system, assume that the same components are connected in a parallel
system. The system reliability will be
n

 x 
R  1   1  Ri   1  1  exp  
 .
 500 
i 1

n
In general, a parallel system will have reliability greater than the reliability of any of its components,
while a series system will have reliability less than that of any of its components. In the example
above, assume that there are 5 components, and that we want to find the reliability at 300 hours. For
the series system, we have
5
  300 
R   Ri  exp  
  0.0498 , while for the parallel system, we have
i 1
  500 
n
n
5


 x 
 300 
R  1   1  Ri   1  1  exp  
  1  1  e 
  0.9813 .
 500 
 500 
i 1


n
Linear Functions of Random Variables
Let X be a r.v. with mean  and standard deviation , and let c be a constant. Then we may define the
r.v. Y by Y = X + c. Then the mean of the distribution of Y is  + c, and the standard deviation of Y is
.
Let Y = cX. Then the mean of Y is c and the standard deviation of Y is c.
Let X1, X2, …, Xn be (continuous or discrete) r.v.’s with means 1, 2, …, and n, respectively, and
variances
 12 ,  22 , …, and  n2 , respectively. Let c1, c2, …, cn be constants. We define a r.v. Y to be
the linear combination

n
n
cn X n   ci X i . Then
Y  c1 X 1  c2 X 2 
i 1

n
n

i 1
i 1
Y  E Y   E  ci X i    ci E  X i    ci i .
 i 1
This equation says that expectation is a linear operator.
n
2
2 2
Furthermore, if the r.v.’s are independent, then  Y   ci  i .
i 1
3
In addition, if the X’s have normal distributions, then Y also has a normal distribution. I. e., if Xi ~
Normal i ,  i2 , for i = 1, 2, …, n,


n
 n
2 2
and if the X’s are independent, then Y   ci X i ~ Normal   ci i ,  ci  i  .
i 1
i 1
 i 1

n
Example: p. 116, Exercise 3-145
What if the R.V.’s are not independent?
Let X1 be a (continuous or discrete) r.v. with mean 1 and variance
discrete) r.v. with mean 2 and variance
the distribution of Y?
 22 .
 12 .
Let X2 be a (continuous or
Let Y = X1 + X2. What are the mean and variance of
Since expectation is a linear operator, we have
Y  E  X1  X 2   E  X1   E  X 2   1  2 .
What about the variance?
Since expectation is a linear operator, we have
2
2
2
2
 Y2  E Y  Y    E Y 2    E Y   E  X1  X 2     E  1  2 




 E  X 12  2 X 1 X 2  X 22    12  2 12  22 

2
2
 E  X 12   12  E  X 22   22  2  E  X 1 X 2   12    1   2  2 E  X 1 X 2   1 2

Defn: The covariance of two r.v.’s X1 and X2 , with means 1 and 2, respectively, is
Cov X 1 , X 2   E  X 1  1  X 2   2   E X 1 X 2  1 2 .

With this notation, we have



 Y2  12   22  2Cov  X1 , X 2  .
Note that from the definition:
1) if larger values of X1 tend to be associated with larger values of X2, and smaller values of X1
tend to be associated with smaller values of X2, then the covariance is positive;
2) if larger values of X1 tend to be associated with smaller values of X2, and smaller values of X1
tend to be associated with larger values of X2, then the covariance is negative;
3)
If the two r.v.’s are independent, then Cov  X1 , X 2   0 .
Hence, the covariance describes how the two variables relate to each other. We want, however, a
standardized measure of relationship, which does not depend on the scale of measurement of either
variable.
4
Defn: The correlation of two r.v.’s X1 and X2 , with means 1 and 2, respectively, and standard
Cov X 1 , X 2  EX 1 X 2   1  2
deviations 1 and 2, respectively, is 12 
.

 1 2
 1 2
Properties of the correlation coefficient:
1) For any two r.v.’s X1 and X2, 1  12  1 .
2) If there is no linear relationship between the variables, then
12
= 0.
3) If there is a perfect positive linear relationship between X1 and X2, then
linear function of X1.
4) If there is a perfect negative linear relationship between X1 and X2, then
linear function of X1.
12
= 1, and X2 is a
12
= -1, and X2 is a
Given the above definitions, we may make the following statements about linear combinations of
r.v.’s:
Let X1, X2, …, Xn be (continuous or discrete) r.v.’s, with means
1, 2, …, and n, respectively, and variances  1 ,  2 , …, and  n , respectively. Let c1, c2, …, cn be
constants. We define a r.v. Y to be the linear combination
2
Y  c1 X 1  c2 X 2 
2
2
n
cn X n   ci X i . Then
i 1
n

 n
1) Y  E Y   E   ci X i    ci E  X i    ci i , and
i 1
 i 1
 i 1
n
2)  Y2   ci2 i2  2 ci c j CovX i , X j    ci2 i2  2 ci c j i j  ij .
n
i 1
n
i j
i 1
i j
Note that if the r.v.’s are all independent, then the correlations are 0, and the two equations above
reduce to the two equations from page
61 above.
Random Samples, Statistics, and the Central Limit Theorem
Defn: A set of (continuous or discrete) random variables X1, X2, ..., Xn is called a random sample if
the r.v.’s have the same distribution and are independent. We say that X1, X2, ..., Xn are independent
and identically distributed (i.i.d.).
Defn: A statistic is a random variable which is a function of a random sample. The probability
distribution associated with a statistic is called its sampling distribution.
5
For example, let X1, X2, ..., Xn be a random sample from a distribution having mean  and standard
deviation . The statistic
1 n
X   X i is called the sample mean. Since The X’s are random variables, then X is also a
n i 1
random variable, with a sampling distribution. From the equations on page 61 above, we have the
n 1  n 1


E
X

E
following:
X
 n X i    n    .
 i 1
 i 1
 
n
Since the members of the sample are i.i.d., then  X2  
i 1
1 2 2
.
 
n
n2
If the random sample was selected from a normal distribution (we write X1, X2, ..., Xn ~ Normal(, ),
then we can also say that

 
X ~ Normal  ,
 .
n

Example: p. 121, Example 3-161
Some other examples of statistics are:
1) The sample variance, S 2 
2) The sample median,
~
X,
3) The kth order statistic,
1 n
X i  X 2 ,

n  1 i 1
X k  .
Theorem: (Central Limit Theorem) If X1, X2, ..., Xn are a random sample from any distribution with
mean  and standard deviation
X 
 < +, then the limiting distribution of    as n  + is standard normal.


 n
Note: Nothing was said about the distribution from which the sample was selected except that it has
finite standard deviation. The sample could be selected from a normal distribution, or from an
exponential distribution, or from a Weibull distribution, or from a Bernoulli distribution, or from a
Poisson distribution, or from any other distribution with finite standard deviation. See, e.g., the
illustration on p. 120.
Note: For what n will the normal approximation be good? For most purposes, if n  30 , we will say
that the approximation given by the Central Limit Theorem (CLT) works well.
Example: p. 122, Exercise 3-163.
6
Example: The fracture strength of tempered glass averages 14 (measured in thousands of p.s.i.) and
has a standard deviation of 2. What is the probability that the average fracture strength of 100
randomly selected pieces of tempered glass will exceed 14,500 p.s.i.?
Example: Shear strength measurements for spot welds have been found to have a standard deviation of
10 p.s.i. If 100 test welds are to be measured, what is the approximate probability that the sample
mean will be within 1 p.s.i. of the true population mean?
Normal Approximation to Binomial Distribution
Assume that Y ~ Binomial(n, p). We may also consider Y to be a sum of n i.i.d. Bernoulli(p) random
Xp
Y  np
variables, X1, X2, ..., Xn. If n is large (n  30), then the CLT implies that

p(1  p)
np(1  p)
n
has an approximate standard normal distribution.
Example: Assume that the date is October 1, 2008. We want to predict the outcome of the Presidential
election. We select a simple random sample of 1068 voters from the population of all U.S. voters, and
ask each voter in the sample, “Do you intend to vote for Sen. Obama for President?” Let X = number
of voters in the sample who plan to vote for Sen. Obama. The actual level of support for the Senator in
the voting population was 0.53 (from the outcome of the election). What is the probability that a
majority of the voters in the sample will say that they are supporters of Sen. Obama? We will calculate
this probability exactly, using the binomial distribution, and then calculate an approximate probability,
using the normal approximation to the binomial distribution.
Example: It is known that under normal operating conditions, a machine tool produces 1% defective
parts. We want to decide whether the rate of defects has increased. We select a simple random sample
of 36 parts produced by this machine, and let X = number of defective parts in the sample. Again, we
will calculate the exact probability, and the approximate probability.
Download