Proof of results given in Equations (7.1.11) and (7.1.12) of the Text: To prove (7.1.11) and (7.1.12) consider the sample sum T X 1 N X n for all n possible samples. We wish to find the mean of all these T’s, which is what is meant by E (T ) . We have, E (T ) where T1 , ,T N n T1 ... T N n N n , (1) N would be the possible sample sums if they were arranged in some n order. Now each of the X i ' s, that is, X1 , write (1) in terms of X1 , N 1 , X N , occurs in T’s. Hence we may n 1 , X N as follows: N 1 ( X1 n 1 E (T ) N n N 1 N n But and X1 n 1 n N XN ) (2) X N N ; we have n E (T ) N n , N (3) thus verifying (7.1.11). Now consider (7.1.12). We can write 2 1 Var (T ) T1 n N n 2 T n . N (4) n Using (1) and (3), we find, using Var (T) = E (T 2 ) ( E (T ))2 , that Var (T ) 1 2 T1 N n T 2 n2 2 . N n (5) Now for any sample X 1 , ( X 12 X n2 2 X 1 X 2 , X n , the T for this sample is ( X 1 2 X n 1 X n ) . Thus, any X 2 , say X i2 , occurs in any product of two X’s say X i X j occurs in Var (T ) in terms of X 12 , Var (T ) 1 N n X n ) and T 2 is , X N2 , X 1 X 2 , N 2 n2 T N 1 n 1 2 ’s, and of the T 2 ’s. Hence, we can express , X N 1 X N as follows, N 1 2 2 2 N 2 2 ( X 1 ... X N ) 2 ( X 1 X 2 ... X N 1 X N ) n n2 n 1 (6) which can be written as Var (T ) N 1 N 2 1 2 ( X1 n 1 n 2 N n N 2 ( X1 n2 This equation results by adding and subtracting X N )2 n2 2 . (X N 2 n2 X N2 ) 2 1 (7) X N2 ) inside the bracket in (7) and noting that N 2 2 ( X1 X 2 n2 N 2 2 X N 1 X N ) ( X1 n2 X N2 ) N 2 ( X1 n2 But since N 1 N n n 1 n N and X 1 and N 2 N n (n 1) n 2 n N ( N 1) X N N , we find that (7) reduces to Var (T ) n ( N n) [( X 12 N ( N 1) X N2 ) N 2 ]. Now using 2 we obtain 1 ( X 12 X 22 N X N2 N 2 ) X N )2 . N n 2 Var (T ) n N 1 which proves (7.1.12). Note that if N then Var(T) = n 2 , and consequently Var( X ) 2 / n. 7.2.1 The Central Limit Theorem Theorem 7.2.1 (Central Limit Theorem) If X 1 , , X n are independent and identically distributed with mean and variance 2 (both finite), then the limiting form of the distribution of Z n X as n is that of the standard normal, that is, normal with / n mean 0 and variance 1. Proof: We prove this theorem for the case when the moment generating function of X i ' s exists. We write Z n as Zn Xi X n n n X X i n n n n i 1 Ui n , where U i X i . Because Xi’s are independent and identically distributed with mean and variance 2 , U i , i 1, 2, , n are independent and identically distributed with mean 0 and variance 2 . Furthermore, from Chapter 6, we have that the moment generating function of a linear combination of independent random variables, U i ’s, is the product of their moment generating functions, so that the moment generating function of Z n is given by t M Zn (t ) M Ui n i 1 n Again, from (4.2.12), we know that for any random variable, say X, that (1) M x (t ) 1 1 2 t2 t3 3 2! 3! (.2) Thus, from (1) and (2), it follows that n t2 t3 M Zn (t ) 1 3 2!n 3! 3n3/ 2 i 1 (3) where r E (U ir ) and r E (( X ) r ). Now (3) may be rewritten in the form t2 n M Z n (t ) 1 2n i 1 n n t2 n 1 , 2n (4) where n n (t ) (2 j t j ) / j ! j n j /21 so that n approaches zero as n . Thus, j 3 letting n , we have 1 t2 lim M Z n (t ) e 2 as n , which is the m.g.f. of the standard normal distribution. Thus, Theorem 6.3.1 and Theorem 7.2.1states that as n , the limiting distribution of Z n is the standard normal. That is, for n large enough M Z n (t ) is approximately that of the m.g.f. of the standard normal so that, approximately, the distribution of Z n for n large enough, is that of the standard normal. This in turn says that for large n, X i is approximately normally distributed with n mean n and variance n 2 and X X i / n is approximately normal with mean and i 1 n variance 2 / n . In fact, algebraically, Zn X i 1 i n n X . / n Theorem 7.3.6 If X and Y are independent random variables having the normal distribution N (0,1) and the Chi-Square distribution with n degrees of freedom respectively, then the random variable X Y /n T= (1) has the following probability density function ((n 1) / 2) t 2 f(t) = 1 n n (n / 2) n 1 2 t . (2) Proof : To prove this we note that, since X and Y are independent, (X, Y) has the joint probability density function g ( x, y ) = 1 x2 /2 ( y / 2)n /21 e y /2 e . 2(n / 2) 2 (3) Now consider the transformation T= X , U Y, Y /n which has as the inverse transformation, X T U , n Y U. We then find that the joint probability density function of (T, U) is f (t , u) g ( x, y) J (4) where the absolute value of the Jacobian, J , is easily seen to be U / n , so that f (t , u ) = 1 1 t2 (u / 2)[( n1)/2]1 exp u(1 ) . n 2 n 2(n / 2) 2 Now integrating (5) with respect to u for a fixed t, we obtain (2). (5) Theorem 7.3.10 The probability density function of the Snedecor F-distribution with 1 and 2 degrees of freedom is 1 /2 [( 1 2 ) / 2] 1 h( f ) ( 1 / 2)( 2 / 2) 2 f 1 /2 1 1 f 1 2 (1 2 )/2 = 0, , f 0 (1) otherwise Proof: To prove Theorem 7.3.10, we begin with the p.d.f. of ( X 1 , X 2 ) , where X1 and X2 are two independent random variables having Chi-Square distributions with 1 and 2 degrees of freedom respectively. The joint p.d.f is given by g ( x1 , x2 ) ( x1 / 2)(1 /2)1 ( x2 / 2)( 2 /2) 1 ( x1 x2 )/2 , e 4( 1 / 2)( 2 / 2) for x1 0, x2 0 Let us use the transformation F X1 /1 ,U X 2 X 2 / 2 which has the inverse transformation given by X1 1 UF , X 2 U 2 Now the joint p.d.f. of (F, U) is h(f, u) = g ( x1 , x2 ) J where the Jacobian J is easily seen to be equal to ( 1 / 2 )u . Thus, the joint p.d.f. of (F, U) is given by h( f , u ) ( 1 / 2 )1 /2 (u / 2)(1 2 )/21 f (1 /2) 1 1 exp[ u (1 1 f )] , f 0, u 0 . 2( 1 / 2)( 2 / 2) 2 2 (2) Integrating (7.3.19) with respect to u from 0 to for a fixed f gives (1). Also, at this point, we may remark that t2 F1, as the reader may easily verify. 7.4 Order Statistics (Optional) In this section we shall consider probability distributions of statistics that are obtained if one orders the n elements of a sample from least to greatest, and if sampling is done on a continuous random variable X whose p.d.f. is f(x). Suppose we let X 1 ,..., X n be a random sample from a population having continuous p.d.f. f(x). We note that since X is a continuous random variable, the probability of X assuming a specific value is 0. In fact, by a straightforward conditional probability argument, we can show that for any two of ( X 1 ,..., X n ), the probability of their having the same value is zero. Consider then the observations ( X 1 ,..., X n ) from a population having a p.d.f. f (x) . Let X (1) = smallest of ( X 1 ,..., X n ), X (2) = second smallest of ( X 1 ,..., X n ), X ( k ) = k-th smallest of ( X 1 ,..., X n ), X ( n ) = largest ( X 1 ,..., X n ). Note that X (1) < X (2) < …< X ( k ) …< X ( n ) . The quantities X (1) , X (2) ,..., X ( n ) are random variables and are called the order statistics of the sample. X (1) is called the smallest element in the sample, X ( k ) the kth-order statistic; X ( m 1) is the sample median when the sample size is odd so that n = (2m+1), and X ( n ) the largest; R X ( n ) X (1) is called the sample range. 7.4.1 Distribution of the Largest Element in a Sample As we have just stated, X ( n ) is the largest element in the sample X 1 ,..., X n . If the sample is drawn from a population having p.d.f. f (x ) , let F(x) be the c.d.f. of the population defined by x F ( x) f (u)du P( X x) (7.4.1) Then the c.d.f. of X ( n ) is given by P( X ( n ) x) P( X 1 ,..., X n are all x ) ( F ( x))n , (7.4.2) because the X i ' s are independent and P( X i x) F ( x) for i 1, 2,..., n . If we denote the c.d.f. of the largest value by G ( x ) , we have G( x) ( F ( x))n (7.4.3) The above result says that if we take a random sample of n elements from a population whose p.d.f. is f (x ) [or whose c.d.f. is F(x)], then the c.d.f. G ( x ) of the largest element in the sample, denoted by X, is given by (7.4.3). If we denote the p.d.f. of the largest element by g X ( n ) ( x) , we have g X ( n ) ( x) (d / dx)G( x) n( F ( x))n1 f ( x) . (7.4.4) Example 7.4.1 (Distribution of Last Bulb to Fail) Suppose the mortality of a certain type of mass-produced light bulb is such that a bulb of this type, taken at random from production, burns out in time T. Further, suppose that T is distributed as exponential with parameter , so that the p.d.f. of T is given by f (t ) e t t>0, =0, t 0 (7.4.5) where is some positive constant. If n bulbs of this type are taken at random, let their lives be T1 ,..., Tn . If the order statistics are T(1) ,..., T( n ) , then T( n ) is the life of the last bulb to burn out. We wish to determine the p.d.f. of T( n ) . Solution: To solve this problem we may think of a population of bulbs whose p.d.f. of length of life is given by (7.4.5), we first determine that the c.d.f. of T is given by t t 0 F (t ) f (t )dt et dt 1 et . (7.4.6) Applying (7.4.4), we therefore have as the p.d.f. of T( n ) , gTn (t ) n (1 et )n1 et , = 0, t >0 t 0 (7.4.7) In other words, the probability that the last bulb to burn out expires during the time interval (t , t dt ) is given by g(t)dt, where g (t )dt n (1 et )n1 et dt . 7.4.2 Distribution of the Smallest Element in a Sample (7.4.8) We now wish to find the expression for the c.d.f, of the smallest element X (1) in the sample X 1 ,..., X n . That is, we want to determine P( X (1) x) as a function of x. Denoting this function by G(x), we have G ( x) P ( X (1) x) 1 P( X (1) x) . (7.4.9) But P( X (1) x) P( X 1 ,..., X n are all x ) [1 F ( x)]n , (7.4.10) because the X i ' s are independent and P( X i x) 1 F ( x); i 1, 2,..., n. Therefore, the c.d.f. G(x) of the smallest element in the sample is given by G( x) 1 [1 F ( x)]n . (7.4.11) The p.d.f., say g(x), of the smallest element in the sample is therefore obtained by taking the derivative of the right-hand side of (7.4.11) with respect to x. We thus find g ( x) n[1 F ( x)]n1 f ( x) That is, the p.d.f. of X (1) , is given by g X (1) ( x) n[1 F ( x)]n1 f ( x) . (7.4.12) Example 7.4.2 (Probability Distribution of the Weakest Link of a Chain) Suppose links of a certain type used for making chains are such that the population of individual links has breaking strengths X with p.d.f. f ( x) (m 1)( m 2) m x (c x ) , c m2 = 0, 0 x c, (7.4.13) otherwise where c and m are certain positive constants. If a chain is made up of n links of this type taken at random from the population of links, what is the probability distribution of the breaking strength of the chain? Since the breaking strength of a chain is equal to the breaking strength of its weakest link, the problem reduces to finding the p.d.f. of the smallest element X (1) in a sample of size n from the p.d.f . f (x ) given in (7.4.13). First we find the c.d.f. F(x) of breaking strengths of individual links by performing the following integration: x F ( x) (m 1)(m 2) m 0 u (c u )du , c m2 x f (u )du (7.4.14) that is, x F ( x) (m 2) c m 1 x (m 1) c m2 . (7.4.15) With the use of (7.4.12) and (7.4.13) we obtain the p.d.f. of the breaking strength X of an n-link chain made from a random sample of n of these links; m 1 m2 (m 1)(m 2) x m x x g ( x) n 1 (m 2) (m 1) c m 2 c c n 1 (c x), (7.4.16) for 0 x c, and g(x)=0, otherwise. 7.4.3 Distribution of the Median of a Sample and of the kth Order Statistic Suppose we have a sample of 2m+1 elements X1 ,..., X 2 m1 from a population having p.d.f f (x ) [and c.d.f. F(x)]. If we form the order statistics X (1) ,..., X (2 m 1) of the sample, then X ( m 1) is called the sample median. We want to determine the probability distribution function for the median. Let us divide the x-axis into the following three disjoint intervals: I1 (, x], I 2 ( x, x dx], (7.4.17) I3 ( x dx, ) . Then the probabilities p1 , p2 , p3 that an element X drawn from the population with p.d.f f (x ) will lie in the intervals I1 , I 2 , I 3 are given, respectively, by p1 F ( x) p2 F ( x dx) F ( x) p 1 F ( x dx) 3 (7.4.18) respectively. If we take a sample of size 2m+1 from the population with p.d.f f (x ) , the median of the sample will lie in ( x, x dx) if, and only if, m sample elements fall in I1 (, x] , one sample element falls in I 2 ( x, x dx] and m sample elements fall in I 3 ( x dx, ) . The probability that all of this occurs is obtained by applying the multinomial probability distribution discussion in Section 4.7. This gives (2m 1)! ( p1 ) m ( p2 )1 ( p3 ) m . 2 (m!) (7.4.19) But substituting the values of p1 , p2 , p3 from (7.4.18) into (7.4.19), we obtain (2m 1)! m F ( x)[ F ( x dx) F ( x)][1 F ( x dx)]m . (m!)2 (7.4.20) Now we may write F ( x dx) F ( x) f ( x)dx . (7.4.21) Substituting this expression into (7.4.20), we find that (ignoring terms of order (dx)2 and higher) P( x X ( m1) x dx) (2m 1)! m F ( x)[1 F ( x)]m f ( x)dx 2 (m!) (7.4.22) The p.d.f. g ( x ) of the median is the coefficient of dx on the right-hand side of (7.4.22), and the probability that the sample median X ( m 1) falls in interval ( x, x dx ) is given by g X ( m1) ( x)dx (2m 1)! m F ( x)[1 F ( x)]m f ( x)dx. , 2 (m !) (7.4.23) We note that the sample space of the median X ( m 1) is the same as the sample space of X, where X has the (population) c.d.f. F(x). Example 7.4.3 (Probability Distribution of Median) Suppose 2m+1 points are taken “at random” on the interval (0,1). What is the probability that the median of the 2m+1 points falls in ( x, x dx) ? In this example the p.d.f. of a point X taken at random on (0,1) is defined as f ( x) 1 , =0, 0 x 1 for all other values of x. Then F ( x) 0 , x0 = x, 0 x 1 = 1, x 1. Therefore, the p.d.f. g X ( m1) ( x) of the median in a sample of 2m+1 points is given by g X ( m1) ( x) (2m 1)! m x [1 x]m , if 0 x 1, 2 (m !) and zero otherwise. Hence, the probability that the median of the 2m+1 points falls in ( x, x dx) , is given by g X ( m1) ( x)dx (2m 1)! m x [1 x]m dx. 2 (m!) _____________________________________________________________ More generally, if we have a sample of n elements, say X 1 ,..., X n , from a population having p.d.f. f (x ) and if X ( k ) is the kth-order statistic of the sample (the kth smallest of X 1 ,..., X n ), then we can show, as in the case of the median, that P( x X ( k ) x dx) n! F k 1 ( x)[1 F ( x)]n k f ( x)dx. (k 1)!(n k )! (7.4.24) Therefore, the p.d.f. of the kth-order statistic of the sample is given by g X ( k ) ( x) n! F k 1 ( x)[1 F ( x)]n k f ( x). (k 1)!(n k )! (7.4.25) Note that the functional form of the p.d.f. on the right of (7.4.25) reduces to that on the right of (7.4.12) if k =1, and to that on the right of (7.4.4) if k = n, as one would expect, since in these two cases the kth-order statistic X ( k ) becomes the smallest element X (1) and the largest element X ( n ) , respectively. Example 7.4.4 (Distribution of the kth Order Statistic) If n points X 1 ,..., X n are taken “at random” on the interval (0, 1) what is the p.d.f. of the kth order statistic X ( k ) ? Using (7.4.25), the p.d.f. of X ( k ) is given by: g X ( k ) ( x) n! x k 1[1 x]n k , if 0 x 1, (k 1)!(n k )! since x F ( x) 1dx x, if 0 x 1 0 and zero otherwise. 7.4.4 Other Uses of Order Statistics The Range as an Estimate of in Normal Samples Suppose a random variable X has the normal distribution with unknown standard deviation . If a sample of n independent observations is taken on X, then R X ( n ) X (1) may be used as an estimate of . This estimate is not good for large n, but for small n( n 10) is deemed to be adequate. The estimate ̂ is made using the formula ˆ c(n) R , (7.5.1) where c (n) is tabulated in Table 7.5.1. TABLE 7.5.1 n c(n) n c(n) 3 .591 7 .370 4 .486 8 .351 5 .430 9 .337 6 .395 10 .325 Practice Problems for Section 7.4 1. A continuous random variable, say X, has the uniform distribution function on (0,1) so that the p.d.f. of X is given by 0, f ( x) 1, 0, x0 0 x 1 x 1 If X (1) , X (2) ,..., X ( n ) are the order statistics of n independent observations all having this distribution function, give the expression for the density g(x) for (a) The largest of these n observations. (b) The smallest of these n observations. (c) The rth smallest of these n observations. 2. If ten points are picked independently and at random on the interval(0,1): (a) What is the probability that the point nearest 1 (i.e., the largest of the ten numbers selected) will lie between .9 and 1.0? (b) The probability is ½ that the point nearest 0 will exceed what number? 3. Assume that the cumulative distribution function of breaking strengths (in pounds) of links used in making a certain type of chain is given by F ( x) 1 e x , =0, x 0, x0, where is a positive constant. What is the probability that a 100-link chain made from these links would have a breaking strength exceeding y pounds? 4. Suppose F ( x) is the fraction of objects in a very large lot having weights less than or equal to x pounds. If ten objects are drawn at random from the lot: (a) What is the probability that the heaviest of these ten objects will have a weight less than or equal to u pounds? (b) What is the probability that the lightest of the objects will have a weight less than or equal to v pounds? 5. The time, in minutes, taken by a manager of a company to drive from one plant to another is uniformly distributed over an interval [15, 30]. Let X1 , X 2 , randomly selected days and let X ( n ) Max( X 1 , X 2 , , X n denote her driving times on n , X n ). Determine (a) The probability density function of X ( n ) . (b) The mean of X ( n ) . 6. The lifetime, in years, X1 , X 2 , , X n of n randomly selected power steering pumps manufactured by a subsidiary of a car company is exponentially distributed with mean 1/ . Find the probability density function of X (1) Min( X 1 , X 2 , , X n ) and find its mean and variance. 7. In Problem 5, assume that n = 21. (a) Find the probability density function of the median time taken by the manager to drive from one plant to another. (b) Find the expected value of X (21) . 8. Consider a system of n identical components operating independently. Suppose the lifetime, in months, is exponentially distributed with mean 1/ . These components are installed in series, so that the system fails as soon as the first component fails. Find the probability density function of the life of the system and then find its mean and variance.