Probability and Statistics Lecture 11 Dr.-Ing. Erwin Sitompul President University http://zitompul.wordpress.com 2 0 1 3 President University Erwin Sitompul PBST 11/1 Chapter 8.6 Sampling Distribution of S2 Sampling Distribution of S2 If S2 is the variance of a random sample of size n taken from a normal population having the variance σ2, then the statistic 2 (n 1) S 2 2 n i 1 X i X 2 2 has a chi-squared distribution with v = n – 1 degrees of freedom. President University Erwin Sitompul PBST 11/2 Chapter 8.6 Sampling Distribution of S2 Sampling Distribution of S2 Table A.5 gives values of for various values of α and v. The column headings are the areas α. The left column shows the degrees of freedom. 2 The table entry are the value. 2 President University Erwin Sitompul PBST 11/3 Chapter 8.6 Sampling Distribution of S2 Table A.5 Chi-Squared Distribution President University Erwin Sitompul PBST 11/4 Chapter 8.6 Sampling Distribution of S2 Table A.5 Chi-Squared Distribution President University Erwin Sitompul PBST 11/5 Chapter 8.6 Sampling Distribution of S2 Sampling Distribution of S2 A manufacturer of car batteries guarantees that his batteries will last, on the average, 3 years with a standard deviation of 1 year. If five of these batteries have lifetimes of 1.9, 2.4, 3.0, 3.5, and 4.2 years, is the manufacturer still convinced that his batteries have a standard deviation of 1 year? Assume that the battery lifetime follows a normal distribution. 2 n 2 n xi xi (5)(48.26) (15)2 i 1 i 1 2 0.815 s n(n 1) (5)(4) n 2 (n 1) s 2 2 (4)(0.815) 3.26 (1) From the table, 95% of the χ2 values with 4 degrees of freedom fall between 0.484 and 11.143. The computed value with σ2 = 1 is reasonable. The manufacturer has no reason to doubt the current standard deviation. President University Erwin Sitompul PBST 11/6 Chapter 8.7 t-Distribution t-Distribution In the previous section we discuss the utility of the Central Limit Theorem to infer a population mean or the difference between two population means. These utility is based on the assumption that the population standard deviation is known. However, in many experimental scenarios, knowledge of σ is not reasonable than knowledge of the population mean μ. Often, an estimate of σ must be supplied by the_ same sample information that produced the sample average x. As a result, a natural statistic to consider to deal with inferences on μ is T X S n President University Erwin Sitompul PBST 11/7 Chapter 8.7 t-Distribution t-Distribution If the sample size is large enough, say n ≥ 30, the distribution of T does not differ considerably from the standard normal. However, for n < 30, the values of S2 fluctuate considerably from sample to sample and the distribution of T deviates appreciably from that of a standard normal distribution. In the case that sample size is small, it is useful to deal with the exact distribution of T. In developing the sampling distribution of T, we shall assume that the random sample was selected from a normal population, X T S2 2 n Z V (n 1) where X n (n 1) S 2 V 2 Z President University Erwin Sitompul PBST 11/8 Chapter 8.7 t-Distribution t-Distribution Let Z be a standard normal random variable and V a chi-squared random variable with v degrees of freedom. If Z and V are independent, then the distribution of the random variable T, where T Z V v is given by the density function (v /1) 2 t 2 h(t ) 1 v v 2 v v 1 2 , t This is known as the t-distribution with v degrees of freedom. President University Erwin Sitompul PBST 11/9 Chapter 8.7 t-Distribution t-Distribution Let X1, X2,..., Xn be independent random variables that are all normal with mean μ and standard deviation σ. Let n Xi X i 1 n n and S 2 i 1 X i X 2 n 1 X has a t-distribution with n Then the random variable T v = n – 1 degrees of freedom. S The shape of t-distribution curves for v = 2, 5, and ∞ President University Erwin Sitompul PBST 11/10 Chapter 8.7 t-Distribution Table A.4 t-Distribution It is customary to let tα represent the t-value above which we find an area equal to α. The t-distribution is symmetric about a mean of zero, that is, t1–α = –tα President University Erwin Sitompul PBST 11/11 Chapter 8.7 t-Distribution Table A.4 t-Distribution A t-value that falls below –t0.025 or above t0.025 would tend to make us believe that either a very rare event has taken place or perhaps our assumption about μ is in error. Should this happen, we shall make the latter decision and claim that our assumed value of μ is in error. President University Erwin Sitompul PBST 11/12 Chapter 8.7 t-Distribution t-Distribution The t-value with v = 14 degrees of freedom that leaves an area of 0.025 to the left, and therefore an area of 0.975 to the right, is t0.975 t0.025 2.145 Find P(–t0.025 < T < t0.05). Area 1 0.05 0.025 0.925 P(t0.025 T t0.05 ) 0.925 President University Erwin Sitompul PBST 11/13 Chapter 8.7 t-Distribution t-Distribution Find k such that P(k < T < –1.761) = 0.045, for a random sample of size 15 selected from a normal distribution and T x s n From t-Distribution Table, the value 1.761 corresponds to t0.05 for v = 14. So, t–0.05 = –1.761. P(t T t0.05 ) 0.045 0.05 t t t0.005 t0.005 k 2.977 P(2.977 T 1.761) 0.045 President University Erwin Sitompul PBST 11/14 Chapter 8.7 t-Distribution t-Distribution A chemical engineer claims that the population mean yield of a certain batch process is 500 grams per millimeter of raw material. To check this claim he samples 25 batches each month. If the computed t-value falls between –t0.05 and t0.05, he is satisfied with his claim._ What conclusion should he draw from a sample that has a mean x = 518 gr/mm and a sample standard deviation s = 40 gr? Assume the distribution of yields to be approximately normal. t x 518 500 2.25 s n 40 25 P(t0.05 T t0.05 ) P(1.711 T 1.711) From t-Distribution Table, the value 2.25 corresponds to α between 0.02 and 0.015. This means, the probability of obtaining a mean of 518 gr/mm for a certain sample while the mean of population is 500 gr/mm is only approximately 2%. It is more reasonable to assume that μ > 500. Hence, the manufacturer is likely to conclude that the process produces a better product than he thought. President University Erwin Sitompul PBST 11/15 Chapter 8.8 F-Distribution F-Distribution If the t-distribution is motivated by the comparison between two sample means, the F-distribution finds enormous application in comparing sample variances. The statistic F is defined to be the ratio of two independent chisquared random variables, each divided by its number of degrees of freedom. Hence, we can write F U v1 V v2 where U and V are independent random variables having chisquared distributions with v1 and v2 degrees of freedom, respectively. President University Erwin Sitompul PBST 11/16 Chapter 8.8 F-Distribution F-Distribution Let U and V be two independent random variables having chisquared distributions with v1 and v2 degrees of freedom, U v1 respectively. Then the distribution of the random variable F V v2 is given by the density v1 v2 2 v1 v2 v1 h( f ) v1 2 v2 2 0, 2 1 f v1 21 , ( v1 v2 ) 2 (1 v1 f v2 ) 0 f elsewhere This is known as the F-distribution with v1 and v2 degrees of freedom. Table A.6 in the reference gives values of fα for α = 0.05 and α = 0.01 for various combinations of the degrees of freedom v1 and v2. Table A.6 can also be used to find values of f0.95 and f0.99. The theorem will be presented later. President University Erwin Sitompul PBST 11/17 Chapter 8.8 F-Distribution Table A.6 F-Distribution α = 0.05 President University Erwin Sitompul PBST 11/18 Chapter 8.8 F-Distribution Table A.6 F-Distribution α = 0.05 President University Erwin Sitompul PBST 11/19 Chapter 8.8 F-Distribution Table A.6 F-Distribution α = 0.01 President University Erwin Sitompul PBST 11/20 Chapter 8.8 F-Distribution Table A.6 F-Distribution α = 0.01 President University Erwin Sitompul PBST 11/21 Chapter 8.8 F-Distribution F-Distribution Typical F-distribution Tabulated values of the Fdistribution Writing fα(v1, v2) for fα with v1 and v2 degrees of freedom, we obtain f1 (v1 , v2 ) 1 f (v2 , v1 ) President University Erwin Sitompul PBST 11/22 Chapter 8.8 F-Distribution F-Distribution with Two Sample Variances If S1 and S2 are the variances of independent random samples 2of size n1 and n2 taken from normal populations with variances 1 and 22 , respectively, then 2 2 S12 12 22 S12 F 2 2 2 2 S2 2 1 S2 has an F-distribution with v1 = n1 – 1 and v2 = n2 – 1 degrees of freedom. President University Erwin Sitompul PBST 11/23 Probability and Statistics Homework 10A 1. A maker of a famous chocolate candies claims that their average calorie is 5 cal/g with a standard deviation of 1.2 cal/g. In a random sample of 8 candies of this famous brand, the calorie content was found to be 6, 7, 7, 3, 4, 5, 4, and 2 cal/g. Would you agree with the claim? Use Chi-squared distribution and assume a normal distribution. (Wal8.852 ep.283) 2. A small cleaning service company obtains a contract proposal from a customer owning an office tower with 100 rooms. The company has only 5 workers. It needs to determine its profit margin by first finding out the time required by the workers to finish cleaning 100 rooms. The first estimation is that the workers would need 5.5 hours to clean the 100 rooms. The company starts a probation period for two week, while collecting data so that it can later charge the customer correctly. The data collected by the company can be seen on th next table. After collecting this data, the company wants to determine if the first estimation of 5.5 hours to finish cleaning 100 rooms was reasonable. If the computed t-value falls between –t0.025 and t0.025, the company would be satisfied and will stay with its first estimation. What is your opinion? (Int.Rndvz ep.283) President University Erwin Sitompul PBST 11/24 Time to clean 100 rooms 5.5 7 6.4 4.5 3.9 7.1 5.6 5.8 7.8 4.6 4.5 5.5