Chapter 3 ESTIMATION: SINGLE POPULATION Two Methods of Inference About a Parameter (page 437) Estimation – inference about a parameter is made by finding a single value or a range of values computed from the sample data that may be used to make a statement about the unknown value of the parameter. Hypothesis testing – inference about a parameter is made by assessing whether or not the sample data support an assertion made about the true value of the parameter. Chapter 3. Estimation: Single Population Two Major Types of Estimates Definition 13.1 (page 437) A point estimator is a single statistic whose realized value is used to estimate the true but unknown value of the population parameter. The realized value of an estimator is called the point estimate. Definition 13.2 (page 439) An interval estimator of the unknown value of the population parameter is a rule that tells us how to calculate two numbers based on sample data that will form an interval within which we expect the population parameter to lie with a specified degree of confidence. The realized pair of numbers computed from this estimator, say , is called the interval estimate or confidence interval estimate. Chapter 3. Estimation: Single Population Example 13.1 (page 438) Suppose we have a population with mean, µ. A random sample of n observations, (X1, X2, …, Xn), was taken from this population. Then the sample mean, X 1 n n X i , is a point estimator of the population mean, µ. i 1 The particular numerical value computed from a given sample, estimate. x , is a point The point estimator is a random variable while the point estimate is its realized value. The estimate can only be computed once the sample data has been collected. For example, (X1, X2, X3) is a random sample and the sample data are as follows: X1=2, X2=3, X3=4 then a point estimate for µ is X 2 3 4 3 3. Chapter 3. Estimation: Single Population Example: Confidence interval estimator for (page 456) Suppose a random sample of size n from a normal distribution with unknown mean, , but known variance, 2, is taken. Under this case, a (1- )100% confidence interval estimator for the population mean, , is given by: X z /2 n ,X z /2 n Note: This confidence interval estimator has a lower confidence limit and an upper confidence limit. Both are random variables. They will only have realized values once the sample data have been obtained. For example, after collecting the data and computing for the values of the lower and upper confidence limits where =0.1 as 10 and 20, respectively, then a 90% confidence interval estimate for is (10,20). We then say that we are 90% confident that the value of is in the interval (10,20). Chapter 3. Estimation: Single Population Common Point Estimators for Parameters under Random Sampling from an Infinite Population (Table 13.3, page 444) Parameter Estimator Mean: µ 2 Variance: 1 n Xi ni1 Sample Variance: 1 n 2 S2 Xi X n 1i 1 Sample Standard Deviation: Sample Mean: X Standard Deviation: 2 S Standard Error of the Sample Mean SE ( X ) n 1 n 1i n Xi X 2 1 Estimator of Standard Error of the Sample Mean se( X ) Proportion: p S n Sample Proportion: P̂ Estimator of Standard Error of Standard Error of the Sample the Sample Proportion p ( 1 p ) Pˆ (1 Pˆ ) Proportion SE ( Pˆ ) ˆ se ( P ) n n Chapter 3. Estimation: Single Population Note (page 443) A random sample of size n from an infinite population includes samples selected using simple random sampling with replacement (SRSWR) because as explained in Chapter 11, SRSWR satisfies the definition of a random sample from an infinite population as stated in Definition 11.2. We can also include samples selected using SRSWOR so long as the sample size n is small relative to the population size N, so that the finite population correction N n N 1 will be approximately equal to 1, or equivalently, the sampling fraction n/N is very close to 0. Chapter 3. Estimation: Single Population Examples Example 13.5 (page 445), Examples 13.6-13.8 (pages 447-449) Suppose a random sample of men were selected and data on their waistline are as presented in Exercise 1 (page 235) 30 32 a) b) c) 35 34 40 34 28 35 34 36 33 30 31 31 32 32 29 32 37 31 Estimate the mean waistline, n 30 35 40 28 34 33 31 32 29 37 Xi 32 34 34 35 36 30 31 32 32 31 X i1 32.8 n 20 Estimate p= proportion of men with waistline greater than 32 9 Pˆ 0.45. 20 Estimate how varied the values of X from one sample to the other. n se( X ) d) S n (Xi X )2 / (n 1) i 1 n 2.894641 0.647 20 Estimate how varied the values of P̂ from one sample to the other. se( Pˆ ) Pˆ (1 Pˆ ) n (0.45)(0.55) 20 0.1112 Chapter 3. Estimation: Single Population Assignment 7 Assume that the dataset presented in Table 13.4 (page 449) on yield of mango is our population data. Define =population mean yield of mango and P=population proportion of countries with yield below 100,000. Note that =89,627.63 and p=14/19. Select a sample of size n=10 using SRSWR using the seed number assigned to your group. Identify the 10 elements in your sample. (Note: Since this is SRSWR, the 10 countries in your sample need not be distinct.) 2. Using the sample data, estimate the following: (Show your solution.) 1. a. b. how varied the values of the sample mean are from one sample to the other c. p d. how varied the values of the sample proportion are from one sample to the other Chapter 3. Estimation: Single Population Some Desirable Properties of a Point Estimator Unbiasedness (page 449) Reliability (page 452) Efficiency (page 453) Chapter 3. Estimation: Single Population Definition of Unbiasedness (page 449) Definition 13.3 An estimator is said to be unbiased for the parameter being estimated if the average of the estimates it produces under repeated sampling from the same population is equal to the true value of the parameter being estimated. In other words, the expected value of the estimator is equal to the parameter it is estimating. Chapter 3. Estimation: Single Population Examples Example 13.9 (pages 449-450) Example 13.10 (pages 451 -452). Population data={21, 23, 30, 34} Select a sample of size n=2 from this population using SRSWR, Show that S2 is unbiased for 2 but S is biased for . TABLE 13.6a. List of All Possible Samples of Size 2 and Computed Values of S2 and S Sample Data Sample S2 S (x1, x2) 1 21 21 0 0 2 23 23 0 0 3 30 30 0 0 4 34 34 0 0 5 21 23 2 2 6 23 21 2 2 7 30 34 8 8 8 34 30 8 8 9 23 30 24.5 24.5 10 30 23 24.5 24.5 11 21 30 40.5 40.5 12 30 21 40.5 40.5 13 23 34 60.5 60.5 14 34 23 60.5 60.5 15 21 34 84.5 84.5 16 34 21 84.5 84.5 mean of S 2 (0)(4) (2)(2) (8)(2) (24.5)(2) (40.5)(2) (60.5)(2) (84.5)(2) 16 27.5 ( 0)(4) ( 2)(2) ( 8)(2) ( 24.5)(2) ( 40.5)(2) ( 60.5)(2) ( 84.5)(2) 16 mean of S From the population data, we have 4 (Xi 2 i 1 4 and Chapter 3. Estimation: Single Population 27) 2 27.5 27.5 5.244. 4.066 Remarks on Unbiasedness (pages 451-453) In general, the sample mean, X , is an unbiased estimator of µ under random sampling from an infinite population which includes simple random sampling with replacement. It is also unbiased under simple random sampling without replacement. Likewise, the sample proportion is an unbiased estimator of the population proportion under random sampling from an infinite population which includes simple random sampling with replacement. It is also unbiased under simple random sampling without replacement. Under simple random sampling with replacement, the sample variance, S 2, is an unbiased estimator for the population variance, 2. On the other hand, the sample standard deviation, S, is a biased estimator of the population standard deviation, , with the bias becoming smaller or insignificant for large sample sizes. An unbiased estimator is not necessarily the “best” estimator of a parameter. A parameter may have more than one unbiased estimator. Chapter 3. Estimation: Single Population Measure of Reliability In statistical parlance, a statistic whose value does not vary much from one sample to another is a reliable estimator. (page 72) Our measure of reliability is the standard error of the statistic. (page 452) Chapter 3. Estimation: Single Population Example: Three Different Estimators that Behave Differently in Terms of Bias and Reliability (Figure 13.1, page 452) True Value of Parameter: 500 0 250 500 750 1000 Estimator A: unbiased and reliable Estimator C: reliable but biased 0 250 500 750 1000 Estimator B: unbiased but not reliable 0 250 500 750 1000 Estimator C: reliable but biased Chapter 3. Estimation: Single Population Most Efficient Estimator (page 453) Definition 13.4 An unbiased estimator of a parameter with the smallest variance among all the other unbiased estimators is called the most efficient estimator. When sampling from the normal distribution, both the sample mean and the sample median are unbiased estimators of . However, the sample mean has a smaller standard error than the sample median. In fact, the sample mean is the most efficient estimator when sampling from the normal distribution. Chapter 3. Estimation: Single Population Confidence Interval Estimation (page 455) Recall: An interval estimator of the unknown value of the population parameter is a rule that tells us how to calculate two numbers based on sample data that will form an interval within which we expect the population parameter to lie with a specified degree of confidence. The (1- )100 % Confidence Interval Estimate Definition 13.5. The fraction (1- ) in a (1- )100 % confidence interval estimate is called the confidence coefficient, and the endpoints are called the lower and upper confidence limits. The length of the interval is defined as the difference between the upper and lower confidence limits. Note: The researcher chooses the value of . Naturally, we choose a value for that is close to 0 so that we can be more confident about our inference. Common choices for are 0.10, 0.05, and 0.01. Smaller means higher the confidence coefficient. Example: = 0.1 so that the confidence coefficient is (1- )=1 – 0.1 = 0.9. If the 90% confidence interval estimate for is (2.2, 5.5) then the lower confidence limit is 2.2, the upper confidence limit is 5.5, and the length of the interval is: 5.5 – 2.2 = 3.3. We say that we are 90% confident that will lie between 2.2 and 5.5. This statement does not mean P(2.2 < < 5.5)=0.9. The probability that is in between 2.2 and 5.5 is either 0 or 1, depending on the true value of . We cannot determine the probability of this event unless we know the value of . Chapter 3. Estimation: Single Population Interpretation of (1- )100% Confidence Interval for (page 455) Let be the parameter of interest and (T1, T2) is a (1- )100% confidence interval estimator for . The confidence coefficient (1- ) satisfies the condition that P(T1 < < T2) = 1- . This probability statement can be interpreted as follows: “(1- ) is the probability of selecting a random sample whose computed interval estimate using the estimator (T1, T2) contains the value of the population parameter, .” If the sampling scheme used assigns the same chances of selection to all possible samples of size n then this probability statement can also be interpreted using the classical or a priori definition as follows: “If we consider all possible samples of the same size and a (1- )100% confidence interval estimate is computed from each sample using the interval estimator (T1, T2), then (1- )100% of these intervals would include the true population parameter, , somewhere within their interval, while 100 % of them would not.” This probability statement can be interpreted using the relative frequency or a posteriori definition of probability as follows: “If we repeatedly take samples of the same size from the same population under the same conditions, and a (1- )100% confidence interval estimate is computed using the estimator (T1, T2) each time we take a sample, then (1- )100% of all the generated intervals would include the true population parameter, , somewhere within their interval, while the remaining 100 % of them would not.” Chapter 3. Estimation: Single Population Example Repeated Sampling Using Excel Generate 500 samples of size n=15 from a normal distribution with mean =20 and standard deviation =1 using the Random Number Generation of the Data Analysis ToolPak of Excel. Number of Variables=15 Number of Random Numbers=500 Distribution: Normal Mean=20, Standard deviation = 1 Compute for 95% confidence interval estimate for for each sample using the formula 1 1 X 1.96 , X 1.96 15 15 Determine if interval estimate contains the true value of =20 using the IF function, =if(or(lower limit>20,upper limit<20,0,1). Count how many among the 500 intervals contain the true value of =20 using the SUM function. Chapter 3. Estimation: Single Population Assignment 8 2. Generate 500 samples of size n=15 from a normal distribution with mean =10 and standard deviation =2 using the seed number assigned to your group. Compute for the 95% confidence interval estimate for for each sample using the formula: 2 2 X 1.96 , X 1.96 15 15 3. For your assignment, present only the first 5 confidence interval estimates. Among the 500 computed interval estimates, what percentage contain the true value of =10? 1. Chapter 3. Estimation: Single Population (1- ) 100% Confidence Interval Estimators for of a Normal Distribution (Table 13.7, page 458) Assume that we have a random sample (X1, X2, …, Xn) of size n taken from a normal population with mean, , and variance, 2. This means that all the Xis are independent random variables and are all normally distributed with the same mean and variance. Cases Case 1: Case 2: 2 Confidence Interval Estimators is known is unknown (even if the sample size n ≤30) Case 3: X z /2 is unknown (and the sample size n >30) /2 (v /2 S ,X n t /2 z S ,X n z /2 2 X z n 1) 2 X t n ,X /2 n (v n 1) S n S n The first two formulas were derived using the sampling distributions discussed in Chapter 11. (See derivation of first formula in pages 479-480.) Formula 3 is based on the fact that as the degrees of freedom approaches infinity, the t-distribution will approach the standard normal distribution. This formula is useful only when the values of t are not available for large n. In fact, most statistical software will use formula 2 whenever 2 is unknown since these software are capable of determining the value of t for any degrees of freedom. Chapter 3. Estimation: Single Population Remarks on the Assumption of Normality (page 483) We actually do not require that the Xis in the random sample come from an exactly normal distribution. These formulas will still provide good approximate (1- )100% confidence intervals for even if there are slight deviations from normality. Studies show that light-tailedness or heavy-tailedness of the parent population will have little effect on the sampling distribution of the Z and T statistics. We only have to be careful in using these formulas when we suspect that the parent population is badly skewed and the sample size is small. In this case, the actual confidence coefficient may be lower than what we have set and, as a result, we have a false sense of confidence on our inference. All 3 formulas will still provide us with good approximate (1- )100% confidence intervals for even when the parent population is not normal (including those that are badly skewed and those that are discrete), provided that the sample size is large, that is, n > 30. This result is attributed to the Central Limit Theorem. Chapter 3. Estimation: Single Population Examples z .10 1.282 .05 1.645 .025 1.960 .01 2.326 .005 2.576 .001 3.090 .0005 .00005 3.291 3.891 Example 13.11 to 13.15 (pages 459 – 462) Exercise 2a (page 464). Laboratory tests of bacterial counts are often used for declaring a water source “polluted”. Suppose that the distribution of bacterial counts in a sample taken from a certain lake is normally distributed with a variance of 9,000,000. Suppose 25 water samples were taken over the course of July 2004 and yielded a mean count of 12,000. Construct an 80% confidence interval estimate of the unknown mean bacterial count in this lake at this time. Parameter of interest: = mean bacterial count Problem: Find an 80% confidence interval estimate for Given: (X1, X2, …, X25) is a random sample from a normal distribution with known 2 =9,000,000. Its sample mean, X , is equal to 12,000. We’ll use the first formula, X z /2 n ,X z /2 n coefficient is 80% then 1 - =0.8 so that =0.2 and Lower confidence limit: X z /2 Upper confidence limit: X z /2 n n because 2 is known. Since the desired confidence /2=0.1. Using Table B.1 page 604, , z0.1=1.282. 12, 000 1.282 9, 000, 000 25 11, 230.8 12, 000 1.282 9, 000, 000 25 12, 769.2 Chapter 3. Estimation: Single Population Example (cont’d) z .10 1.282 .05 1.645 .025 1.960 .01 2.326 .005 2.576 .001 3.090 .0005 3.291 .00005 3.891 Our 80% confidence interval estimate is (11230.8, 12769.2). The length of this interval is 12769.2-11230.8=1538.4. Using the same data, let us compute for a 90% confidence interval estimate for . This time 1 that =0.1 and /2=0.05. Using Table B.1, page 604, z0.05=1.645. Lower confidence limit: X z /2 Upper confidence limit: X z /2 n n 12, 000 1.645 9, 000, 000 25 11, 013 12, 000 1.282 9, 000, 000 25 12,987 Our 90% confidence interval estimate is (11013, 12987). The length of this interval is 12987-11013=1974. We are more confident of our inference but at the expense of a longer interval. Chapter 3. Estimation: Single Population = 0.9 so General Remarks on the Length of the Interval Estimate (page 463) We assess the “goodness” of our interval estimate by checking the confidence coefficient, together with the length of the interval. A good confidence interval estimate is one that is as narrow as possible and has a large confidence coefficient. The narrower the interval we have created, the more exactly we have located the parameter. Whereas, the larger the confidence coefficient is, possibly near 1, the more confidence we have that a particular interval encloses the true value of the parameter. For a fixed sample size n, as the confidence coefficient increases, the length of the interval also increases. Thus, for a fixed sample size, the trade off of having a high confidence on our interval estimate is a longer or wider interval. Similarly, for a fixed sample size, the trade off of having a narrow interval is having a lower confidence on our interval estimate. Chapter 3. Estimation: Single Population Length of the Interval Estimate for (page 463) 2z Length of interval estimate , /2 n 2t /2 when (v n 1)s n , when is known is unknown For a fixed n, as the confidence coefficient (1- ) increases, the length of the interval increases. The reason for the increase in the length of the interval estimate for µ when the confidence coefficient, 1- , increases is the corresponding increase in the values of z /2 and t /2(v=n-1) as decreases. For a fixed confidence coefficient (1- ), as n increases, the length of the interval decreases. In fact, as n goes to infinity, the length of the interval goes to 0. This is because the values of both the standard error and its estimator, / n and S / n,respectively, approach 0 as n goes to infinity. If we are not satisfied with the resultant length of the interval at the desired confidence coefficient, then we can improve on our estimates in the future by using a larger sample size. Increasing the sample size will reduce the standard error, and consequently, reduce the length of the interval. When the elements are homogeneous with respect to the characteristic under study, that is is small, then we do not need a very large sample size in order to attain a short interval. Chapter 3. Estimation: Single Population Exercise 1 (page 463) TRUE or FALSE. a.) For a given sample variance and sample mean , a 90% confidence interval for an unknown mean is narrower than a 99% confidence interval. b.) Consider the construction of a 95% confidence interval. Suppose one repeats the same sampling process indefinitely. Suppose further that, for each sample drawn, a new 95% confidence interval calculation is performed. If for each sample, the investigator claims that the parameter is contained in the interval, about 95% of his statements will be correct. Chapter 3. Estimation: Single Population More examples Let us use the following data in Exercise 2, page 204 on the price of brown sugar per kilo based on a sample of 8 grocery stores: 20.50 21.25 19.95 22.50 20.00 22.75 23.50 21.75 Assuming that the price of brown sugar is normally distributed and that we have a random sample, compute for a 95% confidence interval estimate for =mean price of brown sugar. Because 2 is unknown, we will use formula 2, X t /2 (v n 1) S ,X n t /2 (v n 1) S n Using the sample data, we get X 21.525 and S=1.32745729. Since the confidence coefficient is 95% then 1- =0.95 so that =0.05 and /2=0.025. The degrees of freedom is v=n-1=7. Referring to Table B.2, page 605, t0.025(v=7)= 2.365. Lower confidence limit: X t /2 (v n 1) S n 21.525 2.365 1.32745729 8 20.415. Upper confidence limit: X /2 (v n 1) S n 21.525 2.365 1.32745729 8 22.635. We are 95% confident that t lies in the interval (20.415, 22.635). Chapter 3. Estimation: Single Population Assignment 9 Compute for the requested confidence interval estimates. Always present the formula used to compute for the interval estimates with the appropriate values plugged-in. No immediate rounding. Round-off final answer only to 2 decimal places. 1. A coin-operated soda machine was designed to discharge, on the average, 7 ounces of beverage per cup. To test the machine, a random sample of 15 cupfuls of soda was drawn from the machine and measured. The results were as follows: 6.95 7.00 6.99 6.92 6.88 6.98 7.02 7.00 6.99 7.10 7.01 6.96 7.04 7.00 7.05 Assuming that the amount of beverage dispensed is normally distributed, compute for a 99% confidence interval estimate for =mean amount of soda dispensed (in ounces). 2. A random sample of 500 elementary school children was selected. Each student in the sample was given a reading comprehension test. The sample mean and sample variance were computed to be 64 and 52, respectively. Compute for a 90% confidence interval estimate for =mean score in the reading comprehension test of all elementary school children in the population. Chapter 3. Estimation: Single Population Confidence Interval for the Proportion (page 465) If the population proportion is not expected to be too close to 0 or 1 and the sample size n is large, then an approximate 1 100% confidence interval estimator for the population proportion, p, is given by: Pˆ z where z 2 is the 100 1 th 2 /2 Pˆ (1 Pˆ ) ˆ ,P z n /2 Pˆ (1 Pˆ ) n percentile of the standard normal distribution. Notes: (i) The sample size n must be large because this formula is based on the Central Limit Theorem, where the population proportion is viewed as the mean of the Xis where Xi 1 if ith element possesses characteristic of interest 0 if ith element does not possess characteristic of interest Clearly, Xi is discrete and cannot come from a normal distribution. (ii) As the sample size n goes to infinity, the length of the interval goes to 0. Chapter 3. Estimation: Single Population Examples Example 13.18 (page 468) Exercise 1 (page 468) According to a 1984 American study, about one in three individuals feels shopping is an unpleasant experience (Journal of Marketing Research, February/March 1984). Suppose we take a national sample of 4,100 Filipino male and female adults, and we determine each respondent’s opinion on the pleasantness of shopping. The survey produced the following results: Sample Size Number who think shopping is an unpleasant experience Males Females 2,015 850 2,085 570 a) Compute a 95% confidence interval for the proportion of males in the sample who think shopping is an unpleasant experience. Pˆ z /2 Pˆ (1 Pˆ ) ˆ ,P z n /2 Pˆ (1 Pˆ ) n 850 1165 Pˆ (1 Pˆ ) 2015 2015 .011 =estimate for the standard error n 2015 Confidence coefficient=95% means that (1- )=0.95 so that =0.05 and /2=0.025 and z.025=1.96. Pˆ 850 =point estimate for p 2015 Lower confidence limit: Pˆ z /2 Upper confidence limit: Pˆ z /2 Pˆ (1 Pˆ ) n 850 (1.96)(.011) .400 2015 Pˆ (1 Pˆ ) n 850 (1.96)(.011) .443 2015 Chapter 3. Estimation: Single Population Margin of Error (page 458) Definition 13.6 The margin of error, denoted by e, is the upper bound on the absolute difference between the estimator and the parameter called the error of estimation, though there is an associated risk of selecting a sample that yields an estimate whose error of estimation is greater than this upper bound, e. By Definition 13.16, when we use X to estimate , the margin of error, e, satisfies the condition: P(| X or ,equivalently, P(| X | e) | e) 1 margin of error, e, satisfies the condition: P(| Pˆ P( X z P( Pˆ z /2 /2 n p(1 p) n X z p /2 n Pˆ z ) 1 /2 . Likewise, when we use P̂ to estimate p, the p | e) means that e p(1 p) ) 1 n z /2 n means that e z /2 p(1 p) n Chapter 3. Estimation: Single Population Remarks About the Margin of Error We can use the margin of error to describe the “goodness” of the point estimate. The smaller the margin of error, the better. The margin of error usually involves unknown parameters. In this case, we use sample data to estimate these unknown parameters. In estimating the mean or proportion, the margin of error is the term that we add/subtract to the point estimate to compute the (1- )100% confidence interval estimate. It is a function of 2 factors: a) standard error and b) the risk . Example: The researchers’ estimate for the percentage of voters who will elect Person A is 60% with a margin of error of 3 percentage points. (In most studies, when the risk is not reported then it is understood that =0.05). This means that the chance is as small as 0.05 that they have selected a sample where the estimated percentage is beyond 3 percentage points away from the true value. That is, P(| Pˆ p | 0.03) 0.05 or P(| Pˆ p | 0.03) P(Pˆ 0.03 p Pˆ 0.03) Again, this is NOT the same as saying P(0.57 ≤ p ≤ 0.63) = 0.95. Chapter 3. Estimation: Single Population 0.95 Confidence Interval Estimate for 2 (pages 470 and 472) A (1- )100 % confidence interval estimator for the population variance (n 1) S 2 , 2 v n 1 2 2 2 2 An approximate 100 1 2 percentiles of the chi-square is given by: (n 1) S 2 2 v n 1 1 2 v n 1 and v n 1 are the 100 1 2 where distribution with v=n-1 degrees of freedom, respectively. 2 th 2 % confidence interval estimator for the population variance (n 1) S 2 , 2 v n 1 2 is given by: (n 1) S 2 2 v n 1 1 v n 1 and 12 v n 1 are the 100 1 2 th and 100 where distribution with v = n-1 degrees of freedom, respectively. 2 2 2 1 2 th and 100 th 2 percentiles of the chi-square Notes: (i) (ii) The chi-square based inference about 2 depends on the assumption that the random sample is selected from a population that has a normal distribution. If the distribution of the population is distinctly nonnormal, then the formula for the confidence interval estimate of 2 (and of ) is not appropriate even if the sample size is large. Nonnormality, in the form of skewness or heavy tails, can have serious effects on the level of confidence in estimating for 2 . As the sample size n approaches infinity, the length of the interval goes to 0. Chapter 3. Estimation: Single Population Examples Examples 13.19 to 13.21. (pages 470-472) Exercise 2 (page 472). A mortgage is a type of loan that is secured by a designated piece of property. If the borrower defaults on the loan, the lender can sell the property to recover the outstanding debt. A federal bank examiner is interested in estimating the mean and standard deviation of outstanding principal balance of all home mortgages foreclosed by the bank due to default by the borrower during the last 3 years. A random sample of 12 foreclosed mortgages yielded the following data (in dollars): 95,982 59,200 81,422 62,331 39,888 105,812 46,836 55,545 66,899 56,635 69,110 72,123 Find a 90% confidence interval for the standard deviation of interest. Are there any distributional assumptions that we have to make to compute for the confidence interval estimate for the standard 2 2 deviation? .05 (v 11) 19.675, 0.95 (v 11) 4.575 (n 1) S 2 , 2 n 1) /2 (v 2 1 (n 1) S 2 n 1) /2 (v (14,334.75, 29,727.06) Note: Simultaneous confidence regions for and 2 using the same sample data is not as simple as computing for the separate CI estimates using the formulas we have presented because the resulting confidence coefficient is not (1- ) or (1- )2 since the two events are not independent. Chapter 3. Estimation: Single Population Assignment 10 1. (MGB, page 400) Suppose that 200 heads and 300 tails resulted from 500 tosses of a coin. Compute for an approximate 99% confidence interval estimate for the probability of a head. Based on your computed estimate, does it appear that the coin is not fair? (Notes: (i) the probability of observing a head is the same as the true proportion of tosses where a head comes up; and (ii) for a fair coin, the probability of observing a head is 1/2. ) 2. Carlton Sign Company wanted to know the variance of the life of the light fixtures it uses in its signs. It selected a random sample of 25 signs and learned that the fixtures in the sample lasted an average of 9,500 hours with a standard deviation of 81 hours. Assuming that the distribution of the life of a light fixture is normally distributed, compute for a 90% confidence interval estimate for the population variance. Chapter 3. Estimation: Single Population Using PhStat Choose Add-in then select PhStat. For point and interval estimation: Choose Confidence Intervals then click appropriate action. To estimate proportion: Encode data as follows: 1 – element possesses characteristic of interest 0 – element does not possess characteristic of interest Compute for Number of successes using =sum(cells containing dataset) Examples: 1. Point estimate for mean and proportion 2. CI estimate for the mean with known variance 3. CI estimate for the mean with unknown variance 4. CI estimate for proportion 5. CI estimate for the variance Chapter 3. Estimation: Single Population