CONCEPT OF A STATISTIC Any characteristic of a population which is measurable is called a population parameter. (Greek letters for population parameters.) A parameter is a numerical property of a sample. Usually the population is too large to calculate these parameters. In order to estimate a population parameter, we take a random sample from the population and use observations from the items in it to estimate the required parameters. EXAMPLE A manufacturer makes three sizes of toaster. 40% of the toasters sell for K16, 50% sell for K20 and 10% sell for K30. a. Find the mean and variance of the value of the toasters. A sample of 2 toasters is sent to a shop. b. List all the possible prices of the samples that could be sent. c. Find the sampling distribution for the mean price X of these samples . EXAMPLE A supermarket sells a large number of 3-litre and 2-litre cartons of milk. They are sold in the ratio 3:2. a. Find the mean and variance of the milk content in this population of cartons. A random sample of 3 cartons is taken from the shelves (π1 , π2 , πππ π3 ). b. c. d. e. List all the possible samples. Find the sampling distribution of the mean πΜ . Find the sampling distribution of the mode π. Find the sampling distribution of the median π of these samples. EXAMPLE A large bag contains pawns. Sixty per cent of the pawns have the number 0 on them and forty per cent have the number 1. a. Find the mean and variance for this population of pawns. A simple random sample of size 3 is taken from this population. b. List all possible samples. c. Find the sampling distribution for the mean π +π +π πΜ = 1 2 3 3 where π1 , π2 πππ π3 are the three variables representing samples 1, 2 and 3. d. Hence find E(πΜ ) and Var(πΜ ). e. Find the sampling distribution for the mode M. f. Hence find E(M) and Var(M). CENTRAL LIMIT THEOREM The Central Limit Theorem says that if π1 , π2 , π3 … . . ππ is a random sample of size n from a population with mean π and variance π 2 then πΜ is approximately πΜ ~π (π, π2 ) π CONFIDENCE INTERVAL (C.I.) The value of πΜ, which is an estimator of θ, is found from a sample. It is used as an unbiased estimate for the population parameter θ and is very unlikely to be exactly equal to θ. There is no way of establishing, from the sample data only, how close the estimate is. INSTEAD, YOU CAN FORM A CONFIDENCE INTERVAL FOR Θ. A confidence interval (C.I.) for a population parameter Θ is a range of values defined so that there is a specific probability that the true value of the parameter lies within that range. You could establish a 90% confidence interval, or a 95% confidence interval. A 95% confidence interval is an interval such that there is a 0.95 probability that the interval contains Θ. Different samples will generate different confidence intervals since estimates for the parameter will change based on the data in the sample and the sample size. πΜ ~π (π, π2 ) π Hence, if you know the population standard deviation, you can establish a confidence interval for the population mean π using the standardized normal distribution. EXAMPLE Show that a 95% confidence interval for π, based on a sample of size "π" is given by ( π₯Μ − 1.96 π √π , π₯Μ + 1.96 π √π ) We are sometimes interested in the width of a confidence interval. The width of a confidence interval is the difference between THE UPPER CONFIDENCE LIMIT AND THE LOWER CONFIDENCE LIMIT. This is 2 × π × π √π where z is the value from the tables EXAMPLE QUESTION ONE The breaking strains of string produced at a certain factory are normally distributed with standard deviation 1.5 kg. A sample of 100 lengths of string from a certain batch was tested and the mean breaking strain was 5.30 kg. a. Find a 95% confidence interval for the mean breaking strain of string in this batch. The manufacturer becomes concerned if the lower 95% confidence limit falls below 5 kg. A sample of 80 lengths of string from another batch gave a mean breaking strain of 5.31 kg. b. Will the manufacturer be concerned? QUESTION TWO A random sample of size 25 is taken from a normal population with standard deviation of 2.5. The mean of the sample was 17.8 a. Find a 99% C.I. for the population mean V. b. What size sample is required to obtain a 990/0 C.I. of width of at most 1.5? c. What confidence level would be associated with the interval based on the above sample of 25 but of width 1.5, i.e. (17.05, 18.55)? QUESTION TWO A random sample of size 9 is taken from a normal distribution with variance 36. The sample mean is 128. a. Find a 95% confidence interval for the mean of the distribution. b. Find a 99% confidence interval for the mean of the distribution. QUESTION THREE A random sample of size 25 is taken from a normal distribution with standard deviation 4. The sample mean is 85. a. Find a 90% confidence interval for the mean of the distribution. b. Find a 95% confidence interval for the mean of the distribution. QUESTION FOUR A normal distribution has standard deviation 15. Estimate the sample size required if the following confidence intervals for the mean should have width of less than 2. a. 90% b. 95% c. 99% QUESTION FIVE An experienced poultry farmer knows that the mean weight kg for a large population of chickens will vary from season to season but the standard deviation of the weights should remain at 0.70 kg. A random sample of 100 chickens is taken from the population and the weight π₯ kg of each chicken in the sample is recorded, giving ∑ π₯ = 190.2. Find a 95% confidence interval for π. QUESTION SIX It is known that each year the standard deviation of the marks in a certain examination is 13.5 but the mean mark will fluctuate. An examiner wishes to estimate the mean mark of all the candidates on the examination but he only has the marks of a sample of 250 candidates which give a sample mean of 68.4. a. What assumption about these candidates must the examiner make in order to use this sample mean to calculate a confidence interval for π? b. Assuming that the above assumption is justified, calculate a 95% confidence interval for π. Later the examiner discovers that the actual value of was 65.3. c. What conclusions might the examiner draw about his sample? QUESTION SEVEN The managing director of a certain firm has commissioned a survey to estimate the mean expenditure of customers on electrical appliances. A random sample of 100 people were questioned and the research team presented the managing director with a 95% confidence interval of (K128.14, K141.86). The director says that this interval is too wide and wants a confidence interval of total width KIO. a. Using the same value off, find the confidence limits in this case. b. Find the level of confidence for the interval in part a. The managing director is still not happy and now wishes to know how large a sample would be required to obtain a 95% confidence interval of total width no more than πΎ10. c. Find the smallest size of sample that will satisfy this request. QUESTION EIGHT a. The error made when a certain instrument is used to measure the body length of a butterfly of a particular species is known to be normally distributed with mean 0 and standard deviation 1 mm. Calculate, to 3 decimal places, the probability that the error made when the instrument is used once is numerically less than 0.4 mm. b. Given that the body length of a butterfly is measured 9 times with the instrument, calculate, to 3 decimal places, the probability that the mean of the 9 readings will be within 0.5 mm of the true length. c. Given that the mean of the 9 readings was 22.53 mm, determine a 98% confidence interval for the true body length of the butterfly. CONFIDENCE INTERVAL FOR A LARGE SAMPLE In the examples considered so far it has been assumed that the samples are drawn from a normal distribution and that the variance of this distribution is known. In practice you will not always be sure that the population is normal and you may or may not have accurate information about its variance. However, the method of calculating a confidence interval which has been described in this chapter can still be applied provided that the sample is large. ESTIMATING POPULATION PARAMETERS ο A statistic that is used to estimate a population parameter is called an ππ¬ππ’π¦πππ¨π« ο And the particular value of the estimator generated from the sample taken is called an ππ¬ππ’π¦πππ. There is need to determine how reliable these sample statistics are as estimators for the corresponding population parameters. Since all the ππ are random variables having the same mean and variance as the population, you can sometimes find expected values of a statistic T, E(T), which will tell you what the 'average' value of the statistic should be. The BIAS is simply the EXPECTED VALUE OF THE ESTIMATOR MINUS THE PARAMETER OF THE POPULATION it is estimating. ο If a statistic T is used as an estimator for a population parameter θ then BIAS = E(T) − θ. ο If a statistic T is used as an estimator for a population parameter θ and πΈ(π) = π πΈ(π) − π = 0 then T is an unbiased estimator for θ. ο An unbiased estimator for π 2 is given by the sample variance π 2 π 1 π = ∑(ππ − πΜ )2 π−1 2 π=1 One reason that the sample mean is used as an estimator for π is that π2 ο The variance of the estimator Var(πΜ ) = π decreases as n increases. ο For larger values of n, the value of an estimate is more likely to be close to the population mean. ο So, a larger value of n will result in a better estimator. ο The standard deviation of an estimator is called the standard error of the estimator. Μ = ππππ§πππ«π ππ«π«π¨π« π¨π π π √π§ = π¬ √π§ EXAMPLE QUESTION ONE A random sample π1 , π2 , π3 , , , … , , ππ is taken from a population with π(π, π 2 ). Show that – I. II. πΈ(πΜ ) = π π2 πππ(πΜ ) = π QUESTION TWO 1 Show that π 2 = π−1 ∑(π − πΜ )2 = 1 π−1 (∑ π₯ 2 − ππ₯Μ 2 ) is an unbiased estimator for π 2 QUESTION THREE The table below summarizes the number of breakdowns π on a busy road on 30 randomly chosen days. Number of break downs Number of days 2 3 3 5 4 4 5 3 6 5 7 4 8 4 9 2 a. Calculate unbiased estimates of the mean and variance of the number of breakdowns Twenty-one more days were randomly sampled, and this sample had πΜ = 6.0 πππ¦π πππ π 2 = 5.0 b. Treating the 50 results as a single sample, obtain further unbiased estimates of the population mean and variance. c. Find the standard error of this new estimate of the mean. d. Estimate the size of the sample required to achieve a standard error of less than 0.25 QUESTION FOUR The lengths of metal bars produced by a certain machine are normally distributed with mean π and standard deviation π. A random sample of 10 metals bars is taken, and there lengths π1 , π2 , π3 , , , … , , π10 are measured. Write down the distributions of the following. a. b. ∑10 π=1 ππ 2π1 − 3π10 10 ∑ ππ π=1 5 c. ∑10 π=1(ππ − π) d. πΜ e. ∑51 ππ − ∑10 6 ππ f. ∑10 π ( ππ − π π ) QUESTION FIVE A large bag of coins contains 1 cent, 5 cent and 10 cent coins in the ratio 2 βΆ 2 βΆ 1 a. Find the mean and the variance for the value of coins in this population. A random sample of two coins is taken and their values π1 and π2 are recorded. b. List all the possible observations from this sample. π −π c. Find the sampling distribution for the mean πΜ = 1 2 d. Hence show that E(πΜ ) = π and πππ(πΜ ) = π2 2 π QUESTION SIX Find unbiased estimates of the mean and variance of the populations from which the following random samples have been taken. a. 21.3 19.6 18.5 22.3 17.4 16.3 18.9 17.6 18.7 16.5 19.3 21.8 20.1 22.0 b. 1 , 2, 5, 1, 6, 4, 1, 3, 2, 8, 5, 6, 2, 4, 3, 1 c. 120.4 230.6 356.1 129.8 185.6 147.6 258.3 329.7 249.3 QUESTION SEVEN Find unbiased estimates of the mean and variance of the populations for which the random samples with the following summaries have been made a. b. c. d. π = 120 π = 30 π = 1037 π = 15 ∑ π₯ = 4368 ∑ π₯ = 270 ∑ π₯ = 1140.7 ∑ π₯ = 168 ∑ π₯ 2 = 162 466 ∑ π₯ 2 = 2546 ∑ π₯ 2 = 1278.08 ∑ π₯ 2 = 162 466 QUESTION EIGHT A sample of size 6 is taken from a population that is normally distributed with mean 10 and standard deviation 2. a. Find the probability that the sample mean is greater than 12. b. State, with a reason, if your answer is an approximation. QUESTION NINE A machine fills cartons in such a way that the amount of drink in each carton is distributed normally with a mean of 40 ππ3 and a standard deviation of 1.5 ππ3. A sample of four cartons is examined. a. Find the probability that the mean amount of drink is more than 40.5 ππ3. A sample of 49 cartons is examined. b. Find the probability that the mean amount of drink is more than 40.5 ππ3 on this occasion. QUESTION TEN Cartons of orange juice are filled by a machine. A sample of 10 cartons selected at random from the production line contained the following quantities of orange juice (in ml). 201.2 205.0 209.1 202.3 204.6 206.4 210.1 201.9 203.7 207.3 Calculate unbiased estimates of the mean and variance of the population from which this sample was taken. QUESTION ELEVEN A manufacturer of self-build furniture required bolts of two lengths, 5cm and 10 cm, in the ratio 2 βΆ 1 respectively. a. Find the mean π and the variance π 2 for the lengths of bolts in this population. A random sample of three bolts is selected from a large box containing bolts in the required ratio. b. c. d. e. f. g. List all the possible observations from this sample. Find the sampling distribution for the mean πΜ . Μ ) and πππ(πΜ ). Hence find E(X Find the sampling distribution of the mode π Hence find πΈ(π) πππ π£ππ(π) Find the bias when π is used as an estimator of the population mode. QUESTION TWELVE A machine operator checks a random sample of 20 bottles from a production line in order to estimate the mean volume of bottles (ππ ππ3 ) from this production run. The 20 values can be summarized as ∑ π₯ = 1300 and ∑ π₯ 2 = 84 685. a. Use this sample to find unbiased estimates of π and π 2 . A supervisor knows from experience that the standard deviation of volumes on this process, (T, should be 3 ππ3 and he wishes to have an estimate of π that has a standard error of less than 0.5 ππ3 ). b. Recommend a sample size for the supervisor, showing working to support your recommendation. c. Does your recommended sample size guarantee a standard error of less than 0.5 cm3? Give a reason for your answer. The supervisor takes a further sample of size 16 and finds ∑ π₯ = 1060. d. Combine the two samples to obtain a revised estimate of π. QUESTION THIRTEEN To work for a company, applications need to complete a medical test. The probability of each applicant passing the test is π, independent of any other applicant. The medicals are out over two days and on the first day "π" applicants are seen and on the next day "2π" are seen. Let π1 be the number of applicants who pass the test on the first day and Let π2 be the number of applicants who pass the test on the second day a. Write down πΈ(π1 ), πΈ(π2 ), πππ(π1 ) πππ πππ(π2 ) b. Show that π1 π πππ π2 are both unbiased estimates of π and state giving a reason which one 2π you would prefer to use. 1 π π c. Show that π = 2 ( π1 + 2π2 ) is an unbiased estimator of π π1 +π2 d. Show that π¦ = ( 3π ) is an unbiased estimator of π e. Which of the statistics The statisticπ = ( π1 π 2π1 +π2 3π , π2 2π π ππ π is the best estimator of π? ) is proposed as an estimator of π f. Find the bias. QUESTION FOURTEEN In a bag that contains a large number of counters, the umbers 0 is written on 40% of the counters, the number 1 is written on 20% of the counters and the number is written on the remaining 40% of the counters. a. Find the mean π and the variance π 2 for this population of counters A random sample of size 3 is taken from the bag b. c. d. e. List all the possible observations from this sample Find the sampling distribution of the mean πΜ . Μ ) πππ πππ(πΜ ). The E(X Find the sampling distribution for the median π. f. Hence, find E(N) πππ πππ(π). g. Show that N is an unbiased estimator of π. h. Explain which estimator, πΜ or π, you would choose as an estimator of π. QUESTION FIFTEEN A factory worker checks a random sample of 20 bottles from a production line in order to estimate the mean volume of bottles (ππππ2 ) from this production run. The 20 values can be summarized as ∑ π₯ = 1300 πππ ∑ π₯ 2 = 84685. a. Use this sample to find unbiased estimates π and π 2 . A factory manager knows from experience that the standard deviation of volumes on this process, " π " should be 3ππ3 and he wishes to have an estimate of π that has a standard error of less than 0.5 ππ3. b. Recommend a sample size for the manager, showing working to support your recommendation. c. Does the recommended sample size guarantee a standard error of less than 0.5ππ3 ? Give a reason for your answer. The manager takes a further sample of 16 and finds ∑ π₯ = 1060. d. Combine the two samples to obtain a revised estimate of π