Statistics and modelling course

Statistics and Modelling Course 2011 Topic: Confidence Intervals Achievement Standard 90642 Calculate Confidence Intervals for Population Parameters 3 Credits Externally Assessed NuLake Pages 63101 LESSON 1 – Sampling Handout with gaps to fill in – goes with the following slides. STARTER: Look at the following 2 examples of bad sampling technique & discuss what’s wrong in each case. 1. Discuss how you’d obtain a representative sample from our school roll. 2. Notes on sampling and inference. 3. Population and Samples ‘Policemen’ worksheet (from Achieving in Statistics). Complete for HW. Sampling Describe some faults with each of these sampling methods. Sampling Describe some faults with each of these sampling methods.   (a) A survey on magazine readership is conducted by phoning households between 1 and 4pm. People who aren’t at home during those times cannot be surveyed. Some people don’t have a phone Sampling Describe some faults with each of these sampling methods.   (b) A talkback radio station asks listeners to phone in with a quick ‘yes’ or ‘no’ answer to the question “Should NZ have capital punishment?” Only people who are listening at the time can participate. Self-selected sample. Only those with a strong opinion will ring in. Sampling You are asked the question: “How tall are St. Thomas students?” • You only have time to measure the heights of 35 students. Q1: How would you choose which 35 students to measure. Q2: Once you’ve measured your 35 students’ heights, how would you use this data to answer the question: “How tall are St. Thomas students? Purpose of a Sample Make an inference POPULATION SAMPLE Purpose of a Sample SAMPLE Make an inference Inferences POPULATION Sampling terminology Purpose of a Sample SAMPLE Make an inference Inferences POPULATION Sampling terminology POPULATION: Target Population: All items under investigation. We usually just call it the “Population”. SAMPLES: Sample: Subset selected to REPRESENT the population. Sampling Frame: A list/database of items from which we select our sample. (Should include all items in the Target Population) Sampling terminology POPULATION: Target Population: All items under investigation. We usually just call it the “Population.” SAMPLES: Sample: Subset selected to REPRESENT the population. Sampling Frame: A list/database of items from which we select our sample. (Should include all items in the Target Population) For a sample to be Representative of a given population: The Sampling Frame must match the Target Population. POPULATION: Target Population: All items under investigation. We usually just call it the “Population.” SAMPLES: Sample: Subset selected to REPRESENT the population. Sampling Frame: A list/database of items from which we select our sample. (Should include all items in the Target Population) For a sample to be Representative of a given population: The Sampling Frame must match the Target Population. POPULATION: Target Population: All items under investigation. We usually just call it the “Population.” SAMPLES: Sample: Subset selected to REPRESENT the population. Sampling Frame: A list/database of items from which we select our sample. (Should include all items in the Target Population) For a sample to be Representative of a given population: The Sampling Frame must match the Target Population. Sample statistic Population parameter POPULATION: Target Population: All items under investigation. We usually just call it the “Population.” SAMPLES: Sample: Subset selected to REPRESENT the population. Sampling Frame: A list/database of items from which we select our sample. (Should include all items in the Target Population) For a sample to be Representative of a given population: The Sampling Frame must match the Target Population. Sample statistic Number of items Population parameter POPULATION: Target Population: All items under investigation. We usually just call it the “Population.” SAMPLES: Sample: Subset selected to REPRESENT the population. Sampling Frame: A list/database of items from which we select our sample. (Should include all items in the Target Population) For a sample to be Representative of a given population: The Sampling Frame must match the Target Population. Sample statistic Number of items n: Sample size Population parameter POPULATION: Target Population: All items under investigation. We usually just call it the “Population.” SAMPLES: Sample: Subset selected to REPRESENT the population. Sampling Frame: A list/database of items from which we select our sample. (Should include all items in the Target Population) For a sample to be Representative of a given population: The Sampling Frame must match the Target Population. Number of items Sample statistic Population parameter n: Sample size N: Population size POPULATION: Target Population: All items under investigation. We usually just call it the “Population.” SAMPLES: Sample: Subset selected to REPRESENT the population. Sampling Frame: A list/database of items from which we select our sample. (Should include all items in the Target Population) For a sample to be Representative of a given population: The Sampling Frame must match the Target Population. Number of items Mean Sample statistic Population parameter n: Sample size N: Population size X POPULATION: Target Population: All items under investigation. We usually just call it the “Population.” SAMPLES: Sample: Subset selected to REPRESENT the population. Sampling Frame: A list/database of items from which we select our sample. (Should include all items in the Target Population) For a sample to be Representative of a given population: The Sampling Frame must match the Target Population. Number of items Mean Sample statistic Population parameter n: Sample size N: Population size X m POPULATION: Target Population: All items under investigation. We usually just call it the “Population.” SAMPLES: Sample: Subset selected to REPRESENT the population. Sampling Frame: A list/database of items from which we select our sample. (Should include all items in the Target Population) For a sample to be Representative of a given population: The Sampling Frame must match the Target Population. Number of items Mean Standard deviation Sample statistic Population parameter n: Sample size N: Population size X m POPULATION: Target Population: All items under investigation. We usually just call it the “Population.” SAMPLES: Sample: Subset selected to REPRESENT the population. Sampling Frame: A list/database of items from which we select our sample. (Should include all items in the Target Population) For a sample to be Representative of a given population: The Sampling Frame must match the Target Population. Number of items Sample statistic Population parameter n: Sample size N: Population size Mean X Standard deviation s m POPULATION: Target Population: All items under investigation. We usually just call it the “Population.” SAMPLES: Sample: Subset selected to REPRESENT the population. Sampling Frame: A list/database of items from which we select our sample. (Should include all items in the Target Population) For a sample to be Representative of a given population: The Sampling Frame must match the Target Population. Number of items Sample statistic Population parameter n: Sample size N: Population size Mean X Standard deviation s m s POPULATION: Target Population: All items under investigation. We usually just call it the “Population.” SAMPLES: Sample: Subset selected to REPRESENT the population. Sampling Frame: A list/database of items from which we select our sample. (Should include all items in the Target Population) For a sample to be Representative of a given population: The Sampling Frame must match the Target Population. Number of items Sample statistic Population parameter n: Sample size N: Population size Mean X Standard deviation s Proportion m s POPULATION: Target Population: All items under investigation. We usually just call it the “Population.” SAMPLES: Sample: Subset selected to REPRESENT the population. Sampling Frame: A list/database of items from which we select our sample. (Should include all items in the Target Population) For a sample to be Representative of a given population: The Sampling Frame must match the Target Population. Number of items Mean Standard deviation Proportion Sample statistic Population parameter n: Sample size N: Population size X s p m s POPULATION: Target Population: All items under investigation. We usually just call it the “Population.” SAMPLES: Sample: Subset selected to REPRESENT the population. Sampling Frame: A list/database of items from which we select our sample. (Should include all items in the Target Population) For a sample to be Representative of a given population: The Sampling Frame must match the Target Population. Number of items Mean Standard deviation Proportion Sample statistic Population parameter n: Sample size N: Population size X s p m s POPULATION: Target Population: All items under investigation. We usually just call it the “Population.” SAMPLES: Sample: Subset selected to REPRESENT the population. Sampling Frame: A list/database of items from which we select our sample. (Should include all items in the Target Population) For a sample to be Representative of a given population: The Sampling Frame must match the Target Population. Number of items Mean Standard deviation Proportion Sample statistic Population parameter n: Sample size N: Population size X s p m s p Number of items Mean Standard deviation Proportion Sample statistic Population parameter n: Sample size N: Population size X s p m s p A representative sample should have… • Sample-size large enough to allow the results to be meaningful (rough guide: sample size of n > 30). • No Bias – Sample selection is said to be “biased” if some items are more likely to be chosen than others. Every item in the target population should be equally-likely to be chosen. Random selection ensures this. • Minimal Non-response – difficult to control this. Example: A home security firm is hoping to sell as many burglar alarms as possible to householders in a certain town. Usually each house only needs one burglar alarm. Before the firm orders the alarms from their supplier, they wish to have an indication of how many alarms they might sell. 1.) What is the target population? A. all the people who live in the town. B. the head of each household. C. the houses in the town. Example: A home security firm is hoping to sell as many burglar alarms as possible to householders in a certain town. Usually each house only needs one burglar alarm. Before the firm orders the alarms from their supplier, they wish to have an indication of how many alarms they might sell. 1.) What is the target population? A. all the people who live in the town. B. the head of each household. C. the houses in the town. Example: A home security firm is hoping to sell as many burglar alarms as possible to householders in a certain town. Usually each house only needs one burglar alarm. Before the firm orders the alarms from their supplier, they wish to have an indication of how many alarms they might sell. 1.) What is the target population? Answer: C. the houses in the town. 2.) What is the sampling frame? A. the electoral roll for the town. B. a list of all the people who live in the town. C. a list of all the houses in the town. Example: A home security firm is hoping to sell as many burglar alarms as possible to householders in a certain town. Usually each house only needs one burglar alarm. Before the firm orders the alarms from their supplier, they wish to have an indication of how many alarms they might sell. 1.) What is the target population? Answer: C. the houses in the town. 2.) What is the sampling frame? A. the electoral roll for the town. B. a list of all the people who live in the town. C. a list of all the houses in the town. Example: A home security firm is hoping to sell as many burglar alarms as Do to Population Samples possible householders inand a certain town. Usually each house onlyworksheet. needs one burglarFinish alarm. by ‘Policemen’ Before the firm orders alarms their supplier, they wish to Monday. Will the mark asfrom a class. have an indication of how many alarms they might sell. 1.) What is the target population? Answer: C. the houses in the town. 2.) What is the sampling frame? Answer: C. a list of all the houses in the town. 3.) How would you select a representative sample of the houses in the town? (discus s as a class) EXTRA ON SAMPLING TECHNIQUES IF TIME (schol students) Otherwise skip to Lesson 3: Distribution of Sample Means 1 Extension Lesson: Other sampling techniques Good sampling techniques: 1. Simple Random Sampling 2. Systematic Sampling 3. Stratified Sampling 4. Cluster Sampling Bad sampling techniques (biased selection): • Convenience sampling. • Self-selected sampling. Random selection Q: What does the word “random” actually mean? Q: How would you select a student at random from this school? 21.03 Simple random sampling. Generate 20 different random numbers between 1 and 100. If a random number has already occurred, generate more as needed. Calculator formula 1 + 100×RAN# 42 67 2 12 77 49 60 20 45 15 64 7 8 21 15 64 58 14 29 68 26 90 1. Simple Random Sampling 1. Obtain a list of all N items in the target population, numbering them 1 to N (e.g. the school roll: 1-600). 2. Decide how many you will select for your sample (n). 3. Use the random number generator on your calculator to select numbers at random between 1 and N: On calculator, type: 1 + Population size × RAN# 4. Keep pressing ‘equals’ until you have selected n different items. Discard any repeats. Advantage of SR sampling: Ensures that every item in the population has an equal chance of being selected – so no chance of bias. 2. Decide how many you will select for your sample (n). Select a sample 35 students the St. 3. Use the random of number generator from on your calculator Thomas school at roll. to select numbers random between 1 and N: On calculator, type: 1 + Population size × RAN# HW: Old Sigma‘equals’ Pg. 130 Ex.have 9.1 (all), then 4. Keep pressing until–you selected n Pg. different items. any repeats. 134 – Ex. 9.2 –Discard just Q1. Advantage of SR sampling: Ensures that every item in the population has an equal chance of being selected – so no chance of bias. Disadvantage: • Does not ensure that all subgroups of the population are represented in proportion (e.g. some racial, socioeconomic groups could be over/under-represented). 3 other good sampling techniques Systematic sampling 1. 2. 3. Obtain a list of all N items in the target popn (numbered 1N). Pick a random starting point (e.g. item number 7) Sample every kth item after that, where k=N/n until you have selected n items. Cluster sampling Stratified sampling Use when the population consists of categories (strata), (e.g. racial groups) 1. Divide sampling frame into the strata (categories). 2. Select a separate random sample from each stratum in proportion to the percentage of the population found in each. (Called Proportional Allocation ) Use when the population is distributed into naturally-occurring groups or ‘clusters’ (e.g. towns and cities in NZ) Stage 1: Select the clusters: Select a representative sample of the clusters themselves. Stage 2: Select a random sample of items within chosen clusters. Must be in proportion to the percentage of the population found in each. (Called Proportional Allocation ) 21.03 Comparison of samples. Simple random sampling Stratified sampling Systematic sampling Cluster sampling 3 other good sampling techniques Systematic sampling 1. Obtain a list of all N items in the target popn (numbered 1N). 1. Select a sample of between 30 and 36 students 2. Pick a random starting point (e.g. from the7) school roll using each of these 3 item number 3. methods. Sample every kth item after that, where k=N/n until you have selected n items. 2. Write down at least one advantage and at least Cluster sampling Use when the population is distributed one disadvantage/risk associated with each of Stratified sampling into naturally-occurring groups or Use whenthese the population consists of 3 techniques. ‘clusters’ (e.g. towns and cities in NZ) categories (strata), (e.g. racial groups) Stage 1: Select the clusters: Divide sampling frame into the ndSelect a representative sample of the  HW: Do Old Sigma (2 edition) p137: Ex. 9.3. strata (categories). clusters themselves. 2. Select a separate random sample from each stratum in proportion Stage 2: Select a random sample of to the percentage of the population items within chosen clusters. Must be found in each. (Called Proportional in proportion to the percentage of the population found in each (Proportional Allocation ) Allocation). 1. 21.03 Systematic sampling. To obtain a systematic sample of size 20 from this data. Choose a starting point at random between 1 and 100. Using calculator 1 + 100×RAN# = Suppose this gives 5.87352 5. So start at item number 5. Then choose every kth item, where k = N/n. = 100/20 = 5. So sample every 5th item. Systematic Sampling 1. Obtain a list of all N items in the target population. 2. Decide on your sample size, n . 3. Pick a random starting point (e.g. item number 7) 4. Sample every kth item after that, where k=N/n until you have selected n items. Advantages: • Ensures that sample is selected from throughout the breadth of the sampling frame. • Convenient and fast – easier to collect info on items that are in a sequence (every 5th house) than from a random sample where they are scattered all over. 4. Sample every kth item after that, where k=N/n until you have selected n items. Advantages: • Ensures that sample is selected from throughout the breadth of the sampling frame. • Convenient and fast – easier to collect info on items that are in a sequence (every 5th house) than from a random sample where they are scattered all over. Disadvantage: Be careful that the list itself has no systematic pattern. If every 2nd house on a street were sampled, all would be on the same side of the street! 21.03 Stratified sampling. Suppose the avocados are of 3 different varieties. Hass: 1–40 40% Fuerte: 41–70 30% Hopkins: 71–100 30% The number in each strata of the sample should be proportional to the number in each group in the population. Hass: 40% x 20 = 8 Fuerte: 30% x 20 = 6 Hopkins: 30% x 20 = 6 21.03 Stratified sampling. Thus generate random numbers as follows: Hass: 1–40 8 random nos. 33 17 12 25 9 9 33 16 39 8 Fuerte: 41–70 6 random nos. 58 59 67 43 53 56 Hopkins: 71–100 6 random nos. 98 85 96 99 90 81 Stratified sampling Use when the population consists of categories (strata), and you wish to represent each ‘stratum’ proportionally (e.g. racial groups, one-story and multi-story homes within a city). 1. Obtain a list of all N items in the target population. 2. Decide on your sample size, n . 3. Divide list into the strata (categories). 4. Select a separate random sample from each stratum in proportion to the percentage of the population found in each. Proportional Allocation: Selecting from each stratum in proportion to its percentage of the population. 1. 2. 3. 4. Obtain a list of all N items in the target population. Decide on your sample size, n . Divide list into the strata (categories). Select a separate random sample from each stratum in proportion to the percentage of the population found in each. Proportional Allocation: Selecting from each stratum in proportion to its percentage of the population. E.g. If 12% of a city’s citizens are Pacific Islanders, then 12% of the sample size should be selected from among the Pacific Island citizens. 3. Divide list into the strata (categories). 4. Select a separate random sample from each stratum in proportion to the percentage of the population found in each. Proportional Allocation: Selecting from each stratum in proportion to its percentage of the population. E.g. If 12% of a city’s citizens are Pacific Islanders, then 12% of the sample size should be selected from among the Pacific Island citizens. Advantage: Guaranteed to be representative of each stratum. Disadvantage: Time-consuming and expensive because you must collect information about the strata-sizes in advance. Cluster sampling Use when the population is distributed into naturallyoccurring groups or ‘clusters’ (e.g. towns and cities in a country). 1. Select a representative sample of the clusters themselves (usually a lot so we can’t sample from all). 2. Select a random sample of items from within each chosen cluster. 3. Again, use Proportional Allocation (like with stratified samples). Weight the number selected from each cluster according to the cluster size. E.g. Selecting samples of New Zealanders by selecting a sample of towns/cities from throughout the country, then a proportional random sample from within each. 1. Select a representative sample of the clusters themselves (usually a lot so we can’t sample from all). 2. Select a random sample of items from within each chosen cluster. 3. Again, use Proportional Allocation (like with stratified samples). Weight the number selected from each cluster according to the cluster size. E.g. Selecting samples of New Zealanders by selecting a sample of towns/cities from throughout the country, then a proportional random sample from within each. Advantage: • Cheaper and faster when sampling from a geographically large area (data can be collected in groups within chosen clusters rather than being spread out). E.g. Selecting samples of New Zealanders by selecting a HW: Memorise the 4 typesthe of country, sample of towns/cities from throughout thensampling a proportional random sampleand fromthe within each. techniques advantages & disadvantages Advantage: of each. • Cheaper and faster when sampling from a geographically large area (data can be collected in groups within chosen clusters rather than being spread out). Disadvantages: • Items don’t have an equal chance of selection. – Small clusters are unlikely to be sampled from. – Items that are not in clusters are excluded altogether. E.g. farmers or people in small rural communities may have no chance of being selected. • Requires prior knowledge of cluster sizes. 21.03 Cluster sampling. Here is one way of obtaining a cluster sample of size 20. Choose four clusters, each of 5 avocados, by selecting four numbers at random from the data, and taking them as the middle item of a ‘cross’. If clusters overlap or run outside the boundaries, choose another. Spreadsheet formula 99×RAN# + 1 = 62 22 2 68 56 Note: Depending how a cluster is defined, it can exclude some items or make other items more likely to be chosen than under other sampling methods LESSON 2 – Distribution of Sample Means The points of today: The point of today: Get confident at calculating probabilities involving the distribution of sample means. – Mark HW: “Achieving in Statistics”: pages 30. – Handout to fill in (goes with following slides) • Then do Achieving in Statistics: pages 31 & 32. The Distribution of Sample Means The Distribution of Sample Means STARTER ACTIVITY: Each class member has 5 dice. Toss your 5 dice and record the number facing upward for each. Add up to get the total for your 5. My total value from 5 tosses = _____ My mean score for each die roll = ________ Your group of 5 dice tosses represents a sample of size n=5. Between us, as a class, we tossed 5 dice ________ times. We got means of: ________________________________ This illustrates the fact that ________________________. The Distribution of Sample Means STARTER ACTIVITY: Each class member has 5 dice. Toss your 5 dice and record the number facing upward for each. Add up to get the total for your 5. My total value from 5 tosses = _____ My mean score for each die roll = ________ Your group of 5 dice tosses represents a sample of size n=5. Between us, as a class, we tossed 5 dice ________ times. We got means of: ________________________________ This illustrates the fact that sample means vary. The Distribution of Sample Means Your group of 5 dice tosses represents a sample of size n=5. Between us, as a class, we tossed 5 dice ________ times. We got means of: ________________________________ This illustrates the fact that sample means vary. A random sample can be thought of as a collection of n items (n=5 dice-tosses in the experiment we did last time), The Distribution of Sample Means Your group of 5 dice tosses represents a sample of size n=5. Between us, as a class, we tossed 5 dice ________ times. We got means of: ________________________________ This illustrates the fact that sample means vary. A random sample can be thought of as a collection of n items (n=5 dice-tosses in the experiment we did last time), each of which has a value that we measure (the number facing upward when a die lands in this case). The Distribution of Sample Means This illustrates the fact that sample means vary. A random sample can be thought of as a collection of n items (n=5 dice-tosses in the experiment we did last time), each of which has a value that we measure (the number facing upward when a die lands in this case). When you select items at random from any population, the value of each item, X is a random variable (e.g. height, weight, volume of drink in soft drink bottles etc.). Select a random sample of size n from any population: The sample mean, X = X 1  X 2  X 3  ...  X n n The Distribution of Sample Means When you select items at random from any population, the value of each item, X is a random variable (e.g. height, weight, volume of drink in soft drink bottles etc.). Select a random sample of size n from any population: The sample mean, X = X 1  X 2  X 3  ...  X n n Different samples will produce different mean values , just like we got different mean values from tossing our dice. The Distribution of Sample Means Select a random sample of size n from any population: The sample mean, X = X 1  X 2  X 3  ...  X n n Different samples will produce different mean values , just like we got different mean values from tossing our dice. The sample mean : • ___________________________________________ ___________________________________________ • ___________________________________________ ___________________________________________ ___________________________________________ The Distribution of Sample Means Select a random sample of size n from any population: The sample mean, X = X 1  X 2  X 3  ...  X n n Different samples will produce different mean values , just like we got different mean values from tossing our dice. The sample mean : • is a random variable itself because it varies at random from sample to sample. • ___________________________________________ ___________________________________________ ___________________________________________ The Distribution of Sample Means Select a random sample of size n from any population: The sample mean, X = X 1  X 2  X 3  ...  X n n Different samples will produce different mean values , just like we got different mean values from tossing our dice. The sample mean : • is a random variable itself because it varies at random from sample to sample. • is normally distributed about the population mean m, The Distribution of Sample Means Select a random sample of size n from any population: The sample mean, X = X 1  X 2  X 3  ...  X n n Different samples will produce different mean values , just like we got different mean values from tossing our dice. The sample mean : • is a random variable itself because it varies at random from sample to sample. • is normally distributed about the population mean m, even if the population from which it is drawn is not normally distributed, The Distribution of Sample Means Select a random sample of size n from any population: The sample mean, X = X 1  X 2  X 3  ...  X n n Different samples will produce different mean values , just like we got different mean values from tossing our dice. The sample mean : • is a random variable itself because it varies at random from sample to sample. • is normally distributed about the population mean m, even if the population from which it is drawn is not normally distributed, provided the samples are large enough. Rule of thumb is n > 30. The Distribution of Sample Means Different samples will produce different mean values , just like we got different mean values from tossing our dice. The sample mean : • is a random variable itself because it varies at random from sample to sample. • is normally distributed about the population mean m, even if the population from which it is drawn is not normally distributed, provided the samples are large enough. Rule of thumb is n > 30. In other words the sample means will ‘average out’ towards the population mean. The Distribution of Sample Means Different samples will produce different mean values , just like we got different mean values from tossing our dice. The sample mean : • is a random variable itself because it varies at random from sample to sample. • is normally distributed about the population mean m, even if the population from which it is drawn is not normally distributed, provided the samples are large enough. Rule of thumb is n > 30. In other words the sample means will ‘average out’ towards the population mean. This result is called the ‘__________________’. The Distribution of Sample Means Different samples will produce different mean values , just like we got different mean values from tossing our dice. The sample mean : • is a random variable itself because it varies at random from sample to sample. • is normally distributed about the population mean m, even if the population from which it is drawn is not normally distributed, provided the samples are large enough. Rule of thumb is n > 30. In other words the sample means will ‘average out’ towards the population mean. This result is called the ‘Central Limit Theorem’. The Distribution of Sample Means Different samples will produce different mean values , just like we got different mean values from tossing our dice. The sample mean : • is a random variable itself because it varies at random from sample to sample. • is normally distributed about the population mean m, even if the population from which it is drawn is not normally distributed, provided the samples are large enough. Rule of thumb is n > 30. In other words the sample means will ‘average out’ towards the population mean. This result is called the ‘Central Limit Theorem’. i.e. mX = m In other words the sample means will ‘average out’ towards the population mean. This result is called the ‘Central Limit Theorem’. i.e. mX = m Mean of sample means Distribution of Sample Means X mX = m . Std. deviation of distribution of sample means (standard error) mX sX = s n * In other words the sample means will ‘average out’ towards the population mean. This result is called the ‘Central Limit Theorem’. i.e. mX = m . Mean of sample means Distribution of Sample Means X mX = m . Std. deviation of distribution of sample means (standard error) mX sX = s n * In other words the sample means will ‘average out’ towards the population mean. This result is called the ‘Central Limit Theorem’. i.e. mX = m . Mean of sample means Distribution of Sample Means X mX = m . Std. deviation of distribution of sample means (standard error) mX sX = s Since sample means are normally distributed about the population mean, n In other words the sample means will ‘average out’ towards the population mean. This result is called the ‘Central Limit Theorem’. i.e. mX = m . Mean of sample means Distribution of Sample Means X mX = m . Std. deviation of distribution of sample means (standard error) mX sX = s Since sample means are normally distributed about the population mean, we can use the properties of a normal distribution curve n In other words the sample means will ‘average out’ towards the population mean. This result is called the ‘Central Limit Theorem’. i.e. mX = m . Mean of sample means Distribution of Sample Means X mX = m . Std. deviation of distribution of sample means (standard error) mX sX = s n Since sample means are normally distributed about the population mean, we can use the properties of a normal distribution curve to predict the percentage of samples that will produce means within a particular distance from the population mean. Mean of sample means Distribution of Sample Means X mX = m . Std. deviation of distribution of sample means (standard error) sX = mX s n Since sample means are normally distributed about the population mean, we can use the properties of a normal distribution curve to predict the percentage of samples that will produce means within a particular distance from the population mean. Example: Since sample means are normally distributed about the population mean, we can use the properties of a normal distribution curve to predict the percentage of samples that will produce means within a particular distance from the population mean. Example: The results of a census of all 17 year-old males in NZ showed a mean height of m = 177cm, with s = 9cm. If a random sample of 36 seventeen year-old NZ males is taken, calculate: a) The expected value of the sample mean. E( X ) = m X And, by the Central Limit Theorem,  E ( X ) = ____ mX = m the population mean. Since sample means are normally distributed about the population mean, we can use the properties of a normal distribution curve to predict the percentage of samples that will produce means within a particular distance from the population mean. Example: The results of a census of all 17 year-old males in NZ showed a mean height of m = 177cm, with s = 9cm. If a random sample of 36 seventeen year-old NZ males is taken, calculate: a) The expected value of the sample mean. E( X ) = m X And, by the Central Limit Theorem,  E ( X ) = 177cm mX = m the population mean. Example: The results of a census of all 17 year-old males in NZ showed a mean height of m = 177cm, with s = 9cm. If a random sample of 36 seventeen year-old NZ males is taken, calculate: a) The expected value of the sample mean. E( X ) = m X And, by the Central Limit Theorem, mX = m the population mean.  E ( X ) = 177cm b) The standard deviation (standard error) of the sample mean. s sX = n 9 = 36 The results of a census of all 17 year-old males in NZ showed a mean height of m = 177cm, with s = 9cm. If a random sample of 36 seventeen year-old NZ males is taken, calculate: a) The expected value of the sample mean. E( X ) = m X And, by the Central Limit Theorem, mX = m the population mean.  E ( X ) = 177cm b) The standard deviation (standard error) of the sample mean. s sX = n 9 = 36 = 1.5cm If a random sample of 36 seventeen year-old NZ males is taken, calculate: a) The expected value of the sample mean. E( X ) = m X And, by the Central Limit Theorem, mX = m the population mean.  E ( X ) = 177cm b) The standard deviation (standard error) of the sample mean. s sX = n = 9 36 = 1.5cm c) What percentage of such samples would have a mean that is: Example: The results of a census of all 17 year-old males in NZ showed a mean height of m = 177cm, with s = 9cm. If a random sample of 36 seventeen year-old NZ males is taken: c) What percentage of such samples would have a mean that is: (i) Within 3cm of the population mean of 177cm? P(174  X  180, if m = 177) = P_______  z  ________  Example: The results of a census of all 17 year-old males in NZ showed a mean height of m = 177cm, with s = 9cm. If a random sample of 36 seventeen year-old NZ males is taken: c) What percentage of such samples would have a mean that is: (i) Within 3cm of the population mean of 177cm?  174  m 180  m   P(174  X  180, if m = 177) = P z   s s X X   Example: The results of a census of all 17 year-old males in NZ showed a mean height of m = 177cm, with s = 9cm. If a random sample of 36 seventeen year-old NZ males is taken: c) What percentage of such samples would have a mean that is: (i) Within 3cm of the population mean of 177cm?  174  177 180  177   P(174  X  180, if m = 177) = P z   s s X X   Example: The results of a census of all 17 year-old males in NZ showed a mean height of m = 177cm, with s = 9cm. If a random sample of 36 seventeen year-old NZ males is taken: c) What percentage of such samples would have a mean that is: (i) Within 3cm of the population mean of 177cm?     174  177 180  177  P(174  X  180, if m = 177) = P z  s s     n n   Example: The results of a census of all 17 year-old males in NZ showed a mean height of m = 177cm, with s = 9cm. If a random sample of 36 seventeen year-old NZ males is taken: c) What percentage of such samples would have a mean that is: (i) Within 3cm of the population mean of 177cm?     174  177 180  177  P(174  X  180, if m = 177) = P z  9 9     36 36   Example: The results of a census of all 17 year-old males in NZ showed a mean height of m = 177cm, with s = 9cm. If a random sample of 36 seventeen year-old NZ males is taken: c) What percentage of such samples would have a mean that is: (i) Within 3cm of the population mean of 177cm? 180  177   174  177 P(174  X  180, if m = 177) = P z   1.5   1.5 Example: The results of a census of all 17 year-old males in NZ showed a mean height of m = 177cm, with s = 9cm. If a random sample of 36 seventeen year-old NZ males is taken: c) What percentage of such samples would have a mean that is: (i) Within 3cm of the population mean of 177cm? 180  177   174  177 P(174  X  180, if m = 177) = P z   1.5   1.5 = P 2  z  2 = 2  0.47724 = 0.9545 (4sf) So 95.45% of samples Example: The results of a census of all 17 year-old males in NZ showed a mean height of m = 177cm, with s = 9cm. If a random sample of 36 seventeen year-old NZ males is taken: c) What percentage of such samples would have a mean that is: (ii) More than 5cm away from the population mean? P( X  172 or X  182) = 1 - P____  X  ____  Example: The results of a census of all 17 year-old males in NZ showed a mean height of m = 177cm, with s = 9cm. If a random sample of 36 seventeen year-old NZ males is taken: c) What percentage of such samples would have a mean that is: (ii) More than 5cm away from the population mean? P( X  172 or X  182) = 1 - P172  X  182 = 1- P_______  Z  _______  Example: The results of a census of all 17 year-old males in NZ showed a mean height of m = 177cm, with s = 9cm. If a random sample of 36 seventeen year-old NZ males is taken: c) What percentage of such samples would have a mean that is: (ii) More than 5cm away from the population mean? P( X  172 or X  182) = 1 - P172  X  182  172  177 182  177    = 1 - P Z  s X   sX Example: The results of a census of all 17 year-old males in NZ showed a mean height of m = 177cm, with s = 9cm. If a random sample of 36 seventeen year-old NZ males is taken: c) What percentage of such samples would have a mean that is: (ii) More than 5cm away from the population mean? P( X  172 or X  182) = 1 - P172  X  182     172  177 182  177  = 1 - P Z s s     n n   Example: The results of a census of all 17 year-old males in NZ showed a mean height of m = 177cm, with s = 9cm. If a random sample of 36 seventeen year-old NZ males is taken: c) What percentage of such samples would have a mean that is: (ii) More than 5cm away from the population mean? P( X  172 or X  182) = 1 - P172  X  182     172  177 182  177  = 1 - P Z 9 9     36 36   Example: The results of a census of all 17 year-old males in NZ showed Homework: a mean height of m = 177cm, with s = 9cm. Do Achieving in Statistics: pages 31 & 32. If a random sample of 36 seventeen year-old NZ males is taken: c) What percentage of such samples would have a mean that is: (ii) More than 5cm away from the population mean? P( X  172 or X  182) = 1 - P172  X  182 182  177   172  177 = 1 - P Z  1.5   1.5 1 1 = 1 - P(3  Z  3 ) 3 3 = 1- 0.99914 So only about 0.09% of = 0.00086 samples. Very rare. Extension The point of today: Look at when we can draw conclusions about the population mean based on a sample mean. STARTER: Look at applet that demonstrates the distribution of sample means: SIM - onlinestatbook.com.SLASH.rvls.html. • Work through the following examples as class (handout to fill in). • Then do Sigma p184 – Ex. 11.5 (old version). or Sigma p66 – Ex. 3.05 (new version) Example: The census of all NZ seventeen year-old males from yesterday’s example was actually conducted back in 1987. It had mean of m =177cm and s of 9cm. A random sample of 36 seventeen year-old NZ males was selected just last year. This sample found a mean height of 180cm. (a) What is the probability that a random sample of 36 students selected from a population with m=177cm and s=9cm would give a mean height greater than 180cm?     X m  P( X  180, if m = 177) = P z  s     n       180  177  = P z  9     36   (a) What is the probability that a random sample of 36 students selected from a population with m=177cm and s=9cm would give a mean height greater than 180cm?   P( X  180, if m = 177) = P z       X m  s   n      180  177  = P z  9     36   = Pz  2 = 0.02275 (b) Based on this answer, what percentage of samples would have means of 180cm or higher if the population mean was 177cm? (a) What is the probability that a random sample of 36 students selected from a population with m=177cm and s=9cm would give a mean height greater than 180cm?   P( X  180, if m = 177) = P z       X m  s   n      180  177  = P z  9     36   = Pz  2 = 0.02275 (b) Based on this answer, what percentage of samples would have means of 180cm or higher if the population mean was 177cm? Answer: Only 2.275%     X m P( X  180, if m = 177) = P z  s     n       180  177  = P z  9     36   = Pz  2 = 0.02275 (b) Based on this answer, what percentage of samples would have means of 180cm or higher if the population mean was 177cm? Answer: Only 2.275% (c) Sketch a normal distribution curve for the distribution of sample means from a population with m = 177cm and standard deviation of s = 9cm = Pz  2 = 0.02275 (b) Based on this answer, what percentage of samples would have means of 180cm or higher if the population mean was 177cm (like in 1987)? Answer: Only 2.275% (c) Sketch a normal distribution curve for the distribution of sample means from a population with m = 177cm and standard deviation of s = 9cm (d) So it is very ____________ that a randomly selected ________ taken from a _____________ with mean 177cm and standard deviation of 9cm would have a mean as high as this one. = Pz  2 = 0.02275 (b) Based on this answer, what percentage of samples would have means of 180cm or higher if the population mean was 177cm (like in 1987)? Answer: Only 2.275% (c) Sketch a normal distribution curve for the distribution of sample means from a population with m = 177cm and standard deviation of s = 9cm (d) So it is very unlikely that a randomly selected ________ taken from a _____________ with mean 177cm and standard deviation of 9cm would have a mean as high as this one. = Pz  2 = 0.02275 (b) Based on this answer, what percentage of samples would have means of 180cm or higher if the population mean was 177cm (like in 1987)? Answer: Only 2.275% (c) Sketch a normal distribution curve for the distribution of sample means from a population with m = 177cm and standard deviation of s = 9cm (d) So it is very unlikely that a randomly selected sample taken from a _____________ with mean 177cm and standard deviation of 9cm would have a mean as high as this one. = Pz  2 = 0.02275 (b) Based on this answer, what percentage of samples would have means of 180cm or higher if the population mean was 177cm (like in 1987)? Answer: Only 2.275% (c) Sketch a normal distribution curve for the distribution of sample means from a population with m = 177cm and standard deviation of s = 9cm (d) So it is very unlikely that a randomly selected sample taken from a population with mean 177cm and standard deviation of 9cm would have a mean as high as this one. Do Sigma: = Pz  2 = 0.02275 oldon(2this) answer, edition: Pg. 184 – Ex. 11.4. would have means of (b) In Based what percentage of samples 180cm or higher if the population mean was 177cm (like in 1987)? OR in NEW edition: Pg. 66 – Ex. 3.04. Answer: Only 2.275% nd (c) Sketch a normal distribution curve for the distribution of sample means from a population with m = 177cm and standard deviation of s = 9cm (d) So it is very unlikely that a randomly selected sample taken from a population with mean 177cm and standard deviation of 9cm would have a mean as high as this one. Yet it did. (e) What is the most likely explanation? LESSON 4 – C.I.s for Means 1 • Today’s theme: Solving problems involving Confidence Intervals for Means. • Students do NuLake Ch 2.5 – Calculate Confidence Intervals of means. http://www.youtube.com/watch?v=Ohz-PZqaMtk Question: If the population mean height of 17 year-old NZ males is 177cm with s of 9cm, within what interval would we expect the means of 95% of samples of size 36 to lie? Notice that the middle 95% of the area under normal curve means half on each side of the mean. 95% Question: If the population mean height of 17 year-old NZ males is 177cm with s of 9cm, within what interval would we expect the means of 95% of samples of size 36 to lie? 47.5% i.e. 47.5% (or 0.475) on each side. 47.5% Notice that the middle 95% of the area under normal curve means half on each side of the mean. Question: If the population mean height of 17 year-old NZ males is 177cm with s of 9cm, within what interval would we expect the means of 95% of samples of size 36 to lie? Looking up 0.475 on the tables gives z = 1.96. 47.5% i.e. 47.5% (or 0.475) on each side. 47.5% Notice that the middle 95% of the area under normal curve means half on each side of the mean. Question: If the population mean height of 17 year-old NZ males is 177cm with s of 9cm, within what interval would we expect the means of 95% of samples of size 36 to lie? Looking up 0.475 on the tables gives z = 1.96. 47.5% i.e. 47.5% (or 0.475) on each side. -1.96 47.5% Notice that the middle 95% of the area under normal curve means half on each side of the mean. 1.96 Question: If the population mean height of 17 year-old NZ males is 177cm with s of 9cm, within what interval would we expect the means of 95% of samples of size 36 to lie? Looking up 0.475 on the tables gives z = 1.96. 47.5% i.e. 47.5% (or 0.475) on each side. -1.96 47.5% Notice that the middle 95% of the area under normal curve means half on each side of the mean. 1.96 Question: If the population mean height of 17 year-old NZ males is 177cm with s of 9cm, within what interval would we expect the means of 95% of samples of size 36 to lie? So when we calculate the mean from a random sample we expect that, 95% of the time, it will be within + 1.96 standard errors of the popn mean, m. Looking up 0.475 on the tables gives z = 1.96. 47.5% i.e. 47.5% (or 0.475) on each side. -1.96 47.5% Notice that the middle 95% of the area under normal curve means half on each side of the mean. 1.96 47.5% Now, work out the lower and upper limits of the interval within which you’d expect 95% of sample means to lie if each sample has 36 people in it. -1.96 47.5% Question: If the population mean height of 17 year-old NZ males is 177cm with s of 9cm, within what interval would we expect the means of 95% of samples of size 36 to lie? So when we calculate the mean from a random sample we expect that, 95% of the time, it will be within + 1.96 standard errors of the popn mean, m. 1.96 47.5% Now, work out the lower and upper limits of the interval within which you’d expect 95% of sample means to lie if each sample has 36 people in it. -1.96 47.5% Question: If the population mean height of 17 year-old NZ males is 177cm with s of 9cm, within what interval would we expect the means of 95% of samples of size 36 to lie? So when we calculate the mean from a random sample we expect that, 95% of the time, it will be within + 1.96 standard errors of the popn mean, m. 1.96 Conclusion: So 95% of samples of size 36 from this population will produce means between _______cm and ________cm 47.5% Now, work out the lower and upper limits of the interval within which you’d expect 95% of sample means to lie if each sample has 36 people in it. -1.96 47.5% Question: If the population mean height of 17 year-old NZ males is 177cm with s of 9cm, within what interval would we expect the means of 95% of samples of size 36 to lie? So when we calculate the mean from a random sample we expect that, 95% of the time, it will be within + 1.96 standard errors of the popn mean, m. 1.96 Conclusion: So 95% of samples of size 36 from this population will produce means between 174.06cm and 179.93cm -1.96 47.5% 47.5% Question: If the population mean height of 17 year-old NZ males is 177cm with s of 9cm, within what interval would we expect the means of 95% of samples of size 36 to lie? So when we calculate the mean from a random sample we expect that, 95% of the time, it will be within + 1.96 standard errors of the popn mean, m. 1.96 Conclusion: So 95% of samples of size 36 from this population will produce means between 174.06cm and 179.93cm -1.96 Problem: 47.5% 47.5% Conclusion: So 95% of samples of 36 from this population will produce means between 174.06cm and 179.93cm 1.96 -1.96 47.5% 47.5% Conclusion: So 95% of samples of 36 from this population will produce means between 174.06cm and 179.93cm 1.96 Problem: In real-life, we almost never know the population mean (or standard deviation). -1.96 47.5% 47.5% Conclusion: So 95% of samples of 36 from this population will produce means between 174.06cm and 179.93cm 1.96 Problem: In real-life, we almost never know the population mean (or standard deviation). We only have enough resources to conduct ONE random sample and use it to estimate (infer) the population mean. -1.96 47.5% 47.5% Conclusion: So 95% of samples of 36 from this population will produce means between 174.06cm and 179.93cm 1.96 Problem: In real-life, we almost never know the population mean (or standard deviation). We only have enough resources to conduct ONE random sample and use it to estimate (infer) the population mean. How can our knowledge of the distribution of sample means help us here?? -1.96 47.5% 47.5% Conclusion: So 95% of samples of 36 from this population will produce means between 174.06cm and 179.93cm 1.96 Problem: In real-life, we almost never know the population mean (or standard deviation). We only have enough resources to conduct ONE random sample and use it to estimate (infer) the population mean. How can our knowledge of the distribution of sample means help us here?? 47.5% 47.5% -1.96 1.96 Problem: In real-life, we almost never know the population mean (or standard deviation). We only have enough resources to conduct ONE random sample and use it to estimate (infer) the population mean. How can our knowledge of the distribution of sample means help us here?? Answer: We construct an interval within which we think the population mean lies. 47.5% 47.5% -1.96 1.96 Problem: In real-life, we almost never know the population mean (or standard deviation). We only have enough resources to conduct ONE random sample and use it to estimate (infer) the population mean. How can our knowledge of the distribution of sample means help us here?? Answer: We construct an interval within which we think the population mean lies. Estimate of m = X  margin of error 47.5% 47.5% -1.96 1.96 Problem: In real-life, we almost never know the population mean (or standard deviation). We only have enough resources to conduct ONE random sample and use it to estimate (infer) the population mean. Answer: We construct an interval within which we think the population mean lies. Estimate of m = X  margin of error. This is known as a Confidence Interval. Problem: In real-life, we almost never know the population mean (or standard deviation). We only have enough resources to conduct ONE random sample and use it to estimate (infer) the population mean. Answer: We construct an interval within which we think the population mean lies. Estimate of m = X  margin of error. This is known as a Confidence Interval. A 95% Confidence Interval for the population mean is an interval that has a 95% probability of containing the population mean. Diagram on board Problem: In real-life, we almost never know the population mean (or standard deviation). We only have enough resources to conduct ONE random sample and use it to estimate (infer) the population mean. Answer: We construct an interval within which we think the population mean lies. Estimate of m = X  margin of error. This is known as a Confidence Interval. A 95% Confidence Interval for the population mean is an interval that has a 95% probability of containing the population mean. A 95% confidence interval for m is X  Problem: In real-life, we almost never know the population mean (or standard deviation). We only have enough resources to conduct ONE random sample and use it to estimate (infer) the population mean. Answer: We construct an interval within which we think the population mean lies. Estimate of m = X  margin of error. This is known as a Confidence Interval. A 95% Confidence Interval for the population mean is an interval that has a 95% probability of containing the population mean. A 95% confidence interval for m is X  1.96  s X Problem: In real-life, we almost never know the population mean (or standard deviation). We only have enough resources to conduct ONE random sample and use it to estimate (infer) the population mean. Answer: We construct an interval within which we think the population mean lies. Estimate of m = X  margin of error. This is known as a Confidence Interval. A 95% Confidence Interval for the population mean is an interval that has a 95% probability of containing the population mean. A 95% confidence interval for m is X  1.96  s n A 99% confidence interval for m is X  _____ s n Problem: In real-life, we almost never know the population mean (or standard deviation). We only have enough resources to conduct ONE random sample and use it to estimate (infer) the population mean. Answer: We construct an interval within which we think the population mean lies. Estimate of m = X  margin of error. This is known as a Confidence Interval. A 95% Confidence Interval for the population mean is an interval that has a 95% probability of containing the population mean. A 95% confidence interval for m is X  1.96  s n A 99% confidence interval for m is X  2.576  s n A 95% Confidence Interval for the population mean is an interval that has a 95% probability of containing the population mean. A 95% confidence interval for m is X  1.96  s n A 99% confidence interval for m is X  2.576  s n Example 1: A soft drink is sold in bottles. The amount of drink in each bottle is normally distributed with a standard deviation of 40mL. The mean volume of drink in a random sample of 100 such bottles is 300mL. Construct a 95% confidence interval for the true mean volume of drink per bottle. Example 1: A soft drink is sold in bottles. The amount of drink in each bottle is normally distributed with a standard deviation of 40mL. The mean volume of drink in a random sample of 100 such bottles is 300mL. Construct a 95% confidence interval for the true mean volume of drink per bottle. Example 1: A soft drink is sold in bottles. The amount of drink in each bottle is normally distributed with a standard deviation of 40mL. The mean volume of drink in a random sample of 100 such bottles is 300mL. Construct a 95% confidence interval for the true mean volume of drink per bottle. Solution: There is a 95% probability that the interval 300mL +1.96 standard errors contains the true population mean. A 95% C.I. for the population mean m is: Example 1: A soft drink is sold in bottles. The amount of drink in each bottle is normally distributed with a standard deviation of 40mL. The mean volume of drink in a random sample of 100 such bottles is 300mL. Construct a 95% confidence interval for the true mean volume of drink per bottle. Solution: There is a 95% probability that the interval 300mL +1.96 standard errors contains the true population mean. A 95% C.I. for the population mean m is: + Margin of Error X =X + z ×Standard Error of the sample mean = 300 + = 300 + 1.96  40 100 = 300mL + 7.84mL Margin of Error E Example 1: A soft drink is sold in bottles. The amount of drink in each bottle is normally distributed with a standard deviation of 40mL. The mean volume of drink in a random sample of 100 such bottles is 300mL. Construct a 95% confidence interval for the true mean volume of drink per bottle. Solution: There is a 95% probability that the interval 300mL +1.96 standard errors contains the true population mean. A 95% C.I. for the population mean m is: + Margin of Error X =X + z ×Standard Error of the sample mean = 300 + = 300 + 1.96  40 100 = 300mL + 7.84mL ANSWER: The 95% CI for the population mean is: _____mL < m < _____mL Example 1: A soft drink is sold in bottles. The amount of drink in each bottle is normally distributed with a standard deviation of 40mL. The mean volume of drink in a random sample of 100 such bottles is 300mL. Construct a 95% confidence interval for the true mean volume of drink per bottle. Solution: There is a 95% probability that the interval 300mL +1.96 standard errors contains the true population mean. A 95% C.I. for the population mean m is: + Margin of Error X =X + z ×Standard Error of the sample mean = 300 + = 300 + 1.96  40 100 = 300mL + 7.84mL ANSWER: The 95% CI for the population mean is: 292.2mL < m < _____mL Example 1: A soft drink is sold in bottles. The amount of drink in each bottle is normally distributed with a standard deviation of 40mL. The mean volume of drink in a random sample of 100 such bottles is 300mL. Construct a 95% confidence interval for the true mean volume of drink per bottle. Solution: There is a 95% probability that the interval 300mL +1.96 standard errors contains the true population mean. A 95% C.I. for the population mean m is: + Margin of Error X =X + z ×Standard Error of the sample mean = 300 + = 300 + 1.96  40 100 = 300mL + 7.84mL ANSWER: The 95% CI for the population mean is: 292.2mL < m < 307.8mL Example 1: A soft drink is sold in bottles. The amount of drink in each bottle is normally distributed with a standard deviation of 40mL. The mean volume of drink in a random sample of 100 such bottles is 300mL. Construct a 95% confidence interval for the true mean volume of drink per bottle. Solution: There is a 95% probability that the interval 300mL +1.96 standard errors contains the true population mean. A 95% C.I. for the population mean m is: + Margin of Error X = 300 + 40 1.96  100 = 300mL + 7.84mL ANSWER: The 95% CI for the population mean is: 292.2mL < m < 307.8mL Example 1: A soft drink is sold in bottles. The amount of drink in each bottle is normally distributed with a standard deviation of 40mL. The mean volume of drink in a random sample of 100 such bottles is 300mL. Construct a 95% confidence interval for the true mean volume of drink per bottle. Solution: There is a 95% probability that the interval 300mL +1.96 standard errors contains the true population mean. A 95% C.I. for the population mean m is: + Margin of Error X 40 + 1.96  = 300 100 = 300mL + 7.84mL ANSWER: The 95% CI for the population mean is: 292.2mL < m < 307.8mL Now calculate the 99 % C.I. Will it be wider or narrower?? Example 1: A soft drink is sold in bottles. The amount of drink in each bottle is normally distributed with a standard deviation of 40mL. The mean volume of drink in a random sample of 100 such bottles is 300mL. Construct a 95% confidence interval for the true mean volume of drink per bottle. Solution: A 95% C.I. for the population mean m is: 40 + 1 . 96  = 300 ANSWER: 100 = 300mL + 7.84mL The 95% CI for the population mean is: 292.2mL < m < 307.8mL Now calculate the 99 % C.I. Will it be wider or narrower?? A 99% C.I. is s + z0.99 / 2  300 n Margin of Error = 300 + 2.576  40 100 E = 300mL + 10.304mL Example 1: A soft drink is sold in bottles. The amount of drink in each bottle is normally distributed with a standard deviation of 40mL. The mean volume of drink in a random sample of 100 such bottles is 300mL. Construct a 95% confidence interval for the true mean volume of drink per bottle. Solution: A 95% C.I. for the population mean m is: 40 + 1 . 96  = 300 ANSWER: 100 = 300mL + 7.84mL The 95% CI for the population mean is: 292.2mL < m < 307.8mL Now calculate the 99 % C.I. Will it be wider or narrower?? A 99% C.I. is s + z0.99 / 2  300 n = 300 + 2.576  40 ANSWER: 100 The 99% CI for the population mean is: = 300mL + 10.304mL ____mL < m < _____mL Example 1: A soft drink is sold in bottles. The amount of drink in each bottle is normally distributed with a standard deviation of 40mL. The mean volume of drink in a random sample of 100 such bottles is 300mL. Construct a 95% confidence interval for the true mean volume of drink per bottle. Solution: A 95% C.I. for the population mean m is: 40 + 1 . 96  = 300 ANSWER: 100 = 300mL + 7.84mL The 95% CI for the population mean is: 292.2mL < m < 307.8mL Now calculate the 99 % C.I. Will it be wider or narrower?? A 99% C.I. is s + z0.99 / 2  300 n = 300 + 2.576  40 ANSWER: 100 The 99% CI for the population mean is: = 300mL + 10.304mL 289.7mL < m < _____mL Example 1: A soft drink is sold in bottles. The amount Copy examples, then of dodrink in each bottle is normally distributed with a Ex. standard deviation of 40mL. The mean volume NuLake 2.5: p8184 of drink in a random sample of 100 such bottles is 300mL. Construct a 95% confidence interval for the true mean volume of drink per bottle. Solution: A 95% C.I. for the population mean m is: 40 + 1 . 96  = 300 ANSWER: 100 = 300mL + 7.84mL The 95% CI for the population mean is: 292.2mL < m < 307.8mL Now calculate the 99 % C.I. Will it be wider or narrower?? A 99% C.I. is s + z0.99 / 2  300 n = 300 + 2.576  40 ANSWER: 100 The 99% CI for the population mean is: = 300mL + 10.304mL 289.7mL < m < 310.3mL When we aren’t told the population standard deviation s. If we aren’t given the popn standard deviation s, then use the sample standard deviation s as an estimate. This is OK provided the sample size is large enough (n > 30). LESSON 5 – C.I.s for Means 2 The purpose of today:  Memorise definition of a confidence interval.  Get confident at constructing confidence intervals for population means. To do today: 1. 2. 3. 4. 5. Watch youtube clip: http://www.youtube.com/watch?v=Ohz-PZqaMtk Interpret C.I. from yesterday’s e.g. in context. Finish NuLake 2.5. Do new Sigma p75 - Ex. 4.01: To end of Q14 compulsory. Q1517 are extra for experts. 2008 NCEA exam question: Do new Sigma p75 - Ex. 4.01: To end of Q14 compulsory. Q1517 are extra for experts. LESSON 6 – SAMPLE SIZE (MEANS) • Today’s theme: Calculate the required sample size to meet a set of specified conditions for a Confidence Interval for the population MEAN. • Do Sigma (old): Ex. 14.2 – pg. 230. (New version: Ex. 4.02 – pg. 79) Calculating the minimum sample size - means. The confidence interval formula for estimating the population mean, m, is: s X  z n The ____________, E, is _____________________________ ________________________________________________. Calculating the minimum sample size - means. The confidence interval formula for estimating the population mean, m, is: s X  z n The margin of error, E, is the _________________________ ________________________________________________. Calculating the minimum sample size - means. The confidence interval formula for estimating the population mean, m, is: s X  z n The margin of error, E, is the distance between the sample mean and the upper and lower limits of this interval. Margin of Error, E = z s n For example, a confidence interval of 18cm < m < 22cm, can also be expressed as 20cm + ___cm. Calculating the minimum sample size - means. The confidence interval formula for estimating the population mean, m, is: s X  z n The margin of error, E, is the distance between the sample mean and the upper and lower limits of this interval. Margin of Error, E = z s n For example, a confidence interval of 18cm < m < 22cm, can also be expressed as 20cm + 2cm. Calculating the minimum sample size - means. The confidence interval formula for estimating the population mean, m, is: s X  z n The margin of error, E, is the distance between the sample mean and the upper and lower limits of this interval. Margin of Error, E = z s n For example, a confidence interval of 18cm < m < 22cm, can also be expressed as 20cm + 2cm. The ___________ is __cm. Our estimate is “___________________”. Calculating the minimum sample size - means. The confidence interval formula for estimating the population mean, m, is: s X  z n The margin of error, E, is the distance between the sample mean and the upper and lower limits of this interval. Margin of Error, E = z s n For example, a confidence interval of 18cm < m < 22cm, can also be expressed as 20cm + 2cm. The margin of error is 2cm. Our estimate is “___________________”. Calculating the minimum sample size - means. The confidence interval formula for estimating the population mean, m, is: s X  z n The margin of error, E, is the distance between the sample mean and the upper and lower limits of this interval. Margin of Error, E = z s n For example, a confidence interval of 18cm < m < 22cm, can also be expressed as 20cm + 2cm. The margin of error is 2cm. Our estimate is “accurate to within 2cm”. Given a particular level of confidence a, Calculating the minimum sample size - means. The confidence interval formula for estimating the population mean, m, is: s X  z n The margin of error, E, is the distance between the sample mean and the upper and lower limits of this interval. Margin of Error, E = z s n For example, a confidence interval of 18cm < m < 22cm, can also be expressed as 20cm + 2cm. The margin of error is 2cm. Our estimate is “accurate to within 2cm”. Given a particular level of confidence a, we can calculate how big a sample is necessary to estimate m to give a required accuracy or margin of error, E. The margin of error, E, is the distance between the sample mean and the upper and lower limits of this interval. Margin of Error, E = z s n For example, a confidence interval of 18cm < m < 22cm, can also be expressed as 20cm + 2cm. The margin of error is 2cm. Our estimate is “accurate to within 2cm”. Given a particular level of confidence a, we can calculate how big a sample is necessary to estimate m to give a required accuracy or margin of error, E. E.g: A survey is to be conducted to determine the mean income of a group of workers. A pilot survey gives s  $100. The margin of error, E, is the distance between the sample mean and the upper and lower limits of this interval. Margin of Error, E = z s n For example, a confidence interval of 18cm < m < 22cm, can also be expressed as 20cm + 2cm. The margin of error is 2cm. Our estimate is “accurate to within 2cm”. Given a particular level of confidence a, we can calculate how big a sample is necessary to estimate m to give a required accuracy or margin of error, E. E.g: A survey is to be conducted to determine the mean income of a group of workers. A pilot survey gives s  $100. How large must the sample be if the mean income is to be estimated to within $20 using a 95% confidence interval? For example, a confidence interval of 18cm < m < 22cm, can also be expressed as 20cm + 2cm. The margin of error is 2cm. Our estimate is “accurate to within 2cm”. Given a particular level of confidence a, we can calculate how big a sample is necessary to estimate m to give a required accuracy or margin of error, E. E.g: A survey is to be conducted to determine the mean income of a group of workers. A pilot survey gives s  $100. How large must the sample be if the mean income is to be estimated to within $20 using a 95% confidence interval? s Solution: A confidence interval for the mean income m is: X  1·96  For the income to be found to within $20, we need: 1·96  100 n < 20 n E.g: A survey is to be conducted to determine the mean income of a group of workers. A pilot survey gives s  $100. How large must the sample be if the mean income is to be estimated When you’ve copied down this e.g: to within $20 using a 95% confidence interval? Do Sigma (new): Ex. 4.02 – pg. 79 s Solution: A confidence interval for the mean income m is: (or Old version: Ex. 14.2 – pg. 230). For the income to be found to within $20, we need: FINISH FOR H.W. 1·96  100 n < 20 196  20 n 1962  202 n 1962 n 2 20 n > 96.04 Squaring both sides Answer: A minimum sample size of 97 is needed. X  1·96  n Formula for calculating minimum sample size.  zs  n=   E  2 Where E = Margin of Error. i.e. half of C.I. width. Sample-size question from 2007 NCEA External Exam A random sample of size n is taken from a population having a known standard deviation σ. A 95% confidence interval for the population mean is calculated using the sample mean. A second random sample of size 2n is taken from the same population and a 95% confidence interval for the population mean is calculated using its sample mean. How many times greater is the width of the first confidence interval than the width of the second confidence interval? Formula for calculating minimum sample size.  zs  n=   E  2 Where E = Margin of Error. i.e. half of Confidence Interval width. LESSON 7 – Intro to Confidence Intervals for Proportions The points of today: • Introduction to Distribution of Sample PROPORTIONS. • Construct confidence intervals for Population Proportions.  Notes on distn. of sample proportions (handout).  Do handout on distribution of sample proportions (Achieving in Statistics page 33).  How to construct a C.I. for a proportion.  HW: NuLake Ex. 2.6. The Distribution of Sample Proportions E.g. Political Opinion Polls - National vs Labour. 2 possible outcomes where p is the proportion of successful outcomes in n trials. If a sequence of n independent trials results in x successes, then x has a _________ distribution. The Distribution of Sample Proportions E.g. Political Opinion Polls - National vs Labour. 2 possible outcomes where p is the proportion of successful outcomes in n trials. If a sequence of n independent trials results in x successes, then x has a Binomial distribution. A point estimator of the popn proportion of successful trials, x . p, is the sample proportion p= n With a sufficient sample size (rule of thumb n>30), the distribution of sample proportions p is approximately normal and… The of SampleofProportions 1. DoDistribution handout on distribution sample E.g. Political Opinion Polls - National vs Labour. proportions. (Will do Q1 table on board as a class) 2 possible outcomes where p is the proportion of successful outcomes in n trials. If a sequence of n independent trials results in x successes, then x has a Binomial distribution. x A point estimator of the popn proportion, p, is the sample proportion p = n With a sufficient sample size (rule of thumb n>30), the distribution of sample proportions p is approximately normal and… E ( p) = m p = p By the Central Limit Theorem sp = p (1  p ) n Next slide: The proofs of the formulae for mean and standard deviation of the distribution of sample proportions With a sufficient sample size (rule of thumb n>30), the distribution of sample proportions p is approximately normal and… E ( p) = m p = p Proof: X Var ( p) = Var  n 1 E( X ) n 1 = np n =p n Proof: X E ( p) = E   n = sp = p (1  p ) Since, for the Binomial Distribution, m = np 1  = Var  X  n  2 1 =   Var  X   n 2 1 =   np (1  p ) n Since, for the Binomial Distribution, s2 = np1 p With a sufficient sample size (rule of thumb n>30), the distribution of sample proportions p is approximately normal and… E ( p) = m p = p Proof: X Var ( p) = Var  n 1 E( X ) n 1 = np n =p n Proof: X E ( p) = E   n = sp = p (1  p ) Since, for the Binomial Distribution, m = np 1  = Var  X  n  2 1 =   Var  X  n 1 = 2 np (1  p ) n p (1  p ) = n Since, for the Binomial Distribution, s2 = np1 p s p = p (1  p ) n Confidence Intervals for Proportions Example: Political opinion polls. 500 New Zealanders aged 18 and over were selected at random for an opinion poll. Confidence Intervals for Proportions Example: Political opinion polls. 500 New Zealanders aged 18 and over were selected at random for an opinion poll. They were asked to indicate whether Labour or National would be their preferred political party. 275 voted for National. Find a 95% confidence interval for the true proportion of all NZers who favour National. Solution: Our point estimate for p is p = 275 500 = 0 .5 5 There is a 95% probability that the interval 0.55 + 1.96 standard errors contains the true population proportion who would prefer National. A 95% C.I. is p  zs p =p  z p (1  p ) n = 0.55  1 . 96  = 0.55  _____ 0 . 55  0 . 45 500 Margin of Error E Example: Political opinion polls. 500 New Zealanders aged 18 and over were selected at random for an opinion poll. They were asked to indicate whether Labour or National would be their preferred political party. 275 voted for National. Find a 95% confidence interval for the true proportion of all NZers who favoured National. 275 Solution: Our point estimate for p is p = 500 = 0 .5 5 There is a 95% probability that the interval 0.55 + 1.96 standard errors contains the true population proportion who would prefer National. A 95% C.I. is p  zs p =p  z p (1  p ) n = 0.55  1 .96  = 0.55  _____ 0 . 55  0 . 45 500 Margin of Error E Example: Political opinion polls. 500 New Zealanders aged 18 and over were selected at random for an opinion poll. They were asked to indicate whether Labour or National would be their preferred political party. 275 voted for National. Find a 95% confidence interval for the true proportion of all NZers who favoured National. Solution: Our point estimate for p is p = 275 500 = 0 .5 5 There is a 95% probability that the interval 0.55 + 1.96 standard errors contains the true population proportion who would prefer National. A 95% C.I. is p  zs p =p  z = 0.55  1 . 96  p (1  p ) n 0 . 55  0 . 45 500 = 0.55  0 . 0 4 3 6 1 Margin of Error E ANSWER: The 95% CI for the proportion in favour of National is ______ < p < _______ Example: Political opinion polls. 500 New Zealanders aged 18 and over were selected at random for an opinion poll. They were asked to indicate whether Labour or National would be their preferred political party. 275 voted for National. Find a 95% confidence interval for the true proportion of all NZers who favoured National. Solution: Our point estimate for p is p = 275 500 = 0 .5 5 There is a 95% probability that the interval 0.55 + 1.96 standard errors contains the true population proportion who would prefer National. A 95% C.I. is p  zs p =p  z = 0.55  1 . 96  p (1  p ) n 0 . 55  0 . 45 500 = 0.55  0 . 0 4 3 6 1 Margin of Error E ANSWER: The 95% CI for the proportion in favour of National is 0.5064 < p < _______ Example: Political opinion polls. 500 New Zealanders aged 18 and over were selected at random for an opinion poll. They were asked to indicate whether Labour or National would be their preferred political party. 275 voted for National. HW: Do NuLake Ex. 2.6 – CIs for Find a 95% confidence interval for the true proportion of all NZers who favoured National . proportions. PONDER THIS: Solution: Our point estimate for p is p = 275 500 = 0 .5 5 Based on this opinion poll, does National have a There is a 95% probability that the interval 0.55 + 1.96majority? standard errors STATISTICALLY SIGNIFICANT contains the true population proportion who would prefer National. A 95% C.I. is p  zs p =p  z = 0.55  1 . 96  p (1  p ) n 0 . 55  0 . 45 500 = 0.55  0 . 0 4 3 6 1 Margin of Error E ANSWER: The 95% CI for the proportion in favour of National is 0.5064 < p < 0.5936 LESSON 8 – Practice constructing C.I.s for Proportions The point of today: • Do lots of practice involving confidence intervals for Population Proportions. Go over any homework questions – NuLake p87,88: Ch 2.6 – C.I.s for proportions. Then do Sigma pg. 232 – Ex. 14.3 (old version). or in new version: pg. 88 - Ex. 5.01. Finish for HW. LESSON 9 – SAMPLE SIZE (PROPORTIONS) • Today’s theme: Calculate the required sample size to meet a set of specified conditions for a Confidence Interval for the population PROPORTION. • Key point – for minimum sample size, if not told p, assume p=0.5 as this gives the greatest margin of error (prepared for the worst).  Do Sigma: old edition – p235 – Ex. 14.4 or new edition – p91 - Ex. 5.02 Calculating the minimum sample size - proportions. The confidence interval formula for estimating the population proportion, p, is: p  z p (1  p ) n The margin of error, E, is the distance between the sample proportion and the upper and lower limits of this interval. Margin of Error, E = z p (1  p ) n For example, a confidence interval of 0.37 < p < 0.43, can also be expressed as 0.4 + 0.03. The margin of error is _____. Calculating the minimum sample size. The confidence interval formula for estimating the population proportion, p, is: p  z p (1  p ) n The margin of error, E, is the distance between the sample proportion and the upper and lower limits of this interval. Margin of Error, E = z p (1  p ) n For example, a confidence interval of 0.37 < p < 0.43, can also be expressed as 0.4 + 0.03. The margin of error is 0.03. Calculating the minimum sample size. The confidence interval formula for estimating the population proportion, p, is: p  z p (1  p ) n The margin of error, E, is the distance between the sample proportion and the upper and lower limits of this interval. Margin of Error, E = z p (1  p ) n For example, a confidence interval of 0.37 < p < 0.43, can also be expressed as 0.4 + 0.03. The margin of error is 0.03. Our estimate is “accurate to within 0.03”. The sample size depends on three factors: 1.The level of confidence required, a. 2.The true value of p, which will often be unknown. 3.The accuracy required. i.e. the margin of error, E, we are willing to accept. Margin of Error, E = z  p (1  p ) n For example, a confidence interval of 0.37 < p < 0.43, can also be expressed as 0.4 + 0.03. The margin of error is 0.03. Our estimate is “accurate to within 0.03”. The sample size depends on three factors: 1.The level of confidence required, a. 2.The true value of p, which will often be unknown. 3.The accuracy required. i.e. the margin of error, E, we are willing to accept. Example An international airline is thinking of making smoking illegal on its aircraft. Before making the decision it wishes to estimate the proportion of smokers in the population of passengers on its planes by taking a random sample. How big a sample must it take to be 95% sure that the value so obtained does not differ from the true proportion by more than 0.05? The sample size depends on three factors: 1.The level of confidence required, a. 2.The true value of p, which will often be unknown. 3.The accuracy required. i.e. the margin of error, E, we are willing to accept. Example An international airline is thinking of making smoking illegal on its aircraft. Before making the decision it wishes to estimate the proportion of smokers in the population of passengers on its planes by taking a random sample. How big a sample must it take to be 95% sure that the value so obtained does not differ from the true proportion by more than 0.05? Solution: A 95% confidence interval for the proportion of smokers on all planes, p is: p (1  p ) p  1.96  n For the proportion to be found to within 0.05, we need: Margin of < 0.05 Error Example An international airline is thinking of making smoking illegal on its aircraft. Before making the decision it wishes to estimate the proportion of smokers in the population of passengers on its planes by taking a random sample. How big a sample must it take to be 95% sure that the value so obtained does not differ from the true proportion by more than 0.05? Solution: A 95% confidence interval for the proportion of smokers on all planes, p is: p (1  p ) p  1.96  n Margin of For the proportion to be found to within 0.05, we need: Error 1.96  PROBLEM! p (1  p ) n < 0.05  0.05 Example An international airline is thinking of making smoking illegal on its aircraft. Before making the decision it wishes to estimate the proportion of smokers in the population of passengers on its planes by taking a random sample. How big a sample must it take to be 95% sure that the value so obtained does not differ from the true proportion by more than 0.05? Solution: A 95% confidence interval for the proportion of smokers on all planes, p is: p (1  p ) p  1.96  n Margin of For the proportion to be found to within 0.05, we need: Error 1.96  p (1  p ) < 0.05  0.05 n PROBLEM! We don’t have a value for π. That’s the very thing we’re trying to estimate!! Example An international airline is thinking of making smoking illegal on its aircraft. Before making the decision it wishes to estimate the proportion of smokers in the population of passengers on its planes by taking a random sample. How big a sample must it take to be 95% sure that the value so obtained does not differ from the true proportion by more than 0.05? Solution: A 95% confidence interval for the proportion of smokers on all planes, p is: p (1  p ) p  1.96  n Margin of For the proportion to be found to within 0.05, we need: Error 1.96  p (1  p ) < 0.05  0.05 n PROBLEM! We don’t have a value for π. That’s the very thing we’re trying to estimate!! To get around this problem we have 3 options: An international airline is thinking of making smoking illegal on its aircraft. Before making the decision it wishes to estimate the proportion of smokers in the population of passengers on its planes by taking a random sample. How big a sample must it take to be 95% sure that the value so obtained does not differ from the true proportion by more than 0.05? Solution: A 95% confidence interval for the proportion of smokers on all planes, p is: p (1  p ) p  1.96  n For the proportion to be found to within 0.05, we need: 1.96  Margin of < 0.05 Error p (1  p ) n  0.05 PROBLEM! We don’t have a value for π. That’s the very thing we’re trying to estimate!! To get around this problem we have 3 options: 1. Use a value of p that has held in the past (previous samples). 2. Take a small pilot survey, and use the sample proportion p from that as an estimate of p. Solution: A 95% confidence interval for the proportion of smokers on all planes, p is: p (1  p ) p  1.96  n For the proportion to be found to within 0.05, we need: 1.96  Margin of < 0.05 Error p (1  p ) n  0.05 PROBLEM! We don’t have a value for π. That’s the very thing we’re trying to estimate!! To get around this problem we have 3 options: 1.Use a value of p that has held in the past (previous samples). 2.Take a small pilot survey, and use the sample proportion p from that as an estimate of p. 3.Use p=0.5. This allows for the greatest possible error because the maximum possible value of p(1-p) occurs when both p and (1-p) are = ½ i.e. when p(1-p) = 0.5 × 0.5 = 0.25 For the proportion to be found to within 0.05, we need: 1.96  Margin of Error p (1  p )  0.05 n PROBLEM! We don’t have a value for π. That’s the very thing we’re trying to estimate!! To get around this problem we have 3 options: 1.Use a value of p that has held in the past (previous samples). 2.Take a small pilot survey, and use the sample proportion p from that as an estimate of p. 3.Use p=0.5. This allows for the greatest possible error because the maximum possible value of p(1-p) occurs when both p and (1-p) are = ½ i.e. when p(1-p) = 0.5 × 0.5 = 0.25 Back to this example: 1.96  p (1  p )  0.05 n We’re given no information on the value of p, so let p = 0.5. PROBLEM! We don’t have a value for π. That’s the very thing we’re trying to estimate!! To get around this problem we have 3 options: 1.Use a value of p that has held in the past (previous samples). 2.Take a small pilot survey, and use the sample proportion p from that as an estimate of p. 3.Use p=0.5. This allows for the greatest possible error because the maximum possible value of p(1-p) occurs when both p and (1-p) are = ½ i.e. when p(1-p) Back to this example: 1.96  = 0.5 × 0.5 = 0.25 p (1  p )  0.05 n We’re given no information on the value of p, so let p = 0.5. 1.96  0.5(1  0.5)  0.05 n 1.96  0.25  0.05 n 3. Use p=0.5. This allows for the greatest possible error because the maximum possible value of p(1-p) occurs when both p and (1-p) are = ½ Do Sigma p235 – Ex. 14.4 (old version) i.e. when p(1-p) = 0.5 × 0.5 p91= 0.25 – Ex. 5.02 (new version) Back to this example: 1.96  p (1  p )  0.05 Homework: NuLake npg. 96: Q6477. We’re given no information on the value of p, so let p = 0.5. 1.96  0.5(1  0.5)  0.05 n 1.96  0.25  0.05 n 1.962  0.25  0.052 n Squaring both sides 1.96 2  0.25 n  0.052 n > 384.16… Answer: A sample size of 385 passengers is needed. LESSON 10 – Differences between means 1 The point of today: Construct confidence intervals for the difference between 2 population means. • Do NuLake 2.7: pg. 8993. 2007 NCEA exam – C.I.s Confidence Intervals for the Difference Between 2 Means Confidence Intervals for the Difference Between 2 Means Involves comparison between the means of two populations (e.g. males & females). Confidence Intervals for the Difference Between 2 Means Involves comparison between the means of two populations (e.g. males & females). We select a random sample from each group and calculate the 2 means, subtracting to get the difference. We then use this difference to estimate the difference between the means of the 2 populations from which the samples were drawn. The expected difference between the 2 sample means, is the true difference between the 2 population means: (Central Limit Theorem) E( X1  X 2 ) = m1  m2 i.e. Mean difference between sample means= diff. between popn means. We select a random sample from each group and calculate the 2 means, subtracting to get the difference. We then use this difference to estimate the difference between the means of the 2 populations from which the samples were drawn. The expected difference between the 2 sample means, is the true difference between the 2 population means: (Central Limit Theorem) E( X1  X 2 ) = m1  m2 i.e. Mean difference between sample means= diff. between popn means Sample Mean (point estimate) Sample Size Popn Mean Variance of Sample Means We select a random sample from each group and calculate the 2 means, subtracting to get the difference. We then use this difference to estimate the difference between the means of the 2 populations from which the samples were drawn. The expected difference between the 2 sample means, is the true difference between the 2 population means: (Central Limit Theorem) E( X1  X 2 ) = m1  m2 i.e. Mean difference between sample means= diff. between popn means Sample Mean (point estimate) Sample Size X1 n1 X2 n2 X1  X 2 ― Popn Mean Variance of Sample Means We then use this difference to estimate the difference between the means of the 2 populations from which the samples were drawn. The expected difference between the 2 sample means, is the true difference between the 2 population means: (Central Limit Theorem) E( X1  X 2 ) = m1  m2 i.e. Mean difference between sample means = diff. between popn means Sample Mean (point estimate) Sample Size X1 n1 X2 n2 X1  X 2 ― Popn Mean Variance of Sample Means We then use this difference to estimate the difference between the means of the 2 populations from which the samples were drawn. The expected difference between the 2 sample means, is the true difference between the 2 population means: (Central Limit Theorem) E( X1  X 2 ) = m1  m2 i.e. Mean difference between sample means = diff. between popn means Sample Mean Sample Size Popn Mean X1 n1 m1 X2 n2 (point estimate) X1  X 2 ― Variance of Sample Means s 12 n1 We then use this difference to estimate the difference between the means of the 2 populations from which the samples were drawn. The expected difference between the 2 sample means, is the true difference between the 2 population means: (Central Limit Theorem) E( X1  X 2 ) = m1  m2 i.e. Mean difference between sample means = diff. between popn means Sample Mean (point estimate) X1 X2 X1  X 2 Sample Size Popn Mean n1 m1 n2 m2 ― Variance of Sample Means s 12 n1 s 22 n2 We then use this difference to estimate the difference between the means of the 2 populations from which the samples were drawn. The expected difference between the 2 sample means, is the true difference between the 2 population means: (Central Limit Theorem) E( X1  X 2 ) = m1  m2 i.e. Mean difference between sample means = diff. between popn means Sample Mean (point estimate) X1 X2 X1  X 2 Sample Size Popn Mean n1 m1 n2 m2 ― m1 – m2 Variance of Sample Means s 12 n1 s 22 n2 s1 2 n1  s2 2 n2 Sample Mean (point estimate) X1 X2 X1  X 2 Sample Size Popn Mean n1 m1 n2 m2 ― m1 – m2 Variance of Sample Means s 12 n1 s 22 n2 s 12 n1  s 22 n2 Sample Mean (point estimate) X1 X2 X1  X 2 Sample Size Popn Mean n1 m1 n2 m2 ― m1 – m2 Variance of Sample Means s 12 n1 s 22 n2 s 12 n1  s 22 n2 So the Standard Error of the difference between 2 sample means is: Sample Mean (point estimate) Sample Size Popn Mean n1 m1 n2 m2 ― m1 – m2 X1 X2 X1  X 2 Variance of Sample Means s 12 n1 s 22 n2 s 12 n1  s 22 n2 So the Standard Error of the difference between 2 sample means is: s X  X  = 1 2 s1 2 n1  s2 2 n2 So s X  X  = 1 2 s1 2 n1  s2 2 n2 NOTE: 1. The 2 samples must be INDEPENDENT of one another. 2. When finding a confidence interval for the difference between 2 means, we use the popn parameters s1 and s2. If not told these, we can use the sample SD’s s1 and s2, provided the sample sizes are large enough (n>30). So s X  X  = 1 2 s1 2 n1  s2 2 n2 NOTE: 1. The 2 samples must be INDEPENDENT of one another. 2. When finding a confidence interval for the difference between 2 means, we use the popn parameters s1 and s2. If not told these, we can use the sample SD’s provided the sample sizes are large enough. 3. s1 and s2, A 95% Confidence Interval tells us that 95% of such intervals will CONTAIN the difference between the POPULATION MEANS. So the Standard Error of the difference between 2 sample means is: s X  X  = 1 2 s1 2 n1  s2 2 n2 Confidence Intervals for Difference Between 2 Means So the Standard Error of the difference between 2 sample means is: s X  X  = 1 2 s1 2 n1  s2 2 n2 Confidence Intervals for Difference Between 2 Means Example: If a random sample of 49 women has a mean life of 76 years with a standard deviation of 8 years and a random sample of 64 men has a mean life of 72 years with a standard deviation of 9 years. (a) Find a 95% confidence interval for the difference between the mean lifetimes of all women and all men. (b) What can we conclude about the mean lifespans of all men and all women on the basis of this confidence interval? Justify your answer. Confidence Intervals for Difference Between 2 Means Example: If a random sample of 49 women has a mean life of 76 years with a standard deviation of 8 years and a random sample of 64 men has a mean life of 72 years with a standard deviation of 9 years. (a) Find a 95% confidence interval for the difference between the mean lifetimes of all women and all men. Solution: For the women: Confidence Intervals for Difference Between 2 Means Example: If a random sample of 49 women has a mean life of 76 years with a standard deviation of 8 years and a random sample of 64 men has a mean life of 72 years with a standard deviation of 9 years. (a) Find a 95% confidence interval for the difference between the mean lifetimes of all women and all men. Solution: For the women: n1 = 49 Confidence Intervals for Difference Between 2 Means Example: If a random sample of 49 women has a mean life of 76 years with a standard deviation of 8 years and a random sample of 64 men has a mean life of 72 years with a standard deviation of 9 years. (a) Find a 95% confidence interval for the difference between the mean lifetimes of all women and all men. Solution: For the women: n1 = 49 X1 =76 Confidence Intervals for Difference Between 2 Means Example: If a random sample of 49 women has a mean life of 76 years with a standard deviation of 8 years and a random sample of 64 men has a mean life of 72 years with a standard deviation of 9 years. (a) Find a 95% confidence interval for the difference between the mean lifetimes of all women and all men. Solution: For the women: n1 = 49 X1 =76 s1 = 8 Confidence Intervals for Difference Between 2 Means Example: If a random sample of 49 women has a mean life of 76 years with a standard deviation of 8 years and a random sample of 64 men has a mean life of 72 years with a standard deviation of 9 years. (a) Find a 95% confidence interval for the difference between the mean lifetimes of all women and all men. Solution: For the women: n1 = 49 For the men: n2 = 64 X1 =76 s1 = 8 Confidence Intervals for Difference Between 2 Means Example: If a random sample of 49 women has a mean life of 76 years with a standard deviation of 8 years and a random sample of 64 men has a mean life of 72 years with a standard deviation of 9 years. (a) Find a 95% confidence interval for the difference between the mean lifetimes of all women and all men. Solution: For the women: n1 = 49 X1 =76 For the men: n2 = 64 X2 =72 s1 = 8 Confidence Intervals for Difference Between 2 Means Example: If a random sample of 49 women has a mean life of 76 years with a standard deviation of 8 years and a random sample of 64 men has a mean life of 72 years with a standard deviation of 9 years. (a) Find a 95% confidence interval for the difference between the mean lifetimes of all women and all men. Solution: For the women: n1 = 49 X1 =76 s1 = 8 For the men: n2 = 64 X2 =72 s2 =9 Confidence Intervals for Difference Between 2 Means Example: If a random sample of 49 women has a mean life of 76 years with a standard deviation of 8 years and a random sample of 64 men has a mean life of 72 years with a standard deviation of 9 years. (a) Find a 95% confidence interval for the difference between the mean lifetimes of all women and all men. Solution: For the women: For the men: n1 = 49 X 1 =76 s1 = 8 n2 = 64 X 2 =72 s2 =9 Confidence Intervals for Difference Between 2 Means Example: If a random sample of 49 women has a mean life of 76 years with a standard deviation of 8 years and a random sample of 64 men has a mean life of 72 years with a standard deviation of 9 years. (a) Find a 95% confidence interval for the difference between the mean lifetimes of all women and all men. Solution: For the women: For the men: n1 = 49 n2 = 64 X 1  X 2 = 76 – 72 = 4 yrs X1 =76 X2 =72 s1 = 8 s2 = 9 Solution: For the women: For the men: n1 = 49 n2 = 64 X1 =76 s1 = 8 s2 = 9 X 2 =72 X 1  X 2 = 76 – 72 = 4 yrs A 95% Confidence Interval for m1-m2, the difference between the population mean lifetimes of women and men is: X 1  X2 = X 1  X 2  = 4 + z ×Standard Error of Use the sample standard deviations – OK if sample is large enough For the women: For the men: n1 = 49 n2 = 64 X 1 =76 X2 s1 = 8 s2 = 9 =72 X 1  X 2 = 76 – 72 = 4 yrs A 95% Confidence Interval for m1-m2, the difference between the population mean lifetimes of women and men is: X 1  X2 = X 1  X 2  = 4 = 4 + z ×Standard Error of Use the sample standard deviations – OK if sample is large enough Margin of Error E For the women: For the men: n1 = 49 n2 = 64 X 1 =76 s1 = 8 s2 = 9 X 2 =72 X 1  X 2 = 76 – 72 = 4 yrs A 95% Confidence Interval for m1-m2, the difference between the population mean lifetimes of women and men is: X 1  X2 + z ×Standard Error Use the sample standard deviations – OK if sample is large enough = X 1  X 2  = = 4 4 of Margin of Error E For the women: n1 = 49 s1 = 8 X 1 =76 (b) What can we conclude mean X 2 =72about the For the men: n2 = 64 s2 = 9 lifespans of all men and all women on the basis X 1  X 2 = 76 – 72 of this confidence interval? Justify your = 4 yrs answer. A 95% Confidence Interval for m1-m2, the difference between the population mean lifetimes of women and men is: ANSWER: Since the interval does not contain Error of X1  X 2  + ofz ×Standard a difference ZERO, there are sufficient sample grounds to say that there isUsea the difference standard deviations –   = X  X 1 2 between the mean lifespansOKofif sample the ispopulations large enough of all men and all women. = 4 = 4 ANSWER: The 95% CI for the difference between the population mean lifetimes of women and men is 0.857yrs < (m1-m2)< 7.143yrs X 1 =76 n1 = 49 s1 = 8 (b) What cann we conclude about the mean X For the men: = 64 =72 s = 9 2 2 2 For the women: lifespans of all men and all women on the basis X 1  X 2 = 76 – 72 of this confidence interval? Justify your = 4 yrs answer. A 95% Confidence Interval for m1-m2, the difference between the population mean lifetimes of women and men is: ANSWER: Since the interval does not contain a difference ZERO, there sufficient Error are of X1  X 2  + ofz ×Standard sample grounds to say that there isUse a the difference standard deviations –   = X  X between the mean lifespans OK ofif sample the ispopulations 1 2 large enough of all men and all women. TRY WITH A 99% = C.I. 4 = 4 ANSWER: The 95% CI for the difference between the population mean lifetimes of women and men is 0.857yrs < (m1-m2)< 7.143yrs Difference between 2 means exercises • Do NuLake Ch 2.7: pg. 8993 LESSON 11 – Differences between means 2 The point of today: Construct confidence intervals for the difference between 2 population means. • Do Sigma pg. 239 – Ex. 14.5 (old version). or pg. 83 – Ex. 4.03 (new version) STARTER: GO THROUGH PROBLEM FROM HW AS A CLASS. Do Sigma pg. 239 – Ex. 14.5 (old version). or pg. 83 – Ex. 4.03 (new version) LESSON 12 The distribution of the sample Total. The point of today: Construct confidence intervals for the combined total of a sample of items. • Example • 2009 NCEA paper (AS90642): Q1b & c. • Probabilities for sample totals: Ex. 3.03 (pg. 64) – complete for HW. Confidence Intervals for the Sample Total, Tn This is where you are asked to give a confidence interval for the combined total of a sample of n items (or of the entire population of N items). (E.g. total weight of a sample of eight Y13 males). Confidence Intervals for the Sample Total, Tn This is where you are asked to give a confidence interval for the combined total of a sample of n items (or of the entire population of N items). (E.g. total weight of a sample of eight Y13 males). You will be told the mean value per item and the standard deviation. These problems come in 2 types, depending on whether you’re given the population mean, m, or the mean from a sample X . Confidence Intervals for the Sample Total, Tn This is where you are asked to give a confidence interval for the combined total of a sample of n items (or of the entire population of N items). (E.g. total weight of a sample of eight Y13 males). You will be told the mean value per item and the standard deviation. These problems come in 2 types, depending on whether you’re given the population mean, m, or the mean from a sample X . Type 1: Based on m, a known population mean. Type 2: Based on X, the mean from a random sample. Confidence Intervals for the Sample Total, Tn This is where you are asked to give a confidence interval for the combined total of a sample of n items (or of the entire population of N items). (E.g. total weight of a sample of eight Y13 males). You will be told the mean value per item and the standard deviation. These problems come in 2 types, depending on whether you’re given the population mean, m, or the mean from a sample X . Type 1: Based on m, a known population mean. (look at today). Type 2: Based on X, the mean from a random sample. (look at next lesson) Confidence Intervals for the Sample Total, Tn This is where you are asked to give a confidence interval for the combined total of a sample of n items (or of the entire population of N items). (E.g. total weight of a sample of eight Y13 males). You will be told the mean value per item and the standard deviation. These problems come in 2 types, depending on whether you’re given the population mean, m, or the mean from a sample X . Type 1: Based on m, a known population mean. This is where you are given m, the mean value per item in the population and asked to construct a confidence interval for the total value of a sample of n items. E.g. Seventeen year-old NZ males have a known mean weight of 80kg, with a standard deviation of 5kg. Construct a 99% CI for the combined total weight of a random sample of 8 students. These problems come in 2 types, depending on whether you’re given the population mean, m, or the mean from a sample X . Type 1: Based on m, a known population mean. This is where you are given m, the mean value per item in the population and asked to construct a confidence interval for the total value of a sample of n items. E.g. Seventeen year-old NZ males have a known mean weight of 80kg, with a standard deviation of 5kg. Construct a 99% CI for the combined total weight of a random sample of 8 students. Solution: The distribution of the total weight of 8 students is the sum of 8 identically distributed random variables. Here we know the population mean weight per seventeen yearold male, m, and the standard deviation, s. So we can simply add the means and add the variances. E.g. Seventeen year-old NZ males have a known mean weight of 80kg, with a standard deviation of 5kg. Construct a 99% CI for the combined total weight of a random sample of 8 students. Solution: The distribution of the total weight of 8 students is the sum of 8 identically distributed random variables. Here we know the population mean weight per seventeen yearold male, m, and the standard deviation, s. So we can simply add the means and add the variances. Distribution of a Total of n independent items: If X1, X2,………..Xn are n independent sample values, then the sample total is Tn = X1 + X2,……….+ Xn Solution: The distribution of the total weight of 8 students is the sum of 8 identically distributed random variables. Here we know the population mean weight per seventeen year-old male, m, and the standard deviation, s. So we can simply add the means and add the variances. Distribution of a Total of n independent items: If X1, X2,………..Xn are n independent sample values, then the sample total is Tn = X1 + X2,……….+ Xn Expected value of total: E[Tn] = E [X1 + X2,……….+ Xn ] = E[X1]+……… + E[Xn] Here we know the population mean weight per seventeen year-old male, m, and the standard deviation, s. So we can simply add the means and add the variances. Distribution of a Total of n independent items: If X1, X2,………..Xn are n independent sample values, then the sample total is Tn = X1 + X2,……….+ Xn Expected value of total: E[Tn] = E [X1 + X2,……….+ Xn ] = E[X1]+……… + E[Xn] = nm Variance of estimates of the total: Var[Tn] =Var [X1 + X2,……….+ Xn ] = Var[X1]+……… + Var[Xn] So we can simply add the means and add the variances. Distribution of a Total of n independent items: If X1, X2,………..Xn are n independent sample values, then the sample total is Tn = X1 + X2,……….+ Xn Expected value of total: E[Tn] = E [X1 + X2,……….+ Xn ] = E[X1]+……… + E[Xn] = nm Variance of estimates of the total: Var[Tn] =Var [X1 + X2,……….+ Xn ] = Var[X1]+……… + Var[Xn] = nσ2 (if all have equal SD) So the std. deviation of estimates of the total is: s = n  s T   Tn = X1 + X2,……….+ Xn Expected value of total: E[Tn] = E [X1 + X2,……….+ Xn ] = E[X1]+……… + E[Xn] = nm Variance of estimates of the total: Var[Tn] =Var [X1 + X2,……….+ Xn ] = Var[X1]+……… + Var[Xn] = nσ2 (if all have equal SD) So the std. deviation of estimates of the total is: s = n  s T   Back to the example: Total weight of sample of 8 males: E(T8) = Tn = X1 + X2,……….+ Xn Expected value of total: E[Tn] = E [X1 + X2,……….+ Xn ] = E[X1]+……… + E[Xn] = nm Variance of estimates of the total: Var[Tn] =Var [X1 + X2,……….+ Xn ] = Var[X1]+……… + Var[Xn] = nσ2 (if all have equal SD) So the std. deviation of estimates of the total is: s = n  s T   Back to the example: Total weight of sample of 8 males: E(T8) = 8(80) Tn = X1 + X2,……….+ Xn Expected value of total: E[Tn] = E [X1 + X2,……….+ Xn ] = E[X1]+……… + E[Xn] = nm Variance of estimates of the total: Var[Tn] =Var [X1 + X2,……….+ Xn ] = Var[X1]+……… + Var[Xn] = nσ2 (if all have equal SD) So the std. deviation of estimates of the total is: s = n  s T   Back to the example: Total weight of sample of 8 males: E(T8) = 8(80) = 640kg. Var(T8) = 8(52) = 200. Tn = X1 + X2,……….+ Xn Expected value of total: E[Tn] = E [X1 + X2,……….+ Xn ] = E[X1]+……… + E[Xn] = nm Variance of estimates of the total: Var[Tn] =Var [X1 + X2,……….+ Xn ] = Var[X1]+……… + Var[Xn] = nσ2 (if all have equal SD) So the std. deviation of estimates of the total is: s = n  s T   Back to the example: Total weight of sample of 8 males: E(T8) = 8(80) = 640kg. Var(T8) = 8(52) = 200. So σ8 = 200 = 14.14213562kg Variance of estimates of the total: [X1 +(b) X2and ,……….+ n] =Var 1. Do 2009 NCEA Var[T AS90642 – Q1 (c) Xn ] = Var[X1]+……… + Var[Xn] 2. Do Sigma: = nσ2 (if all have equal SD) So the-std. of estimates the–total Olddeviation (2nd edition): pg. of 183 Ex.is:11.3. s T = n s   - or New: pg. 64 – Ex. 3.03. Back to the example: Total weight of sample of 8 males: E(T8) = 8(80) = 640kg. Var(T8) = 8(52) = 200. So σ8 = 200 = 14.14213562kg 99% CI for T is E(T8)  z  s T = 640  2.576 14.14... = 640kg  36.43kg (4sf) ANSWR: The 99% CI for T8, the total weight of the sample of 8 males is: 603.6kg <T< 676.4kg (all to 4sf) LESSON 13 Confidence Intervals for Population Totals • STARTER: Revise the definition of a Confidence Interval. • Notes on CI for population totals.  Do NCEA AS90642 – 2009 paper: Q2c.  Do NuLake p98-100 (mixed problems).  Do NuLake practice assessment (p101). Confidence Intervals for the Sample Total, Tn This is where you are asked to give a confidence interval for the combined total of a sample of n items (or of the entire population of N items). (E.g. total weight of a sample of eight Y13 males). You will be told the mean value per item and the standard deviation. These problems come in 2 types, depending on whether you’re given the population mean, m, or the mean from a sample X . Type 1: Based on m, a known population mean. Type 2: Based on X , the mean from a random sample. Confidence Intervals for the Sample Total, Tn This is where you are asked to give a confidence interval for the combined total of a sample of n items (or of the entire population of N items). (E.g. total weight of a sample of eight Y13 males). You will be told the mean value per item and the standard deviation. These problems come in 2 types, depending on whether you’re given the population mean, m, or the mean from a sample X . Type 1: Based on m, a known population mean. (looked at last lesson). Type 2: Based on at today) X , the mean from a random sample. (look Confidence Intervals for the Sample Total, Tn This is where you are asked to give a confidence interval for the combined total of a sample of n items (or of the entire population of N items). (E.g. total weight of a sample of eight Y13 males). You will be told the mean value per item and the standard deviation. These problems come in 2 types, depending on whether you’re given the population mean, m, or the mean from a sample X . Type 2: Based on X , the mean from a random sample: Type 2: Based on X, the mean from a random sample: This is where you are asked to construct a confidence interval for the total value of N items but the population mean per item is unknown. Instead we are told X , the mean from a sample. Then an estimate of the total value of N items is: NX To construct a CI for a total based on a sample: 1. Construct a confidence interval for the population mean per item, m. 2. Multiply the lower and upper bounds of the interval by N , the number of items. Example: 68 Year 13 male students are to be selected at random from throughout NZ to win a prize of an overseas holiday after NCEA exams. Type 2: Based on X, the mean from a random sample: This is where you are asked to construct a confidence interval for the total value of N items but the population mean per item is unknown. Instead we are told X , the mean from a sample. Then an estimate of the total value of N items is: NX To construct a CI for a total based on a sample: 1. Construct a confidence interval for the population mean per item, m. 2. Multiply the lower and upper bounds of the interval by N , the number of items. Example: 68 Year 13 male students are to be selected at random from throughout NZ to win a prize of an overseas holiday after NCEA exams. The organisers need to estimate the likely total weight of the students, due to weight restrictions on the aircraft. Then an estimate of the total value of N items is: NX To construct a CI for a total based on a sample: 1. Construct a confidence interval for the population mean per item, m. 2. Multiply the lower and upper bounds of the interval by N , the number of items. Example: 68 Year 13 male students are to be selected at random from throughout NZ to win a prize of an overseas holiday after NCEA exams. The organisers need to estimate the likely total weight of the students, due to weight restrictions on the aircraft. The mean and SD of the popn of all Year 13 males is unknown so they conduct a pilot study by selecting a random sample of 30. Then an estimate of the total value of N items is: NX To construct a CI for a total based on a sample: 1. Construct a confidence interval for the population mean per item, m. 2. Multiply the lower and upper bounds of the interval by N , the number of items. Example: 68 Year 13 male students are to be selected at random from throughout NZ to win a prize of an overseas holiday after NCEA exams. The organisers need to estimate the likely total weight of the students, due to weight restrictions on the aircraft. The mean and SD of the popn of all Year 13 males is unknown so they conduct a pilot study by selecting a random sample of 30. This sample has a mean weight of 76Kg with standard deviation of 7Kg. Construct a 96% CI for the expected total of weight of 68 randomly selected Year 13 students. To construct a CI for a total based on a sample: 1. Construct a confidence interval for the population mean per item, m. 2. Multiply the lower and upper bounds of the interval by N , the number of items. Example: 68 Year 13 male students are to be selected at random from throughout NZ to win a prize of an overseas holiday after NCEA exams. The organisers need to estimate the likely total weight of the students, due to weight restrictions on the aircraft. The mean and SD of the popn of all Year 13 males is unknown so they conduct a pilot study by selecting a random sample of 30. This sample has a mean weight of 76Kg with standard deviation of 7Kg. Construct a 96% CI for the expected total of weight of 68 randomly selected Year 13 students. Solution: Type 2: Based on X , the mean from a random sample: Example: 68 Year 13 male students are to be selected at random from throughout NZ to win a prize of an overseas holiday after NCEA exams. The organisers need to estimate the likely total weight of the students, due to weight restrictions on the aircraft. The mean and SD of the popn of all Year 13 males is unknown so they conduct a pilot study by selecting a random sample of 30. This sample has a mean weight of 76Kg with standard deviation of 7Kg. Construct a 96% CI for the expected total of weight of 68 Year 13 students. Solution: 1.Construct a 96% confidence interval for the popn mean m: Interval is given by: 7 s 76  z n = 76  2.054 = 76  2.625 30 Example: 68 Year 13 male students are to be selected at random from throughout NZ to win a prize of an overseas holiday after NCEA exams. The organisers need to estimate the likely total weight of the students, due to weight restrictions on the aircraft. The mean and SD of the popn of all Year 13 males is unknown so they conduct a pilot study by selecting a random sample of 30. This sample has a mean weight of 76Kg with standard deviation of 7Kg. Construct a 96% CI for the expected total of weight of 68 Year 13 students. Solution: 1.Construct a 96% confidence interval for the popn mean m: 7 s Interval is given by: 76  z n = 76  2.054 = 76  2.625 So 96% CI for popn mean weight, m is: 30 Example: 68 Year 13 male students are to be selected at random from throughout NZ to win a prize of an overseas holiday after NCEA exams. The organisers need to estimate the likely total weight of the students, due to weight restrictions on the aircraft. The mean and SD of the popn of all Year 13 males is unknown so they conduct a pilot study by selecting a random sample of 30. This sample has a mean weight of 76Kg with standard deviation of 7Kg. Construct a 96% CI for the expected total of weight of 68 Year 13 students. Solution: 1.Construct a 96% confidence interval for the popn mean m: 7 s Interval is given by: 76  z n = 76  2.054 30 = 76  2.625 So 96% CI for popn mean weight, m is: 73.375kg < m < 78.625kg The mean and SD of the popn of all Year 13 males is unknown so they conduct a pilot study by selecting a random sample of 30. This sample has a mean weight of 76Kg with standard deviation of 7Kg. Construct a 96% CI for the expected total of weight of 68 Year 13 students. Solution: 1.Construct a 96% confidence interval for the popn mean m: Interval is given by: s 7 = 76  2.054 76  z 30 n = 76  2.625 So 96% CI for popn mean weight, m is: 73.375kg < m < 78.625kg 2. Multiply the lower and upper bounds of the interval by N , the number of items. 96% CI for the expected total weight of the 68 Y13s is: (N × lower limit for m) < TN < (N × upper limit for m) Construct a 96% CI for the expected total of weight of 68 Year 13 students. 1. C.I. for Population Totals: Solution: Do 2009 NCEA paper (AS90642): 1.ConstructQ2c. a 96% confidence interval for the popn mean m: Interval is given by: 7 s 76  z = 76  2.054 2. Preparation for test: 30 n Do NuLake p98-100 (Mixed = 76  2.625 problems). So 96% CI for popn mean weight, m is: 73.375kg < m < 78.625kg 2. Do NuLake practice assessment Multiply the lower and upper bounds of the interval by N (p101) number of items. , the 96% CI for the expected total weight of the 68 Y13s is: (N × lower limit for m) < TN < (N × upper limit for m) = (68 × 73.375) < TN < (68 × < 78.625) = 4990kg < T68 < 5347kg answer Sample-size question from 2007 NCEA External Exam A random sample of size n is taken from a population having a known standard deviation σ. A 95% confidence interval for the population mean is calculated using the sample mean. A second random sample of size 2n is taken from the same population and a 95% confidence interval for the population mean is calculated using its sample mean. How many times greater is the width of the first confidence interval than the width of the second confidence interval? LESSON 14 – ASSESSMENT What to study:  Do NuLake mixed problems (p98) – merit level qs.  NuLake practice assesment (p101)  More practice (Achieved & Merit): Do Sigma Confidence Intervals Review exercise:  Old: p241 – Ex. 14.6  New: p95 – Ex. 5.03  CIs for totals (Excellence) – past papers: 2009 Q2c 2008 Q6 2006 Q7

Statistics and modelling course

Related documents

Products

Support

Statistics and modelling course

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib