Bus 221 Excel Notes: Chapter 11 Sampling Distributions – Background Info Consider the distribution of the GPA of a current CWU student – assume it is approximately normally distributed. That distribution has a mean GPA, we call it the population mean. We can take a sample of CWU students and calculate the mean GPA for the sample. It turns out that when you take a sample, the value of the mean for that sample is random variable. The reason is that you do not know which individuals from the population will end up in your sample. In this exercise you will take a sample and calculate the mean. However, you will not take a sample of students, but rather a sample of randomly generated numbers. You can think of this as drawing numbers from a hat. Rather than the numbers having an approximately normal distribution, like GPA above, the numbers will have a uniform distribution. That is, there an equal chance of getting any number between 0 and 1. In this exercise, you will take a sample of numbers. The sample size will be 10 – that is, you will take a sample of 10 numbers. You will then take another sample of 10 numbers. You will actually take 100 samples of size 10. For each sample, you will then calculate the sample mean. You will then have 100 sample means. You will then generate a histogram using excel to reveal the distribution of the 100 sample means. You will see that the shape of the distribution is approximately normal. This is what the Central Limit Theorem tells us to expect. In the second part of the exercise, you start fresh, and take a brand new sample. This time the sample size will be 100. You will then take another sample of 100 numbers. You will actually take 100 samples of size 100. For each sample, you will then calculate the sample mean. You will then have 100 sample means. You will then generate a histogram using excel to reveal the distribution of the 100 sample means. You will see that the shape of the distribution is again approximately normal. However, the distribution will be much more narrow than in the prior exercise. This is also what the Central Limit Theorem tells us to expect. o Why? The purpose of this exercise is to exhibit the Central Limit Theorem. Sampling Distributions – Excel Manipulations Open up an Excel spreadsheet Note: How To Select Cells o One cell: click on the cell o Multiple Cells (any rectangular shaped area of cells): click on the cell on upper far left of the group you want to select, hold the click, drag the cursor over to the cell on the lower far right of the group you want to select, and release the click. o A row of cells: click on the cell on the far left of the row, hold the click, drag the cursor over to the cell on the far right, and release the click. o A column of cells: click on the cell on the top of the column, hold the click, drag the cursor over to the cell on the bottom of the column, and release the click. Create 100 different random samples of observations on a random variable, where the random variable has a uniform distribution over the interval [0, 1], and the sample size is 10. Create a sample with 10 observations, and calculate the mean. o Type the following formula into cell A2 and press return: =RAND() o This randomly generates an observation on a variable which has a uniform distribution over the interval [0, 1]. o Copy this formula into cells A3 – A11 by selecting cell A1 (just click on it), holding the cursor over the lower right corner of cell A2 until a “+” sign appears, then click, hold the click, and drag the cursor down to cell A11. You have now created a sample of 10 observations. o Now type the following into cell A1: =average(a2:a11) Create 99 more samples with 10 observations, and calculate the mean for each sample: o Select all of the cells in the range A1 – A11. o Copy the formulas in cells A1 – A11 by holding the cursor over the lower right corner of cell A11 until a “+” sign appears, then click, hold the click, and drag the cursor straight to the right until you reach column CV. You have now will have 100 samples of size 10, with the mean calculated in the first row of each sample. Generate the histogram for the one hundered sample means you just calculated. o First make the classes (ranges) for the histogram. Excel calls these bins. Just below the first column of data, in cell A15, type a 0. In the cell below that type a 0.05. Now select the two above cells, and drag down as above until you see a number 1. o Now install the analysis toolpak Click on “File” on the top left of the page, then click on “Options,” then click on “Add Ins,” then select “Excel Add-Ins” from the drop down menu next to “Manage,” then select “Go,” then put a check in the box next to “Analysis ToolPak,” and then select “OK.” o Now generate a histogram. Click on “Data” at the top of the spreadsheet, then click on “Data Analysis,” then select “Histogram” and click “OK” In the “Input Range” box type “a1:cv1” This references all of the data for which the histogram will display the distribution. Excel refers to the data as an “array.” In the “Bin Range” box type “a14:a34” This references the classes (ranges) that will be used in the histogram. Check the circle next to “Output Range” and type C14 into the box to the right. This tells Excel to generate the histogram in cell C14. Check the box next to “Chart Output.” This tells Excel to generate a histogram. Without this selection, Excel will only generate a table displaying the distribution of the variable. Click “OK” o Note the shape of the histogram. Are you surprised that it looks Normal. This is what the Central Limit Theorem predicts. SECOND PART OF THE EXERCISE Create 100 different random samples of observations on a random variable, where the random variable has a uniform distribution over the interval [0, 1], and the sample size is 100. Click on “Sheet2” at the bottom left of your spreadsheet. This will go to a second worksheet in your spreadsheet file. Create a sample with 100 observations, and calculate the mean. o Type the following formula into cell A2 and press return: =RAND() o This randomly generates an observation on a variable which has a uniform distribution over the interval [0, 1]. o Copy this formula into cells A3 – A101 by selecting cell A2 (just click on it), holding the cursor over the lower right corner of cell A2 until a “+” sign appears, then click, hold the click, and drag the cursor down to cell A101. You have now created a sample of 100 observations. o Now type the following into cell A1: =average(a2:a101) Create 99 more samples with 100 observations, and calculate the mean for each sample: o Select all of the cells in the range A1 – A101. o Copy the formulas in cells A1 – A101 by holding the cursor over the lower right corner of cell A11 until a “+” sign appears, then click, hold the click, and drag the cursor straight to the right until you reach column CV. You have now will have 100 samples of size 100, with the mean calculated in the first row of each sample. Generate the histogram for the one hundered sample means you just calculated. o First make the classes (ranges) for the histogram. Excel calls these bins. Just below the first column of data, in cell A103, type a 0. In the cell below that type a 0.05. Now select the two above cells, and drag down as above until you see a number 1. o Now install the analysis toolpak Click on “File” on the top left of the page, then click on “Options,” then click on “Add Ins,” then select “Excel Add-Ins” from the drop down menu next to “Manage,” then select “Go,” then put a check in the box next to “Analysis ToolPak,” and then select “OK.” o Now generate a histogram. Click on “Data” at the top of the spreadsheet, then click on “Data Analysis,” then select “Histogram” and click “OK” In the “Input Range” box type “a1:cv1” This references all of the data for which the histogram will display the distribution. Excel refers to the data as an “array.” In the “Bin Range” box type “a103:a123” This references the classes (ranges) that will be used in the histogram. Check the circle next to “Output Range” and type C103 into the box to the right. This tells Excel to generate the histogram in cell C103. Check the box next to “Chart Output.” This tells Excel to generate a histogram. Without this selection, Excel will only generate a table displaying the distribution of the variable. Click “OK” o Note the shape of the histogram. Compare it to your prior distribution. Note the difference in how narrow the distribution is. This is also what the Central Limit Theorem predicts.