Chapter 11 Excel Notes

advertisement
Bus 221 Excel Notes: Chapter 11
Sampling Distributions – Background Info




Consider the distribution of the GPA of a current CWU student – assume it is approximately
normally distributed. That distribution has a mean GPA, we call it the population mean. We
can take a sample of CWU students and calculate the mean GPA for the sample. It turns out
that when you take a sample, the value of the mean for that sample is random variable. The
reason is that you do not know which individuals from the population will end up in your
sample. In this exercise you will take a sample and calculate the mean. However, you will
not take a sample of students, but rather a sample of randomly generated numbers. You can
think of this as drawing numbers from a hat. Rather than the numbers having an
approximately normal distribution, like GPA above, the numbers will have a uniform
distribution. That is, there an equal chance of getting any number between 0 and 1.
In this exercise, you will take a sample of numbers. The sample size will be 10 – that is, you
will take a sample of 10 numbers. You will then take another sample of 10 numbers. You
will actually take 100 samples of size 10. For each sample, you will then calculate the
sample mean. You will then have 100 sample means. You will then generate a histogram
using excel to reveal the distribution of the 100 sample means. You will see that the shape of
the distribution is approximately normal. This is what the Central Limit Theorem tells us to
expect.
In the second part of the exercise, you start fresh, and take a brand new sample. This time
the sample size will be 100. You will then take another sample of 100 numbers. You will
actually take 100 samples of size 100. For each sample, you will then calculate the sample
mean. You will then have 100 sample means. You will then generate a histogram using
excel to reveal the distribution of the 100 sample means. You will see that the shape of the
distribution is again approximately normal. However, the distribution will be much more
narrow than in the prior exercise. This is also what the Central Limit Theorem tells us to
expect.
o Why?
The purpose of this exercise is to exhibit the Central Limit Theorem.
Sampling Distributions – Excel Manipulations




Open up an Excel spreadsheet
Note: How To Select Cells
o One cell: click on the cell
o Multiple Cells (any rectangular shaped area of cells): click on the cell on upper far left of
the group you want to select, hold the click, drag the cursor over to the cell on the lower
far right of the group you want to select, and release the click.
o A row of cells: click on the cell on the far left of the row, hold the click, drag the cursor
over to the cell on the far right, and release the click.
o A column of cells: click on the cell on the top of the column, hold the click, drag the
cursor over to the cell on the bottom of the column, and release the click.
Create 100 different random samples of observations on a random variable, where the
random variable has a uniform distribution over the interval [0, 1], and the sample size is 10.
Create a sample with 10 observations, and calculate the mean.





o Type the following formula into cell A2 and press return: =RAND()
o This randomly generates an observation on a variable which has a uniform distribution
over the interval [0, 1].
o Copy this formula into cells A3 – A11 by selecting cell A1 (just click on it), holding the
cursor over the lower right corner of cell A2 until a “+” sign appears, then click, hold the
click, and drag the cursor down to cell A11. You have now created a sample of 10
observations.
o Now type the following into cell A1: =average(a2:a11)
Create 99 more samples with 10 observations, and calculate the mean for each sample:
o Select all of the cells in the range A1 – A11.
o Copy the formulas in cells A1 – A11 by holding the cursor over the lower right corner of
cell A11 until a “+” sign appears, then click, hold the click, and drag the cursor straight to
the right until you reach column CV. You have now will have 100 samples of size 10,
with the mean calculated in the first row of each sample.
Generate the histogram for the one hundered sample means you just calculated.
o First make the classes (ranges) for the histogram. Excel calls these bins.
 Just below the first column of data, in cell A15, type a 0.
 In the cell below that type a 0.05.
 Now select the two above cells, and drag down as above until you see a number 1.
o Now install the analysis toolpak
 Click on “File” on the top left of the page, then click on “Options,” then click on
“Add Ins,” then select “Excel Add-Ins” from the drop down menu next to “Manage,”
then select “Go,” then put a check in the box next to “Analysis ToolPak,” and then
select “OK.”
o Now generate a histogram.
 Click on “Data” at the top of the spreadsheet, then click on “Data Analysis,” then
select “Histogram” and click “OK”
 In the “Input Range” box type “a1:cv1”
 This references all of the data for which the histogram will display the
distribution. Excel refers to the data as an “array.”
 In the “Bin Range” box type “a14:a34”
 This references the classes (ranges) that will be used in the histogram.
 Check the circle next to “Output Range” and type C14 into the box to the right.
 This tells Excel to generate the histogram in cell C14.
 Check the box next to “Chart Output.”
 This tells Excel to generate a histogram. Without this selection, Excel will only
generate a table displaying the distribution of the variable.
 Click “OK”
o Note the shape of the histogram. Are you surprised that it looks Normal. This is what
the Central Limit Theorem predicts.
SECOND PART OF THE EXERCISE
Create 100 different random samples of observations on a random variable, where the
random variable has a uniform distribution over the interval [0, 1], and the sample size is
100.
Click on “Sheet2” at the bottom left of your spreadsheet. This will go to a second worksheet
in your spreadsheet file.



Create a sample with 100 observations, and calculate the mean.
o Type the following formula into cell A2 and press return: =RAND()
o This randomly generates an observation on a variable which has a uniform distribution
over the interval [0, 1].
o Copy this formula into cells A3 – A101 by selecting cell A2 (just click on it), holding the
cursor over the lower right corner of cell A2 until a “+” sign appears, then click, hold the
click, and drag the cursor down to cell A101. You have now created a sample of 100
observations.
o Now type the following into cell A1: =average(a2:a101)
Create 99 more samples with 100 observations, and calculate the mean for each sample:
o Select all of the cells in the range A1 – A101.
o Copy the formulas in cells A1 – A101 by holding the cursor over the lower right corner
of cell A11 until a “+” sign appears, then click, hold the click, and drag the cursor straight
to the right until you reach column CV. You have now will have 100 samples of size
100, with the mean calculated in the first row of each sample.
Generate the histogram for the one hundered sample means you just calculated.
o First make the classes (ranges) for the histogram. Excel calls these bins.
 Just below the first column of data, in cell A103, type a 0.
 In the cell below that type a 0.05.
 Now select the two above cells, and drag down as above until you see a number 1.
o Now install the analysis toolpak
 Click on “File” on the top left of the page, then click on “Options,” then click on
“Add Ins,” then select “Excel Add-Ins” from the drop down menu next to “Manage,”
then select “Go,” then put a check in the box next to “Analysis ToolPak,” and then
select “OK.”
o Now generate a histogram.
 Click on “Data” at the top of the spreadsheet, then click on “Data Analysis,” then
select “Histogram” and click “OK”
 In the “Input Range” box type “a1:cv1”
 This references all of the data for which the histogram will display the
distribution. Excel refers to the data as an “array.”
 In the “Bin Range” box type “a103:a123”
 This references the classes (ranges) that will be used in the histogram.
 Check the circle next to “Output Range” and type C103 into the box to the right.
 This tells Excel to generate the histogram in cell C103.
 Check the box next to “Chart Output.”
 This tells Excel to generate a histogram. Without this selection, Excel will only
generate a table displaying the distribution of the variable.
 Click “OK”
o Note the shape of the histogram. Compare it to your prior distribution. Note the
difference in how narrow the distribution is. This is also what the Central Limit Theorem
predicts.
Download