M 311 – L

advertisement
JANUARY 30, 2008
MATH 311
LAB 3 – SAMPLING DISTRIBUTIONS
DUE: MONDAY, FEBRUARY 4TH AT 3:00 P.M.
Let’s imagine that you rolled a 34641-sided (trust me) die and let’s let X = the number of dots showing
on the top face. X is, of course, a random variable. We’re going to investigate the distribution of X in
this lab.
Before we begin to investigate this distribution by running a computer simulation, let’s try to guess what
you’ll see. Imagine that you rolled this die 10,000 times
Question 1: In a perfect world without random error, what should the mean value of X (which we
denote by  X ) be? Be careful – think about a 6-sided die; the mean is not 3, it’s 3.5.
Okay, enough imagining, let’s make Minitab do some work: Generate at least 10,000 rows of data with
a lower endpoint of 1 and an upper endpoint of 34641 by following Calc>Random Data>Integer.
(Note: the number 10,000 is not important for the following. The bigger we can make it, however, the
clearer the pattern we’re looking for will be. So if your computer can handle 50,000 or 100,000 – go for
it!).
Question 2: Make a Histogram and compute the mean and standard deviation of the data. Include these
in your report. Are the data normally distributed? Test it.
Okay, great. Now let’s imagine that instead of rolling the die just once you rolled it 4 times and let Y =
the mean of the four rolls. (Y is, essentially, the mean of a sample of size 4.)
Further, let’s imagine that you computed this value of Y 10,000 times (10,000 samples of size 4).
Question 3: What do you think the mean value of this new random variable Y is? Think about it. This
is what we sometimes call a “thought experiment.” First we make an educated guess at what the
solution should be and then we run a simulation to check our intuition.
Okay – so let’s make Minitab simulate this situation. Generate 10,000 rows and 4 columns (C1-C4) of
Integer data with a lower endpoint of 1 and an upper endpoint of 34641 as follows:
Select Calc>Random Data>Integer
In the Generate _______ rows of data box enter 10000
In the Store in Column(s) box, enter C1 – C4
In the Minimum Value box enter 1
In the Maximum Value box enter 34641
Notice that each row represents one sample of rolls. We can compute the mean of this sample of size 4
as follows:
Click in the grey area at the top of column C5 and title it “Mean: n = 4”
Select Calc>Row Statistics
Click Mean
In the Input variables: box enter C1-C4
In the Store result in: box enter C5
Select OK
The mean for each of the 10,000 rows should appear in C5.
Question 4: Make a Histogram and compute the mean and standard deviation of these data in C5 (the
values of Y). Include these in your report. Are the data normally distributed? (Test it.)
Alrighty, now imagine that you rolled the die 100 times (a sample of size n = 100) (whew!) and let Z =
the mean of these 100 rolls.
Suppose further that you then you computed the value of Z 10,000 times (that’s right, 10,000 sets of 100
rolls – that’s a lot of rolls!). (Z is a sample mean and, of course, will vary from sample to sample.) Tired
yet?
Well, enough “imagining.” Have Minitab compute 10,000 rows and 100 columns (C1-C100) of Integer
data with a lower endpoint of 1 and an upper endpoint of 34641. Just as you did above, compute the
mean of each row and store it in C101 (you should un-title column C5 and title C101 as “Mean: n =
100.”)
Question 5: Make a Histogram and compute the mean and standard deviation of these data in C101 (the
values of Z – remember Z? What was it?). Include these in your report. Are the data normally
distributed? (Test it.)
Have you noticed any patterns that the mean and standard deviations seem to be following for various
sample sizes? Look closely at the numbers and, perhaps, round to the “tens-place.” Still don’t see
anything? Try simulating 202 = 400 rolls. Then, maybe 332 = 1089 times (you might have to reduce the
number of rows used – it depends on how much memory the computer you’re using has). Ready to
generalize? Good, answer the last question:
Question 6: Lastly, let’s generalize (i.e. look at the pattern and come up with a formula): imagine that
you rolled the die n times and let X = mean of the n rolls.
Make a conjecture about the shape of the distribution of X, the mean (  X ), and the standard deviation
(  X ).
Download