Lab 8 Sampling Distributions

advertisement
LAB 8 Sampling Distributions
The Situation: Joe, Cally, and Nick are 3 clerks who work in the small claims office at the government building in a
large metropolitan city. Their supervisor periodically checks up on them to see how long it takes each of them to
process claims. In order to deal with the large volume of small claims that come through the office each day, each
clerk must take 6 minutes or less to process a claim, on the average. It would be too time consuming, if not
impossible, for the supervisor to monitor every transaction performed by each clerk. The supervisor needs to decide
on the best way to make a reliable estimate of the time it takes each clerk to process claims. In general, she has one
of two options: 1) The supervisor will randomly pick a small sample (somewhere between 2 and 4 claims) and
calculate the average time it takes a clerk to process claims. or 2) The supervisor will randomly pick a large sample
(somewhere between 16 and 25 claims) and calculate the average time it takes a clerk to process claims.
If a clerk takes longer than 6 minutes, on the average, to process small claims, the clerk may be moved to a less
demanding position, or possibly fired. The supervisor does not want to make an incorrect decision and dismiss a
clerk who is actually performing their job well. Your task is to find the method that will help the supervisor make
the best decision.
Overview: The supervisor isn’t able to observe the entire population for a particular clerk. She has to make her
decision based on a single sample of the worker’s performance. From this sample the supervisor can calculate
statistics like the sample mean and sample median. However, we know that if she took a different sample she could
get different values for the sample statistics. Thus, she shouldn’t expect the sample mean to be exactly equal to the
population mean. Let’s find out how well the sample mean estimates the population mean.
In this lab you will utilize a program developed by Robert delMas at the University of Minnesota to help us
investigate distributions of sample means to decide which method is best. Once you understand this behavior, you
can decide which method is best for the employees.
Case 1: Population is Normal (Joe)
Let’s start with Joe. On average, Joe takes 5 minutes to process a claim,
and his processing time tends to have a Normal distribution. However, the
supervisor does not know this information, and must base all conclusions on
sample data. We want to know the chances of Joe getting unlucky and
showing his supervisor a sample mean of more than 6 minutes per claim,
even though he's a 5-minute claim-processor. To find this out, we need to
know the behavior of the sampling distribution of the sample means.
Program Instructions: Locate and double-click the Sampling SIM
program. You will see a screen similar to the figure on the right.
The program lets you create predefined population distributions by simply clicking on a button.
Hold the mouse button down on the
button, slide the mouse
down to highlight NORMAL, and let go. This creates a population with the
shape of a Normal distribution (see figure above). The characteristics of this
population distribution are displayed below the graph for the population, e.g.
population mean of  = 5 and standard deviation  = 1.805. This graph
represents the population distribution of Joe’s claim-processing. The mean
(blue ) and median (red M) are represented by vertical lines.
1. Let’s simulate Monitoring Method 1 where the supervisor picks two claims at random.
 Go to the Menu bar and select Windows -> Sampling Distribution.
 The Sample Size set to 1. Change the Sample Size to 2
 Click once on the New Series button so it reads Add More.
 Click once on the button labeled Draw Samples.
The program will draw one sample, calculate it’s sample mean, and then place a green square in the graph area
to represent the location of the sample mean. Click again (just once) on the Draw Samples button. A second
sample is drawn and it’s sample mean is plotted on the graph. Look at the box labeled Total Sample Drawn.
The number 2 is in the box indicating that a total of 2 samples has been drawn. Click on the Draw Samples
button eight more times so that you have a total of 10 sample means plotted.
2.
Change the value in the Number of Samples box from 1 to 490. Now click on Draw Samples and the program
will draw 490 more samples and plot the sample mean for each. By clicking the Draw Samples button until
you have a total of 500 in the Total Samples Drawn box. With a total of 500 samples, you get an even better
idea of how the sample mean varies from sample to sample.
4.
Go to the worksheet. Joe: Normal Distribution and sketch a graph that matches the graph created by the
Sampling Distributions program, be sure to label the x axis.
Just below where you placed the graph, you will see a place to record the Mean of x and the sd of x .
 The MEAN of Sample Means box on the computer screen tells you the average of the sample means. This
measures the center of the distribution of sample means. Locate this value on the computer screen and write
it in for the Mean of x just below where you placed the graph.
 The Standard Dev. of Sample Means box on the computer screen tells you the standard deviation of the
sample means. This measures the variability among the sample means. Locate this value on the computer
screen and write it in for the sd of x .
6. If the CLT was valid we would expect sd of x to be close to /sqrt(n) where  is the original population standard
deviation.
5.
7. Move the red tab to find the actual proportion of x that fell above 6. On your worksheet write the proportion of
times that Joe would have had a sample that his supervisor would consider poor.
STOP. Before you proceed, ask the instructor to check your work.
Repeat each step for sample sizes of 9 and 16.
Case 2: The Population is not Normal (Talia)
Now, let’s take a look at another employee’s claim-processing times. Talia’s claim-processing times follow a
skewed left distribution with mean 6.81 and standard deviation = 2.063. How will the means for samples taken
from her distribution behave? To find out, return to the Population window (Windows -> Population), click on
Normal, and then select "Skewed -". This will create Talia’s skewed left population distribution. This distribution is
also shown at the top of the second column of the worksheet.
Follow the previous steps to look at distributions of sample means for the same three sample sizes we used with Joe’s
population (n = 2, n = 9, and n = 16).
Case 3: Erratic Behavior (Cally)
Now we look at an employee whose claim-processing times are quite irregular. Cally is pretty erratic; the population
of her claim-processing times is presented at the top of the third column on worksheet. Her distribution has a
population mean of = 5 with variability = 3.410. What will distributions of sample means from Cally’s
population look like? To find out, return to the Population window (Windows -> Population). You will see four
buttons at the bottom of the Population Window. Each button has the outline of a distribution. Locate the last button
along the bottom with the blue outline. Click once on the button to create Cally’s population. Make sure you have
the correct population by checking that = 5 and = 3.410.
Follow the previous steps to look at distributions of sample means for the same three sample sizes we used with
Joe’s population (n = 2, n = 9, and n = 16).
Sampling Distributions Activity – Part 1
For a sample size of n = 2 (the first graph), how does the SHAPE of distribution
of 500 sample means compare to the shape of the population ?
For a sample size of n = 9 (the second graph), how does the SHAPE of
distribution of 500 sample means compare to the shape of the population ?
For a sample size of n = 16 (the third graph), how does the SHAPE of
distribution of 500 sample means compare to the shape of the population?
For all sample sizes n = 2, 9 and 16, how does the Mean of x compare to the
MEAN of the population (the value of from the Sampling Distribution
Worksheet)?
page 3
Joe
Normal Dist.
Very Different
Different
A Little Different
About the Same
Very Different
Different
A Little Different
About the Same
Very Different
Different
A Little Different
About the Same
Talia
Negative Skew
Very Different
Different
A Little Different
About the Same
Very Different
Different
A Little Different
About the Same
Very Different
Different
A Little Different
About the Same
Cally
Irregular Shape
Very Different
Different
A Little Different
About the Same
Very Different
Different
A Little Different
About the Same
Very Different
Different
A Little Different
About the Same
Much Lower
Much Lower
Much Lower
A Bit Lower
A Bit Lower
A Bit Lower
About the Same
About the Same
About the Same
A Bit Higher
A Bit Higher
A Bit Higher
Much Higher
Much Higher
Much Higher
n=2
n=9
n = 16
n=2
n=9
n = 16
n=2
n=9
n = 16
n=3
n=9
n = 16
n=2
n=9
n = 16
n=3
n=9
n = 16
YES
YES
YES
NO
NO
NO
For all sample sizes n = 2, 9, and 16 (the first graph), describe how the
STANDARD DEVIATION of the sample means (sd of x ) compares to the
STANDARD DEVIATION of the population (the value of from the
Sampling Distribution Wooksheet)?
Which sample size produced a distribution of sample means with the LARGEST
variability (largest value for the sd of x )?
Which sample size produced a distribution of sample means with the
SMALLEST variability (smallest value for the sd of x )?
Look at the value for the sd of x for all three sample sizes. Are any of the
values GREATER than the standard deviation for the population (larger than the
value of )?
Sampling Distributions Scrapbook
Score:
out of 3
page 1
Score:
out of 3
Score:
out of 3
Download