Statistics 101 L – Laboratory 8

advertisement
Statistics 101 L – Laboratory 8
Drawing samples from populations requires some method of randomization. In the activities in
this week’s lab you will use different methods to randomly select a sample from a population. In
the process you will also learn something about the variation introduced by random sampling
(called sampling variation).
Activity 1: Your group will need the sheet with 100 rectangles on it for this activity. Do not
study the sheet of rectangles. The rectangles represent a generic population of values. The
areas of the rectangles could represent the number of hours per week a student spend on Stat 101
outside of class and lab. What we will investigate in this activity is what can be learned about the
area of the rectangles using different sampling methods.
a) Each individual in your lab group should look at the sheet of rectangles for a few seconds
and guess the average area of the rectangles on the entire sheet. The unit of measure is
the background rectangle. Thus rectangle 7 has area 4x3=12. Compare your guess with
those of the other members in your group. Discuss why you got different guesses.
Record the individual guesses on the group answer sheet.
b) Each individual should now select five rectangles that, in her/his judgment, are
representative of the population of rectangles on the entire sheet. Write down the area for
each of the five rectangles and compute the sample mean area. How does this compare
with your guess? How does it compare with the sample means of the judgment samples
for the others in your group? Again discuss why you got different means. Record the
judgment sample means on your group answer sheet.
c) We want to take a random sample from the 100 rectangles. We could cut out the
rectangles, laminate them, put them in a bag, mix thoroughly and draw from the bag. Do
you think this would give you a random sample? Explain briefly.
d) You have a 10-sided die that you can use to select a random sample. Devise a plan to
produce a random sample from the population of rectangles using the 10-sided die.
e) Using your sample plan developed in d), take a simple random sample of 5 rectangles.
Record the rectangle numbers and the corresponding areas on the group answer sheet.
What is the sample mean area for your sample? How does it compare with your guesses
and your judgment sample means?
f) If you were to repeat part e) would you come up with the same sample mean area?
Explain briefly.
g) Which do you think will be a better guess for the population mean area of the 100
rectangles, a random sample mean for 5 rectangles or the random sample mean for 20
rectangles? Explain briefly.
Activity 2: The first population that we will sample from is the U.S. Senate (2002) edition. There
are 100 members of this population. Some of the characteristics of the Senators that may be of
interest are:
name = Senator’s last name
gender = M for male, F for female
party = D for Democratic, R for Republican, I for Independent
state = two letter state abbreviation
years = years in Senate
One way to randomly sample items from a population is to put all of the items into a bag, mix
them up and draw from the bag, without replacement. We can’t throw all the Senators into a bag
and mix them up, so we will use an applet on the web that simulates random selection, drawing
1
without replacement, from the population of 100 Senators. Go to the course web site under
computing and select the Sampling Senators link.
a) Choose a random sample of 10 U.S. Senators. Record the Senator’s name, gender, party
and years in Senate on the group answer sheet. Also record the proportion of females, the
proportion of Republicans and the sample mean years in the Senate.
b) Obtain two more samples of 10 U.S. Senators. Again record the proportion of females,
proportion Republicans and sample mean years in the Senate.
c) Did the three samples produce the same summary values, i.e. proportions and means?
Why?
d) In the 2002 Senate there were 87 males and 13 females, 50 Democrats, 49 Republicans
and one Independent and the mean years in the Senate was 12.38 years. Which of your
samples comes closest to matching the true values in the population?
e) Because most of the time we take only one sample, what can be said about a single
sample’s ability to match the population?
Activity 3: In this activity we will look at how you can use JMP to select a simple random
sample and also look at the idea of stratified sampling. The population we will look at
consists of 1742 students at ISU who have taken Stat 101 and filled out the survey at the
beginning of the semester they took Stat 101. The variable we will look at is the age of the
student. The data is saved in a JMP file called SexAge_Stat101.JMP.
a) Open up the JMP file. You can select a simple random sample from this file by going to
Tables – Subset. In the dialog box click on Random – sample size: and fill in 100 for the
size of the sample. Use Analyze – Distribution to answer the following questions. Turn
in the JMP output.
i. What proportion of your sample is Female? Male?
ii. What is the average age for your sample?
iii. What is the median age for your sample?
iv. Describe the distribution of age for your sample.
b) The population can be stratified by gender – male and female. This creates two
subpopulations (strata) of 950 females and 792 males. These two subpopulations (strata)
are saved in JMP files called Male_Age_Stat101.JMP and Female_Age_Stat101.JMP.
Using these subpopulations (strata) obtain a stratified random sample of 50 males and 50
females. Use Analyze – Distribution to answer the following questions. Turn in the JMP
output.
i. What are the average and median ages for your sample of males?
ii. What are the average and median ages for your sample of females?
iii. Combine your sample average age for males with your sample average age for
females to come up with an average age for students in the entire population.
Hint: You need to take into account that there are more females in the
population than males.
iv. Compare your answer in b. – iii. to that in a. – iii.
2
Stat 101 L: Laboratory 8 – Answer Sheet
Names: _________________________
_________________________
_________________________
_________________________
Activity 1:
Name
Guessed Value
Judgment Mean
c) Do you think drawing laminated rectangles from a bag will give you a random sample?
Explain briefly.
d) Plan for using a 10-sided die to select a random sample.
e) Random sample
Rectangle #
Area
Sample mean area = _______________________________
f) Would another random sample have the same sample mean area? Explain briefly.
g) Which is better, random sample mean based on 5 or random sample mean based on 20?
Explain briefly.
3
Activity 2:
a) Sample of 10 U.S. Senators
Name
Summaries:
Years
y=
Party
p̂ R =
Gender
p̂ F =
b)
Years
Party
Gender
Sample 2 Summaries:
y=
p̂ R =
p̂ F =
Sample 3 Summaries:
y=
p̂ R =
p̂ F =
c) Same summary values for all samples?
d) Which sample comes closest to the population values?
e) What can be said about a single sample’s ability to match the population?
4
Activity 3:
a) Simple Random Sample
i. What proportion of your sample is Female? Male?
ii. What is the average age for your sample?
iii. What is the median age for your sample?
iv. Describe the distribution of age for your sample.
b) Stratified Random Sample
i.
What are the average and median ages for your sample of males?
ii. What are the average and median ages for your sample of females?
iii. Combine your sample average age for males with your sample average age for
females to come up with an average age for students in the entire population.
Hint: You need to take into account that there are more females in the
population than males.
iv. Compare your answer in b. – iii. to that in a. – iii.
5
Download