Sampling Design Homework

advertisement
AP Statistics Review
Sampling Design Homework
1-3 Identify the population in the study, the sample, and the sampling method (include whether it was
random or not.
1. A sociologist wants to know the opinions of employed adult women about government funding for
day care. She obtains a list of the 520 members of a local business and professional women’s club
and mails a questionnaire to 100 of these women selected at random. Only 48 questionnaires are
returned.
Population: ALL employed adult women in the world
Sample: 48 women from local businesses and a professional women’s club
Sampling method: Cluster sampling utilizing a voluntary response
Randomization: Voluntary response is not random sampling method….it creates bias (only the
truly passionate will respond), and this style of clustering creates undercoverage (blue collar
employed workers are left out).
2. A business magazine mailed a questionnaire to the human resources directors of all of the Fortune
500 companies, and received responses from 23% of them. Those responding reported that they did
not find that such surveys intruded significantly on the workday.
Population: Human resource directors for fortune 500 companies
Sample: 23% of the HR directors
Sampling Method: Voluntary response through survey
Randomization: none…..voluntary response are extremely biased.
3. Researches waited outside a bar they had randomly selected from a list of such establishments. They
stopped every 10th person who came out of the bar and asked whether he or she thought drinking
and driving was a serious problem.
Population: Difficult to determine…..but assumed that it is people who drink in bars
Sample: Every 10 person exiting the bar (what is the age range (21 and up). What do we know
about these bars….what city are they in (a college town, uptown Houston, Deep Elm in Dallas0
Sampling method: Cluster method paired with systematic random
Randomization: The method within this study was randomly design the sample that was
selected….however, there was serious undercoverage because only people who go to bars were
surveyed in regard to the serious problem associated with drinking and driving. A truly random
experiment would have questioned all individuals aged 21 and up or whatever the actual
population of this study should be
4. Describe how you would select a sample of 30 juniors from your school using the following methods:
a. SRS: random # generator……selecting from the entire junior class.
b. Stratified random sampling: There are a great number of possible scenarios here:
Here is one example: Divide the Juniors up based off of their GPA (91-100), (81-90), (71-80),
(61-70)…..the number the juniors within each group, and use a random number generator to
select the same number from each group.
You could break them into groups by extra-curricular’s, hair color, ethnicity, etc.
You would need to make sure that whatever groups you make have similar sizes. So for my
example above, I may have to rework the splits to ensure group size.
c. Convenience sampling
Stand in the hall way, and pick the first 30 juniors that walk past you.
d. Voluntary response sampling
Make an announcement asking for 30 volunteers from the junior class
e. Systematic sampling
Take a printout of the entire junior class. Take the total number of people and divide it by 30.
Use a number generator to pick a number a number from 1 to that number, and then add that
number to it each time.
5. Some medical researchers suspect that added calcium in the diet reduces blood pressure.
You have available 40 men with high blood pressure who are willing to serve as subjects.
You want to give 8 men the calcium additive and the rest a placebo (sugar pill). In an
experiment, you need to randomly allocate the subjects as specified above. Technically, this
is not a simple random sample, but the random allocation works the same.
a) The names of the subjects appear below. List the subjects to whom you will give the
drug. (use the given random #’s). Mark clearly on the number line indicating your
selections. Start by numbering them
1 Alomar
2 Asihiro
3 Bennett
4 Bikalis
5 Carlson
6 Chen
7 Cranston
8 Curtis
9 Denman
10 Durr
11 Edwards
12 Farouk
13 Furgeson
14 George
15 Green
16 Guillen
17 Han
18 Howard
19 Hruska
20 Imrani
21 James
22 Kaplan
23 Krushchev
24 Lehman
25 Liang
26 Maldonado
27 Marsden
28 Moore
29 O’Brian
30 Ogle
31 Pittman
32 Rodriguez
33 Rosen
34 Solomon
35 Tompkins
36 Townsend
37 Troy
38 Underwood
39 Willis
40 Zhang
42672 67680 42376 95023 82744 03971 96560 55148 26461 88346 52430 60906
74216
b) As mentioned above, since this scenario was an experiment, we used random allocation
as oppose to a simple random sample (even though they worked very similar). Explain
why the process above cannot be called a simple random sample.
Not every individual has the same probability of being selected since only 8 individuals
will get the treatment, and the remaining 32 will be the placebo
Circle the best answer. Explain briefly why each answer was correct or incorrect.
6.
Suppose that a number of crates of pencils is chosen at random from a boxcar of crates,
and then a number of boxes of pencils is chosen at random from each selected crate. Our
goal is to determine the number of defective pencils in a box. This is an example of…
A) simple random sampling
B) multi-stage sampling
C) stratified sampling
D) systematic sampling
E) none of the above
7.
The following is a map of a census tract
in a fictitious town. Census tracts are
small, homogeneous areas averaging
4000 in population. On the map, each
block is marked with a Census Bureau
identification number. A random
sample of blocks from a census tract is
often the next-to-last stage in a
multistage sample. Explain how you
will set up a simple random sample of 5
blocks from this census tract using their
identification numbers. Be sure to mark
clearly on the number line and explain
what happens with unused numbers.
10
11
12
13
14
15
21
22
23
24
25
26
31
32
33
34
35
36
40
51
52
53
54
55
56
61
62
63
64
65
97971 48932 45792 63993 95635 28753 46069 84635 49345 183058 76213 82390
77412 Utilize these number to select the 5 blocks that you will use…the rest will be untested.
66
8. A club has 30 student members and 10 faculty members. The club can send 4 students
and 2 faculty members to a convention. It decides to choose those who will go by
random selection. Use the numbers below to choose a stratified random sample of 4
students and 2 faculty members.
STUDENTS
1 Abel
2 Carson
3 Chen
4 David
5 Deming
6 Elashoff
7 Fisher
8 Ghosh
9 Griswold
10 Hein
11 Hernandez
12 Holland
FACULTY
1 Andrews
2 Besicovitch
3 Fernandez
4 Gupta
13 Huber
14 Jimenez
15 Jones
16 Kim
17 Klotz
18 Liu
5 Kim
6 Lightman
19 Miranda
20 Moskowitz
21 Neyman
22 O’Brian
23 Pearl
24Potter
25 Reinmann
26 Santos
27 Shaw
28 Thompson
29 Utts
30 Varga
7 Moore
8 Phillips
9 West
10 Yang
487477659532588383928442280016378908099352010
719502249400369512698707373694977512388273613
178575235221392229304377610503582495764847051
You could start back over for the faculty or continue forward….both random…I continued forward
In the following description of an experiment, there is information that you have not been given. Your
job is to identify the parts of the experiment, and justify your responses. Where you have not been given
enough information to answer the question, you can either 1.) make an assumption, which you justify,
or 2.) give two scenarios, and tell which one you think would be better. You may want to use the More
On Experimental Units handout to help you. Bachman, Herzberg, and Rich conducted a factorial study of
fluid flow through thin tubes. They measured the time required for the liquid level in a fluid holding tank
to drop from 4 in. to 2 in. for two drain tube diameters and two fluid types. Two different technicians
did the measuring. Here is their data:
Technician Diameter (in.) Fluid Time (sec)
1 .188 water 21.12, 21.11, 20.80
2 .188 water 21.82, 21.87, 21.78
1 .314 water 6.06, 6.04, 5.92
2 .314 water 6.09, 5.91, 6.01
1 .188 ethylene glycol 51.25, 46.03, 46.09
2 .188 ethylene glycol 45.61, 47.00, 50.71
1 .314 ethylene glycol 7.85, 7.91, 7.97
2 .314 ethylene glycol 7.73, 8.01, 8.32
1. What is the response variable?
Time (in seconds) required for the liquid level to drop from 4in to 2in
2. How many factors?
2 factors: I am assuming that the researchers are not interested in the effect of the variable
“technician” on the value of the response variable, and I am assuming that “technician” is a blocking
variable.
There are times, however, when we are interested in the effect of “person” on a response. For example,
suppose there was an experiment where one factor was “surgeon,” with levels Dr. Brown, Dr. Jones, Dr.
Smith, and another factor was “method of performing surgery”, with levels A, B, C. Suppose the
response variable is “ two‐year survival rate.” Over the next year, each of the three heart surgeons in
General Hospital performs heart surgery using each of the three methods, and each patient is observed
for two years after surgery, whereupon his/her survival status is noted. Such an experiment is ethical if
we truly do not know which method is the best and which surgeon is the best. If you are a potential
patient, or if you are the person evaluating the performance of the surgeons, you are interested in the
effect of surgeon on survival rate. (Yes, there are lots of complications with this experiment that I am
not discussing.)
3. What are the factors (and their levels)?
Diameter in inches(.188, .314)
Liquid (water, ethylene glycol)
2
4. How many treatments?
4 treatments
5. What are the experimental units?
There is not enough information to answer this question.
Scenario 1 (Detroit): If the researchers are interested in generalizing to all tubes of this type with the
two diameters and if they are interested in generalizing to any water of ethylene glycol speciments, then
they need to replicate the tubes and they need to use different water and different ethylene glycol in
each run of the experiment. In this case, the experimental units are specific tubes used in conjunction
with specific samples of liquid. This scenario is most likely to be the scenario of interest.
Scenario 2 (Detroit and NASCAR): If the researchers are interested in future drainings of water
and ethylene glycol through these specific tubes, but generalized over any quantity of water or ethylene
glycol (not the same water and glycol over and over), then they need to replicate the liquid samples, and
the experimental unit is liquid sample.
Scenario 3 (NASCAR only): If the researchers are interested in draining the same quantity of water
and the same quantity of ethylene glycol over and over using the same tubes—and no, it is not likely
that anyone would be interested in this—then they simply replicate the act of draining, and the
experimental unit is the act of draining.
6. Are there any controlled variables? Should there be?
“Person draining” should either be a controlled or a blocking varaible. The text said “person measuring”
was spread over two people, but it didn’t say anything about “person draining.”
All conditions under which the draining takes place should be controlled—the starting condition of the
drains, the exact way the liquid is poured, the temperature, etc…
7. Are there any blocking variables? Should there be?
Technician is a blocking variable. Yes, technician should be a blocking variable (or a controlled variable).
See controlled variables.
8. How many replicates?
They used r=3 replicates per treatment.
Download