Statistics 402C Final Exam Name: May 5, 2004

advertisement
Statistics 402C
May 5, 2004
Final Exam
Name:
INSTRUCTIONS: Read the questions carefully and completely. Answer the questions and
show work in the space provided. This is the only work that I will look at. Partial credit
will not be given if work is not shown. Be sure to answer all questions within the context of
the problem. Refer to the computer printout and graphs provided when appropriate. Pace
yourself, do not spend too much time on any one problem. Point values for each problem
are given.
1. [40 pts] Traffic engineers are interested in the effect of erecting signs that say “Accident
Reduction Project Area” and metering the flow of vehicles onto freeway on-ramps
on the average traffic speed. Twenty similar freeway interchanges are chosen. Each
interchange has a traffic light at the on-ramp. The interchanges are spread widely
around a single large metropolitan area in Southern California. Ten of the intersections
are chosen at random to get “Accident Reduction Project Area” signs; the other ten
get no signs. The traffic lights can be turned off (no minimum time between vehicles)
or set to require 3 or 6 seconds between entering vehicles. Average traffic speed during
“rush hour” will be measured at each interchange on three consecutive Tuesdays in
June. At each interchange, the three settings of traffic lights are assigned at random
to the three Tuesdays. Refer to the JMP output entitled Accident Reduction Project.
(a) [4] What are the treatments, experimental units and response?
(b) [6] Explain why this is a split-plot design. Be sure to mention what the whole-plot
factor and the sub-plot factor are.
1
(c) [6] Below are main effect plots of average speed. Describe the apparent effects of
each factor.
(d) [4] Below is an interaction plot. Is there an apparent interaction between the two
factors? Be sure to indicate what you see in the plot that supports your answer.
2
(e) [5] Does the average speed of traffic differ significantly for interchanges with signs
when compared to those without signs? Support your answer statistically.
(f) [5] Is their a significant interaction between timing and signs? Support your answer statistically.
(g) [5] Is there a significant difference in average speed of traffic for the three traffic
light timings? Support your answer statistically.
(h) [5] Construct an adjLSD (use t associated with 98% confidence) for comparing
the three traffic light timings. Which, if any, traffic light timings are significantly
different?
3
2. [40 pts] A study was done involving a large number of family doctors. Each doctor was
given a “fable” about a female patient less than 18 years old. After hearing the “fable”
the doctor was asked whether she/he would keep patient confidentiality and not inform
the patient’s parents. There were 16 different “fables” constructed by factorial crossing
of four factors. The percentage of doctors who would keep patient confidentiality and
not inform the patient’s parents for each treatment combination is recorded.
Factor
Low Level (−1)
High Level (+1)
M: Maturity of patient
immature for age
mature for age
L: Length of time doctor
has known family
less than 1 year more than 5 years
A: Age of patient
14 years
17 years
C: Complaint
drug problem
venereal disease
(a) [4] What are the response, conditions (treatments) and units in this experiment?
(b) [8] Below is a plot of the factor level means. Describe the effect of each of the
four factors.
4
(c) [5] Below are the estimated full effects and a normal plot of those effects. Identify
and label on the normal plot the estimated full effects that appear to be significant.
Effect
M
L
A
C
Estimate
0.1369
−0.0181
0.1549
0.2364
SS
Effect
0.0749 ML
0.0013 MA
0.0959 MC
0.2235
LA
LC
AC
Estimate
−0.0269
−0.0264
−0.0214
0.0326
0.0351
0.0316
SS
Effect
0.0029 MLA
0.0028 MLC
0.0018 MAC
0.0043 LAC
0.0049 MLAC
0.0040
Estimate
0.0034
−0.0056
−0.0286
−0.0221
0.0016
SS
0.0000
0.0001
0.0033
0.0020
0.0000
(d) [6] Use the 3- and 4-way interaction terms to compute a substitute estimate of
error variability. Be sure to indicate how many degrees of freedom are associated
with this estimate.
5
(e) [6] It appears that factor L: Length of time doctor has known family is not important either by itself or with any other factors. Given that factor L is no different
from error, explain HOW you could compute an estimate of error variability.
The estimate must be different from the one in d). How many degrees of freedom
are associated with this estimate? DO NOT COMPUTE THE ESTIMATE
OF ERROR VARIABILITY.
(f) [6] Below is the analysis of the data using just factors M, A and C. M SError =0.00194
Source
M
A
MA
C
MC
AC
MCA
df
1
1
1
1
1
1
1
Sum of Squares
0.0749
0.0959
0.0028
0.2235
0.0018
0.0040
0.0033
F-Ratio Prob > F
38.6
0.0003
49.4
0.0001
1.4
0.2656
115.1
<0.0001
0.9
0.3604
2.1
0.2891
1.7
0.2301
According to this analysis, what factor(s) and/or interaction(s) are statistically
significant? Be sure to support your answers by referring to the analysis.
6
(g) [5] Using only those terms that are statistically significant and the fact that the
overall mean response is 0.660, give the prediction equation. Use it to predict
the percentage of family doctors who would keep confidentiality for a 17 year old
female patient with a drug problem. This patient appears immature for her age
and the doctor has known the family for less than one year.
3. [24 pts] Name that design! For each of the following scenarios indicate what design
is used. Indicate the factors of interest, nuisance factors and provide an ANOVA table
listing all sources of variation and associated degrees of freedom.
(a) [8] A company that cuts and freezes french fries wants to know which machine
of the four they own produces the most waste when cutting the fries. The four
fry cutters, and their operators, are constantly in use. Different operators may
produce different amounts of waste. Each day a new load of potatoes is used and
there will be day to day variation in waste due to size and shape of potatoes in
a days load. Each operator will operate each machine once. Each operator will
operate a different machine each day.
7
(b) [8] An ornithologist is interested in the time it takes red-shouldered hawks to
respond to calls of other birds that may invade their territory. The type of forest;
old growth or new growth may have an effect. Also the type of intruding bird
may have an effect. There may also be an interaction between type of forest and
type of bird. Two forests, one old growth and one new growth, are used. In each
forest, ten nests are chosen at random from known nesting sites. At each nest,
two pre-recorded calls are played over a loudspeaker. One call is a red-shouldered
hawk call, the other is a great horned owl call. The calls are played several days
apart and the order is randomized for each nest. The response is the time until
the nesting hawks leave the nest to drive off the intruder.
(c) [8] A study is performed to investigate the effect of depth of planting (2 levels)
and date of planting (3 levels) on corn yields. The study is performed using 6
plots near Nevada, IA, 6 plots near Clear Lake, IA, 6 plots near New Ulm, MN
and 6 plots near Decorah, IA. The six combinations of depth and date will be
randomly assigned to plots at each location.
8
4. [24] For each of the following situations give the response, conditions of interest and
units. Explain what design you would use to accomplish the purpose stated.
(a) [8] The investigator wishes to see if smoking a marijuana cigarette changes people’s heart rate.
(b) [8] In order to track migration, butterflies will be marked and released. The
placement of the mark may affect how attractive the butterflies are to predators.
The investigator wishes to see if the placement of a mark on the wing of different
butterfly species affects the chances of successfully migrating. There are six locations on the wing and two species of butterfly.
9
(c) [8] Do students from different colleges score differently on multiple choice and
problem solving statistics tests. Students from Engineering, Business, Agriculture and Liberal Arts and Sciences Colleges will participate. Each student will
take both a multiple choice test and a problem solving test over the same material in an introductory statistics course. The order of the test, multiple choice or
problem solving first, will be randomized for each student.
5. [7 pts] A friend comes to you for advice on an experiment she is going to conduct with
pigs. There are two types of pigs she can use. In addition, there is a second factor
involving the amount of antibiotic they will be given in their food (none, low level and
high level). She will factorially cross type of pig and amount of antibiotic. She does
not have to pay for the antibiotic as it is being provided to her by the manufacturer.
(a) [3] If she wants to be able to detect a two standard deviation difference in treatment means with alpha of 0.05 and beta of 0.10, how many pigs of each type does
she need?
(b) [4] When you tell her the number, she says that she can get that many of the
first type of pig but twice that many of the second type. How many pigs should
she use in her experiment? Explain briefly
10
6. [15 pts] The three fundamental principles of a well designed experiment are control
of outside variables, randomization and replication.
(a) [5] Explain why control of outside variables is important.
(b) [5] Explain why blocking is important and how it differs from control of outside
variables.
(c) [5] Explain why randomization is important and how it can help when there is
an outside variable that you did not control.
You may pick up your corrected final exam and a copy of the critique of your
project at my office starting Monday, May 10. If you cannot pick them up and
would like them sent to you, write the address where you would like these things
sent in the space below.
11
Download