stat 301 – business statistics

advertisement
STAT 301 – BUSINESS STATISTICS
FALL SEMESTER 2010
“Knowledge Festival” #1 – Part 2 (Conceptual)
This “knowledge festival” is intended to assess your mastery of the conceptual aspects of this
course. This test is closed-book, closed-notes. No calculator is allowed (or needed).
There are 100 points worth of problems on this exam; relative problem weights are given in
brackets. Unless the problem specifically indicates otherwise, use the traditional confidence
level of 95% and the traditional significance level of =0.05.
This “knowledge festival” is given under provisions of the Honor System of Stetson University.
The word “pledged” written before your name at the top of this page is a symbol of your ongoing
commitment to the Honor System.
Enjoy!!
Question 0 [4 points]:
Spell your name (correctly and legibly) at the top of the page.
Question 1 [16 points; 2 each part]:
Indicate whether each of the following statements is TRUE or FALSE:
a) Income data are typically skewed left.
TRUE
FALSE
b) Employing the mode as a measure of “typicalness” in a data set is
generally most useful when the data set is small (just a few numbers).
TRUE
FALSE
c) All else being equal, as the sample size increases, the variance of the
sampling distribution for the sample mean increases.
TRUE
FALSE
d) If the data are skewed right, the mean will be greater than the median.
TRUE
FALSE
e) It is theoretically possible for the expected value to be negative.
TRUE
FALSE
f) All else being equal, a 90% confidence interval will be wider than a
95% confidence interval.
TRUE
FALSE
g) One advantage of a prospective over a retrospective observational study
is that the data tend to be cheaper.
TRUE
FALSE
h) In hypothesis testing, a one-tailed test is used if, before we collect the data,
we have reason to believe that any departure from the null hypothesis
will occur in one particular direction.
TRUE
FALSE
Question 2 [6 points]:
For a normal distribution, approximately __________ percent of the data lie within one
standard deviation of the mean, and approximately __________ percent of the data lie within
two standard deviations of the mean.
Question 3 [4 points]:
Clyde Arthur Fazenbaker is conducting a hypothesis test to determine whether a coin is
fair. He has computed a test statistic of 0. What does that imply?
_____
_____
_____
_____
He got the same number of “heads” as “tails.”
He made a computational mistake, since a test statistic cannot be 0.
His null hypothesis is false.
He made a Type I error.
Question 4 [6 points; 2 each part]:
The Literary Digest was a prominent American magazine of news and public affairs that
is most famous for conducting the most disastrous political poll in history. In 1936 their survey
predicted that Alf Landon would defeat Franklin Roosevelt in the presidential election. In
reality, Roosevelt won in a landslide. (The Literary Digest lost so much credibility as a result
that they went bankrupt soon afterwards.) After-the-fact analysis of the survey process revealed
several flaws. For each of them, indicate whether this is an example of sampling error or of
nonsampling error.
a) While the Literary Digest sent out over 10 million surveys (by mail), they received
only about 2.5 million back.
_____ sampling error
_____ nonsampling error
b) The survey was begun in September. Some of those surveyed changed their mind
between then and the November election.
_____ sampling error
_____ nonsampling error
c) The Literary Digest obtained its lists of people to survey primarily from telephone
books and motor vehicle registrations – at a time (mid-Depression) when
relatively few people owned these items.
_____ sampling error
_____ nonsampling error
Question 5 [8 points; 4 each part]:
Dr. Rasp continues to claim that students who get a good night’s sleep before a
“knowledge festival” tend to do better on the “festival.” He decides to conduct research to
investigate this claim.
a) State (in words) Dr. Rasp’s null and alternative hypotheses.
b) Dr. Rasp asks students to write down, on their “exam,” the number of hours of sleep the
number of hours of sleep they got the previous night. He then analyzes the resulting data
(sleep and “knowledge festival” grades). Dr. Rasp has …
_____ … a controlled experiment
_____ … an observational study.
Question 6 [4 points]:
Dr. Rasp maintained, in class, that “numbers are meaningless …”. What did he mean by
this seemingly anti-statistical remark? Give an example to illustrate.
Question 7 [4 points]:
Berengaria Naverre is the manager of StatsWorld, a popular new theme park. She is
reading a report, prepared by the park’s accounting and marketing research staffs, which outlines
various strategies for increasing park income. Included in the report are results from a study on
how much customers spend on souvenirs while in the park. Berengaria reads about a “95%
confidence interval for mean spending per customer” as being “$20 + $5.” Which of these is a
proper interpretation of this result? {PICK ONE}
_____
_____
_____
_____
_____
She’s 100% sure that 95% of the customers in the population spend between $15 and $25.
She’s 95% sure that 100% of the customers in the population spend between $15 and $25.
She’s 100% sure that 95% of the customers in the sample spend between $15 and $25.
She’s 95% sure that 100% of the customers in the sample spend between $15 and $25.
None of the above.
Question 8 [4 points]:
While reading the report (in the previous question), Berengaria wonders why she has only
a 95% confidence interval rather than 100% confidence interval. Explain to Berengaria why a
100% confidence interval is infeasible.
Question 9 [6 points]:
Alphonso Ferrabosco is conducting a hypothesis test, and computes a p-value of .42.
What conclusion should he draw?
_____ Reject the null hypothesis.
_____ Don’t reject the null hypothesis.
______ Reject the alternative hypothesis
______ Don’t reject the alternative hypothesis
Question 10 [6 points]:
Horatio Wajberlinski is testing
H0: beer does not cause cancer
HA: beer does cause cancer
He gets a “reject” result on his hypothesis test. What conclusion should he draw?
_____ There is enough evidence to believe that beer causes cancer.
_____ There is enough evidence to believe that beer does not cause cancer.
_____ There is not enough evidence to believe that beer causes cancer.
_____ There is not enough evidence to believe that beer does not cause cancer.
Question 11 [4 points]:
We know that we divide by n-1 rather than by n in computing a sample (rather than
population) variance or standard deviation. Why do we do so?
Question 12 [4 points]:
Gracetta Squornshellous and Murgatroyd Applegarth, for their STAT 301 project,
conduct a survey on whether Stetson should construct a parking garage on campus. During their
presentation they tell Dr. Rasp that to obtain data they went through the Commons and (in their
words) “just handed out surveys randomly.” Dr. Rasp points out that they really have a
convenience sample rather than a random sample. What would they have needed to do, in order
to obtain a random sample?
Question 13 [4 points]:
“Placebin” is a newly developed pharmaceutical that is absolutely, completely, 100%
ineffective at treating every disease, condition, or sickness in the universe. However, the
manufacturers of placebin do not know that it is completely worthless. Hence, they conduct 100
different controlled experiments, to see whether placebin is effective in treating one of 100
different diseases. Given that placebin actually has no effect on anything, on how many of these
100 experiments can placebin’s manufacturers expect (on average) to obtain a “reject the null
hypothesis” result?
_____ on about 0 of the 100 experiments
_____ on about 50 of the 100 experiments
_____ on about 100 of the 100 experiments
_____ on about 5 of the 100 experiments
_____ on about 95 of the 100 experiments
Question 14 [4 points]:
Balph Snerdwell, for his (decidedly weak) STAT 301 project, surveys three Stetson
students and asks them how much sleep they got last night. The data were 4, 6, and 8 hours.
What is the population (rather than sample) mean amount of sleep?
_____ (4 + 6 + 8)/3
_____ (4 + 6 + 8)/2
_____ we can’t tell from the information given
Question 15 [4 points]:
What does the term “statistically significant” mean?
Question 16 [4 points]:
Jubilation T. Cornpone is testing the null hypothesis that his “lucky” (Confederate) silver
dollar is a fair coin, versus the alternative that it is not a fair coin. What is a Type I error, in this
situation?
Question 17 [4 points]:
According to the Law of Large Numbers and the Central Limit Theorem, what two things
happen to the sampling distribution of the sample mean as the sample size is increased?
Question 18 [4 points]:
Before he came to Stetson, Dr. Rasp taught for five years at the University of Alabama.
One semester, he had 600 students in his Introduction to Business Statistics class. Let’s suppose
that each one of those 600 individuals obtained data from a random sample of ten different
Alabama students on the amount of money spent on textbooks that semester. Each one of Dr.
Rasp’s 600 students computes a confidence interval for the true (unknown) population mean.
Assume that each of the 600 students computes his/her interval correctly. On average, how
many of those intervals will contain the true, unknown population mean? Why?
SOLUTIONS
1a) False
e) True
b) False
f) True
c) False
g) False
d) True
h) True
2) 68% (or two-thirds); 95%
3) He got the same number of “heads” as “tails.”
4a) nonsampling
b) nonsampling
c) nonsampling
5a) Null hypothesis: Sleep has no effect upon ‘festival’ grade.
Alternative hypothesis: Sleep does affect (or: improves) ‘festival’ grade.
b) observational study
6) Numbers are meaningless without a frame of reference. We need some sort of notion of
context, of what constitutes “big” or “small” in a particular situation. The primary example used
in class was 14.9 million forested acres in a state – is that a lot or a little?
7) None of the above.
8) 100% confidence interval means either (1) you have data on the entire population, or (2) your
interval is “minus infinity to plus infinity”. The first isn’t very feasible; the second isn’t
informative.
9) Don’t reject the null hypothesis.
10) There is enough evidence to believe that beer causes cancer.
11) A sample of data will tend to underestimate the variability in the entire population. The
“n-1” is an adjustment for this – dividing by a smaller number makes the result larger.
12) Selection by a chance mechanism.
13) on about 5 of the 100 experiments
14) we can’t tell from the information given
15) we can reject the null hypothesis
16) Reject the null hypothesis if it is true. Say that the coin is not fair, when really it is fair.
17) LLN: the variance of the sampling distribution decreases
CLT: the sampling distribution becomes normal
18) 95% of 600, or 570.
Download