Assignment 3 for STAT 203 – Statistics for Social Scientists

advertisement
Assignment 3 for STAT 203 – Statistics for Social Scientists
Due Friday June 29, 2012 at 4:30pm in the drop box in or near the statistics workshop.
1) Let your null hypothesis be that are no unicorns left in the world. To test your hypothesis, you
go to a nearby magical forest. You are testing this hypothesis to the default level of significance
0.05. (Hint: The p-value is the probability of seeing observations as far against your null
hypothesis or further, assuming that the null hypothesis is true) (6 points total)
A) If you see a unicorn in the forest, what is your p-value? (1 pt.)
B) Justify your answer in part A. (Hint: Use the definition, apply it to the subject at hand.) (2
pts.)
C) Assuming you see a unicorn, what would be your decision regarding the null hypothesis? (1
pt.)
D) If you didn’t see a unicorn in the forest, could you accept the null hypothesis? Why or why
not? (2 pts.)
2) In each situation, would a large value of significance level alpha (more than .05) or a small level
of alpha be more appropriate (.05 or less)? (5 pts, 1 each)
A) A trial for which the penalty is death (null hypothesis is innocence).
B) Determining if you should take extra vitamin C to ward off a cold this morning (null
hypothesis is you’re not in danger of getting a cold).
C) A patient might have a disease that would require cutting off a leg. (Null hypothesis is that
the leg doesn’t need to be cut off).
D) You have to decide whether to look before crossing the street. (Null hypothesis is that it’s
safe to cross without looking).
E) You have an exam in the morning and you’re deciding whether to set the alarm or not. (Null
hypothesis is that you’ll wake up in time without an alarm)
3) Consider the dataset Milklong.csv on the webpage. (17 points total)
A) At the = .050 level, determine if the calcium level from the population of Good1 is not
equal to 20. (1 pt.)
B) At .05 significance, determine if the calcium level from Good2 is less than 20. (2 pts.)
C) Construct a side-by-side boxplot of Good1 and Low. (2 pts.)
D) Test if Good1 and Low have the same mean. ( = .05) Assume that the two sets of 25 milk
bottles come from 50 completely different cows. (2 pts.)
E) Test if Good1 and Low have the same mean. ( = .05) Assume that Good1 comes from 25
cows and that Low comes from the same 25 cows after a diet change. (The 1st bottle in each
set is from the 1st cow, the 2nd bottle in each set is from the 2nd cow and so on.) (2 pts.)
F) Test if the mean calcium level in Wild is 20, as opposed to not equal to 20. (
= .05) (1 pt.)
G) Construct the side-by-side boxplot of Good1, Good2, Good3, Good4, and Wild. (2 pts.)
H) Does the milk from the Wild sample appear to be the same as the milk from the Good
samples? If there are any differences, why did you get the result that you did from part F?
(3 pts.)
4) Build a 95% confidence interval of the mean for the following pairs. (Timesaving Hint: the
standard error is the same for everything in part A,B, and C) (8 points total)
A)
= 100, s= 20, n=4 and =100,
= 20, n=4 (2 pts.)
B)
= 100, s= 50, n=25 and
C)
= 100, s= 100, n=100 and
=100,
=100,
= 50, n=25 (2 pts.)
= 100, n=100 (2 pts.)
D) Without calculating, what would the pair of confidence intervals look like for n=5813 for the
same standard error as parts A,B, and C. (2 pts.)
5) (3 points total)
A) If you did a one-sample t-test using a sample of size n=10, how many degrees of freedom df
would you have? (1 pt.)
B) If you did a two-sample t-test with samples of size n1=10 and n2 = 10 to determine if two
means were equal, how many degrees of freedom would you have? (1 pt.)
C) If you were testing to determine if three means were equal using three samples of size
n1=10, n2 = 10, and n3 = 10, how many degrees of freedom would you have? (1 pt.)
6) Consider two analyses from the same data set. (9 points total)
When we consider the two groups to be 71 teenage boys and 71 teenage girls with heights measured at
the same age. We get the following result from a two-sample t-test . ( = .05).
a) What type of t-test is this? (1 pt.)
b) What conclusion do you make from this test? (result just below this) (2 pts.)
Next, we consider the two groups as 71 sets of teenage sisters and brothers and do a different twosample t-test ( = .05).
c) What type of t-test is this? (1 pt.)
d) What conclusion do you make from this test? (result below) (2 pts.)
e) Why are your conclusions different even though the data set is the same? (3 pts.)
7) Consider the dataset Dragons.sav , which has some biological information from 300 bearded
dragons that are from a local pet distributor. (6 points total)
A) Many creatures have a 1-1 sex ratio, meaning that .50 are males and .50 are females. Using
= .01, determine if this is the case with the dragons from this distributor. (2 pts.)
B) What is the 95% confidence interval of the proportion of female bearded dragons? (1 pt.)
C)
In the wild, .20 of bearded dragons are ‘fancy’ colours, which means anything but the
default green. However, fancy colours tend to sell for more money, so sometimes there is
intentional breeding of fancy dragons to increase the proportion to fancy dragons to higher
than .20. Is there evidence that these dragons are the result of intentional breeding at
=
.01? (2 pts.)
D) What is the 99% confidence interval of the proportion of fancy dragons. (Hint: To change the
confidence level from the default 95%, click the ‘Options’ button in the one-sample t-test
pop-up) (1 pt.)
For interest:
Super Freakanomics, by Steven D. Levitt, and Stephen J. Dubner. You have a heavy workload ahead, so
this isn’t for marks and won’t be on any test, but I highly recommend you read the chapter “Why Do
Most Drug Dealers Live With Their Moms” if you’re in the criminology track.
http://books.google.ca/books?id=66Dm4p1wxqUC
Download