df SS MS F P

advertisement
1) The expected value is from the regression output.
The intercept was 5.0811, the slope was 2.0923
So the expected value at x=6 will be 5.0811+2.0923*6 = 17.6349
2) DFRegression = G-1 = 3 -1 = 2
DFResidual = N-G = 47 -3 = 44
But you don’t have to calculate both, once you find one you can just subtract if from the Degrees
of freedom total, since it must add up to 46
SSRegression = SUM( n*(xbar – totalaverage)2)
= 17*(15.21-15.92)2 + 22*(17.32-15.92)2 + 8*(13.59-15.92)2 = 95.12
SSResidual = SUM( (n-1) s2 ) = (17-1)*4.022 + (22-1)*4.202 + (8-1)*3.812 = 730.62
But you don’t have to calculate both. Once you do one of the SS, you can subtract it from the
825.74
MSRegression = 95.12/2=47.56
MSResidual = 730.62/44 = 16.60
F = 47.56/16.60 = 2.864
ANOVA
Groups
Residual
Total
df
SS
2 95.12
44 730.62
46 825.74
MS
47.56
16.60
F
P-value
2.864 0.06771
3) The decision that the two are related is the alternative hypothesis, so the Grinch rejected the
null hypothesis. The truth was that the two variables are independent. That means the Grinch
rejected the null when the null was true, so a Type I error was made.
But the Grinch made an error before that. The p-value was larger than .05, so he should have
accepted the null instead of rejecting it.
5) It will decrease the width of her confidence interval.
6) H0: p1 – p2 = 0
None of the confidence intervals contain 0 (note: You are not checking whether α is in the
interval)
Even at 98% we would reject, so the p-value is smaller than 0.02 (note – you do not divide by 2)
P-value < α therefore reject.
Conclude the Republicans and Democrats do not spend the same amount of money for
campaigns
7) Note: It doesn’t matter that one sample has 10 times as many as the other sample, and the
percentage of dogs who die without the cure is (1000-19)/1000 = 981/1000
n1π1=981 n1(1-π1)=19 n2π2=11 n2(1-π2)=89 So the assumptions for normality are met
p1  p 2  Z
p1 1  p1  p 2 1  p 2 

n1
n2
981 
981  11 
11 
1 

1 

981 11
1000  1000  100  100 

 1.88

1000 100
1000
100
= (0.8116, 0.9304)
8) np=49 n(1-p)=576 so the assumptions for normality are met
1) H0: p≤ 0.10
2) HA: p > 0.10
3) α = 0.05
4)
49
 0.10
p 
625
Z

 1.8
 1   
0.101  0.10
n
625
If you look up the 1.8 (using the positive z-table) you’ll get an area of 0.9641. So the area to the
right of -1.8 would be 0.9641. If you don’t have a picture you’ll never get it straight
5) p-value: 0.9641
6) Fail to reject the null
7) There is not enough evidence to conclude that the percent of veterinarians who go
bankrupt is over 10%.
11) SSE = sum[ (n-1) s2 ] = (4-1)*4.62 + (4-1)*5.22 + (6-1)*4.82 + (6-1)*5.82 = 428
An alternative way:
MSE = SSE/(n-g)
5.172 = SSE/(20 – 4) so SSE = 5.172 * (20-4) = 428
12)
H0: μ ≤ 10
HA: μ > 10 (the claim)
α = .05
The measurements are random and normally distributed
X 
s
T2 
23  12  18
 17.67
3
23  17.67 2  (12  17.67) 2  (18  17.67) 2
3 1
 5 .5
17.67  10
 2.415
5.5
3
Looking up 2.415 on the T-table with 2 degrees of freedom,
.05 < p-value < .10
Thus we fail to reject the null
Santa’s bag will not weigh more than 10 kilotons.
13)
Clark rejected the null hypothesis, and Truthman said that the null hypothesis was incorrect, so
Clark did not make a Type I or Type II error. The error that Clark made was to do a z-test when
n*p<10. N*p = 200*.02 = 4
14)
ANOVA
Regression
Residual
Total
df
1
29
30
SS
23
220
243
MS
23
7.58
F
P-value
3.03 0.09845
15) The correct answers are:
______ Major and gender are related
______ The distribution of majors is not the same for males and females
16) The matched pairs test has a smaller standard deviation which means a more powerful test.
18)
So the residual plot looks like a dog, which is meant to be humorous, but believe it or not
residual plots like this really do happen - usually in models that have more than one x variable,
and it’s usually because multiple assumptions have been violated.
These assumptions should be investigated as possibly problematic:
- Independence: There could be clumping in the area of -3 which is why we see a “head”
- Constant Variance: The variance near 1 is small, but the variance near -1 is large, as well as the
variance near 3.5, which is why we see “legs”
- Linearity: Since there is a curved shape (flowing from the “head” down the “body”) the
assumption of linearity could be wrong
- Normality: There should be a lot of residuals near zero, with fewer residuals further away from
zero. Instead there are few near zero, many near -1 few near 1, few near -3.5 and many 3. A
histogram of these residuals would not be bell shaped.
The one assumption that is not easily suspicious is that the residuals are centered at zero. Zero
looks like the balancing point for the data.
19)
Regression
Error
Total
DF
1
79
80
SS
6.40
117.45
123.85
MS
6.40
1.4867
F
4.3048
20)
Horse
Age Price Expected Residual
Red Fury
2
$8000 7600
400
Light Knight
9
$4000 6200
-2200
Twigger
15
$2900 5000
-2100
Pastey Beauty 25
$1000 3000
-2000
The best value will have the largest negative residual.
Light Knight saves 2200, so it is the best value.
There is an expected cell which is less than 5 (the Mosquito – Cayenne value
of 4.4) Therefore we cannot do this test using the methods we learned in class, and
no more conclusion (or hypothesis test) can be done.
22)
23) We have the difference in two means, with variances unpooled.
The samples are greater than 30
t80=2.639
3.5 2 1.2 2
15  12  2.639

81
81
 1.915,4.085
24)
H0: µd=0
Ha: µd≠0
α=0.05
t24=(31-48)/(10.1/sqrt(35))=-9.96
p-value off the chart ≈ 0
Reject
Our data shows one of the airlines (PilotAirOr) is significantly higher than the other
25)
Ho: µ≥4.7
HA: µ<4.7
α=0.05
t31=(4.5-4.7)/(.5/sqrt(32))=-2.26
.01<p-value<.02
Reject
Our evidence shows the Deans claim that honor students graduate earlier than 4.7 years is
correct
26) Not a large enough sample size
27)
t4=2.776
(107.4-88.4)±2.776*sqrt(17.42/7+14.42/5)=( -6.55, 44.55)
28)
First let’s find the cut-off that Bill will use.
His distribution for the average is normal centered at 50 with standard error of 21/sqrt(9).
With α=0.05 the cutoff will be at z=1.96
(You could argue that Bill wants a two tailed test, but I think if he’s betting with the other guy
that he would only care about the upper tail of his test. Not like the small addition to power from
the other tail will have much probability)
1.96=(xbar-50)/(21/sqrt(9))
Xbar = 63.72
So if Bill gets an average above 63.72 he will reject (because his p-value will be smaller than
0.05 at that point)
Will’s distribution for the average is normal with mean 70 and standard error of 21/sqrt(9)
The probability of being above 63.72 (which means Bill will reject his test, which is powerful in
Will’s opinion).
Z=(63.72-70)/(21/sqrt(9)) = -0.9
Probability = 1- 0.1841 = 0.8159
While that’s enough to answer the question I hope you notice that the alpha is reasonably small,
the power is reasonably high, and the sample size is rather small (if you can get titanium at
WalMart) so everyone is likely to be happy with this study, and agree to the results of it.
29)
H0: The distribution is what is should be
HA: The distribution is not what it should be (the machine is not working right)
Alpha:0.05
Watermelon Green Apple Lemon Cherry Peach
.56
2.22
.5
4.09
4.09
The Chi-squared with 4 degrees of freedom is 11.46
The p-value is between 0.02 and 0.025
Reject the null
The data shows the machine is not operating within the specified parameters ( the quality control
expert now has to stop the machine and see if he can fix it)
Download