Midterm 2, Question 2, 3, 4. Odds. Odds Ratio.

advertisement
Midterm 2, Question 2, 3, 4.
Odds. Odds Ratio.
Fractal by Emma Taylor.
Question 2: Test the null that the mean is 40 against the
two-tailed alternative.
= 46, s =6, n = 9
First, identify: We only have one sample mean, so this is a one
sample t-test. We are given that it is a two-tailed test.
(HA, the alternative hypothesis is two-tailed)
Not part of the question, but it really helps.
a) Would you reject the null at α = 0.01?
= 46, s =6, n = 9.
H0: μ = 40. HA: μ
40
This is a t-test, so look at the t formulas given.
a) Would you reject the null at α = 0.01?
= 46, s =6, n = 9.
H0: μ = 40. HA: μ
40
This is a t-test, so look at the t formulas given.
NO
, we don’t have
r
a) Would you reject the null at α = 0.01?
= 46, s =6, n = 9.
H0: μ = 40. HA: μ
40
This is a t-test, so look at the t formulas given.
NO
, we don’t have
NO
, we only have
r
one
a) Would you reject the null at α = 0.01?
= 46, s =6, n = 9.
H0: μ = 40. HA: μ
40
YES .
This is the one sample t-test.
We have the necessary data.
t = 3. This sample mean is 3 standard errors above 40.
Also, df = 8 (n=9 minus 1 because we have one mean)
Looking at the t-table:
t* = 3.355 > t = 3. So we fail to reject the null.
b) Give a range for the p-value.
t* at 0.02 significance, this is the t value when p=0.02
t* at 0.01 significance, this is the t value when p=0.01
b) Give a range for the p-value.
t* when p = 0.02: 2.896
t* when p = 0.01: 3.355
t = 3, between 2.896 and 3.335,
so p-value is between 0.01 and 0.02.
c) Would you reject the null if you knew σ = 6?
Key idea: What’s the difference between σ = 6 and s=6?
c) Would you reject the null if you knew σ = 6?
Key idea: What’s the difference between σ = 6 and s=6?
σ is the population standard deviation.
s is the sample standard deviation.
S is what we use in the place of σ when we don’t know σ.
We use the normal distribution (df = infinity) with σ, and the tdistribution with s (df = 8).
The score calculation is the same:
z = 3.
But we compare to z* using infinite degrees of freedom.
z = 3, which is larger than z* = 2.576, so we reject the null.
Key point: It’s easily to find significant difference when you
know the true standard deviation instead of estimating it.
See: Assn 3, Q4 (With all the confidence intervals)
Getting the ‘AHA’ lightning strike?
Just like Edison.
Question 3: Given a paired test of the difference between two
means, we find a mean difference of 5, and a sample standard
deviation of the difference of 20 from our 25 pairs.
First, identify: We already know it’s a paired sample t-test.
What else can we find?
Question 3: Given a paired test of the difference between two
means, we find a mean difference of 5, and a sample standard
deviation of the difference of 20 from our 25 pairs.
First, identify: We already know it’s a paired sample t-test.
What else can we find?
Difference = D or
diff
=5
Sample standard deviation =
Number of pairs = n = 25
s = 20
a) Construct a 90% confidence interval of the difference.
Formula for confidence interval:
μ is the true difference, which we don’t know but are hoping
to capture. (90% of the time, at least)
is the value that everything centers around.
We also need t*.
a) Construct a 90% confidence interval of the difference.
To find t*, consult the t-table. (2 sided, 10% significance).
Why? C.I. are 2-tailed and 90% inside  10% outside.
t* = 1.711.
t* = 1.711,
= 5, s = 20, n = 25.
Plug in the values to get your confidence interval.
=
5 6.844 or (-1.844 to 11.844)
b) Is there a significant difference between the groups at the
0.10 level? Use the confidence interval to justify your
decision.
To test the null hypothesis that there is no difference at .10,
we can use the confidence interval of the difference.
If the interval includes zero, then a difference of zero is
plausible and we fail to reject. Otherwise reject.
(-1.844 to 11.844) includes zero, so fail to reject the null.
There is no significant difference between the two groups.
Do these problems seem less overwhelming now?
Question 4: Consider the intervention program of 1C. We’re
also interested in if the intervention group has higher grade
averages than the non-intervention group.
(Non-intervention is “control”,that’s why C was used for that group)
= 61
SI = 21
nI = 125
I
= 58
SC = 26
nC = 125
C
a) Describe the test we would perform.
Question 4: Consider the intervention program of 1C. We’re
also interested in if the intervention group has higher grade
averages than the non-intervention group.
(Non-intervention is “control”,that’s why C was used for that group)
= 61
SI = 21
nI = 125
I
= 58
SC = 26
nC = 125
C
a) Describe the test we would perform.
One-tailed, independent samples test.
b) Justify your choice of paired or independent.
If it were paired….
- There would be a link between a student in the
intervention group and one in the non-intervention group.
- We would be comparing individuals instead of entire
groups.
- There would be only one standard deviation.
b) Justify your choice of paired or independent.
If it were paired….
- There would be a link between a student in the
intervention group and one in the non-intervention group.
- We would be comparing individuals instead of entire
groups.
- There would be only one standard deviation.
- Example: 125 students before and after.
- Example: 125 pairs of older and younger sibling.
There is no indication of such a pairing structure, so we use the
independent samples test.
c) From this SPSS output table what is your conclusion (in
both statistical and real life terms) at α = 0.10.
First, what are we looking for?
c) From this SPSS output table what is your conclusion (in
both statistical and real life terms) at α = 0.10.
First, what are we looking for?
Sig. (2-tailed)
OR
t
With Sig. (2-tailed) we can compare the p-value to α directly.
With t we can do a t-test using the table.
Sig. (2-tailed) is the p-value for a 2-tailed test, but we have a 1tailed test.
Cut the p-value in half. p = 0.193 / 2 = 0.0965
p-value = 0.0965 < 0.10 = α. So reject the null.
That’s one option. There’s one more.
t = 1.304, but df has been removed from the table.
Each of the samples is size n=125, so we know there are at
least 124 degrees of freedom.
From the t-table at 120 df, and 0.10 one-tailed significance:
t* = 1.289.
1.304 = t > t* = 1.289, so we
reject the null.
One more note:
If the confidence interval was 80% instead of 95%, we could
have also used that and checked if it included zero.
80% in the middle means 20% on the outside, so 10% on either
side.
d) How many degrees of freedom can you guarantee without
assuming pooled standard deviation?
Without using pooled standard deviation, the degrees of
freedom are the lower of
nI – 1 and nC – 1.
Since nI = 125 and nC = 125, this means we can guarantee
124
degrees of freedom.
e) How many degrees of freedom are there if you DO assume
pooled standard deviation?
When we pool the standard deviation, we pool the degrees
of freedom too. In that case, each group has
n -1 degrees of freedom,
where n is the size of that group.
(nI – 1) + (nC – 1) = 248
I hope you feel you can navigate this material better now.
Odds (Found near the end of Ch.11, P.403)
Odds are a lot like probability, but are calculated differently.
Times event occurs
Probability of event =
------------------------Times anything occurs
Example: The probability of rolling a “4” on a six-sided die is:
Pr( Rolling a 4) = One face / Six faces in total = 1/6.
Odds is calculated as
Odds of event =
Times event occurs / Times event DOESN’T occur
In odds, “ / ” should be read “to”.
In probability, “ / ” is read “in”.
Example: The ODDS of rolling a 4 on a six-sided die are:
Odds( Rolling a 4 ) = One face / Five faces = 1/5, 1 to 5.
Probability:
Odds:
The general formula for odds, can be computed from the
probability. (P is the probability)
Sometimes odds of doing something are interpreted as:
We use odds when we’re interested in comparing how often
an event happens to its opposite, or its converse.
This week Dr. Julio Montagner, an HIV expert, called for
everyone who is sexually active in British Columbia to be
screened for HIV. (voluntarily, I assume).
B.C. aims to end HIV/AIDS with widespread testing
Aim is to have every person in B.C. who's ever been sexually active tested
CBC News
Posted: Jul 18, 2012 8:27 PM PT
Last Updated: Jul 19, 2012 7:05 AM PT
Assume the screening test is perfect.
If we test a random person, there is probability P = 1/1000
they have HIV. (Fairly accurate P for British Columbia)
What are the odds of a random person testing positive for
HIV?
Odds =
Assume the screening test is perfect.
If we test a random person, there is probability P = 1/1000
they have HIV. (Fairly accurate P for British Columbia)
What are the odds of a random person testing positive for
HIV?
Odds = .001 / .999
= 1 / 999
or 1 to 999.
Is the benefit of finding one case greater than the cost of
wasting the test on 999 people?
For interest: Sports gambling uses odds as payout structure,
because if the odds are fair, that is…
Odds = Pr( Right Bet) / Pr(Wrong Bet)
… then the gambler will neither win nor lose money on
average.
Example: The casino is giving 9/1 odds against the Red Sox
winning the World Series.
That means they think
Pr( Red Sox Win ) / Pr (Red Sox Lose) = 1 / 9 (or less)
So they’re paying $9 for every $1 bet on the Red Sox to win.
Some people never want to hear the odds.
Next time: Odds ratio, Tests of contingency plots.
Download