Tests of Significance

advertisement
Hypothesis Tests
One Sample Means
I tell if they
really
ExampleHow
2: Acan
government
agency
are
underweight?
has received numerous
complaints that a particular
Hypothesis test
restaurant
has
been
selling
will help me
underweight
decide!
Takehamburgers.
a sample & find x.The
restaurant advertises that its
pattiesButare
how“a
doquarter
I know if pound”
this x is (4
one
that I expect to happen or is it one
ounces).
that is unlikely to happen?
What are hypothesis tests?
Calculations that tell us if a value
occurs by random chance or not – if
it is statistically significant
Is it . . .
– a random occurrence due to
variation?
– a biased occurrence due to some
other reason?
Nature of hypothesis tests How does a murder trial work?
• First begin by supposing the
“effect”
is NOTthat
present
First - assume
the
innocent
• Next,person
see ifisdata
provides
Then – must
have sufficient
evidence
against
the
evidence to prove guilty
supposition
Example:
murder trial
Steps:
Notice the steps are the
same except we add
hypothesis statements –
which you will learn today
1) Assumptions
2) Hypothesis statements &
define parameters
3) Calculations
4) Conclusion, in context
Assumptions for z-test
(t-test):
YES –
•
•
These are the same
Have an SRS
of context
assumptions
as confidence
intervals!!
Distribution is (approximately)
normal
– Given
– Large sample size
– Graph data
• σ is known (unknown)
Example 1: Bottles of a popular cola
are supposed to contain 300 mL of
cola. There is some variation from
bottle to bottle. An inspector, who
suspects that the bottler is underfilling, measures the contents of six
randomly selected bottles. Are the
assumptions met?
299.4 297.7 298.9 300.2 297 301
Writing Hypothesis statements:
• Null hypothesis – is the statement
being tested; this is a statement of
“no effect” or “no difference”
H0:
• Alternative hypothesis – is the
statement that we suspect is true
Ha:
The form:
Null hypothesis
H0: parameter = hypothesized value
Alternative hypothesis
Ha: parameter > hypothesized value
Ha: parameter < hypothesized value
Ha: parameter = hypothesized value
Example 2: A government agency has
received numerous complaints that a
particular restaurant has been selling
underweight hamburgers. The
restaurant advertises that its patties
are “a quarter pound” (4 ounces).
State the hypotheses :
H0: μ = 4
Ha: μ < 4
Where μ is the true
mean weight of
hamburger patties
Example 3: A car dealer advertises
that its new subcompact models get
47 mpg. You suspect the mileage
might be overrated.
State the hypotheses :
H0: μ = 47
Ha: μ < 47
Where μ is the
true mean mpg
Example 4: Many older homes have electrical
systems that use fuses rather than circuit
breakers. A manufacturer of 40-A fuses
wants to make sure that the mean amperage at
which its fuses burn out is in fact 40. If the
mean amperage is lower than 40, customers
will complain because the fuses require
replacement too often. If the amperage is
higher than 40, the manufacturer might be
liable for damage to an electrical system due
to fuse malfunction. State the hypotheses :
H0: μ = 40
Ha: μ = 40
Where μ is the true
mean amperage of
the fuses
Facts to remember about hypotheses:
• ALWAYS refer to populations
(parameters)
• The null hypothesis for the
“difference” between populations is
usually equal to zero
H0: μx-y= 0
P-values • The probability that the test
statistic would have a value as
extreme or more than what
is actually observed
Level of significance • Is the amount of evidence
necessary before we begin to doubt
that the null hypothesis is true
• Is the probability that we will
reject the null hypothesis, assuming
that it is true
• Denoted by α
– Can be any value
– Usual values: 0.1, 0.05, 0.01
– Most common is 0.05
Statistically significant –
• The p-value is as small or smaller
than the level of significance (α)
• If p > α, “fail to reject” the null
hypothesis at the α level.
• If p < α, “reject” the null
hypothesis at the α level.
Facts about p-values:
• ALWAYS make decision about the null
hypothesis!
• Large p-values show support for the
null hypothesis, but never that it is
true!
• Small p-values show support that the
null is not true.
• Double the p-value for two-tailed (=)
tests
• Never accept the null hypothesis!
Never “accept” the null hypothesis!
Never “accept” the null
hypothesis!
Never “accept” the null
hypothesis!
At an α level of .05, would you
reject or fail to reject H0 for
the given p-values?
a) .03
b) .15
c) .45
d) .023
Reject
Fail to reject
Fail to reject
Reject
Calculating p-values
• For z-test statistic –
– Use normalcdf(lb,rb,0,1)
– [using standard normal curve]
• For t-test statistic –
– Use tcdf(lb, rb, df)
Draw & shade a curve &
calculate the p-value:
1) right-tail test
t = 1.6; n = 20
2) left-tail test
z = -2.4; n = 15
3) two-tail test
t = 2.3; n = 25
Writing Conclusions:
1) A statement of the decision
being made (reject or fail to
reject H0) & why (linkage)
AND
2) A statement of the results in
context. (state in terms of Ha)
“Since the p-value < (>) α,
I reject (fail to reject)
the H0. There is (is not)
sufficient evidence to
suggest that Ha.”
Be sure to write Ha in
context (words)!
Example 5: Drinking water is
considered unsafe if the mean
concentration
of lead is 15 ppb (parts
H0: μ = 15
per
or greater. Suppose a
Ha: μbillion)
> 15
t=2.1
community
randomly
selects
25 water
Where μ is the
true mean
concentration
samples
computes
a t-test
p-value
< α, I reject
H0. There is
ofSince
lead the
inand
drinking
water
statistic
2.1. Assume
that
lead
sufficient
evidence
to suggest
that
the
P-value =of
tcdf(2.1,10^99,24)
concentrations
are of
normally
mean
concentration
lead in drinking
=.0232
water is greater
thanthe
15 ppb.
distributed.
Write
hypotheses,
calculate the p-value & write the
appropriate conclusion for α = 0.05.
Example 6: A certain type of frozen
dinners states that the dinner
contains
240 calories. A random
H0: μ = 240
sample
12 of these frozen dinners
Ha: μ > of
240
t=1.9
was
selected
from
production
to see
Where
μ
is
the
true
mean
caloric
Since the p-value < α, I reject H0. There is
if content
the caloric
content
was greater
of
the
frozen
dinners
sufficient evidence to suggest that the
than
stated
on
the
box.
The
t-test
P-value
=
tcdf(1.9,10^99,11)
true mean caloric content of these frozen
statistic
was
calculated
to
be
1.9.
=.0420
dinners is greater than 240 calories.
Assume calories vary normally. Write
the hypotheses, calculate the p-value
& write the appropriate conclusion
for α = 0.05.
Formulas:
known:
statistic - parameter
test statistic 
standard deviation of statistic
z=
x 
σ
n
Formulas:
σ unknown:
statistic - parameter
test statistic 
standard error
t=
x 
s
n
Example 7: The Fritzi Cheese Company buys milk from
several suppliers as the essential raw material for its
cheese. Fritzi suspects that some producers are
adding water to their milk to increase their profits.
Excess water can be detected by determining the
freezing point of milk. The freezing temperature of
natural milk varies normally, with a mean of -0.545
degrees and a standard deviation of 0.008. Added
water raises the freezing temperature toward 0
degrees, the freezing point of water (in Celsius). The
laboratory manager measures the freezing
temperature of five randomly selected lots of milk
from one producer with a mean of -0.538 degrees. Is
there sufficient evidence to suggest that this
producer is adding water to his milk?
Assumptions:
SRS?
Normal?
•I have an SRS of milk from one producer
How do you
•The freezing temperature of milk is a normal distribution.
know?
(given)
• σ is known
Do you
What are your
H0: μ = -0.545
know σ?
hypothesis
Ha: μ > -0.545
statements? Is
there
a key word?
where μ is the true mean freezing temperature
of milk
 .538   .545 
z
 1.9566
.008
5
p-value = normalcdf(1.9566,1E99)=.0252
 = .05
Plug values
into formula.
Use normalcdf to
calculate p-value.
Compare your p-value to
α & make decision
Since p-value < α, I reject the null hypothesis.
There is sufficient evidence to suggest that the true mean freezing
temperature is greater than -0.545. This suggests that the
producer is adding water to the milk.
Conclusion:
Write conclusion in
context in terms of Ha.
Example 9: The Wall Street Journal
(January 27, 1994) reported that based
on sales in a chain of Midwestern grocery
stores, President’s Choice Chocolate Chip
Cookies were selling at a mean rate of
$1323 per week. Suppose a random sample
of 30 weeks in 1995 in the same stores
showed that the cookies were selling at
the average rate of $1208 with standard
deviation of $275. Does this indicate that
the sales of the cookies is different from
the earlier figure?
Assume:
•Have an SRS of weeks
•Distribution of sales is approximately normal due to large
sample size
• σ unknown (use t-distribution)
H0: μ = 1323
Ha: μ ≠ 1323
where μ is the true mean cookie sales per
week
1208  1323
t
 2.29
275
30
p  value  .0295
Since p-value < α of 0.05, I reject the null hypothesis. There
is sufficient evidence to suggest that the sales of cookies are
different from the earlier figure.
Example 9: President’s Choice Chocolate
Chip Cookies were selling at a mean rate of
$1323 per week. Suppose a random sample
of 30 weeks in 1995 in the same stores
showed that the cookies were selling at the
average rate of $1208 with standard deviation
of $275. Compute a 95% confidence interval
for the mean weekly sales rate.
CI = ($1105.30, $1310.69)
Based on this interval, is the mean weekly
sales rate statistically different from the
reported $1323?
What do you notice about the decision from the
confidence interval & the hypothesis test?
What decision would you make on Example 9 if α =
.01?
What confidence level would be correct to use?
Does that confidence interval provide the same
decision?
If Ha: μ < 1323, what decision would the hypothesis
test give at α = .05? α = .01?
Now, what confidence levels are appropriate for
this alternative hypothesis?
Matched Pairs Test
A special type of t-inference
Matched Pairs – Two Forms
• Pair individuals by
certain characteristics
• Randomly select
treatment for individual
A
• Individual B is assigned
to other treatment
• Assignment of B is
dependent on
assignment of A
• Individual persons or
items receive both
treatments
• Order of treatments
are randomly
assigned or before &
after measurements
are taken
• The two measures are
dependent on the
individual
Is this an example of matched pairs?
1)A college wants to see if there’s a difference in
time it took last year’s class to find a job after
graduation and the time it took the class from five
years ago to find work after graduation.
Researchers take a random sample from both
classes and measure the number of days between
graduation and first day of employment
No, there is no pairing of individuals, you have
two independent samples
Is this an example of matched pairs?
2) In a taste test, a researcher asks people in a
random sample to taste a certain brand of spring
water and rate it. Another random sample of
people is asked to taste a different brand of water
and rate it. The researcher wants to compare these
samples
No, there is no pairing of individuals, you have
two independent samples – If you would have
the same people taste both brands in random
order, then it would be an example of matched
pairs.
Is this an example of matched pairs?
3) A pharmaceutical company wants to test its new
weight-loss drug. Before giving the drug to a
random sample, company researchers take a
weight measurement on each person. After a
month of using the drug, each person’s weight is
measured again.
Yes, you have two measurements that are
dependent on each individual.
A whale-watching company noticed that many
customers wanted to know whether it was better to
book an excursion in the morning or the afternoon.
You
mayquestion,
subtract either
To test
this
the company collected the
way – just
be on
careful
following
data
15 when
randomly selected days over
writing Ha
the past month. (Note: days were not consecutive.)
Day
1 2
3
Morning
8 9
7 9
Afternoon
8 10
9 8
4
5
6
7
8
9
10
11 12 13 14 15
10 13 10
8
2
5
7 7 6 8 7
9
10
4
7
8 9 6 6 9
11 8
Since you have two values for
each day, they are dependent on
the day – making this data
matched pairs
First, you must find the
differences for each
day.
Day
1 2
3
4
Morning
8 9
7
9 10 13 10
Afternoon
8 10
9
8 9
Differenc
es
0 -1
-2 1 1
Assumptions:
5
6
7
8
9
10
11 12 13 14 15
8
2
5
7 7 6 8 7
11 8
10 4 7 8 9 6 6 9
I subtracted:
2Morning
2
-2
-2 -2 -1 -2 0 2 -2
– afternoon
You could subtract the other way!
• Have an
SRS
of to
days
forassumptions
whale-watching
You
need
state
using the
• α unknown
differences!
•Since the normal probability plot is approximately linear, the
distribution of differences is approximately normal.
Notice the granularity in this
plot, it is still displays a nice
linear relationship!
Differenc
es
0 -1
-2 1 1
2
2
-2
-2 -2
-1 -2 0 2 -2
Is there sufficient evidence that more whales are sighted in the
afternoon?
H0: μD = 0
Ha: μD < 0
Be careful writing your Ha!
Think
about how
you
If you subtract
afternoon
– morning;
subtracted:
then
Ha: μD>0 M-A
Notice we
mD forisdifferences
If used
afternoon
more should the
& it equals 0 since
the null be
should
differences
+ or be
-?
that there
is NO
difference.
Don’t
look
at numbers!!!!
Where μD is the true mean difference
in whale sightings from morning
minus afternoon
Differenc
es
0 -1
-2 1 1
2
2
-2
finishing the hypothesis test:
xμ
 .4  0
t

 .945
s
1.639
n
15
p  .1803
df  14
α  .05
-2 -2
-1 -2 0 2 -2
In your calculator,
perform
Notice
thata ift-test
you
using
the differences
subtracted
A-M,
then your test
statistic
t = + .945, but pvalue would be the
same
Since p-value > α, I fail to reject H0. There is insufficient
evidence to suggest that more whales are sighted in the
afternoon than in the morning.
Download