Part 4

advertisement
STA 2023
E.Philias
Part 4
Chapter 10: Hypothesis Testing (using a single sample)
Hypothesis Testing is the statistical procedure designed to test a claim about a population
parameter based on the statistical evidence contained in a representative sample of the
population.
Research hypothesis is a claim about a population parameter that can be tested using
sample data.
Statistical hypotheses are the null and alternative hypotheses.
Null hypothesis ( 𝐻0 ) is a claim (Or statement) about a population parameter that is
assumed to be true until it declared false.
It describes the opposite of the alternative hypothesis. The null hypothesis must contain
the equality sign.
Alternative hypothesis ( π»π‘Ž ) describes the research hypothesis of the problem. It contains
strict inequalities.
Test Statistic is a formula that summarizes the statistical evidence collected against the null
hypothesis (or in favor of the alternative/research hypothesis).
Rejection region is the set of values of the test statistic indicating convincing evidence
against Ho.
Four Outcomes from Hypothesis Testing
𝐻0 𝑖𝑠 π‘‘π‘Ÿπ‘’π‘’
𝐻0 is false
Do not Reject 𝐻0
Correct decision
Type II error
Reject 𝐻0
Type I error
Correct decision
Type I error consists of rejecting the null hypothesis 𝐻0 when 𝐻0 is actually true.
Type II error consists of failing to reject 𝐻0 when 𝐻0 is actually false.
Alpha (𝛼) designates the probability of making a Type I error.
STA 2023
E.Philias
Beta (𝛽) designates the probability of type II error.
p-value is a probability that measures the strength of our case against 𝐻0 (that is, in favor
of π»π‘Ž ). The p-value of any statistical test describes the observed probability of type I error.
Tails of a Test
A two-tailed test has rejection regions in both tails, a left tailed test has the rejection region
in the left tail, and a right-tailed test has the rejection region in the right tail of the
distribution curve.
Sign in the null
Hypothesis 𝐻0
Sign in the
alternative
Hypothesis𝐻1
Rejection region
Two-Tailed test
=
Left-Tailed Test
= or ≥
Right-Tailed Test
= or ≤
≠
<
>
In both tails
In the left tail
In the right tail
Steps to perform a test of Hypothesis
A statistical test of hypothesis procedure has the following five steps.
1. State the null and alternative hypotheses.
2. Create the Decision rule by determining the rejection and nonrejection regions.
3. Calculate the value of the test statistic.
4. Make a decision.
5. State your conclusion
STA 2023
E.Philias
Testing Hypotheses Regarding the Population Mean (large sample)
Step1: Determine the null and alternative hypotheses. The hypotheses can be structured
in one of three ways:
Two-Tailed
Left-Tailed
Right-Tailed
𝐻0 : πœ‡ = πœ‡0
𝐻0 : πœ‡ = πœ‡0
𝐻0 : πœ‡ = πœ‡0
𝐻1 : πœ‡ ≠ πœ‡0
𝐻1 : πœ‡ < πœ‡0
𝐻1 : πœ‡ > πœ‡0
Note: πœ‡0 is the assumed or status quo value of the population mean.
Step 2 : Create the Decision rule by determining the rejection region. The level of
significance is used to determine the critical value.
Step 3: Calculate the value of the test statistic
z=
z=
Where 𝜎π‘₯Μ… =
𝜎
√𝑛
and 𝑠π‘₯Μ… =
xΜ…−πœ‡
𝜎π‘₯Μ…
xΜ…−πœ‡
𝑠π‘₯Μ…
if 𝜎 is known
if 𝜎 is not known
𝑠
√𝑛
The value of z is calculated for xΜ… using the formula is also called the observed value of z.
Step 4: Decision: Compare the critical value with the test statistic:
Two-Tailed
If z < -𝑧𝛼 or z > 𝑧𝛼⁄2
Left-Tailed
If z < -𝑧𝛼 reject
Right-Tailed
If z > 𝑧𝛼 , reject
Reject the null hypothesis
the null hypothesis
the null hypothesis
2
Step 5: Conclusion
1) Reject 𝐻0 in favor of π»π‘Ž : The sample data provides sufficient evidence to support the
research hypothesis.
2) Fail to reject 𝐻0 : The sample data provides insufficient evidence to support the
research hypothesis.
STA 2023
E.Philias
Rejection Regions for Common Values of 𝜢
Alternatives Hypothesis
𝛼 = .10
𝛼 = .05
𝛼 = .01
Lower-Tailed
z< -1.28
z< -1.645
z< -2.33
Upper-Tailed
z > 1.28
z > 1.645
z > 2.33
Two-Tailed
z< -1.645 or z > 1.645
Z < -1.96 or z > 1.96
z< -2.575 or z > 2.575
Hypothesis Tests using the p-Value approach
The p-value is the smallest significance level at which the null hypothesis is rejected.
P-value is a probability that measures the strength of our case against Ho (that is, in favor
of Ha). The p-value of any statistical test describes the observed probability of type I error.
Step 1: Determine 𝐻0 and π»π‘Ž
Step 2: Decide on a level of significance, depending on the seriousness of making a type 1
error .
Μ…
x−πœ‡
Step 3 : Compute the test statistics 𝑧0 = 𝜎
Μ…
π‘₯
Step 4: Determine the p-value.
Two tails test : p-value = 2P( z > 𝑧0 )
Right tail test : p-value = P( z > 𝑧0 )
Left tail test : p-value = P( z < 𝑧0 )
STA 2023
E.Philias
Tests Concerning Means (Small Samples)
Conditions under which the t distribution is used to make tests of hypothesis about 𝝁
The t distributions is used to conduct a test of hypothesis about 𝝁 if
1. The sample size is small ( n <30).
2. The population from which the sample is drawn is (approximately) normally
distributed.
3. The population standard deviation 𝜎 is unknown.
Test Statistic
The value of the test statistic t for the sample mean xΜ… is computed as
xΜ…−πœ‡
t= 𝑠
Μ…
π‘₯
where 𝑠π‘₯Μ… =
𝑠
√𝑛
The value of t calculated for xΜ… by using the above formula is also called the observed value
of t.
Hypothesis Test about a Population Proportion : Large samples
The value of the test statistic z for the sample proportion, 𝑝̂ , is computed as
Z=
𝑝̂−𝑝
πœŽπ‘
Μ‚
where πœŽπ‘Μ‚ =√
𝑝̂(1−𝑝̂)
𝑛
The value of p used in this formula is the one used in the null hypothesis. The value of z
calculated for 𝑝̂ is also called the observed value of z.
STA 2023
E.Philias
Technology Step by Step
Hypothesis Tests Regarding 𝝁 (Large sample)
Step 1: If necessary, enter raw data in 𝐿1 .
Step 2: Press STAT, highlight TESTS, and select 1 : Z-Test.
Step 3: If the data are raw, highlight DATA; make sure that List1 is set toL1 and Freq is set
to 1. If summary statistics are known, highlight STATS and enter the summary statistics.
Following 𝜎, enter the population standard deviation. For the value of πœ‡0 , enter the value of
the mean stated in the null hypothesis.
Step 4: Select the direction of the alternative hypothesis.
Step 5: Highlight Calculate and press ENTER. The TI-83/84 plus gives the P-value.
Hypothesis Tests Regarding 𝝁 (Small sample)
Step 1: If necessary, enter raw data in 𝐿1 .
Step 2: Press STAT, highlight TESTS, and select 2: T-Test.
Step 3: If the data are raw, highlight DATA; make sure that List1 is set toL1 and Freq is set
to 1. If summary statistics are known, highlight STATS and enter the summary statistics.
Following 𝜎, enter the population standard deviation. For the value of πœ‡0 , enter the value of
the mean stated in the null hypothesis.
Step 4: Select the direction of the alternative hypothesis.
Step 5: Highlight Calculate and press ENTER. The TI-83/84 plus gives the P-value.
STA 2023
E.Philias
Hypothesis Tests regarding a Population Proportion
Step 1: Press STAT, highlight TESTS, and select 5: 1-PropZTest.
Step 2: For the value of 𝑝0 , enter the “status quo” value of the population proportion.
Step 3: Enter the number of successes, x, and the sample size n.
Step 4: Select the direction of the alternative hypothesis.
Step 5: Highlight Calculate or Draw, and press ENTER. The TI-83/84 gives the P-value.
STA 2023
E.Philias
Comparing Two Population Means: Independent Sampling
To perform inference on the difference of two population means, we must first determine
whether the data come from independent or dependent sample.
A sampling method is independent when the individual selected for one sample do not
dictate which individuals are to be in a second sample.
A sampling method is dependent when the individuals selected to be in one sample are
used to determine the individuals to be in the second sample.
DIFFERENCES BETWEEN MEANS (LARGE SAMPLE)
Large-Sample Confidence Interval for (𝝁𝟏 - 𝝁𝟐 )
(π‘₯
Μ…Μ…Μ…1 - Μ…Μ…Μ…)
π‘₯2 ± 𝑧𝛼⁄2 √
𝜎12
𝑛1
+
𝜎22
𝑛2
Where 𝜎12 π‘Žπ‘›π‘‘ 𝜎22 are the variances of the two populations being sampled, and 𝑛1 π‘Žπ‘›π‘‘ 𝑛2
𝜎2
𝜎2
1
2
are the respective sample sizes. We also refer to √𝑛1 + 𝑛2 as the standard error of the
statistic (π‘₯
Μ…Μ…Μ…1 - Μ…Μ…Μ…).
π‘₯2
STA 2023
E.Philias
Large-Sample Test of Hypothesis for (𝝁𝟏 - 𝝁𝟐 ): Independent Sampling
Step 1: Determine the null and alternative hypotheses. The hypotheses are structured in
one of three ways:
Two-tailed
Left-Tailed
Right-Tailed
𝐻0 : πœ‡1 = πœ‡2
𝐻0 : πœ‡1 = πœ‡2
𝐻0 : πœ‡1 = πœ‡2
π»π‘Ž : πœ‡1 ≠ πœ‡2
π»π‘Ž : πœ‡1 < πœ‡2
π»π‘Ž : πœ‡1 > πœ‡2
Step 2: Select a level of significance 𝛼, depending on the seriousness of making a type I
error.
Step 3: Compute the test statistic
Z=
(π‘₯
Μ…Μ…Μ…1Μ… − Μ…π‘₯Μ…Μ…2Μ…)− (πœ‡1 − πœ‡2 )
𝜎2 𝜎2
√ 1+ 2
𝑛1 𝑛2
≈
(π‘₯
Μ…Μ…Μ…1Μ… − Μ…π‘₯Μ…Μ…2Μ…)− (πœ‡1 − πœ‡2 )
𝑠2 𝑠2
√ 1+ 2
𝑛1 𝑛2
, with(πœ‡1 − πœ‡2 ) =0
Step 4: Compare the critical value with the test statistic
Step 5: Conclusion
STA 2023
E.Philias
Small-Sample Confidence Interval for (𝝁𝟏 − 𝝁𝟐 ) (Independent Samples)
(π‘₯
Μ…Μ…Μ…1 - Μ…Μ…Μ…)
π‘₯2 ± 𝑑𝛼⁄2 √𝑠𝑝2 (
Where 𝑠𝑝2 =
(𝑛1 −1)𝑠12 +(𝑛2 −1)𝑠22
𝑛1 +𝑛2 −2
1
𝑛1
+
1
𝑛2
),
and 𝑑𝛼⁄2 is based on (𝑛1 + 𝑛2 − 2) degrees of freedom.
Here 𝑠𝑝 is the pooled sample standard deviation.
Small-Sample Test of Hypothesis for (𝝁𝟏 − 𝝁𝟐 ) (Independent Samples)
Test statistic: t =
(π‘₯
Μ…Μ…Μ…1Μ… − Μ…π‘₯Μ…Μ…2Μ…)−(πœ‡1 − πœ‡2 )
1
1
√𝑠𝑝2 (𝑛 +𝑛 )
1
2
, (πœ‡1 − πœ‡2 ) = 0
STA 2023
E.Philias
Comparing Two Population Means: Dependent Samples
Differences Between Means (Paired Data)
Confidence Interval for Matched-Pairs Data (dependent Samples)
A (1-𝛼)βˆ™100% confidence interval for πœ‡π‘‘ is given by
Large Sample
𝜎
𝑑̅ βˆ“ 𝑧𝛼⁄2 βˆ™ 𝑑 ≈ 𝑑̅ βˆ“ 𝑧𝛼⁄2 βˆ™
√𝑛
𝑠𝑑
√𝑛
Small Sample
𝑑̅ βˆ“ 𝑑𝛼⁄2 βˆ™
𝑠𝑑
√𝑛
The critical value 𝑑𝛼⁄2 is determined using n-1 degree of freedom. The value of 𝑑̅ and 𝑠𝑑 are
the mean and standard deviation of the differenced data.
Note: The interval is exact when the population is normally distributed and approximately
correct for nonnormal populations, provided that n is large.
Paired Difference Test of Hypothesis for 𝝁𝒅 = (𝝁𝟏 − 𝝁𝟐 ) =0 (Dependent Samples)
Large sample
𝑑̅
𝑑̅
√𝑛
𝑠𝑑
√𝑛
Test statistic : z = πœŽπ‘‘ ≈
Small Sample:
𝑑̅
Test statistic: t = 𝑠𝑑
√𝑛
STA 2023
E.Philias
Sampling Distribution of the Difference between two proportions (Large
sample)
Suppose a simple random sample of size 𝑛1 is taken from a population where π‘₯1 of the
individual s have a specified characteristic, and a simple random sample of size 𝑛2 is
independently taken from a different population where π‘₯2 of the individuals have a
π‘₯
π‘₯
specified characteristic. The sampling distribution of 𝑝
Μ‚1 - 𝑝
Μ‚,
Μ‚1 =𝑛1 and 𝑝
Μ‚2 =𝑛2 ,is
2 where 𝑝
1
2
approximately normal.
With mean πœ‡π‘Μ‚1
Μ‚2
− 𝑝
=𝑝1 − 𝑝2, and standard deviation πœŽπ‘Μ‚1
Μ‚2
− 𝑝
=√
Μ‚
Μ‚(1−𝑝
𝑝
1
1)
𝑛1
+
Μ‚
Μ‚(1−𝑝
𝑝
2
2)
𝑛2
Large-Sample A (1- 𝜢)βˆ™100% confidence interval for (π’‘πŸ − π’‘πŸ )
Μ‚1 - 𝑝
(𝑝
Μ‚)
βˆ“ 𝑧𝛼⁄2 √
2
Μ‚
Μ‚(1−𝑝
𝑝
1
1)
𝑛1
+
Μ‚
Μ‚(1−𝑝
𝑝
2
2)
𝑛2
Large-sample Test of Hypothesis about (π’‘πŸ − π’‘πŸ )
The z-statistic Z =
Μ‚1 − 𝑝
Μ‚)−(𝑝
(𝑝
2
1 −𝑝2 )
1
1
is used to test the null hypothesis that 𝐻0 : (𝑝1 −
√𝑝̂ (1 − 𝑝̂)(𝑛 +𝑛 )
1
2
𝑝2 ) = 0 (or, equivalently 𝐻0 : 𝑝1 = 𝑝2) the best estimate of 𝑝1 = 𝑝2 =p is 𝑝̂
=
The best point estimate for p is called the pooled estimate of p, denoted by 𝑝̂ .
π‘₯1 +π‘₯2
𝑛1 +𝑛2
Download