Hypothesis Testing

advertisement
A Talk about the Practicality of Hypothesis Tests before the Summary
of the Formulas/Tests
Hypothesis tests might be boring to us if we treat them just as tests with steps,
formulas and calculation without knowing the purpose and practicality. The
following examples will show you that actually hypothesis testing is fairly practical
and interesting. It is true that study can be more fun if we know why we learn it.
Here is an example in details as follows:
Hypothesis Test for Linear Correlation (10-2)
Hypothesis test for linear correlation is used to determine whether there is a
linear correlation (relationship) between two variables. Here is an instance: If you
own a coffee store and you are interested in whether there is a significant linear
relationship between temperature and the sale of coffee, i.e. whether
temperature affects the sale of coffee, you can find the answer by the procedure
of using a hypothesis test of a linear correlation as follows:
Procedure:
ο‚· Step 1. Collect data
Let x represent temperature and y represent the sale of coffee. Collect the
sample of paired (x, y) data which has to be a random sample (for example:
50 random days’ temperatures and the corresponding sale of coffee). Any
outliners must be removed if they are known to be errors.
ο‚· Step 2. Hypothesis Test
H0: There is no linear correlation between temperature and the sale of
coffee.
H1: There is a linear correlation between temperature and the sale of coffee.
Calculate the statistic value of the correlation coefficient r by formula
π‘Ÿ=
𝑛(∑ π‘₯𝑦)−(∑ π‘₯)(∑ 𝑦)
√𝑛(∑ π‘₯²)−(∑ π‘₯)²√𝑛(∑ 𝑦²)−(∑ 𝑦)²
where n is the number of pairs of data.
Find the critical value in Table A-6. If the absolute value of the computed
value/the statistic value is greater than the critical value, we reject H0 and
conclude that there is a linear correlation/relationship between
temperature and the sale of coffee. Otherwise, there is no sufficient
evidence to support that conclusion of a linear correlation.
For example, if we get computed r = - 0.617, the critical value found in
Table A-6 is 0.279 by n = 50 and α = 0.05. Since the absolute value of the
statistic value r is 0.617 which is greater than the critical value 0.279, we
reject H0 and conclude that there is a linear correlation/relationship
between temperature and the sale of coffee. Furthermore, the negative
sign of the statistic value (- 0.617) tells us that the relationship between
temperature and the sale of coffee is negative, i.e. when temperature
increases, the sale of coffee decreases or when the temperature goes down,
the sale of coffee goes up.
If you can think about some questions that can be solved by the hypothesis
tests, studying Statistics would be fun and would help you understand the
hypothesis test better.
Examples or explanation of the types of claims we are able to test:
Chapter 8: Hypothesis Testing (one group of samples)
1. Testing a claim about a proportion:
More than 50% of workers get their jobs through networking.
2. Testing a claim about a population mean (with σ known):
The mean body temperature of the population in North Bay is less than
98.6℉.
3. Testing a claim about a mean (with σ not known but s known):
Chocolate M&Ms have a mean that is actually greater than 0.8535 g, so
consumers are being given more than the amount indicated on the label.
4. Testing a claim about a standard deviation or variance:
Cans of cola from the new machine have amounts with a standard
deviation that is less than 0.051 oz.
Chapter 9: Inferences from Two Samples (two groups of samples)
1. Inferences about two proportions:
The success rate with surgery is better than the success rate with splinting.
2. Inference about two means
Students get better marks if they are allowed to have 10 minutes to write
about their nervousness and pressure before tests.
3. Inference about matched pairs:
There is a difference between the actual low temperatures and the low
temperatures that were forecast five days earlier.
Chapter 10: Correlation and Regression
1. Linear Correlation Coefficient
Is there a strong and positive correlation between marks and the time
spent on study?
2. Coefficient of Determination: r²
r² is the proportion of the variation in y that is explained by the linear
relationship between x and y.
3. Hypothesis Test for Correlation
An example is shown at the beginning.
4. Regression Equation
If a regression equation is known, we can use it to predict x or y value. For
example, if there is a very strong correlation between temperature (x in℃)
and the sale of beer (y in bottle) for a beer store, and the linear regression
equation is 𝑦̂ = 20 π‘₯ + 15. If the temperature is 30℃, then the owner can
predict that 615 bottles of beer will be sold tomorrow because we
substitute x by 30 in the equation𝑦̂ = 20π‘₯ + 15 = (20)(30) + 15 = 615.
Of course the exact number of the bottles of beer sold tomorrow might not
equal 615 since 615 is just a predicted number. However, the predicted
number will be close to the actual number and this will help the owner to
plan for his business.
Chapter 12: Analysis of Variance (three and more groups of samples)
1. One-Way ANOVA (the data are categorized into groups according to a
single factor or treatment)
Test whether three or more means are equal.
Example: In the United States, English, mathematics and science majors
have mean abstract reason scores that are not all the same.
2. Two-Way ANOVA (the data are partitioned into categories according to
two factors)
Chapter 13 Nonparametric Statistics
1. Sign Test
The Sign test is a nonparametric test that uses plus and minus signs to
test different claims involving matched pairs of sample data, nominal data
and the median of a single population.
2. Wilcoxon Signed-Ranks Test ( Matched Pairs )
The Wilcoxon Signed-Ranks test is a nonparametric test that uses ranks of
sample data consisting of matched pairs. It is used to test the null
hypothesis that the population of differences has a median of zero.
3. Wilcoxon Rank-Sum Test ( two independent samples)
The Wilcoxon Rank-Sum test is a nonparametric test that uses ranks of
sample date from two independent populations. It is used to test the null
hypothesis that the two independent samples come from populations with
equal medians.
4. Kruskal-Wallis Test (three or more independent populations)
The Kruskal-Wallis test is a nonparametric test that uses ranks of sample
data from three or more independent populations. It is used to test null
hypothesis that the independent samples come from populations with
equal medians.
5. Rank correlation Test
The Rank correlation test (or Spearman’s rank correlation test) is a
nonparametric test that uses ranks of sample data consisting of matched
pairs. It is used to test for an association between two variables, so the
null hypothesis that there is no correlation between the two variables.
Download