Outline - Benedictine University

advertisement
|PART ONE -- Essentials| -- Inference (Estimation and Hypothesis Testing) From Small Samples
Note: This part of the course deals with estimation and hypothesis testing, which were introduced in
Stats 1, Part 5. Review the outline for that section if necessary. The material in Stats 1, Part 5 is
to be considered part of Stats 2, Part 1.
Interval estimation and hypothesis testing
Two Types of Problems
Means--one-group; two-group
t-distribution
Symmetrical with center concentration, but not as concentrated as the normal distribution
Lower in the center and higher in the tails than the normal distribution
Degrees of freedom--expresses the sample size
One-group problems, (n-1); two-group problems, (n1+n2-2) or [(n1-1) + (n2-1)]
On the 4-Column formula sheet, columns 1 and 2 may be used, with the substitution of
"t" for "z". Logic is identical to chapters 6, 7 and 8 large-sample sections.
Unpaired design for two-group problems
Sample items for each group selected randomly
A difference between the means of groups might be due to experimental "treatment" or
might simply be due to the fact that the members of the two groups were different.
Treatment--intentional difference between groups being tested, e.g., in a
pharmaceutical test, drug group vs. non-drug group
Confounding variable--uncontrolled factor that might be causing an observed
difference between groups
Paired-difference design for two-group problems
Purpose--to eliminate "confounding" variables and isolate the variable of interest
Ideal--keep everything constant except the variable under investigation.
Same subjects are tested twice--before and after the experimental treatment.
Difference therefore cannot be due to the members of the groups being different.
Four assumptions
Samples
Random
Independent (in two-group unpaired experiments)
Populations
Normally distributed
Equal variances (in two-group unpaired experiments)
Moderate departures from the assumptions will not seriously affect validity.
A test with this characteristic is called "robust."
If the assumptions are seriously violated, two approaches may be taken
Increase sample to a "large" size (then, population assumptions need not be met).
Use nonparametric tests (which have no population assumptions).
Inferences regarding variances
One-group inference regarding the variance--uses "chi-square" (χ2) distribution
Estimation and hypothesis testing are possible regarding the variance of one group.
In hypothesis testing, the Ho is that σ2 is equal to some specified value.
Two-group inference regarding the variances--uses "F" distribution.
Variances are compared by division (ratio), rather than by subtraction (difference)
Estimation and hypothesis testing are possible regarding the variances of two groups.
In hypothesis testing, the Ho is that σ12 is equal to σ22.
As a ratio, this would mean σ12 / σ22 = 1.
Terminology--explain each of the following:
inferential statistics, sample mean, population mean, estimator, estimate, unbiased estimator,
point estimate, interval estimate, confidence interval, degree of confidence, confidence level,
error factor, required sample size, upper confidence limit, lower confidence limit, hypothesis
test, null hypothesis, alternate hypothesis, type I error, α, type II error, β, calculated-t (test
statistic), critical region, table-t (critical value of t), rejection of the null hypothesis, non-rejection
of the null hypothesis, p-value, hypothesis-test conclusion, independent samples, standard
error of the difference, paired difference design, confounding variable, chi-square distribution
(purpose), F distribution (purpose)
Skills and Procedures
given appropriate data, conduct estimation and hypothesis testing on the population mean of
one group, involving these steps:
 make a point estimate of a population mean
 compute the sampling standard deviation (standard error) of the sample means
 compute and interpret the error factor for the interval estimate for the 90%, 95% and 99%
confidence levels, using the t distribution
 state the null and alternate hypotheses regarding the population mean
 determine the table-t (critical value of t) for alpha levels of 0.10, 0.05 and 0.01
 compute the calculated-t (test statistic)
 draw the appropriate hypothesis-test conclusion based on the given level of α, the table-t
(critical value) and the calculated-t (test statistic)
 interpret the conclusion
 determine and interpret the p-value
given appropriate data, conduct estimation and hypothesis testing on the population means of two
groups, involving these steps:
 make a point estimate of the difference between population means
 compute the sampling standard deviation (standard error) of the difference between sample
means
 compute and interpret the error factor for the interval estimate for the 90%, 95% and 99%
confidence levels
 state the null and alternate hypotheses regarding the difference between population means
 determine the table-t (critical value of t) for alpha levels of 0.10, 0.05 and 0.01
 compute the calculated-t (test statistic)
 draw the appropriate hypothesis-test conclusion based on the given level of α, the table-t
and the calculated-t
 interpret the conclusion
 determine and interpret the p-value
given appropriate data, conduct estimation and hypothesis testing on the population means of two
groups in a paired-difference design, involving these steps:
 make a point estimate of the difference between population means by computing the
average of the differences
 compute the sampling standard deviation (standard error) of the difference between sample
means
 compute and interpret the error factor for the interval estimate for the 90%, 95% and 99%
confidence levels
 state the null and alternate hypotheses regarding the difference between population means
 determine the table-t (critical value of t) for alpha levels of 0.10, 0.05 and 0.01
 compute the calculated-t (test statistic)



draw the appropriate hypothesis-test conclusion based on the given level of α, the table-t
and the calculated-t
interpret the conclusion
determine and interpret the p-value
Concepts- explain why a confidence interval becomes larger as the confidence level increases
 explain why a confidence interval becomes smaller as the sample size increases
 describe the nature of the trade-off between precision and cost
 identify the type of error that is made if the null hypothesis is "the defendant is innocent,"
and an innocent defendant is erroneously convicted
 identify the type of error that is made if the null hypothesis is "the defendant is innocent,"
and a guilty defendant is erroneously acquitted
 explain why a researcher seeking to reject a null hypothesis may tend to prefer a one-sided
alternative hypothesis
 explain how the paired-difference design eliminates a confounding variable
 explain what happens to the t distribution as the sample size becomes smaller
 explain what happens to the t distribution as the sample size becomes larger
 describe how the t distribution is similar to the normal distribution
 describe how the t distribution differs from the normal distribution
What to say and how to say it:
INSERT A NUMBER WHEREVER THERE ARE PARENTHESES ( ).
Column 1--mean, one group
Ho Rejected
The difference between the sample mean, (xbar), and the null hypothesis, (μHo), is
statistically significant at the (α) level. The population mean is probably not (μHo).
Ho not rejected:
The difference between the sample mean, (xbar), and the null hypothesis, (μHo), is
not statistically significant at the (α) level. The population mean could be (μHo).
Column 2--means, 2 groups (or paired-difference design)
Ho rejected:
The difference between the sample means, (xbar1-xbar2), is statistically significant
at the (α) level. The population means are probably not equal.
Ho not rejected:
The difference between the sample means, (xbar1-xbar2), is not statistically significant
at the (α) level. The population means could be equal.
THE t-DISTRIBUTION
CENTRAL LIMIT THEOREM -- SAMPLING DISTRIBUTIONS OF:
MEANS
DIFFERENCES BETWEEN MEANS
PROPORTIONS
DIFFERENCES BETWEEN PROPORTIONS
ARE ESSENTIALLY NORMAL REGARDLESS OF THE SHAPE OF THE POPULATION DISTRIBUTION,
WHEN SAMPLE SIZES ARE LARGE (n  30).
WHEN SAMPLE SIZES ARE SMALL (n < 30), SAMPLING DISTRIBUTIONS ARE NO LONGER NORMAL.
THEY FOLLOW t-DISTRIBUTIONS:
SYMMETRICAL,
LOWER AND WIDER THAN THE NORMAL DISTRIBUTION,
LESS CONCENTRATED IN THE CENTER.
t-DISTRIBUTION SHAPE VARIES AS n CHANGES.
THE SMALLER THE n, THE LESS CONCENTRATED IN THE CENTER.
SAMPLE SIZE IS EXPRESSED BY DEGREES OF FREEDOM
df = (n - 1).
t-DISTRIBUTION TABLE
COLUMN HEADINGS -- ONE-SIDED AND TWO-SIDED TAIL AREAS:
BODY OF TABLE CONTAINS t-VALUES (ANALOGOUS TO z-VALUES)--THE NUMBER OF STANDARD
DEVIATIONS FROM THE MEAN
t-VALUES APPROACH z-VALUES AS n INCREASES.
BOTTOM ROW OF THE t-TABLE CONTAINS z-VALUES.
AS n DECREASES, t-VALUES INCREASE.
DUE TO THE LESSER DEGREE OF CENTER CONCENTRATION, AS THE SAMPLE SIZE DECREASES, ONE
MUST MOVE FARTHER FROM THE MEAN IN ORDER TO ENCLOSE A GIVEN PORTION OF THE
DISTRIBUTION.
Download