Lecture 7 Outline - Wharton Statistics Department

advertisement
Lecture 7 Outline
• Levene’s test for equality of variances
(4.5.3)
• Interpretation of p-values (2.5.1)
• Robustness and resistance of t-tools (3.13.4)
Bumpus’ Data Revisited
• Bumpus concluded that sparrows were subjected to
stabilizing selection – birds that were markedly different
from the average were more likely to have died.
• Bumpus (1898): “The process of selective elimination is
most severe with extremely variable individuals, no matter
in what direction the variations may occur. It is quite as
dangerous to be conspicuously above a certain standard of
organic excellence as it is to be conspicuously below the
standard. It is the type that nature favors.”
• Bumpus’ hypothesis is that the variance of physical
characteristics in the survivor group should be smaller than
the variance in the perished group
Testing Equal Variances
• Two independent samples from populations with
variances  12 and  2 2
• H0:  2   2 vs. H1:  2   22
1
1
2
• Levene’s Test – Section 4.5.3
• In JMP, Fit Y by X, under red triangle next to
Oneway Analysis of humerus by group, click
Unequal Variances. Use Levene’s test.
• p-value = .4548, no evidence that variances are not
equal, thus no evidence for Bumpus’ hypothesis.
t-tests for randomized experiments
• Section 2.4
• t-test (with its associated Student t distribution
under H0) has been developed in Ch. 2 for making
inferences to populations using the random
sampling probability model.
• In Ch. 1, we studied making causal inferences in
the additive treatment effect model using the
probability model of a randomized experiment.
Y Y
t

• The two-sample t-statistic SE(Y  Y ) is a
reasonable test statistic for testing H0: additive
treatment effect is  * , basically equivalent to Y1  Y2
1
2
1
2
t-test for randomized experiments cont.
• When the two group sizes are large (n1  30, n2  30) the ttest provides an approximately correct p-value for a
randomized experiment experiment, i.e., the distribution of
the t-statistic under the null hypothesis of an additive
treatment effect of 0 is approximately t distribution with
n1  n2  2 degrees of freedom.
• See Display 2.11
• Bottom line: t-test in JMP can be used to make
approximately correct inferences (p-values and CIs) for
randomized experiments but inferences should be phrased
in terms of additive treatment effects rather than difference
in population mean.
Notes about tests, p-values
• Interpretation of p-value:
– Formally: the probability of random sampling (or
random assignment) leading to a test statistic at least as
large as the observed one if H is true.
0
– Informally, the degree of credibility in H0.
• Conclusions from p-values
– (a) Small p-values mean either (i) H0 is wrong or (ii)
we obtained an unusual sample
– (b) Large p-values mean either (i) H0 is correct or (ii)
the study isn’t large enough to conclude otherwise (i.e.,
the data are consistent with H0 being true but do not
prove it).
Conceptual Question 2.8
• Suppose the following statement is made in
a statistical summary: “A comparison of
breathing capacities in individuals in
households with low nitrogen dioxide levels
and individuals in households with high
nitrogen dioxide levels indicated that there
is no difference in the means (two-sided pvalue =.24).” What is wrong with this
statement?
Interpretation of p-values
• So what p-values are small and large.
• For reference: chance of
–
–
–
–
–
–
3 heads in 3 coin tosses is
4
4
5
5
6
6
7
7
8
8
.125
.063
.031
.016
.008
.004
• See Display 2.12 for a subjective guide.
Closer Look at Assumptions
• Chapter 3
• t-test and CIs based on the assumptions that
– (i) the population distributions are normal
– (ii) the population distributions have same S.D.
– (iii) the sample observations are independent
• These ideal assumptions, particularly (i)
and (ii) are never met.
Usefulness of t-tools
• The t-tests and CIs are still quite useable if
we
–
–
–
–
understand their robustness and resistance
consider transformations, e.g. log(Y)
have a strategy for outliers
be prepared to label inferences as
“approximate”
– additionally, we have alternative tools (Ch. 4)
Case study 3.1.2: Effect of Agent Orange
• Many Vietnam veterans are concerned that their health
may have been affected by exposure to Agent Orange, a
herbicide sprayed in South Vietnam between 1962 and
1970.
• Particularly worrisome component of Agent Orange is a
dioxin called TCDD which in high doses is known to be
associated with certain cancers.
• Nonrandom sample of 646 Vietnam vets and 97 nonVietnam vets who entered Army between 1965 and 1971
and served only in U.S. or Germany, dioxin levels of both
samples measured in 1987.
• Question of interest: Are current (1987) dioxin levels
higher in population of Vietnam vets?
Robustness of two-sample t-tools
• A statistical procedure is robust to departures from
a particular assumption if it is valid even when the
assumption is not met exactly
• Valid means that the uncertainty measures – the
confidence levels and p-values – are nearly equal
to the stated rules, e.g., a procedure for obtaining a
95% confidence interval is valid if it is roughly
95% successful in capturing the parameter
• Statisticians know something about robustness
from advanced theory and computer simulation.
How important is normality?
• If the sample sizes are large(n1  30, n2  30) the ttests will be valid no matter how nonnormal the
populations are.
• If the two populations have same S.D. and
approximately the same shape and if n1  n2 ,
validity of t-tools is affected moderately by longtailedness and very little by skewness.
• See Display 3.4
• See Chapter 3.2 for how t-tools are affected by
departures from normality and equal S.D. in other
situations.
Departures from Independence
• Independence: Knowledge of one observation can’t help to
predict another.
• Common violations of independence assumption:
– Cluster effects (Y’s from same cluster, e.g., litters, are
similar)
– Serial effects (Y’s close together in time or space are
similar)
• Effect of lack of independence on validity of t-tools:
. Var (Y2  Y1 )  Var (Y1 )  Var (Y2 )
t-ratio no longer has a
t-distribution and t-tools may give misleading results.
• If cluster effects occur in pairs, use matched pairs t-test.
• If we suspect other types of non-independence, use Ch. 915 tools.
Recognizing Matched Pairs
Studies
• Does there exist some natural relationship between
the first pair of observations that makes it more
appropriate to compare the first pair than the first
observation in group 1 and the second observation
in group 2?
• Before and after designs
• Example: A researcher for OSHA wants to see
whether cutbacks in enforcement of safety
regulations coincided with an increase in work
related accidents. For 20 industrial plants, she has
number of accidents in 1980 and 1995.
Outliers and resistance
• Outliers are observations relatively far from their
estimated means.
• Outliers may arise either
– (a) if the population distribution is long-tailed.
– (b) they don’t belong to the population of interest
(come from contaminating population)
• A statistical procedure is resistant if one or a few
outliers cannot have an undue influence on result.
Resistance
• Illustration for understanding resistance: the
sample mean is not resistant; the sample
median is.
– Sample: 9, 3, 5, 8, 100
– Mean with outlier: 25, without: 6.2
– Median with outlier: 8, without: 6.5
• t-tools are not resistant to outliers because
they are based on sample means.
Practical two-sample strategy
• Think about independence – use tools from later in
course (or matched pairs) if there’s a potential
problem
• Use graphical displays to assess: normality,
spread, outliers
• If there are outliers, investigate them and see
whether they (i) change conclusions; (ii) warrant
removal. Follow the outlier examination strategy
in Display 3.6.
Excluding Observations from Analysis in
JMP for Investigating Outliers
• Click on row you want to exclude.
• Click on rows menu and then click
exclude/unexclude. A red circle with a line
through it will appear next to the excluded
observation.
• Multiple observations can be excluded.
• To include an observation that was excluded back
into the analysis, click on excluded row, click on
rows menu and then click exclude/unexclude. The
red circle next to observation should disappear.
Conceptual Question #6
• (a) What course of action would you propose for
the statistical analysis if it was learned that
Vietnam veteran #646 (the largest observation in
Display 3.6) worked for several years, after
Vietnam, handling herbicides with dioxin?
• (b) What would you propose if this was learned
instead for Vietnam veteran #645 (second largest
observation)?
Download