Chapter 2: Inference Using t-Distributions 2.1 2.1.1 Bumpus’s Data on Natural Selection

advertisement
Chapter 2: Inference Using t-Distributions
2.1
2.1.1 Bumpus’s Data on Natural Selection
Hermon Bumpus (1898) looked at measurments on house sparrows brought to the Anatomical
Laboratory of Brown University after an uncommonly severe winter storm. Some had survived
the storm and others had not. He was curious whether the birds that had died in the storm
may have done so because they lack physical characteristics enabling them to make it through
this weather related ”selective” event. Specifically, do humerus lengths tend to be different for
survivors than those that perished?
• Can he actually answer the question of whether the birds died as a result of their humerus
lengths? Why or why not?
• What population of birds can the results be inferred to?
2.1.2 Anatomical Abnormalities Associated with Schizophrenia
Researchers investigated whether sizes of certain areas of the brain may be different in persons
with schizophrenia than others. Suddath et al. (1990) controlled for genetic and socioeconomic
differences by examining 15 pairs of monozygotic twins, where one was schizophrenic and the
other was not. The twins were found through an intensive search throughout Cananda and
the U.S. One specific question asked whether there is a difference in the volume of the left
hippocampus between the twins.
• If there is a difference, would the researchers be able to say that it is caused by the
schizophrenia? Why or why not?
• What population can the results be inferred to?
2.2 One-Sample t-Tools and the Paired t-Test
GOAL: Draw inference about the population mean from a random sample.
1
2.2.1 The Sampling Distribution of a Sample Average
• Imagine the sampling distribution:
1. Draw a random sample from a population
2. Calculate and record the sample average
3. Repeat for every one of the possible equally likely samples.
4. The collection of all the sample averages is called the sampling distribution of the
sample average
• Reality:
– We draw only one sample and must make inference from it.
– We are able to use the sampling distribution via statistical theory.
– Think about hypothetically repeated the study (sampling)! Study Display 2.3
• Important facts about the sampling distribution of the average:
1. Center of the sampling distribution:
2. Standard deviation of the sampling distribution:
3. The Central Limit Theorem (CLT):
Display 2.4 shows what happens with larger n.
2.2.2 The Standard Error of an Average in Random Sampling:
The standard error of a statistic is an estimate of the standard deviation of its sampling
distribution.
Tells us about how close the sample statistic should be to its the true parameter. (Measure of
accuracy)
• The Degrees of Freedom (d.f.) relates to the amount of information used to estimate
variability.
– It is measured in units of “equivalent numbers of independent observations.”
– Every standard error has a d.f. associated with it.
• Calculating the Standard Error for a Sample Average:
• Questions for thought:
1. What is the difference between the population standard deviation and the standard
deviation of the sampling distribution?
2
2. What can we say about the shape of the sampling distribution of the average as
compared to that of the population distribution?
3. What happens to the standard error of an average (and standard deviation of the
sampling distribution of the average) when sample size increases?
4. What is the difference between the standard deviation of the sampling distribution of
an average and the standard error of an average in random sampling?
• A 95% Confidence Interval for the Mean
– Helps get at the question: What are plausible values for µ, based on the data?
– The definition of a confidence interval stems from the sampling distribution! Once
again, think about repeating the study over and over again and calculating the mean
and a 95% confidence interval each time. 95% of the samples will result in confidence
intervals that cover the parameter (true value).
– We consider the parameter, µ, to be fixed and unknown.
– Suppose you make a 95% confidence interval for µ. What are two possibilities regarding µ?
1.
2.
– We are NOT saying that there is a 95% chance that µ is in the interval we made (the
interval either does or does not include the fixed value of µ).
– Generic 95% confidence interval:
ESTIMATE ± (A MULTIPLIER) × (STANDARD ERROR OF ESTIMATE)
– Example: Minimum in-sleeping bag temperature was measured for each of n = 30
volunteers. The sample average was 81.2 and the sample standard deviation was 8.1.
Construct a 95% confidence interval for the true mean temperature.
3
∗ Write a summary sentence reporting the estimate and confidence interval for the
sleeping bag study:
∗ What is the appropriate scope of inference?
2.2.3 The t-Ratio Based on a Sample Average:
• A t-ratio is a ratio of an estimate’s error to the anticipated size of its error.
• The Z-ratio and t-ratio are just statistics.
• Z-ratio: Use when the population standard deviation is known.
– If the sampling dist’n of the estimate is normally distributed, then the sampling
distribution of Z is standard normal (Z ∼ N (0, 1)).
• t-ratio: Use when the population standard deviation is NOT known.
– Degrees of freedom:
– When do we know the distribution of the t-ratio?
∗ If Y is the average of a random sample of size n from a normally distributed
population, then...
the sampling distribution of the t-ratio is Student’s t-distribution on n − 1
d.f..
2.2.4 Unraveling the t-Ratio:
• The t-distribution shows the distribution of possible values for the t-ratio. It is a sampling
distribution of the t-ratio test statistic.
• The Paired t-Test for No Difference:
– Schizophrenia study example:
4
– IDEA: We cannot prove that µ = 0, but we can infer it from the rarity of the converse,
that is when the data observed are higly unusual under the null hypothesis.
∗ Use your sample as evidence regarding the value of µ (calculate the associated
t-ratio).
∗ How likely is it that we would observe such a t-ratio (test statistic) IF µ is really
0?
• A one-sample t-Test:
– We can test whether the mean of one population is equal to some particular value.
– Example: A sleeping bag manufacturer claims that their sleeping bags will keep a
person warm down to temperatures of −30 degrees F. Optimal in-bed under-blanket
temperature is approximately 86 degrees F. The manufacture’s specific claim is that
the mean interior temperature of the sleeping bag will stay above 80 degrees F when
the exterior temperature is −30 degrees F.
– How does the one-sample t-test differ from the paired t-test?
– When should you use a paired t-test instead of a two-sample t-test? (This is very
important!)
5
Download