Chapter 2: Inference Using t-Distributions 2.1 2.1.1 Bumpus’s Data on Natural Selection Hermon Bumpus (1898) looked at measurments on house sparrows brought to the Anatomical Laboratory of Brown University after an uncommonly severe winter storm. Some had survived the storm and others had not. He was curious whether the birds that had died in the storm may have done so because they lack physical characteristics enabling them to make it through this weather related ”selective” event. Specifically, do humerus lengths tend to be different for survivors than those that perished? • Can he actually answer the question of whether the birds died as a result of their humerus lengths? Why or why not? • What population of birds can the results be inferred to? 2.1.2 Anatomical Abnormalities Associated with Schizophrenia Researchers investigated whether sizes of certain areas of the brain may be different in persons with schizophrenia than others. Suddath et al. (1990) controlled for genetic and socioeconomic differences by examining 15 pairs of monozygotic twins, where one was schizophrenic and the other was not. The twins were found through an intensive search throughout Cananda and the U.S. One specific question asked whether there is a difference in the volume of the left hippocampus between the twins. • If there is a difference, would the researchers be able to say that it is caused by the schizophrenia? Why or why not? • What population can the results be inferred to? 2.2 One-Sample t-Tools and the Paired t-Test GOAL: Draw inference about the population mean from a random sample. 1 2.2.1 The Sampling Distribution of a Sample Average • Imagine the sampling distribution: 1. Draw a random sample from a population 2. Calculate and record the sample average 3. Repeat for every one of the possible equally likely samples. 4. The collection of all the sample averages is called the sampling distribution of the sample average • Reality: – We draw only one sample and must make inference from it. – We are able to use the sampling distribution via statistical theory. – Think about hypothetically repeated the study (sampling)! Study Display 2.3 • Important facts about the sampling distribution of the average: 1. Center of the sampling distribution: 2. Standard deviation of the sampling distribution: 3. The Central Limit Theorem (CLT): Display 2.4 shows what happens with larger n. 2.2.2 The Standard Error of an Average in Random Sampling: The standard error of a statistic is an estimate of the standard deviation of its sampling distribution. Tells us about how close the sample statistic should be to its the true parameter. (Measure of accuracy) • The Degrees of Freedom (d.f.) relates to the amount of information used to estimate variability. – It is measured in units of “equivalent numbers of independent observations.” – Every standard error has a d.f. associated with it. • Calculating the Standard Error for a Sample Average: • Questions for thought: 1. What is the difference between the population standard deviation and the standard deviation of the sampling distribution? 2 2. What can we say about the shape of the sampling distribution of the average as compared to that of the population distribution? 3. What happens to the standard error of an average (and standard deviation of the sampling distribution of the average) when sample size increases? 4. What is the difference between the standard deviation of the sampling distribution of an average and the standard error of an average in random sampling? • A 95% Confidence Interval for the Mean – Helps get at the question: What are plausible values for µ, based on the data? – The definition of a confidence interval stems from the sampling distribution! Once again, think about repeating the study over and over again and calculating the mean and a 95% confidence interval each time. 95% of the samples will result in confidence intervals that cover the parameter (true value). – We consider the parameter, µ, to be fixed and unknown. – Suppose you make a 95% confidence interval for µ. What are two possibilities regarding µ? 1. 2. – We are NOT saying that there is a 95% chance that µ is in the interval we made (the interval either does or does not include the fixed value of µ). – Generic 95% confidence interval: ESTIMATE ± (A MULTIPLIER) × (STANDARD ERROR OF ESTIMATE) – Example: Minimum in-sleeping bag temperature was measured for each of n = 30 volunteers. The sample average was 81.2 and the sample standard deviation was 8.1. Construct a 95% confidence interval for the true mean temperature. 3 ∗ Write a summary sentence reporting the estimate and confidence interval for the sleeping bag study: ∗ What is the appropriate scope of inference? 2.2.3 The t-Ratio Based on a Sample Average: • A t-ratio is a ratio of an estimate’s error to the anticipated size of its error. • The Z-ratio and t-ratio are just statistics. • Z-ratio: Use when the population standard deviation is known. – If the sampling dist’n of the estimate is normally distributed, then the sampling distribution of Z is standard normal (Z ∼ N (0, 1)). • t-ratio: Use when the population standard deviation is NOT known. – Degrees of freedom: – When do we know the distribution of the t-ratio? ∗ If Y is the average of a random sample of size n from a normally distributed population, then... the sampling distribution of the t-ratio is Student’s t-distribution on n − 1 d.f.. 2.2.4 Unraveling the t-Ratio: • The t-distribution shows the distribution of possible values for the t-ratio. It is a sampling distribution of the t-ratio test statistic. • The Paired t-Test for No Difference: – Schizophrenia study example: 4 – IDEA: We cannot prove that µ = 0, but we can infer it from the rarity of the converse, that is when the data observed are higly unusual under the null hypothesis. ∗ Use your sample as evidence regarding the value of µ (calculate the associated t-ratio). ∗ How likely is it that we would observe such a t-ratio (test statistic) IF µ is really 0? • A one-sample t-Test: – We can test whether the mean of one population is equal to some particular value. – Example: A sleeping bag manufacturer claims that their sleeping bags will keep a person warm down to temperatures of −30 degrees F. Optimal in-bed under-blanket temperature is approximately 86 degrees F. The manufacture’s specific claim is that the mean interior temperature of the sleeping bag will stay above 80 degrees F when the exterior temperature is −30 degrees F. – How does the one-sample t-test differ from the paired t-test? – When should you use a paired t-test instead of a two-sample t-test? (This is very important!) 5