Agenda Soc 5811 Lab #5 10.10.05 I. Welcome 1. Third problem set due tomorrow. 2. Review last week’s lab and answer any remaining questions about sampling distributions. 3. We will be using different datasets in upcoming weeks. Let me know if you have a dataset you would like us to use in lab. 4. Lab handouts, datasets, and other information can be found at: http://www.tc.umn.edu/~long0324/ II. Objectives 1. More confidence intervals. 2. Begin hypothesis testing. III. Confidence Intervals (review) 1. A confidence interval is the “range of values around a point estimate that makes it possible to state the probability that an interval contains the population parameter between its lower and upper bounds” (Bohrnstedt & Knoke p.90). In short, a confidence interval is the range of values in which our population parameter is likely to fall. We calculate a confidence interval using our sample mean and our best estimate of the standard error. 2. When we have a large sample size, we can use a Z-table to determine the number of standard errors associated with the probability of our confidence interval. What is the general formula for confidence intervals with a large N? 3. When our N is small, we can not assume that our sampling distribution is normal. We can then use the Student’s T-Distribution to determine the critical values necessary to create the confidence interval. What is the formula for calculating a confidence interval for a small sample? 4. Find the mean and standard deviation for the number of years of education (variable educ). Calculate the 95% confidence interval by finding the correct critical value in the Student’s T-Distribution table. Also, make a visual representation of the interval by drawing the band around the sample mean in which the population mean is likely to fall. Check your calculation with SPSS. Draw confidence interval here: 5. We can also test to see if an estimate falls within a confidence interval using SPSS. For example, we may have reason to believe that most people go to school for twelve year, or until they graduate from highs school. SPSS then gives us the t-value for our test value as well as the test value’s deviation from the upper and lower limits of the confidence interval. We can then determine if our population parameter estimate is close enough to the sample mean. Recalculate the 95% confidence interval using a test value. How far does our estimate deviate from the upper and lower bounds of the confidence interval? 6. Calculate the 90% and 99% confidence intervals for education and overlay them on top of the interval you drew above. Do either of these capture our estimate of 12? IV. Hypothesis Testing using the World Values Survey A. Just like your ninth-grade science class, social statistics is built upon hypothesis testing (although admittedly more complicated). Here is a review of the hypothesis testing procedure: 1. State the research hypothesis, or the alternate hypothesis, which states what you think may be true. (H1) 2. State the null hypothesis, which is the hypothesis we are trying to reject to indirectly support what we think may be true. (H0) 3. Choose an alpha-level for either a one-tailed or two-tailed test. 4. Determine the test statistic (z-value or t-value) that corresponds to the alpha-level. 5. Accept or reject the null hypothesis by comparing our sample t-value or z-value with the test statistic. a. What types of errors can be made when we reject or fail to reject the null hypothesis? B. Hypothesis Testing About Means- One-Sample T-tests 1. The simplest hypothesis tests are for the mean of a variable for a single sample. For example, last week we tested (although informally) whether the average number of hours worked per week in the United States was 40. We concluded that we could reject a null hypothesis that Americans worked forty hours per week because our sample statistic fell outside of the 95% confidence interval. 2. We set up a test about a mean around our null hypothesis. While we don’t know the actual population mean, we do know the width of the sampling distribution (standard error). We can then determine how many standard errors our observed sample mean falls from our hypothetical population mean. If we know the sampling distribution is normal, we can convert the distance from the population mean to a probability of observing a sample mean if our population mean is true. If our sample mean falls into our alpha-area, we can reject the null hypothesis because it is highly improbable to observe a sample mean that falls so far from the actual population mean. a. We typically conduct two-tailed hypothesis tests, in which the alpha-level is divided into equal areas above and below the mean. In this case, a sample mean that falls either above or below the null hypothesis and into the alpha-area can be used to reject the null hypothesis. However, sometimes we are confident in the direction of a hypothesis. Instead of stating a null hypothesis A=B, we can may state A>B or A<B. We can then place our entire alpha in one end of the sampling distribution, thus giving us more room to reject the null hypothesis. 2. Conduct a one-sample T-test for age (variable x003) with the World Values Survey subset. Test the somewhat meaningless hypothesis that the average age of respondents is 42. What is your null hypothesis? Alternate hypothesis? Is it a one-tailed or two-tailed test? Draw the sampling distribution around the null hypothesis: Where does the observed sample mean fall in relation to the null hypothesis? Can we safely reject the null hypothesis? 3. Suppose we have reason to believe that people in the United States, Germany, France, and Japan are relatively happy in life. Test the hypothesis that the average feeling of happiness is “quite happy” (variable a008; “quite happy”=2). Draw the sampling distribution around the null hypothesis and place the observed mean: What did you find? Can we conduct a one-tailed hypothesis in this case? Why or why not? How would the sampling distribution and alpha-area change if we conducted a one-tailed test? 4. Test the null hypothesis that people in The United States, France, Germany, and Japan are fairly liberal (variable e003). How would you set up the null and alternate hypotheses? Try using a one-taield test with an alpha-level of .05. Draw the sampling distribution around the null hypothesis and place the observed mean: C. Hypothesis Testing for Difference in Means- Independent Samples T-tests (f time allows. 1. When we want to compare the means for two different groups, we conduct an independent samples t-test to see if the means for each group are the same or different. We use a corollary of the Central Limit Theorem to determine if we can confidently reject a null hypothesis that the means for two groups are the same. From this, we know that the difference in population means also has a sampling distribution, which we can draw out and mark the alpha-areas. a. What information do we need to know to conduct a T-test for difference in means? How do we estimate the standard error of the sampling distribution for the difference in means for large samples? For small samples? 2. Suppose someone claims that Europe is “aging” compared to the U.S., meaning the average age in European countries is older than in the United States. Test the null hypothesis that the average age (variable x003) of German and American respondents is the same using an alpha-level of .05 (use variable s003 to determine groups; Germany=276, USA=840). What is your alternate hypothesis? Draw out the sampling distribution around the null hypothesis. What are your conclusions? GENERAL SPSS INSTRUCTIONS I. Calculating confidence intervals 1. Click on Analyze, Compare Means, One-Sample T-test. 2. Put variables for which you want to construct confidence intervals into the box. 3. Click on Options and select the confidence level for your interval. 4. To see if a particular estimate falls within the confidence interval, type the estimate into the Test value box. 5. Confidence intervals can also be calculated in the Explore command under the Descriptive Statistics drop-down window. II. One Sample T-Tests (hypothesis tests about a mean) 1. Click on Analyze, Compare Means, One-Sample T-test. 2. Put variable of interest into box. 3. Click on Options and choose critical value. 4. Type the value of the null hypothesis into the Test value box. 5. Paste and Run. III. Independent Samples T-Tests (hypothesis tests fro differences in means) 1. Click on Analyze, Compare Means, Independent Samples T-Tests. 2. Put variable of interest into box. 3. Put the variable from which you are selecting two groups (in our case, variable s003) into the Grouping Variables box. Click on Define Groups and type in the ids for the two groups (ex. 250 for France, 840 for USA). 4. Click on Options and choose critical value. 5. Paste and Run.