5811 Lab 5

advertisement
Agenda
Soc 5811 Lab #5
10.10.05
I. Welcome
1. Third problem set due tomorrow.
2. Review last week’s lab and answer any remaining questions about sampling
distributions.
3. We will be using different datasets in upcoming weeks. Let me know if you
have a dataset you would like us to use in lab.
4. Lab handouts, datasets, and other information can be found at:
http://www.tc.umn.edu/~long0324/
II. Objectives
1. More confidence intervals.
2. Begin hypothesis testing.
III. Confidence Intervals (review)
1. A confidence interval is the “range of values around a point estimate that
makes it possible to state the probability that an interval contains the population
parameter between its lower and upper bounds” (Bohrnstedt & Knoke p.90). In
short, a confidence interval is the range of values in which our population
parameter is likely to fall. We calculate a confidence interval using our sample
mean and our best estimate of the standard error.
2. When we have a large sample size, we can use a Z-table to determine the
number of standard errors associated with the probability of our confidence
interval. What is the general formula for confidence intervals with a large N?
3. When our N is small, we can not assume that our sampling distribution is
normal. We can then use the Student’s T-Distribution to determine the critical
values necessary to create the confidence interval. What is the formula for
calculating a confidence interval for a small sample?
4. Find the mean and standard deviation for the number of years of education
(variable educ). Calculate the 95% confidence interval by finding the correct
critical value in the Student’s T-Distribution table. Also, make a visual
representation of the interval by drawing the band around the sample mean in
which the population mean is likely to fall. Check your calculation with SPSS.
Draw confidence interval here:
5. We can also test to see if an estimate falls within a confidence interval using
SPSS. For example, we may have reason to believe that most people go to school
for twelve year, or until they graduate from highs school. SPSS then gives us the
t-value for our test value as well as the test value’s deviation from the upper and
lower limits of the confidence interval. We can then determine if our population
parameter estimate is close enough to the sample mean. Recalculate the 95%
confidence interval using a test value. How far does our estimate deviate from the
upper and lower bounds of the confidence interval?
6. Calculate the 90% and 99% confidence intervals for education and overlay
them on top of the interval you drew above. Do either of these capture our
estimate of 12?
IV. Hypothesis Testing using the World Values Survey
A. Just like your ninth-grade science class, social statistics is built upon
hypothesis testing (although admittedly more complicated). Here is a review of
the hypothesis testing procedure:
1. State the research hypothesis, or the alternate hypothesis, which states
what you think may be true. (H1)
2. State the null hypothesis, which is the hypothesis we are trying to reject
to indirectly support what we think may be true. (H0)
3. Choose an alpha-level for either a one-tailed or two-tailed test.
4. Determine the test statistic (z-value or t-value) that corresponds to the
alpha-level.
5. Accept or reject the null hypothesis by comparing our sample t-value or
z-value with the test statistic.
a. What types of errors can be made when we reject or fail to reject
the null hypothesis?
B. Hypothesis Testing About Means- One-Sample T-tests
1. The simplest hypothesis tests are for the mean of a variable for a single
sample. For example, last week we tested (although informally) whether
the average number of hours worked per week in the United States was 40.
We concluded that we could reject a null hypothesis that Americans
worked forty hours per week because our sample statistic fell outside of
the 95% confidence interval.
2. We set up a test about a mean around our null hypothesis. While we
don’t know the actual population mean, we do know the width of the
sampling distribution (standard error). We can then determine how many
standard errors our observed sample mean falls from our hypothetical
population mean. If we know the sampling distribution is normal, we can
convert the distance from the population mean to a probability of
observing a sample mean if our population mean is true. If our sample
mean falls into our alpha-area, we can reject the null hypothesis because it
is highly improbable to observe a sample mean that falls so far from the
actual population mean.
a. We typically conduct two-tailed hypothesis tests, in which the
alpha-level is divided into equal areas above and below the mean.
In this case, a sample mean that falls either above or below the null
hypothesis and into the alpha-area can be used to reject the null
hypothesis. However, sometimes we are confident in the direction
of a hypothesis. Instead of stating a null hypothesis A=B, we can
may state A>B or A<B. We can then place our entire alpha in one
end of the sampling distribution, thus giving us more room to
reject the null hypothesis.
2. Conduct a one-sample T-test for age (variable x003) with the World
Values Survey subset. Test the somewhat meaningless hypothesis that the
average age of respondents is 42. What is your null hypothesis? Alternate
hypothesis? Is it a one-tailed or two-tailed test? Draw the sampling
distribution around the null hypothesis:
Where does the observed sample mean fall in relation to the null
hypothesis? Can we safely reject the null hypothesis?
3. Suppose we have reason to believe that people in the United States,
Germany, France, and Japan are relatively happy in life. Test the
hypothesis that the average feeling of happiness is “quite happy”
(variable a008; “quite happy”=2). Draw the sampling distribution around
the null hypothesis and place the observed mean:
What did you find? Can we conduct a one-tailed hypothesis in this case?
Why or why not? How would the sampling distribution and alpha-area
change if we conducted a one-tailed test?
4. Test the null hypothesis that people in The United States, France,
Germany, and Japan are fairly liberal (variable e003). How would you set
up the null and alternate hypotheses? Try using a one-taield test with an
alpha-level of .05. Draw the sampling distribution around the null
hypothesis and place the observed mean:
C. Hypothesis Testing for Difference in Means- Independent Samples T-tests (f
time allows.
1. When we want to compare the means for two different groups, we
conduct an independent samples t-test to see if the means for each group
are the same or different. We use a corollary of the Central Limit
Theorem to determine if we can confidently reject a null hypothesis that
the means for two groups are the same. From this, we know that the
difference in population means also has a sampling distribution, which we
can draw out and mark the alpha-areas.
a. What information do we need to know to conduct a T-test for
difference in means? How do we estimate the standard error of the
sampling distribution for the difference in means for large
samples? For small samples?
2. Suppose someone claims that Europe is “aging” compared to the U.S.,
meaning the average age in European countries is older than in the United
States. Test the null hypothesis that the average age (variable x003) of
German and American respondents is the same using an alpha-level of .05
(use variable s003 to determine groups; Germany=276, USA=840). What
is your alternate hypothesis? Draw out the sampling distribution around
the null hypothesis. What are your conclusions?
GENERAL SPSS INSTRUCTIONS
I. Calculating confidence intervals
1. Click on Analyze, Compare Means, One-Sample T-test.
2. Put variables for which you want to construct confidence intervals into the box.
3. Click on Options and select the confidence level for your interval.
4. To see if a particular estimate falls within the confidence interval, type the
estimate into the Test value box.
5. Confidence intervals can also be calculated in the Explore command under the
Descriptive Statistics drop-down window.
II. One Sample T-Tests (hypothesis tests about a mean)
1. Click on Analyze, Compare Means, One-Sample T-test.
2. Put variable of interest into box.
3. Click on Options and choose critical value.
4. Type the value of the null hypothesis into the Test value box.
5. Paste and Run.
III. Independent Samples T-Tests (hypothesis tests fro differences in means)
1. Click on Analyze, Compare Means, Independent Samples T-Tests.
2. Put variable of interest into box.
3. Put the variable from which you are selecting two groups (in our case, variable
s003) into the Grouping Variables box. Click on Define Groups and type in the
ids for the two groups (ex. 250 for France, 840 for USA).
4. Click on Options and choose critical value.
5. Paste and Run.
Download