5811 Lab 4

Agenda Soc 5811 Lab #4 10.03.05 I. Welcome 1. Third problems set due Monday, October 11 at start of class 2. Review second problem set 3. Lab handouts, datasets, and other information can be found at: http://www.tc.umn.edu/~long0324/ II. Objectives 1. Learn how to draw sub-samples from the GSS dataset. 2. Begin inferential statistics and confidence intervals. III. Sampling and Inferential Statistics (review from last week) 1. Inferential statistics involves making generalizations about a population using information from a sample along with statistical laws. For example, information in the General Social Survey is used to make broader generalizations about the American population. We will draw random samples of the 2002 General Social Survey to get a “hands-on” feel for how samples and populations are related. 2. First, a review on population and sampling distributions… a. What is the difference between a sample and a population? What is a random sample? b. What notation do we use for population parameters and sample statistics? c. Recall that the sampling distribution of the mean is made up of mean estimates from all possible samples of a fixed size. Because we rarely know the sampling distribution, we often think of it as a probability distribution. We can then use the probability distribution to judge how close our population estimate is likely to be. (There will be more on this next week.) 2. For now, we will treat the GSS as our population. Calculate the mean and standard deviation for variable hrs1, or the number of hours the respondent worked last week, for the entire GSS. 3. Next, draw a random sub-sample of 1% of cases, and recalculate the mean and standard deviation for hours worked last week. Record the mean and standard deviation. We will need it later. 4. Repeat this process nine times, selecting a new sub-sample each time. Then, open up a new SPSS data file and create two new variables, mean1 and stdev1. Enter the 10 values for each mean and standard deviation for each sub-sample. While this is not an actual sampling distribution because we have not taken all possible 10-case samples, it does give us a flavor for how a sampling distribution works. 5. Calculate the mean and standard deviation of the sub-sample means and standard deviations. How close are they to the actual mean of the survey population? 6. Repeat steps 3-5 with random sub-samples of 10% cases. Enter the new variables, mean2 and stdev2, into your new dataset. Compare the means and standard deviations of the two sub-sample sizes. What did you find? Which set of mean estimates are more spread out? Which is likely to provide a better estimate of the population mean? IV. Confidence Intervals 1. A confidence interval is the “range of values around a point estimate that makes it possible to state the probability that an interval contains the population parameter between its lower and upper bounds” (Bohrnstedt & Knoke p.90). In short, a confidence interval is the range of values in which our population parameter is likely to fall. We calculate a confidence interval using our sample mean and our best estimate of the standard error. a. What is the formula for a 95% confidence interval? 2. When we have a large sample size, we can use a Z-table to determine the number of standard errors associated with the probability of our confidence interval. What is the general formula for confidence intervals with a large N? 3. When our N is small, we can not assume that our sampling distribution is normal. We can then use the Student’s T-Distribution to determine the critical values necessary to create the confidence interval. What is the formula for calculating a confidence interval for a small sample? 4. Find the mean and standard deviation for variable hrs1, the number of hours worked last week. Calculate the 95% confidence interval by finding the correct critical value in the Student’s T-Distribution table. Also, make a visual representation of the interval by drawing the band around the sample mean in which the population mean is likely to fall. 5. Next, calculate the confidence interval using SPSS (see instructions). A quick preview of hypothesis testing: what if we hypothesized that the population mean for hours worked is 40 hours. Can we be confident in this estimate? IV. Problem Solving 1. Suppose President Bush plans to provide tax breaks for people who make below the mean income in the United States (I know, I know…just run with it). His policy claims that the mean income in the United States is $18,000, but he needs you to test if this is a close estimate using data from the GSS. Calculate and draw a 95% confidence interval around the sample mean for rincom98. Can we be confident in Bush’s estimate? 2. Suppose President Bush also supports prayer in schools, arguing that Americans don’t engage in enough religious activity (i.e. only 1 or 2 times a month). Calculate and draw a 95% confidence interval around the sample mean for variable relig80. Can we be confident in Bush’s assessment of religious activity in the United States? V. Confidence Intervals and Test Values (if time allows) 1. Suppose we have reason to believe that the average number of hours per week worked in the United States is 40. We can use this value as a test value when we calculate the confidence interval (see instructions). SPSS then gives us the tvalue for our test value as well as the test value’s deviation from the upper and lower limits of the confidence interval. We can then determine if our population parameter estimate is close enough to the sample mean. Calculate the 95% confidence interval for hrs1 with a test value of 40. What did you find? GENERAL SPSS INSTRUCTIONS I. Select cases 1. Go to Data, choose Select cases. 2. There are a few different options (select a random sample, create a filter variable, etc.). We will be selecting cases based on our desired sample size. 3. Click on Random sample of cases button. 4. Select exactly 10 cases 9or however many cases are going to be in your sample. Be sure to put the total number of cases in the second box. 5. Click Continue, Paste, and then Run. 6. To select all cases, go back to the Select cases window and click on All cases. II. Calculating confidence intervals 1. Click on Analyze, Compare Means, One-Sample T-test. 2. Put variables for which you want to construct confidence intervals into the box. 3. Click on Options and select the confidence level for your interval. 4. To see if a particular estimate falls within the confidence interval, type the estimate into the Test value box. 5. Confidence intervals can also be calculated in the Explore command under the Descriptive Statistics drop-down window.

5811 Lab 4

Related documents

Products

Support

5811 Lab 4

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib