Introductory Workshop SPSS

advertisement

Introductory Workshop

SPSS

CSU Bakersfield

December 9, 2005

Acknowledgements

• Kaye Bragg, Director, Faculty Teaching and Learning Center

• Peggy Leapley, Nursing

Facilitators

• Ed Nelson – CSU Fresno ednelson@csufresno.edu

• Jim Ross – CSU Bakersfield jross@csub.edu

• Campus representatives for the Social

Sciences Research and Instructional

Council (SSRIC)

Social Science Research and

Instructional Council (SSRIC)

• Discipline council for the social sciences made up of representatives from each campus in the CSU. List of campus representatives can be found at http://www.ssric.org/reps

• Promotes use of data analysis in research and teaching

• Website is at http://www.ssric.org

Social Science Data Bases

• The SSRIC helps maintain and promote the use of the social science data bases in the CSU

• Data bases include:

– Inter-university Consortium for Political and Social

Research (ICPSR)

– The Field Institute

– The Roper Center for Public Opinion Research

• We’ll explore these data bases and how to use them at the workshop tomorrow

Agenda for the Introductory

SPSS Workshop

• Overview of SPSS

• A brief tour

• Transforming data

– Recode

– Compute

– Select If

• Univariate analysis

– Frequencies

– Descriptives

– Explore

• A look ahead at the intermediate workshop

Overview of SPSS

• SPSS is a statistical package for beginning, intermediate, and advanced data analysis

• Other statistical packages include SAS and Stata

• Online statistical packages that don’t require site licenses include SDA

Text – SPSS for Windows

Version 13 A Basic Tutorial

• Authors: Linda Fiddler (Bakersfield), Laura

Hecht (Bakersfield), Ed Nelson (Fresno),

Elizabeth Nelson (Fresno), Jim Ross

(Bakersfield)

• Available from McGraw-Hill Custom Publishing.

Call 800-338-3987 to order. Request ISBN 0-

07-353671-7

• Available on the web at http://www.csub.edu/~jross/projects/spss/ . The data set for this workshop can be downloaded at this site

Current Version of SPSS

• Current version is 14.0

• Text is for version 13.0

• Text is revised every other version

SPSS Files and Extensions

• Portable file -- .por

• Data file -- .sav

• Output file -- .spo

• Syntax file -- .sps

Opening SPSS

• Go to start and find SPSS for Windows

• Click on SPSS 13.0 for Windows to open

• You’ll need to update your SPSS license every year (or your school technician will do it for you)

Creating Your Own Data File

• We’re not going to go through how you would create your own data file. It would take too long.

But you can go to ch. 2 in the text for a thorough discussion. (Note: the slides for creating your own data file are “hidden” in this PowerPoint presentation.)

• It involves creating:

– Variable names

– Variable labels

– Value labels

– Missing values

Opening an Existing File

• Usually you will want to open a data set that you got from someplace else such as:

– ICPSR

– Field Institute

– Roper Center

• These files will usually be in the form of a:

– SPSS portable file

– SPSS data file

– Raw data file with a SPSS syntax file

– Raw data file without a syntax file

Opening a Portable file

• Click on the open yellow folder to open a new file

• Change file type to .por

• Browse to where the portable file you want to open is located and double click on that file

Opening a Data File

• Click on the open yellow folder to open a new file

• Change file type to .sav

• Browse to where the data file you want to open is located and double click on that file

• We’re going to use the data set that comes with the text – gss02a.sav. You can download it from the web site that has the text -http://www.csub.edu/~jross/projects/spss/

Opening a Raw Data File with a

SPSS Syntax File

• Sometimes you will need to open a raw data file

(ASCII or text) and there will be an accompanying SPSS syntax file

• You will need to modify the “File Handle” and

“Save Outfile” commands

• See http://www.icpsr.umich.edu/help/newuser.html#0

5 for more information

• You may need help doing this. Feel free to contact your campus SSRIC representatives or the facilitators for this workshop

Opening a Raw Data File Without a SPSS Syntax File

• If you don’t have a SPSS syntax file you will have to use the codebook that came with the data and create your own syntax file

• You may need help doing this. Feel free to contact your campus SSRIC representatives or the facilitators for this workshop

What’s Next?

• Now you know how to open an existing

SPSS portable or data file

• Let’s do a quick overview of SPSS and then we’ll learn how to transform variables

A Brief Tour of SPSS

(see ch. 1 in text, pp. 5-10)

• Frequencies -- Analyze/Descriptive

Statistics/Frequencies

– Select ABANY and move it to the big box and click on

OK

• Crosstabs – Analyze/Descriptive

Statistics/Crosstabs

– Move ABANY to the “Row” box

– Move SEX to the “Column” box

– Click on “Cells” and select “Column” percents

– Click on OK

A Brief Tour Continued

• Comparing means – Analyze/Compare

Means/Means

– Move AGEKDBRN and EDUC in the

“Dependent List” box

– Move SEX to the “Independent List” box

– Click on OK

A Brief Tour Continued

• Correlations

– Analyze/Correlate/Bivariate

– Move EDUC, MAEDUC, and PAEDUC into the “Variables” box

– Click on OK

A Brief Tour Continued

• Scatterplots

– Graphs/Scatter/Dot

– Click on “Simple Scatter” and then on “Define”

– Move EDUC into the “Y axis” box

– Move PAEDUC into the “X Axis” box

– Click on OK

Transforming Data

(see ch. 3 in text)

• We can transform variables by recoding which means to combine categories on an existing variable into fewer categories

• We can transform variables by creating new variables out of existing variables

• We can select particular cases and analyze only these cases

• We can do other things like weighting cases that we’re not going to talk about in this workshop.

(Note: the slides for weighting data are “hidden” in this PowerPoint presentation.)

Recoding Variables

• Recoding into different variables

• Recoding into the same variable

• We recommend recoding into different variables and not using the into same variable option

Recoding into Different Variables

• Click on “Transform” and then on “Recode” and then on “into different variables”

• Select the variable you want to recode

• Start by giving the new variable a new name and assigning a variable label to the new variable. Click on “Change”

Recoding AGE into AGE1

• Recode AGE into four categories and give it the name of AGE1

– Click on “Old and New Values”

• Use “Range” (fourth option down) to recode as follows. Remember to click on “Add” after entering each recode

– 18 to 29 = 1

– 30 to 49 = 2

– 50 to 69 = 3

– 70 to 89 = 4

Recoding Options

• When you click on “Old and New Values” there will be seven options

• For most recoding you will only have to use two of these options

– The first option from the top allows you to recode a single value into a new value

– The fourth option from the top allows you to recode a range of values from X to Y into a new value

Assign Value Labels to the

Four Categories of AGE1

• Go into “Variable View”

• Find the variable AGE1 (should be at the bottom of the list of variables)

• Click in the “Values” column and then click on the small gray box

• Enter the value labels

• Click on OK

Exercises for Recoding

• INCOME98 is total family income. Do a frequency distribution to see what it looks like before recoding

• Recode into 4 categories and call this new variable INCOME1. Use the following categories: under $20K, $20K to under $40K,

$40K to under $60K, and $60K and over

• Add the value labels

• Run a frequency distribution for INCOME1 and check to make sure that you recoded it correctly by comparing the unrecoded and recoded frequency distributions

More Exercises for Recoding

• Now recode INCOME98 again and call the new variable INCOME2

• This time use 8 categories: under $10K, $10K to under $20K, $20K to under $30K, $30K to under $40K, $40K to under $50K, $50K to under

$60K, $60K to under $75K, and $75K and over

• Add the value labels

• Run a frequency distribution for INCOME2 and check to make sure that you recoded it correctly by comparing the unrecoded and recoded frequency distributions

Creating a New Variable with Compute

• Let’s create a new variable and call it

ABORTION which is the sum of the seven abortion variables

• Click on “Transform” and then on “Compute”

• Enter the new variable name (ABORTION) into the target variable box

• Enter the formula for this new variable into the

“Numeric Expression” box

• Click on OK

Dealing with Missing Data

• If there is missing data for any of these variables

(ABANY to ABSINGLE), the new variable

ABORTION will be assigned a system missing value

• What do we do if we want to allow no more than two missing values?

• Let’s compute the mean value and divide the sum of the abortion values by the number of cases with valid information

• But let’s allow only two variables with missing values

Dealing with Missing Data

Continued

• Click on “Reset” to erase what is currently in the

“Compute Variable” box

• Click on “Statistical” in the “Function Group” box

• Then double click on “Mean” in the “Function and Special Variables” box

• In the “Target Variable” box, enter the name of the new variable. Let’s call it ABORMEAN

• In the “Numeric Expression” box, you should see

“MEAN(?,?)”

Dealing with Missing Data

Continued

• Replace the “?,?” with the variables you want to include so it reads “MEAN

(abany,abdefect,abhlth,abnomore,abpoor, abrape,absingle)”

• Insert .5 following MEAN so it reads

“Mean.5”. This indicates that you want to have at least five variables with valid information

• Click on OK

Exercises for Compute

• There are five variables that measure tolerance for letting someone speak in your community who may have different views than your own: SPKATH,

SPKCOM, SPKHOMO, SPKMIL, and

SPKRAC

• For each of these variables, 1 means they would allow such a person to speak and 2 means they would not allow it

Exercises for Compute Continued

• Create a new variable (call it SPEAK) which is the sum of these five variables

• Run a frequency distribution for SPEAK

• What do the values in this new variable tell us?

More Exercises for Compute

• Now let’s create a variable called

SPKMEAN which allows for one of the five variables (SPKATH to SPKRAC) to be missing

• What happens if there is more than one variable with a missing value?

• How does SPSS calculate the new variable if there is only one variable with a missing value?

Using Select Cases to Select

Specific Cases for Analysis

• Let’s select only Protestants for further analysis

• Click on “Data” and then on “Select Cases”

• Click on “If condition is satisfied” and then on the

“If” button below it

• Select the variable RELIG and move it into the box on the right

• In this box, enter the expression “relig = 1”

• Click on “Continue” and on OK

Using Select Cases Continued

• Now lets select Protestants who are under

35 years age old

• Enter the expression “relig = 1” as you did before.

• Use & for and. Enter “age < 35” so the expression reads “relig = 1 & age < 35”

• Click on OK

Exercises for Select If

• Select all males (1 on the variable SEX) and do a frequency distribution for the variable FEAR (afraid to walk alone at night in the neighborhood)

• Now select all females (2 on the variable

SEX) and fun a frequency distribution for

FEAR

• Are males or females more fearful of walking alone at night?

More Exercises for Select If

• Now let’s select males under age 35 and run a frequency distribution for FEAR

• Do the same thing for females under 35

• Are males or females under 35 more fearful of walking alone at night?

Important Note on Using

Select Cases

• When you are finished using “Select

Cases” and want to revert to using all the cases be sure to click on Data/Select

Cases and select “All cases”. Then click on OK

• If you don’t do this, you will continue to use only those cases you last selected

Univariate Analysis

• Now that we know how to open existing files and transform variables, we’re ready to begin analyzing data

• Univariate analysis refers to analyzing variables one-at-a-time

Types of Univariate

Analysis Procedures

(see ch. 4 in text)

• Frequencies

• Descriptives

• Explore

Frequencies

• Go to Analyze/Descriptive

Statistics/Frequencies

• Select ABANY and AGE and click on OK

Bar Charts

• Bar charts – click on Analyze/Descriptive

Statistics/Frequencies

• Click on “Charts”

• Select “Bar Charts” and click on

“Continue” and then on OK

• Do you think bar charts are appropriate for both ABANY and AGE?

Histograms

• Click on click on Analyze/Descriptive

Statistics/Frequencies

• Click on “Charts”

• Select “Histograms” and click on “Continue” and then on OK

• Do you think histograms are appropriate for both

ABANY and AGE?

• Which do you think is the most appropriate chart

(bar chart or histogram) for ABANY and for

AGE?

Statistics

• Click on Analyze/Descriptive

Statistics/Frequencies

• Click on “Statistics”

• Select the statistics you want and click on

“Continue” and then on OK

Exercises for Frequencies

• There are seven variables dealing with abortion: ABANY, ABDEFECT, ABHLTH

ABNOMORE, ABPOOR, ABRAPE, and

ABSINGLE

• Run a frequency distribution for each variable

• Get a bar chart for each variable

• Compare and contrast how people answered these seven questions

More Exercises for Frequencies

• Run the frequency distribution for AGE

• Get a histogram for AGE

• Compute the following statistics for AGE:

– Mean

– Median

– Standard deviation

– Percentiles – 25 th , 50 th , and 75 th

Descriptives

• Click on Analyze/Descriptive

Statistics/Descriptives

• Select AGE and EDUC

• Click on “Options” and select the statistics you want and then click on “Continue” and

OK

Exercises for Descriptives

• Use Descriptives to compute the following statistics for AGE

– Mean

– Standard deviation

– Variance

– Skewness

– Kurtosis

More Exercises for Descriptives

• Use Descriptives to compute the mean for

EDUC, MAEDUC, PAEDUC

• Who has the most education – respondents or their parents?

• Who has the most education – mothers or fathers?

Explore

• Click on Analyze/Descriptive

Statistics/Explore

• Select EDUC and put it in the “Dependent

List”

• In the Display box on the lower left, click on “Both”

• Click on OK

Selecting Statistics for Explore

• Click on Analyze/Descriptive

Statistics/Explore

• Click on “Statistics” and select the statistics you want

• Click on “Continue” and then OK

Selecting Plots for Explore

• Click on “Plots”

• Select the plots you want

• Click on “Continue” and then OK

Exercises for Explore

• Using Explore to get the following statistics and plots for the variables EDUC, PAEDUC, and

MAEDUC

– Descriptives

– Outliers

– Stem-and-leaf plot

– Histogram

– Boxplot

• First select “Factor levels together” and run it

• Then select “Dependents together” and run it again

• What’s the difference?

Intermediate Workshop for SPSS

• In the next workshop we’ll look at different types of statistical analysis you can do in SPSS

– Cross tabulations (ch. 5)

– Comparing means (ch. 6)

– Correlation and regression (ch. 7)

– Multivariate analysis (ch. 8)

• Cross tabulations

• Multiple regression

– Presenting your data – charts and tables (ch. 9)

Download