SPSS Intro and Analysis Hein Stigum Presentation, data and programs at: http://folk.uio.no/heins/ Analysis with SPSS • SPSS introduction – Files and menus – syntax • Analysis – Continuous data • Symmetrical • Skewed – Categorical data 29.05.2016 H.S. 2 Files • Data .sav Data Editor • Syntax .sps Syntax Editor • Output .spo Viewer+Chart Editor Menus Toolbars file/editor Statusbar 29.05.2016 Vary with H.S. 3 Data Editor • Variable view – Each variable: name, type, label, value labels • Data view – Each case: values • Save a master file, work on workfile 29.05.2016 H.S. 4 Syntax Editor • Syntax – Comands ends with a ”.” – Comments starts with ”*” 29.05.2016 H.S. 5 Ways of working • Use menus to run commands • Use menus, paste commands, run • Write commands, run • Your main product: ”The Syntax File” !! 29.05.2016 H.S. 6 Viewer • Contains all output – Show/hide or delete elements – Double-click to edit element – Double-click on chart to start Chart Editor 29.05.2016 H.S. 7 Select and Filter Do analysis on “old people”: • Method 1, select Select if (age>50). • Method 2, filter Compute ff=(age>50). Filter by ff. … Filter off. 29.05.2016 H.S. 8 Recode and label • Cut age into 3 groups recode age (missing=sysmis) (lowest thru 29=1) (30 thru 39=2) (40 thru highest=3) into ageGr3. • Add labels variable label ageGr3 ’Age in 3 groups’. value label ageGr3 1’29 years’ 2’30-39 years’ 3’40 years’. • Cut age into equal sized groups Rank age /ntiles(3) into ageGr3. Examine age by ageGr3 /plot=none. 29.05.2016 H.S. 9 Compute and If Compute ageSqr=age**2. If (age<=50) old=0. If (age>50) old=1. Compute old=(age>50). Comp oldMale=0. If (age>50 and sex=1) oldMale=1. Compute oldMale= (age>50 and sex=1). Compute id=$casenum. 29.05.2016 H.S. 10 Missing • System missing – Empty values are marked ”.” and called sysmis • User missing – Set to missing: – Set to value: missing age (999). missing age (). • Selection – Remove all missing: select if (not missing(age)) 29.05.2016 H.S. 11 Options • Show variable names – Edit, options, general, show names • Show label values – Edit, options, output labels, Values and Labels 29.05.2016 H.S. 12 Analysis Datatypes • Categorical data – Nominal: – Ordinal: married/ single/ divorced small/ medium/ large • Numerical data – Discrete: number of children – Continuous: weight 29.05.2016 H.S. 14 Data type dictates type of analysis Data type Numerical Yes Means T-test Linear regression 29.05.2016 Normal data Categorical No Medians Non-par tests H.S. Freq table Cross, Chisquare Logistic regression 15 Continuous symmetrical data Check for normality graph /histogram(normal) debut. pplot debut /type=Q-Q /dist=normal. Normal Q-Q Plot of Age of 1. intercourse 160 30 140 120 20 Expected Normal Value 100 80 60 40 20 0 13.1 22.1 31.1 Std. Dev = 3.52 Mean = 18.1 0 N = 406.00 0 44.6 35.6 26.6 17.6 8.6 10 40.1 10 20 30 40 60 50 49.1 Observed Value Age of 1. intercourse Deviations form normal 29.05.2016 H.S. 17 Describe continuous data What is the distribution and the mean of weight? • Distribution graph/histogram weight • Describe descriptive weight Descriptive Statistics N Hva veide du sis t du veide deg? Valid N (listwise) 29.05.2016 6986 Minimum Maximum 0 999 Mean 60,35 Std. Deviation 15,794 6986 H.S. 18 Compare groups, equal variance? • Equal 2 29.05.2016 0 • Not equal 2 4 2 H.S. 0 2 4 19 Compare means Do boys and girls have the same average weight? • T-test – Analyze, Compare means, Independent-Samples T-test Does weight vary with social group? (3 or more groups) • Anova – Analyze, Compare means, One-Way ANOVA • Options, homogeniety of variance test 29.05.2016 H.S. 20 Test situations • 1 sample test • Weight =10 • 2 independent samples • Weight by sex • K independent samples • Weight by age groups • 2 dependent samples (Paired) • Weight last year = Weight today 29.05.2016 H.S. 21 Continuous skewed data Partners Percentiles: 25% 50% (median) 75% 90% Median 25., 50., 75. and 90. percentile Mean 0 2 5 10 2 partners 5 partners 10 partners 20 partners 20 30 40 50 Number of lifetime partners 29.05.2016 H.S. 23 Describe skewed data • Medians and percentiles – Analyze, Descriptive, Statistics=descriptives and percentiles, Plots=Box 29.05.2016 H.S. 24 Compare skewed distributions Do boys and girls have the same height? • 2 independent samples – Analyze, Compare means, Means, height by sex, Options=medians – Analyze, Non-parametric, 2 independent Samples, height by sex(1 2) • K independent samples – Analyze, Non-parametric, K independent samples 29.05.2016 H.S. 25 Categorical data Describe and compare categorical data Do boys and girls have the same educational plans? Frequency tables – Analyze, Descriptives, Frequencies • Crosstables – Analyze, Descriptives, Crosstabs, Row=plans, Column=sex, Stat=chi, Cells=column Syntax: freq plans. cross plans by sex /cells=col /stat=chi. 29.05.2016 H.S. 27 Table of descriptives Normal Numerical data Skewed Proportions Descriptives Center Dispersion Mean Standard deviation Median Fractiles p Confidence intervals for center estimates Standard error 95% Confidence interval 29.05.2016 se(mean) mean ± 2*se(mean) H.S. se(p) p ± 2*se(p) 28 Table of tests Numerical data Normal Skewed 1 sample One sample T-test Wilcoxon signed rank test 2 independent samples Independent sample T-test Mann-Whitney U K independent samples ANOVA Kruskal-Wallis 2 dependent samples Paired sample T-test Wilcoxon signed rank test 29.05.2016 H.S. Proportions Binomial Chi-square Chi-square Mc-Nemar (2x2) 29