~Drowning in Data~ SPSS Data Analysis 3/26/12 Sumiko Takayanagi, Ph.D. Sr. Statistician UCLA School of Nursing 1 Today’s Presentation SPSS Environment Review of SPSS Basics Inferential Statistics in SPSS Independent t-test Two-Way Analysis of Variance Multiple Regression Conclusion References 2 Features of SPSS Originally developed for the people in Social Science Areas, therefore, no heavy programming background required Designed as User Friendly and has Pull Down Menus to Execute Statistical Commands Ability to do Data Management & Manipulations Ability to Store Programs & Produce Reports/Graphs 3 SPSS Program Flow Outside Data Source Raw Data SPSS Data File Data Modification/ Transformation Data Analysis Pull-Down Menu OR Syntax Menu (Data Steps) (Analysis Steps) 4 Data View Window - Data Entry Site (Columns=Variables, Rows=Cases) Help Menu Pull-down Menu bar Tool bar Information bar Title bar Variable Names Data View window Active cell Action bar 5 Variable View Window Data Definition Site 64 Characters Max, No space Between Beg letter, @, #, or $ Numeric, String, & Others Length # of Decimals Variable Description Value Code Description Click here to see this view Missing value Description 6 Before we see Examples… OK VS. Paste buttons <Output File> 1. OK - results/action will be executed 7 1. Hit Paste to obtain Syntax Window 2. Run Syntax to obtain the results in the Output Window <Syntax File> 8 Example - School Data Raw Data Subject 1 Subject # Female Intensive Reading Math (1) (1) (1) (90) (67) Subject 2 Subject # Female Moderate Reading Math (2) (1) (2) (72) (46) Subject 3 Subject # Male Basic Reading Math (3) (0) (3) (41) (73) 9 School Data Variable View Variable View Activated 10 School Data Completed Dataset – Data View 11 School Data Completed Dataset – Variable View 12 Importing Excel Data file to SPSS 1. Open the SPSS Data file 2. Go to File Menu 3. Click “Read Text Data” 4. Click Files of type to Excel & choose Excel file 5. Hit Open 6. Check Worksheet #, Variable on the 1st row, & Hit OK 13 School Data Completed Dataset – Data View 14 Click to Obtain Data File Information 15 Variable Information 16 Value Code Information 17 Basic Statistical Methods Independent t-test Two-Way ANOVA Multiple Regression 18 Independent t-test – Is there a significant difference between 2 groups? Assumptions 1. Normality 2. Variance Equality # of Variables Characteristics Dependent = 1 Continuous Independent = 1 Categorical 2-levels 3. Independence School Data N=100 Math Score Range of 0-100 Gender 19 How to calculate t-value? t-value= Mean Difference Group Variability 20 t-test Medium Variability High Variability Low Variability 21 Independent t-test 1. Go to Analyze. 2. Choose Compare Means. 3. Choose Independent Samples t Test. 22 t-test 1. Choose Dependent & Independent Variables. 23 Descriptives & Analysis Independent Variable Dependent Variable Variance Equality Test t= Z1 – Z2 SD12 + SD22 N1 N2 = t - statistics 63.20 – 54.10 (13.914)2 +(13.064)2 41 = t= Mean Diff Std. Error Diff 9.093 = 3.295 2.760 59 24 Conclusion & Chart There is a significant difference in math ability between males and females. Males performed better than females. 25 Factorial ANOVA – Is there any main or the interaction effects? Assumptions 1. Normality 2. Variance Equality # of Variables Characteristics Dependent = 1 Continuous Independent >1 Categorical2 or more levels 3. Independence School Data N=100 Math Score 0-100 Gender Program Type 26 2 x 3 Factorial ANOVA Design Diagram Gender Male Female Mild 56, 86, 70, 69, ….. 55, 72, 67, 48, ….. Moderate 86, 59, 67, 80, ….. 63, 78, 55, 46, ….. Intensive 89, 92, 86, 71, ….. 72, 76, 54, 56, ….. Program Math Test Scores 27 2-Way Factorial ANOVA 1.Go to General Linear Model & choose Univariate. 2. Choose One Dependent & Two Independent Variables. 28 Factorial ANOVA (2x3) Descriptives 1. Freq of IV and Raw Means 2. Equality of Variance Test 29 Factorial ANOVA Main Analysis Main Effects & Interaction Results: Main effect – Sig. difference in gender and in program type. Interaction – Sig. interaction between gender and program type. 30 Factorial ANOVA Multiple Comparison Which levels are actually Different ?? Scheffe & LSD Methods Sig. Different level 31 Factorial ANOVA Conclusion Significant Effects Males performed better than females. Students in the Intensive program performed better than in the Mild program. Males in the Intensive program performed better than in other programs, but no performance difference in females. 32 Multiple Regression – Which IVs can predict the DV and to estimate the effects of these variables on DV? Assumptions 1. Normality # of Variables 2. Variance Equality 3. 4. Linear Independence Relationship Characteristics Health Survey Data N=100 Dependent =1 Continuous Independent > 1 Continuous or Dichotomous (0 or 1) Variables LDL Value 0-200 HT, WT, BMI, & Exercise 33 Multiple Regression Diagram HT DV WT LDL IV BMI Exercise All 4 IVs are predicting LDL 34 Health Survey Data of N=100 35 Multiple Regression 1.Choose Regression 2. Choose Linear Regression 36 2. Choose Statistics you need. 1. Choose DV, IV, & Method. 3. Choose Residual Plots. 37 Descriptives & Correlation Tables Descriptive Stats. Correlation Coefficients & corresponding p-values. 38 Main Analysis R2=how much of the variability in the outcome is accounted for by the predictors (regression sum of squared/total sum of squares) Adj. R Sq=Adj for the # of Parameters in the model R=r between pred and observ value of the DV Global test to see if any coefficient is different from “0” B=Reg Coefficient t & Sig=IV predictability Partial/Part Correlations Tolerance &VIF Beta=Stdized. Reg Coefficient. Something is Wrong if Beta >1!! 39 Residual Analysis Residual Normality Linearity and Equal Variance & residual independence 40 Conclusion Multiple Regression IVs explain about 40% of the variability of LDL level. The significant predictors of LDL were BMI and Hrs of Exercise. The collinearity statistics didn’t show exceptionally large multicollinearity among predictors. Assumptions of residual normality and equal variance were met. 41 Key Concepts Statistical Models depend on the theory and data. Choose your model wisely to see if it can answer your research questions. Check Assumptions. Model conclusions may not be valid unless the assumptions were met. If not, use appropriate corrections, do data transformations, or even use other statistical methods. 42 Conclusions Statistical judgments come into our daily lives. Statistics are more than mathematical calculations or scientific research, but they are the way of logical thinking… Thank you 43