Getting Started with IBM SPSS Stats 20 Presented by: Rosey Zackula Center for Biostatistical Collaboration Office of Research Tuesday, October 23, 2012 THE RESEARCH PROCESS (1) Turn idea(s) into research question(s) (3) (4) Design study and develop method(s) Write proposal (6) (7) Obtain approval Collect data (9) Evaluate implications (10) Report and disseminate findings http://www.rdinfo.org.uk/flowchart/Flowchart.html (2) Review the literature (5) Address funding issues (8) Analyze and interpret data COURSE OBJECTIVES I. Become familiar with the SPSS system II. Establish methods for reporting results III. Explore steps for analyzing a dataset IV. Demonstrate the analysis process* V. Hands on exercise* * All examples are from Windish and Diener-West, 2005 I. WHAT IS IBM SPSS STATS 20? • Comprehensive system for analyzing all types of data • Suite of software tools • Primary purpose • • • • • Generate tabulated reports Produce charts Plot distributions and trends Conduct descriptive statistics Perform complex statistical analyses IBM SPSS System • 16+ Analysis Tools – – – – – – – – – Statistics Base Advanced Statistics Bootstrapping Categories Complex Samples Conjoint Custom Tables Data Preparation Decision Trees • Cont’d – Direct Marketing – Exact Tests – Forecasting – Missing Values – Neural networks – Regression – Amos AND – SamplePower Manuals for SPSS SPSS Help and Support • Help features in SPSS • SPSS Manuals – S:\SPSS Manuals\SPSS 20 • IBM Corp. Web site – http://www.ibm.com/support • USENET discussion group – comp.soft-sys-stat.spss • Center for Biostatistical Collaboration II. ESTABLISH METHODS FOR REPORTING RESULTS • What style is required? – Check target journal for manuscript – Example JGIM • Modify SPSS output – Edit > Options • Output labeling • Adjust fonts/formats – Table and Charts • Copy special formats (for copying into manuscript: Word, Excel, etc.) III. STEPS TO ANALYZE DATA Step 1: Set up Step 5: Analyze Step 4: Describe Step 2: Inspect Step 3: Clean Step 1: Set up data • Enter manually • Import – Database • Excel, Access, REDCap, Text – Read text • Copy and paste (not advised) Data considerations • SPSS Types – Numeric, comma, dot, scientific notation, date, dollar, custom currency, string, restricted numeric (non-negative integer with leading zeros, i.e. 000001) • Values – Example: categorical variables • 0 = No; 1 = Yes • Measures – Nominal, Ordinal, Scale (data that can be multiplied with meaningful results) Step 2: Inspect data • Role – Input, Target, Both, None, Partition, Split • Defining variable properties – Data attributes – Custom tables • Multiple response: define variable sets (surveys) • Utilities – Variable information – Data file comments – Define/use variables sets Step 3: Clean data • Data preparation – Validation • Identify missing information, unusual cases, etc. – Restructure data • Transform (recoding) – Visual Binning – Create new variables • Compute, recode, replace missing values – Remember to define any new variable properties and attributes Step 4: Describe data Summarizing categorical data – Tables • Dichotomous: proportions • Nominal: relative frequencies (percent of total) • Ordered: median (interquartile range) – Graphs • Bar and pie charts – Percentages vs. Counts Step 4: Describe data (cont’d) Summarizing continuous (scale) data – Tables • Parametric or normal distribution: Mean (standard deviation) – Test of normality » Kolmogorov-Smirnov “vodka test” • Nonparametric test: 1-Sample K-S » K-S Lilliefors (by group) • Explore: Plots • Nonparametric: Median (interquartile range) – Graphs • Histograms, line charts, scatter plots Step 5: Analyze the data • What is the study design? • What is(are) the research question(s)? – Hypothesis to be tested • What is the outcome? – What shape/type of distribution? • Are any groups being compared? • What are the variables of interest? – What shape/type of distribution? – Predictors or confounders? IV. DEMONSTRATION • Study design: randomized clinical trial to evaluate 1-month curriculum • Compare performance of two unpaired groups (intervention and control) • Outcome: composite number of maneuvers performed correctly • Hypothesis 1 – Ho: Participants and controls do not differ in mean number of correct maneuvers V. YOUR TURN • Open SPSS – File > Open Database > New Query… • Database Wizard: Excel Files > Next • Browse… – – – – S:\Everyone\Intro_IBM SPSS_Oct 23 Intro IBM SPSS_Test data_Windish article.xlsx Retrieve all Fields > Next > Next GENDER: Recode to Numeric » Minimize string widths > Finish Your turn (cont’d) • Establish output options per author instructions from target journal – Edit > Options • Define Variables – Data > Define Variable Properties – Variable View • Type, Width, Decimals, Labels, Values, etc. • Choose hypothesis to test – See article Windish and Diener-West, 2005 Hypothesis 2, 3 or 4 (data are not the same; therefore, results may differ) Statistical Results THE GRANDMA CLAUSE You do not really understand something unless you can explain it to your grandmother. Albert Einstein QUESTIONS??? Reference • A Clinician-Educator’s Roadmap to Choosing and Interpreting Statistical Tests. Windish and Diener-West, 2005 – http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1924630/ – Important downloads from the article: • Table 1: Questions to Consider When Selecting the Appropriate Statistical Test • Appendix A: Diagrammatic Approach to Choosing Summary Measures, Statistical Tests and Methods • Appendix B: Glossary of terms