Forms to Spreadsheets A-Team Spring Brown Bags February 7, 2014 Jennifer Lowman Coordinator, Student Persistence Research University of Nevada, Reno Outline • • • • • • • Excel Basics Form Preparation Naming Variables Coding Data Entering Data Cleaning Data Analyzing Data Excel Basics • Workbooks & Worksheets • 3 Parts to a Spreadsheet – Rows (numbered) – Columns (alphabetized) – Cells (combo, H7) Excel Basics • Row Cases • Columns Variables • Cells Data Form Preparation • Case IDs • Variable Names • Data Codes Case IDs & Variable Names are Unique Identifiers It is the combination of the two that make your data meaningful. Case IDs • Best Practices for Case IDs – Unique, Meaningful, Confidential, & Stable – Put the Case ID on every form • Major considerations – How many times are you collecting data from each person? – Do you need institutional data? – Do you need to protect confidentiality? • Employees v. Participants (mandatory v. voluntary) – Are any of the data sensitive? Rules of Thumb • Once, with no need for institutional data – Sequential numbers, with random start (randomize forms before numbering) • More than once, no need for institutional data? – Use something meaningful to respondent – Sample size may challenge uniqueness • Need institutional data? – Use something meaningful to you Trade-Offs • Meaningful for participant – Easy to remember (stable) – Might not be confidential – May or may not link to institutional data • Meaningful to you – Not easy to remember (not stable) – Promotes confidentiality – May need a key, risks to confidentiality – Promotes link to institutional data Put the Case ID on Everything • • • • Every Form, Every page Double Check Back Track Multiple Coders Naming Variables • Best Practices for Variable Names – Unique & Meaningful Abbreviations – Short Standard 8 characters • Excel can handle more, but your column size will increase – Start with letter, not # • Mnemonic strategy vs. Question Number – workhrs vs. Question1 (q001) – Mnemonic, one-time projects, with one person handling data – Question Numbers, repeated or large projects, multiple people handling the data I use both • Meaningful Abbreviations – Less meaningful… question1 or q001 – More meaningful… q1reshall • Avoid generic, be specific – What can you expect to find for q1reshall? • Names of residence halls (Nye, Lincoln, White Pine…) • Codes for residence halls (1 = Nye, 2 = Lincoln…) • Lives in a residence hall (0 = no, 1 = yes) Meaningfulness is tied to your coding!! • 1’s and 0’s 0 = no, does not have characteristic 1 = yes, has the characteristic • sex vs. female – what does a “0” mean? – what does a “1” mean? • race vs. white? Coding Data • Categorical Data – Two Categories, use 0’s and 1’s, variable name should be your reference group – Three or more categories • Nominal - No meaningful numerical difference between categ. “dummy code,” instead of one variable “race,” make five variables 0 = not Asian, 1 = Asian 0 = not Black, 1 = Black 0 = not Hispanic, 1 = Hispanic 0 = not Native American, 1 = Native American 0 = not White, 1 = White Coding Data (cont.) • Categorical Data – Three or more categories with a meaningful, numerical difference between categories – Academic Level 1 = Freshman 2 = Sophomore 3 = Junior 4 = Senior 5 = Second Degree 6 = Masters Student 7 = PhD or Professional Medical 10 = 20 = 30 = 40 = 50 = 60 = 70 = 0 16 33 50 66 83 100 = 13 (years) = 14 = 15 = 16.4 = 17 = 18 = 22 Coding Data • Many types of coding you do when you create your survey – How committed are you to Nevada? • 1 = not committed at all … 7 = Extremely committed – Even if it is not perfect, enter that information in your spreadsheet • Then RECODE it into a NEW VARIABLE • Never throw information out • Always have a system to check your codes – Enter “Race” Then Recode (Dummy Code) Entering Data • Enter it exactly • Recode anything that can be “quantified” into new variables • Missing Data – Leave it Blank – If you must, use an extreme number, something way out of range (-999) Qualitative Data • • • • • Content Analysis (Implicit Quantification) Identify themes, categories, patterns Start Broad Get multiple perspectives Narrow it down to a manageable number of themes • Count Enough Talk Let’s Do It