Forms to Spreadsheets - University of Nevada, Reno

advertisement
Forms to Spreadsheets
A-Team Spring Brown Bags
February 7, 2014
Jennifer Lowman
Coordinator, Student Persistence Research
University of Nevada, Reno
Outline
•
•
•
•
•
•
•
Excel Basics
Form Preparation
Naming Variables
Coding Data
Entering Data
Cleaning Data
Analyzing Data
Excel Basics
• Workbooks & Worksheets
• 3 Parts to a Spreadsheet
– Rows (numbered)
– Columns (alphabetized)
– Cells (combo, H7)
Excel Basics
• Row  Cases
• Columns Variables
• Cells  Data
Form Preparation
• Case IDs
• Variable Names
• Data Codes
Case IDs & Variable Names are Unique Identifiers
It is the combination of the two that make your
data meaningful.
Case IDs
• Best Practices for Case IDs
– Unique, Meaningful, Confidential, & Stable
– Put the Case ID on every form
• Major considerations
– How many times are you collecting data from each
person?
– Do you need institutional data?
– Do you need to protect confidentiality?
• Employees v. Participants (mandatory v. voluntary)
– Are any of the data sensitive?
Rules of Thumb
• Once, with no need for institutional data
– Sequential numbers, with random start
(randomize forms before numbering)
• More than once, no need for institutional
data?
– Use something meaningful to respondent
– Sample size may challenge uniqueness
• Need institutional data?
– Use something meaningful to you
Trade-Offs
• Meaningful for participant
– Easy to remember (stable)
– Might not be confidential
– May or may not link to institutional data
• Meaningful to you
– Not easy to remember (not stable)
– Promotes confidentiality
– May need a key, risks to confidentiality
– Promotes link to institutional data
Put the Case ID on Everything
•
•
•
•
Every Form, Every page
Double Check
Back Track
Multiple Coders
Naming Variables
• Best Practices for Variable Names
– Unique & Meaningful Abbreviations
– Short Standard 8 characters
• Excel can handle more, but your column size will increase
– Start with letter, not #
• Mnemonic strategy vs. Question Number
– workhrs vs. Question1 (q001)
– Mnemonic, one-time projects, with one person handling
data
– Question Numbers, repeated or large projects, multiple
people handling the data
I use both 
• Meaningful Abbreviations
– Less meaningful… question1 or q001
– More meaningful… q1reshall
• Avoid generic, be specific
– What can you expect to find for q1reshall?
• Names of residence halls (Nye, Lincoln, White Pine…)
• Codes for residence halls (1 = Nye, 2 = Lincoln…)
• Lives in a residence hall (0 = no, 1 = yes)
Meaningfulness is tied to your coding!!
• 1’s and 0’s
0 = no, does not have characteristic
1 = yes, has the characteristic
• sex vs. female
– what does a “0” mean?
– what does a “1” mean?
• race vs. white?
Coding Data
• Categorical Data
– Two Categories, use 0’s and 1’s, variable name
should be your reference group
– Three or more categories
• Nominal - No meaningful numerical difference between
categ.
“dummy code,” instead of one variable “race,” make five
variables
0 = not Asian, 1 = Asian
0 = not Black, 1 = Black
0 = not Hispanic, 1 = Hispanic
0 = not Native American, 1 = Native American
0 = not White, 1 = White
Coding Data (cont.)
• Categorical Data
– Three or more categories with a meaningful,
numerical difference between categories
– Academic Level
1 = Freshman
2 = Sophomore
3 = Junior
4 = Senior
5 = Second Degree
6 = Masters Student
7 = PhD or Professional Medical
10 =
20 =
30 =
40 =
50 =
60 =
70 =
0
16
33
50
66
83
100
= 13 (years)
= 14
= 15
= 16.4
= 17
= 18
= 22
Coding Data
• Many types of coding you do when you create
your survey
– How committed are you to Nevada?
• 1 = not committed at all … 7 = Extremely committed
– Even if it is not perfect, enter that information in
your spreadsheet
• Then RECODE it into a NEW VARIABLE
• Never throw information out
• Always have a system to check your codes
– Enter “Race” Then Recode (Dummy Code)
Entering Data
• Enter it exactly
• Recode anything that can be “quantified” into
new variables
• Missing Data
– Leave it Blank
– If you must, use an extreme number, something
way out of range (-999)
Qualitative Data
•
•
•
•
•
Content Analysis (Implicit Quantification)
Identify themes, categories, patterns
Start Broad
Get multiple perspectives
Narrow it down to a manageable number of
themes
• Count
Enough Talk
Let’s Do It
Download