ECO 725-01: Data Methods in Economics - Fall 2014 Chris Swann Course

advertisement
ECO 725-01: Data Methods in Economics - Fall 2014
Chris Swann
446 Bryan Building
email: chris_swann@uncg.edu
Office Hours: Open Door or By Appt.
Course meeting time: MW 930-1045
Location: Bryan 211
GA: Qing Shi – office hours TBA
email: q_shi@uncg.edu
Description
Econ 725 is a three-credit course in which students learn to work with large data sets using the SAS
programming language. In this course we will explore how to manipulate data (including reading,
writing, and combining data files), how to prepare data for research purposes (including variable
construction, sample selection, and issues related to missing data), and how to conduct basic data
analysis. We will pay attention data quality and how to deal with so-called “dirty” data.
Student Learning Outcomes
On completion of this course, students will have:
1) learned practical procedures for working with data;
2) learned the basics of the SAS programming language; and
3) conducted descriptive research with a large data set.
Procedures
ECO 725 will meet twice per week from 0930 to 1045 on Tuesday and Thursday for the entire
semester. We will typically meet in Bryan 211 though some days we may meet in Bryan 456. The
school prohibits food and drink from the computer classrooms. Students are expected to follow the
classroom discussion and exercises and to refrain from other activities, such as web-surfing, emailing, and game-playing, during class.
Your grade will be determined by a series of homework assignments (40% of grade), a midterm
(20% of grade) and a final project (40% of grade). Please note that assignments must be turned in
when they are due. Late assignments will not receive any credit, unless prior arrangements have
been made with the instructor.
Software
The primary software package for this class is SAS. SAS is installed in the UNCG computer labs.
SAS licenses for personal computers are available for UNCG students through ITS. To begin the
license process, connect to https://web.uncg.edu/research-access/secure/sas/sas.asp. We will also
occasionally use Stata and Excel though no specific knowledge of Stata is required, and if you have
an alternate preferred software package (e.g., SPSS) you should be able to use that as well.
Strongly Recommended Books
Delwiche, Lora D. and Susan J. Slaughter. 2012. The Little SAS® Book: A Primer, Fifth Edition,
Cary, NC: SAS Institute Inc. (LSB below)
Cody, Ron. 2008. Cody’s Data Cleaning Techniques Using SAS®, Second Edition. Cary, NC: SAS
Institute Inc. (C below)
Additional Books
You could amass a significant library of books about SAS. I list below a couple that may be handy
for this class.
Cody, Ron. 2007. Learning SAS by Example: A Programmer’s Guide. Cary, NC: SAS Institute Inc.
Comment: this is similar to The Little SAS book.
DiIorio, Frank C., 1991. SAS Applications Programming: A Gentle Introduction, Duxbury Press.
Comment: This is an old one but still good for the basics.
More documentation than you can imagine is available at
http://support.sas.com/documentation/93/index.html. This is the link for SAS 9.3 which is what I
believe is available in the labs and from ITS.
Additional Readings
These will be made available on Blackboard. These include but are not necessarily limited to
Burns, S. 2013. “When Data and Reality Don’t Match.” in Q. McCallum (ed.) Bad Data Handbook.
O’Reilly.
Cody, R. 2011. “Longitudinal Data Techniques: Looking Across Observations.” Paper 265-2011.
Christen, P. 2012. “Data Pre-Processing.” in Data Matching: Concepts and Techniques for Record
Linkage, Entity Resolution, and Duplicate Detection. Springer-Verlag.
Dasu, T. 2012. “Data Glitches: Monsters in Your Data.” in S. Sadiq (ed.) Handbook of Data
Quality. Springer-Verlag.
Harrington, T. “An Introduction to SAS PROC SQL.” Paper 70-27.
Herzog, T., F. Scheuren, and W. Winkler. 2007. “Basic Data Quality Tools.” in Data Quality and
Record Linkage Techniques. Springer-Verlag.
Herzog, T., F. Scheuren, and W. Winkler. 2007. “Automatic Editing and Imputation of Sample
Survey Data.” in Data Quality and Record Linkage Techniques. Springer-Verlag.
Kalt, M. and C. Zender. 2011. “Introduction to ODS Graphics for the Non-Statistician.” Paper 2942011.
Li, A. 2013. “Essentials of the Program Data Vector: Directing the Aim to Understanding the Data
Step.” Paper 125-2013.
Li, A. 2011. “The Essence of Data Step Programming.” Paper 269-2011.
Pool, G. 2012. “Common Sense SAS – Documenting and Structuring Your Code.”
Ronk, K. “Introduction to Proc SQL.” Paper 268-29.
Schwabish, J. 2013. “Subtle Sources of Bias and Error.” in Q. McCallum (ed.) Bad Data Handbook.
O’Reilly.
Tian, S. 2009. “LAG - the Very Powerful and Easily Misused SAS® Function.” Paper 55-2009.
Vaisman, M. 2013. “The Dark Side of Data Science.” in Q. McCallum (ed.) Bad Data Handbook.
O’Reilly.
Zender, C. 2013. “Macro Basics for New SAS® Users.” Paper 120-2013.
SAS Certification
A number of levels of SAS Certification are available. To become certified with the SAS Basic
Programmer for SAS 9 credential, you must pass a exam that covers many of the areas of
programming that we will use. Information on Basic Programmer certification is available at
http://support.sas.com/certify/creds/bp.html. Because of the overlap in coverage, you are
encouraged to consider studying for and taking this exam. Note, however, that this is not a test prep
class, and we will cover some topics in more detail than may be necessary for the exam while others
included on the exam may not be covered at all. If you are thinking of taking the certification exam,
you should consider the prep guide:
SAS Publishing, SAS® Certification Prep Guide: Base Programming for SAS 9, Second Edition,
Cary, NC: SAS Institute Inc., 2009.
Research Integrity
Students are expected to be familiar with and abide by the University’s Academic Integrity policy
(see http://academicintegrity.uncg.edu/). In particular, students may be expected to work
independently on homework assignments and are expected to work independently on the project.
Assistance will be available from the instructor and teaching assistant.
Tentative Outline
Date
Aug 19
Topic
Introduction to data analysis
Aug 21
Introduction to SAS
Aug 26
Before you begin: understanding
your data
Reading data Into SAS and basic
SAS procedures (e.g., proc
contents, proc print, proc means,
proc univariate, proc freq.)
(Numeric) Variable construction:
What do you want to create and
how do you do it?
Character and date variables
Making your job easier:
Introduction to Macros
Data verification (numeric,
character, and dates)
Output Delivery System
Graphing data
Aug 28
Sept 2
Sept 4
Sept 9
Sept 11
Sept 16
Sept 18
Sept 23
Sept 25
Sept 30
Oct 2
Oct 7
Oct 9
Oct 16
Oct 21
Oct 23
Oct 28
Oct 30
Nov 4
Nov 6
Nov 11
Nov 13
Nov 18
Nov 24
TBA
Putting it together: understanding
and characterizing your data
Midterm
Missing data: why it exists, how to
find it, and what to do.
Mechanics of the data step: what
is actually going on?
Debugging your programs
Combining data sets
Getting data out of SAS (e.g., text
files, spreadsheets, Stata files)
Repeated observations and
longitudinal data
Estimation in SAS: linear and
logistic regression
More estimation in SAS: probit,
selection models, ordered models,
and panel data
Proc SQL
Reading
“Data Glitches: Monsters in Your Data”
“Subtle Sources of Bias and Error”
“The Dark Side of Data Science”
LSB: Chapter 1
“Common Sense SAS – Documenting and
Structuring Your Code”
“When Data and Reality Don’t Match”
Assignment
Hand Out HW1
LSB: Chapter 2, 4, 9
LSB: Chapter 3
Collect HW1
Hand Out HW2
LSB: Chapter 7
Collect HW2
Hand Out HW3
C: Chapters 1, 2, 4
“Basic Data Quality Tools”
LSB: Chapter 5
LSB: Chapter 8
“Introduction to ODS Graphics for the NonStatistician.”
TBA
C: Chapter 3
“Automatic Editing and Imputation of
Sample Survey Data”
Collect HW 3
Hand Out Project
Hand Out HW 4
“Essentials of the Program Data Vector”
LSB, Chapter 11
“Data Pre-Processing”
LSB: Chapter 6
C: Chapter 6
LSB: Chapter 10
Collect HW4
Hand Out HW5
“Longitudinal Data Techniques”
“The Essence of Data Step Programming”
C: Chapter 5
LSB: 9.10
TBA
Collect HW 5
“An Introduction to SAS PROC SQL”
“Introduction to Proc SQL”
Summary/Catch-up
Final Project Due
Download