API-208: Stata Review Session Daniel Yew Mao Lim Harvard University Spring 2013

advertisement
API-208: Stata Review Session
Daniel Yew Mao Lim
Harvard University
Spring 2013
Roadmap
Getting
Started
Importing
Data
Data
management
Data analysis
Programming
Getting Started: Orientation
REVIEW WINDOW:
past commands
appear here
VARIABLES
WINDOW:
variable list
shown here
COMMAND
WINDOW:
commands
typed here
RESULTS WINDOW: results and
commands displayed here
Getting Started: Syntax
Getting Started: Syntax Example
Getting Started: Syntax Example
Getting Started: Useful Commands I
if
by
help
in
sum
ssc install
Getting Started: Useful Commands II
Arithmetic Operators
•
•
•
•
•
“+” addition
“-” subtraction
“*” multiplication
“/” division
“^” power
Getting Started: Useful Commands III
Relational Operators
•
•
•
•
•
•
•
“>” Greater than
“<” Less than
“>=” Equal or greater than
“<=” Equal or less than
“==” Equal to
“~=” Not equal to
“!=” Not equal to
Getting Started: Useful Commands IV
Logical (Boolean) Operators
• “&” = and
– Example: A & B
A
B
• “|” = or
– Example: A | B
A
B
Getting Started: Example
Getting Started: Worked Example
Average share of ADB loans during first and second years on UNSC
Between 1985 and 2004
Average share of ADB loans during first and second years on UNSC
Between 1985 and 2004, for each country
Getting Started: Creating Do-files
Text file containing all commands relevant to analysis
Useful for batch processing
Getting Started: Creating Do-files
Getting Started: Commenting in Do-files
*
Ignore stuff written on this line
/* Text
Here*/
Ignore stuff written in between
Getting Started: Commenting in Do-files
Importing Data: Data Types
Stata Data
.xls
.csv
Data Management: Data Structure
Crosssectional
Time-series
Panel
Data Management: Datasets
• merge: add variables across datasets.
• append: add observations across datasets.
• reshape: convert data from wide/long or
long/wide
• rename: change the name of a variable.
• drop: eliminate variables or observations.
• keep: keep variables or observations.
• sort: arrange into ascending order.
Data Management: Missing Data
Recode
List-wise
deletion
Multiple
Imputation
Data Management: Outliers
Impossible
values
Extreme
values
Logarithmic
function
Data Management: Modifying Data
 generate: create new variable.
 replace: replace old values.
 recode: change values by conditions.
 label define: defines value labels (or
“dictionary”).
 label values: attaches value labels (or
“dictionary”) to a variable.
Data Analysis: Exploring Data
•
•
•
•
•
summarize: descriptive statistics.
codebook: display contents of variables.
describe: display properties of variables.
count: counts cases.
list: show values.
Data Analysis: Analyzing Data
• tabstat: tables with statistics.
• tabulate: one- or two-way frequency tables
(related: tab1 and tab2).
• table: calculates and displays tables of
statistics.
Data Analysis: Worked Example
Exercise 1: Create an aidsize variable with three
categories based on the amount of ADB loans received
(adbconstant): small (0 to 99), medium (100 to 999),
and large (1000 or more). Include labels.
Data Analysis: MLE
•
•
•
•
•
regress: standard OLS.
Probit/logit: binary dependent variable.
oprobit: ordered probit regression.
ologit: ordered logistic regression.
xtreg: fixed, between, and random effects, and
population averaged linear models.
• xtregar: fixed and random effects models with
AR(1) disturbance.
Data Analysis: Matching
• psmatch2: propensity score matching.
• cem: coarsened exact matching.
Data Analysis: Interpreting Coefficients
Programming
Conclusion
Pattern
recognition
Self-learning
Programming
Q&A
Thank you!
Download