Data Analysis: Analyzing, Visualizing & Understanding Data Overview

advertisement
Data Analysis: Analyzing, Visualizing & Understanding Data
Year 2: Terms I & II (1.0 credit units)
Overview
This module aims to build skills in multivariate regression analysis using a
variety of modelling techniques: linear, limited dependent, panel, time-series
and longitudinal models. Students will be proficient users of statistical
software and be able to identify, analyze and interpret regression output as
well as present data visually. The module teaches skills that students can
apply across a range of jobs—in the public, private and third sectors.
Emphasis is placed on using real-world data, ‘hands-on’ lab sessions,
analysis, interpretation and visualisation.
Term 1
1. Review of linear regression
• Theory and practice of simple linear regression and how to interpret the
output.
• The components of a simple linear regression model.
2. R – your key to the world!
• The key features of the R statistical programming environment.
• The benefits of scripting your analysis.
3. Multiple regression 1
• Theory and implementation of an extension to simple linear model by
adding multiple explanatory variables.
• The basic assumptions underlying the multiple linear regression model
such as collinearity, outliers/leverage and correlated residuals.
4. Multiple regression 2
• Extending the multiple regression model further by including
explanatory dummy variables for nominal/ordinal categories.
• The lecture also covers interaction effects where one term is
modified according the level of another in the model.
5. Multiple regression 3
• Heteroscedasticity
• FGLS
6: Why spatial data are special
• The features of spatial data.
• Examining the importance of location in data analysis
7. Data on the web – Panning for gold in the 21st century
• The possibilities and pitfalls of datastores
• APIs and web scraping
8. Data management (SQL, database queries)
•
•
Processing and storing large datasets.
Database basics.
9. Data visualization best practice
• What are the features of a good graphic?
• The power of maps
10. Sharing insights
• Communicating with a wider audience
• Explaining uncertainty
Term 2
11. Limited dependent variable models (1)
• Most important statistical theory for a broad range of models
• Maximum likelihood estimation
12. Limited dependent variable models (2)
• Logit/probit: Models for binary dependent variables
• Models explaining individual choices
13. Simulating uncertainty
• Creating uncertainty measures for predictions
• Simulation based solution to complex uncertainty problems
14. Limited dependent variable models (3)
• Choice models when an individual faces several options (example:
Party choice)
• Multinomial logit
15. Repetition week
16. Multilevel modeling (1)
• Models accounting for complex data structure
• Partial pooling and the power of random effects
17. Multilevel modeling (2)
• Varying slopes
• Non-nested models
• Multilevel, generalised linear models
18. Longitudinal & panel data methods
• Assumptions and violations
• Correcting standard errors, controlling for autocorrelation
19. Longitudinal and Panel Data Methods 2
• Fixed effects vs random effects
•
Dynamic models with lagged dependent variables
20. Revision
Download