Uploaded by Fredrik.larsson108

Bullet points

advertisement
Hausman Test: "Assuming that time-invariant characteristics are unique to each
entity: every entity is distinct! (the entity error term and the constant should not be
correlated with each other) → to be assessed using the Hausman test."
Moderation with Continuous Variables - the Need for Standardization:
Statistical Concepts and Techniques:
•
•
•
•
•
•
•
Explore confounding factors and control variables – Watch a YouTube video
on this.
Understand linear regression assumptions – Refer to a YouTube video or
commit to memory.
Differentiate between parametric tests (e.g., Sobel) and non-parametric tests
(e.g., Bootstrapping).
Examine moderating effects, such as X1:X2 relationships.
Clarify the meaning of "parametric test."
Grasp the power of the test, 2 by 2 matrices, etc. (Seminar 3 Q7).
Delve into fixed effects, bias-variance trade-off, accuracy, sensitivity, specificity,
decision trees, and Gini impurity.
Machine Learning Insights:
•
Acknowledge that the number of categories corresponds to the number of
clusters.
Decomposing Time Series:
•
Understand two methods: using a linear model (for no changes in variance)
and multiplication (for changes in variances).
Assumptions in Statistical Techniques:
•
•
•
•
Note that moderation follows the same principles as normal OLS, but
standardization (centering and dividing by standard deviation) is crucial.
For mediation, follow OLS principles, and ensure the indirect effect is normally
distributed (important for Sobel test, less so for Bootstrap).
Treat time series uniformly.
Acknowledge cross-sectional independence and autocorrelation in panel data.
Machine Learning Categories:
•
Differentiate between unsupervised and supervised learning.
Panel Data:
•
Recognize it as a linear model + time series with specific additions.
Sobel Test and Moderating Effects:
•
Familiarize yourself with equations and the inclusion of fixed effects.
OLS Assumption and Cross-Sectional Dependence:
•
•
Acknowledge the need for residuals to be constant at 0 in OLS equations.
Be cautious about cross-sectional dependence, particularly with long time
series and serial correlation in idiosyncratic errors.
Machine Learning Algorithms:
•
Understand Sobel and Bootstrap in the context of statistical analysis.
Gini Impurity and Gini Coefficient:
•
•
Grasp the concept of Gini impurity measuring the likelihood of incorrect
labeling in a set.
Discuss how the algorithm works, how Gini is calculated, and explore local
optima. Consider mentioning random forests.
Clustering and Optimal Number of Clusters:
•
•
•
Recognize the potential disparity between Silhouette and Elbow methods in
determining the optimal number of clusters.
Provide guidelines for reconciling or choosing between these methods,
emphasizing context, cluster stability, external validation, and expert judgment.
Highlight the importance of considering the specificities of the dataset and
task in choosing an optimal number of clusters.
Download