Uploaded by Noah Chancellor

ML+Cheat+Sheet 2

advertisement
a lgorithm
d esc ription
A simple algorithm that models a linear
numerical output variable
Linear Models
A simple algorithm that models a linear
Logistic Regression
relationship between inputs and a categorical
output (1 or
0)
use cases
Stock price predictio
Predicting housing price
Predicting customer lifetime value
use cases
Credit risk score predictio
Customer churn prediction
Part of the regression family — it penalizes
Ridge Regression
features that have low predictive outcomes by
shrinking their coefficients closer to zero. Can
Part of the regression family — it penalizes
Lasso Regression
features that have low predictive outcomes by
shrinking their coefficients to zero. Can be used
for classification or regression
Decision Tree models make decision rules on
De
cision Tree
the features to produce predictions. It can be
used for classification or regression
An ensemble learning method that combines
Tree-Based Models
Supervised Learning
Random Forests
Gradient Boosting
Regression
the output of multiple decision trees
Gradient Boosting Regression employs boosting
to make predictive models from an ensemble of
weak predictive learners
Gradient Boosting algorithm that is efficient &
GBoost
X
flexible. Can be used for both classification and
regression tasks
A gradient boosting framework that is designed
LightGBM Regressor
>
a dva nta g es
di sa dva nta g es
Explainable metho
Assumes linearity between inputs and outpu
Interpretable results by its output coefficient
Sensitive to outlier
Faster to train than other machine learning
Can underfit with small, high-dimensional data
models
Interpretable and explainabl
Less prone to overfitting when using
Assumes linearity between inputs and output
Can overfit with small, high-dimensional data
regularizatio
Applicable for multi-class predictions
use cases
Predictive maintenance for automobile
Sales revenue prediction
be used for classification or regression
to be more efficient than other implementations
K-Means
approach—it determines K clusters based on
euclidean distances
Less prone to overfitting
Best suited where data suffer from
use cases
Predicting housing price
Predicting clinical outcomes based on
health data
use cases
Customer churn predictio
Credit score modelin
All the predictors are kept in the final mode
Doesn't perform feature selection
multicollinearit
Explainable
& interpretable
Less prone to overfittin
Can lead to poor interpretability as it can
Can handle high-dimensional dat
keep highly correlated variables
Explainable and interpretabl
Prone to overfittin
Can handle missing values
Sensitive to outliers
Reduces overfittin
Higher accuracy compared to other models
Training complexity can be high
Not very interpretable
Better accuracy compared to other
Sensitive to outliers and can therefore cause
No need for feature selection
Disease prediction
use cases
Credit score modelin
Predicting housing prices
use cases
Predicting car emission
Predicting ride hailing fare amount
use cases
Churn predictio
Claims processing in insurance
use cases
Predicting flight time for airline
Predicting cholesterol levels based on
health data
K-Means is the most widely used clustering
use cases
Customer segmentatio
Recommendation systems
regression model
overfittin
It can handle multicollinearit
Computationally expensive and has high
It can handle non-linear relationships
complexity
Provides accurate result
Hyperparameter tuning can be comple
Captures non linear relationships
Does not perform well on sparse datasets
Can handle large amounts of dat
Can overfit due to leaf-wise splitting and high
Computational efficient
Low memory usage
& fast training spee
sensitivit
Hyperparameter tuning can be complex
1. Scales to large datasets
1. Requires the expected number of clusters
2. Simple to implement and interpret
from the beginning
3. Results in tight clusters
2. Has troubles with varying cluster sizes and
Clustering
Hierarchical
Clustering
A "bottom-up" approach where each data
point is treated as its own cluster—and then
the closest two clusters are merged together
Gaussian Mixture
distributed clusters within a dataset
Models
Fraud detectio
Document clustering based on similarity
1. There is no need to specify the number
of clusters
2. The resulting dendrogram is informative
1. Doesn’t always result in the best clustering
2. Not suitable for large datasets due to high
complexity
use cases
Customer segmentatio
Recommendation systems
1. Computes a probability for an observation
1. Requires complex tuning
2. Can identify overlapping clusters
components or clusters
belonging to a cluster
2. Requires setting the number of expected mixture
3. More accurate results compared to K-means
Rule based approach that identifies the most
Apriori algorithm
use cases
iteratively
A probabilistic model for modeling normally
Association
Unsupervised Learning
densities
>
Top Machine Learning Algorithms
Linear Regression
relationship between inputs and a continuous
a ppli cation s
frequent itemset in a given dataset where prior
knowledge of frequent itemset properties is used
use cases
1. Product placements 2. Recommendation engines
3. Promotion optimization
1. Results are intuitive and Interpretable
1. Generates many uninteresting itemsets
2. Exhaustive approach as it finds all rules
2. Computationally and memory intensive.
based on the confidence and support
3. Results in many overlapping item sets
Download