Uploaded by Royi Rassin

Practical Machine Learning Course Syllabus

26/01/20 :‫תאריך עדכון‬
:‫שם ומספר הקורס‬
‫יישומים בלמידה חישובית‬
Practical topics in Machine Learning
‫ הרצאה‬:‫סוג הקורס‬
2 :‫היקף שעות‬
:‫שנת לימודים‬
:‫אתר הקורס באינטרנט‬
We explore different use cases in which machine-learning algorithms are used to handle reallife problems involving large amounts of data items of different formats. We put a special focus
on medical-related problems.
After a brief introduction to machine learning, we will get familiar with some of the most
common technologies that are being used in practice. We will get to know the relevant libraries
and platforms that provide tools for processing and cleaning data of different types, including
unstructured volumes like images and texts. Our main coding language is Python; Among the
relevant libraries that we will work with during the course are: numpy, pandas, sklearn,
XGBoost, LGBM, pytorch, and more. Every class is divided into a preliminary part, in which we
will present the relevant theory of the technology in focus, followed by a practice session,
which includes presentation of code examples. During the course, we will show a number of
case studies, in which we will present a few recent published works and projects, and study
their machine-learning problem, data, and their proposed technology.
We will cover various topics in machine learning, including, decision trees, random forest,
neural nets, boosting and more. However, our focus will be put more on the practical side;
therefore, a background in machine learning is required, and the introductory course for
machine learning is a prerequisite for taking this course.
Class breakdown:
Background in machine learning: A quick
reminder of linear/logistic regression and
evaluation philosophy
Teaching material:
Slides 1
Slides 2
Notebook - numpy, pandas
Notebook - sklearn, bike sharing data
Notebook - linear regression
Notebook - logistic regression, ROC, AUC
Feature handling,
data exploration
Case study
Feature cross and non-linear regression,
handling different feature types (numeric,
ordinal, categorical, string), exploring a
dataset (types of visualization:
contingency tables, normal/scatter plots,
box plots), data imputation
Teaching material:
Slides 1
Slide2 2 (TBD: data exploration,
Notebook - one hot encoding
Notebook - feature cross
Notebook - feature cross on bike sharing
Case study (practical notebook):
Notebook (TBD) using the following
Topics that will be covered: encoding
different types of features, data
exploration (contingency tables, different
distribution of features, and getting
intuition about what to look at in the data),
data cleansing (imputation)
Overfitting and
Variance/bias, feature selection, L1/L2
2 Case studies
Teaching material:
Slide 1
Notebook - regularization
Ex 1 (out)
Case study (paper):
Development and validation of a
predictive model for detection of
colorectal cancer in primary care by
analysis of complete blood counts: a
binational retrospective study
Case study (practical notebook):
TBD - a notebook about tuning
regularization parameter (inspired by
classification, intro to
neural networks
Basic image representation, convolution,
max entropy (softmax) classifier, Intro to
Feed forward networks, conv nets,
introduction to Pytorch, with examples
Ex 1 (in)
Ex 2 (out)
Teaching material:
Slides 1
Slides 2
Slides 3 (intro to feed forward and conv
nets- TBD)
Slides 4 GPU vs. CPU
Notebook - image classification
Notebook - pytorch tutorial
Notebook - simple FF network
Notebook - CIFAR10 with conv net
Case studies
Presentation of two (or more) works,
using deep learning to predict medical
1. International evaluation of an AI
system for breast cancer
screening, Nature 2019
2. A clinically applicable approach to
continuous prediction of future
acute kidney injury, Nature 2018
Predicting with trees
Case study
Trees, random forest, bagging, boosting
(AdaBoost, Gradient boosting, XGBoost).
Subtopics: optimization with grid search,
input normalization
Teaching material:
Slides 1
Notebook - housing data with trees
Ex 2 (in)
Ex 3 (out)
Case study (paper):
Trees vs Neurons: Comparison between
random forest and ANN for highresolution prediction of building energy
Time series
Case study
Definition, univariate/multivariate,
stochastic process, stationarity,
seasonality, moving average, exponential
smoothing, time-series clustering
techniques (e.g. topics: hierarchical
clustering, DTW, Ward, self-organizing
maps (SOM))
Ex 3 (in)
Ex 4 (out)
Teaching material:
Slides 1 (TBD)
Notebook (TBD, using the dataset:
https://www.kaggle.com/c/dsghackathon/data )
Case study (paper):
The emotional arcs of stories are
dominated by six basic shapes (EPJ Data
Science, 2016)
Text analysis
Text representation, embeddings, tagging
and classification with RNN
Teaching material:
Slides 1 - tf idf
Slides 2 - LSA, embedding
Notebook - tweet classification with trees
Slides 3 - RNN (TBD)
Common deep
architectures and
their applications
Encode-decoder: Image captioning,
Seq2seq - translation. Attention models
and transformer
Ex 4 (in)
Teaching material:
Case study and
project proposals
Case study, paper:
On the Automatic Generation of Medical
Imaging Reports, ACL 2018
Project proposals and discussion
Project ideas
Require Prerequisites:
89511-Introduction to machine learning
Grade structure:
4 Home assignments - 40%
Final project - 60%