Introduction ● This session is not comprehensive and doesn’t act as a roadmap ● All information/knowledge in the following session is based on my own personal experience. ● Becoming proficient in AI takes time and effort. Set your expectations. ● Most of my experience revolves around “Applied” AI, not research. What we’ll cover NLP Machine Learning Computer Vision Timeline Computer Vision 2014 2012 Machine Learning Perceptron Multilayer perceptron 1957 1960s 1990s Decision Trees 1980 Back-propagation SAM 2023 2016 2016 2014 R-CNN YOLO Seq2Seq BERT 2017 FPNs Random Forests SVMs 1969 ResNet VGG AlexNet 2001 1998 CNNs 2010s Deep Learning 2014 2013 Word2Vec 2018 2020 2019 2019 2019 2017 Attention Is All You Need GPT3 RoBERTa GPT2 NLP T5 LeNet-5 Data flow in LeNet. The input is a handwritten digit, the output is a probability over 10 possible outcomes. Learning: Breadth vs. Depth • Breadth: Gaining a wide understanding of various machine learning concepts, algorithms, and techniques. It involves exploring different topics like supervised and unsupervised learning, neural networks, and reinforcement learning to build a broad knowledge base. • Depth: Diving deeply into specific areas or techniques in machine learning. This means mastering the details, underlying mathematics, and advanced applications of a particular topic, such as deep learning, natural language processing, or computer vision, to develop specialized expertise. Machine Learning: Breadth • Major concepts: supervised vs unsupervised, L1/L2 loss, Features Engineering (normalization, standardization), Evaluation Measures (Precision, Recall, ROC, Accuracy, Average Precision), K-Folds cross-validation, False positives vs False negatives, Interpolation and Extrapolation, Generative vs Discriminative models, Bias and variance trade off • Algorithms: Linear regression, Logistic regression, Naive Bayes, SVM, Decision Tree and Random forest, bagging, boosting, K-means, KNN, Dimensionality Reduction (PCA vs AutoEncoder) • Probabilistic modeling: Distributions (Gaussian), Bayes' theorem, Gaussian Mixture, Variational AutoEncoder Machine Learning: Depth • Breadth: • What is L2 loss? Regularization techniques help prevent overfitting by adding a penalty to the model complexity. • When to use L2 Loss? Use regularization when your model performs well on training data but poorly on validation or test data. • Depth: • What are the different types of regularization? L1 (Lasso), L2 (Ridge), and Elastic Net. • How does L2 regularization work? It adds the squared magnitude of the coefficients as a penalty term to the loss function. • Write a pseudocode for L2 regularization. • Or deeper: When might L2 regularization fail? Why? What are the alternatives? Dropout, Data Augmentation, Early stopping..etc Popular Topics in Deep Learning • • • • • • • • • CNN, Pooling, Depth-Wise Separable • Depth: e.g. number of parameters computation, Translation invariance Dropout, Batchnorm, Which optimizer Activation functions (Sigmoid, Tanh, Relu) Loss (Sigmoid, Softmax, L2, L1, Cross Entropy, Contrastive) Vanishing Gradient (What/Why/How To Solve/LSTM case) Other possible topics: GANs, Transformers, LSTM Handling Imbalanced datasets (e.g. focal loss, over/under sampling) Understanding the Leaky Properties of BackPropagation Practical Training Tips for your Neural Networks Computer Vision (Classical) Computer Vision: Algorithms and Applications, 2nd ed. Richard Szeliski A rough timeline of some of the most active topics of research in computer vision Computer Vision (Classical) • • • • • • • Feature Extraction, Optical Flow 2D & 3D Projective Transformations Visual Tracking Segmentation and Grouping Camera Models & Calibration Epipolar Geometry Stereovision Geometric Aberrations Taxonomy Classical Methods Active Contours: utilize boundaries and evolve over time for the final segmentation. ● ● Contour: F ○ Construct an energy function and minimizing this function will yield the solution. Energy: Pull it toward the larger gradients in the image + keep a smooth surface. 3Rs of Computer Vision Jitendra Malik Lecture: https://inst.eecs.berkeley.edu/~cs280/sp15/lectures/1.pdf Computer Vision Tasks Computer Vision Tasks Computer Vision Tasks Computer Vision: Breadth • Image classification: AlexNet, Resnet, VGG • Object Detection: RCNN 3 variants, 1-stage networks (Yolo, Centrenet, SSD) • Semantic Segmentation: e.g. FCN, U-Net, FPN, Mask R-CNN, Deeplab variants. Segment Anything SAM • Video Classification (frame vs clips), Action classification, Action Localization, Tracking More tasks? (Reconstruction, Reorganization, Tracking..etc) • Natural Language Processing ● Verbal Speech ● Textual ● Sometimes: Visual (OCR) Human language is Hard. ● ● ● ● Symbolic Implicit Sparse Diverse Natural Language Processing NLP Tasks Text Classification Information Retrieval & Extraction Generation Sentiment Analysis Named Entity Recognition (NER) Inten Understanding Summarization QA GPTs MT Auto-completion Reco Systems NLP: Breadth • Tokenization and Text representation, BoW • • • • • • Word2Vec: Efficient Estimation of Word Representations in Vector Space Attention Mechanism: Attention is All You Need GloVe: Global Vectors for Word Representation Sequence Models: LSTMs, Conv1D, Language models Seq2seq models: NMT, Chatbots, QA Transfer Learning NLP: BERT, GPT2, GPT3, XLNet NLP: Breadth • Text Classification: BERT, RoBERTa, DistilBERT, m-DeBERTa • Text Generation: T5, GPT2, GPT3, GPT4, BART • Retrieval and Extraction: XLM-RoBERTa, BERT, ELMo, T5, SpanBERT Conclusion • • • • • Breadth & depth are equally important. Getting your hands dirty is key. Apply a lot. ML/DL path is a bit longer and harder than typical SWE Build good intuitions during building ML projects. Don’t be blind Root cause analysis for the problems you face. Deep analysis • A missing and hard to build skill References • • • • • CS224N: Natural Language Processing with Deep Learning Ian Goodfellow Deep Learning Book CS231n: Deep Learning for Computer Vision Cracking The Machine Learning Interview - Nitin Suri Hands-on Machine Learning with Scikit-Learn, Keras & TensorFlow