1. Lane Detection using Sliding Window + IPM Inverse Perspective Mapping (IPM): IPM is used to convert a perspective image to a top-down (bird’s eye) view. This is done to simplify the lane detection problem by removing the perspective distortion that makes lane lines appear to converge. Mathematically, it is a homography transformation that maps image points to ground plane coordinates. Sliding Window Technique: This technique starts by creating a histogram of pixel intensities across the width of a binary image. The peak of the histogram indicates the base of the lane lines. From there, windows of fixed height are slid vertically, centered on the mean of the nonzero pixels detected in the current window. This continues upward, effectively tracking the lane lines. 2. HSV & CNN for Road Sign Detection HSV Color Space: HSV separates image intensity (Value) from color information (Hue and Saturation), making it easier to isolate colors under different lighting. Thresholding on HSV can detect specific colored regions, which can then be passed to classifiers. CNN (Convolutional Neural Network): CNNs are deep learning models specialized for processing grid-like data such as images. They include layers such as convolutional layers (for feature extraction), pooling layers (for downsampling), and fully connected layers (for classification). A CNN learns hierarchical features: from edges and textures to object parts and full object representations. 3. Gamma Correction & Otsu Thresholding Gamma Correction: It adjusts the brightness of an image using a non-linear transformation. It helps in enhancing underexposed or overexposed regions, especially useful in uneven lighting conditions. Otsu Thresholding: Otsu’s method automatically determines the threshold that minimizes intra-class variance in a grayscale image histogram, making it suitable for converting an image to binary. 4. Motion Model using Encoder + IMU Encoder: Measures rotation of wheels to calculate distance and velocity using counts per revolution (CPR). IMU (Inertial Measurement Unit): Provides data from accelerometers and gyroscopes. When fused with encoder data (e.g., using a Kalman Filter), it improves accuracy in estimating position and orientation (pose estimation). 5. Obstacle Map from Stereo Vision Stereo Vision: Uses two cameras to calculate depth via triangulation. Disparity maps are computed from the difference in pixel locations between the two images. Point Cloud: A collection of 3D coordinates (x, y, z) derived from stereo disparity. These points are used to identify obstacles and can be projected onto a 2D occupancy grid for path planning. 6. Cartographer SLAM SLAM: Simultaneously builds a map and localizes the robot in it. Cartographer: A SLAM library developed by Google. It uses sensor fusion (LIDAR, IMU) for 2D and 3D SLAM. It builds submaps and connects them via pose graphs. Loop closures are used to correct drift. 7. YOLO + Optical Flow for Object Tracking YOLO: A real-time object detection model that divides the image into a grid and predicts bounding boxes and class probabilities simultaneously. Optical Flow: Estimates motion of objects by comparing changes in pixel positions across consecutive frames. Common algorithms: Lucas-Kanade (sparse) and Farneback (dense). 8. YOLOv5-OBB Enhancements YOLOv5-OBB: An oriented bounding box (OBB) variant of YOLOv5 used for angle-aware detection. Backbone Modifications: Replacing Focus and SPP (Spatial Pyramid Pooling) layers with more efficient blocks. Attention Mechanisms: o Receptive Field Block (RFB): Expands the context captured at each layer. o Channel Attention: Weighs feature maps by importance (e.g., using SENet). Correlation Modeling: Adds spatial awareness and learns co-located object patterns. 9. RNN + Encoder-Decoder for Detection RNNs: Suited for sequential data, they maintain a hidden state that updates with each timestep. Variants: LSTM, GRU. Encoder-Decoder: Common in NLP and image captioning. Encoder processes the input to create a context vector. Decoder generates outputs from this context. Transfer Learning: Leveraging pretrained models on large datasets (like COCO or ImageNet) and fine-tuning them on specific tasks to reduce data requirements. 10. MAML & Prototypical Networks MAML: Learns an initialization of model parameters such that the model can adapt quickly to new tasks with few gradient steps. Prototypical Networks: Compute the centroid (prototype) of each class in the embedding space and classify new samples based on proximity (usually Euclidean distance). 11. Sentence-BERT BERT: Bidirectional Encoder Representations from Transformers; captures context from both directions. Sentence-BERT: Modifies BERT with siamese or triplet networks to produce semantically meaningful sentence embeddings, efficient for clustering and similarity tasks. 12. Vector Search in RAG Embeddings: Numerical representations of text. Similar texts have embeddings that are close in the vector space. Vector Search: Finds nearest embeddings using cosine similarity or Euclidean distance. Libraries: FAISS, Chroma. 13. RAG (Retrieval Augmented Generation) Combines retrieval from a document store (vector database) with generative models (LLMs). Example pipeline: Query → Retrieve relevant docs → Concatenate with query → LLM generates final answer. 14. LangChain RAG LangChain: High-level Python framework to build RAG apps. Supports chaining, agents, memory, and interfacing with LLMs + retrieval systems like Chroma or Pinecone. 15. Streamlit Web App Streamlit: Framework for building web apps in Python. Widgets (sliders, buttons) allow real-time interaction with backend ML models. 16. RL for Trading DQN: Uses Q-learning with a neural net to estimate the optimal Q-values. Experience Replay: Stores past experiences and samples them to break correlation. MLP Policy: A feedforward neural net mapping state to action probabilities. Benchmarking: Compare RL vs. time series (LSTM learns temporal trends; ARIMA models linear dependencies). 17. PPO & SAC PPO: On-policy method; uses clipped surrogate objective to avoid large policy updates. SAC: Off-policy actor-critic with entropy maximization, encouraging exploration. 18. Classical Control Tasks CartPole: Balancing problem; reward is 1 per timestep. Ant (Mujoco): Four-legged agent; high-dimensional continuous control. 19. JAX & Flax JAX: Provides automatic differentiation + GPU/TPU support. Flax: Deep learning library built on JAX for defining neural networks. 20. Morlet Transform for SHM Morlet Wavelet: A complex sinusoid modulated by a Gaussian envelope, ideal for vibration signals. SHM: Detects changes in frequency/amplitude to identify damage or wear. 21. UART & RTC in STM32/ESP32 UART: Asynchronous communication; requires baud rate synchronization. RTC: Real-time clock helps manage timed events like waking from deep sleep. 22. Pipelined CPU Design Pipeline Stages: IF (Instruction Fetch), ID (Instruction Decode), EX (Execute), MEM (Memory), WB (Write Back). Hazards: o Data: solved by forwarding or stalling. o Control: solved via branch prediction or delay slots. FSM: A controller that transitions states based on inputs and current state. 23. Mealy FSM & VHDL Simulation Mealy Machine: Outputs depend on both state and input (unlike Moore which depends only on state). VHDL Simulation: Used in Quartus/ModelSim to test logic before synthesis onto FPGA. 24. Core ML Theory Bias-Variance: o High Bias: Underfitting o High Variance: Overfitting Supervised Learning: Uses labeled data to learn function mapping input → output. Unsupervised Learning: Finds structure in unlabeled data (clustering, dimensionality reduction). Classification Metrics: o Accuracy = TP + TN / total o Precision = TP / (TP + FP) o Recall = TP / (TP + FN) o F1 = 2 * (Precision * Recall) / (Precision + Recall) Regression Metrics: MAE, MSE, RMSE, R2 score. Regularization: Prevents overfitting. L1 encourages sparsity; L2 penalizes large weights. Activations: o ReLU: max(0,x), avoids vanishing gradient. o Sigmoid: [0,1], saturates. o Tanh: [-1,1], centered. Optimizers: o SGD: Simple, slow convergence. o Adam: Adaptive moment estimation. Losses: o MSE: Mean Squared Error for regression. o Cross-Entropy: For classification. 25. RL Basics MDP: Tuple (S, A, P, R, gamma). Q-function: Expected cumulative reward of taking action in state. Policy: Function that outputs action given state. Exploration/Exploitation: Balance discovering new strategies vs. using known ones. TD Learning: Updates value estimate using current reward + estimated future value. CHEAT SHEET & MOCK INTERVIEW QUESTIONS FOR YAJAN AGARWAL 📄 CHEAT SHEET (1-PAGE SUMMARY) 🔍 Machine Learning Core Regression: Linear, Lasso, Ridge, MLPRegressor Classification: Logistic, SVM, Decision Trees, CNN Metrics: o Classification: Accuracy, Precision, Recall, F1 o Regression: MSE, RMSE, R² Regularization: L1 (sparse), L2 (smooth) Optimization: SGD, Adam, LR Schedulers 📈 Deep Learning CNNs: Conv → Pool → FC layers RNNs: LSTM, GRU; used in sequence modeling Transformers: Attention-based, used in NLP (BERT, SBERT) Activation: ReLU, Sigmoid, Tanh Loss: CrossEntropy, MSE 🤖 Reinforcement Learning Key Algorithms: DQN (Q-table), PPO (clipped updates), SAC (explorationfocused) Core Concepts: MDP, Q-function, Policy, TD-learning, Reward shaping 🔍 Few-shot & Meta Learning MAML: Learn initial weights for rapid task adaptation Prototypical Networks: Distance-based classification � Computer Vision YOLOv5-OBB: Object detection + orientation Sliding Window + IPM: Lane detection from top view Optical Flow: Track motion between frames 🛠️ Robotics & Hardware SLAM: Map + localize using Cartographer Sensors: Encoders, IMUs for motion estimation Communication: UART, RTC sleep/wake for IoT Signal Analysis: Morlet Wavelet for vibrations � Tools & Frameworks LangChain / Streamlit: LLM pipeline, front-end JAX / Flax: Neural networks with high performance ROS, Quartus, ModelSim: Robotics and HDL simulation Vector Search: FAISS, Chroma for embedding-based retrieval 🎤 MOCK INTERVIEW QUESTIONS WITH ANSWERS 🔸 Machine Learning 1. What’s the difference between L1 and L2 regularization? o L1 adds the absolute value of weights (Lasso), leading to sparsity (feature selection). L2 adds squared weights (Ridge), encouraging small weights but not sparsity. 2. Explain bias-variance tradeoff with an example. o Bias: error due to overly simplistic assumptions. Variance: sensitivity to training data. Linear regression = high bias, low variance; decision trees = low bias, high variance. 3. How does a decision tree handle overfitting? o By setting parameters like max depth, min samples split, or pruning methods post-training. 🔸 Deep Learning 4. Walk me through the architecture of a CNN. o Input → Convolution (with filters) → Activation (ReLU) → Pooling (downsampling) → Fully Connected → Output layer (Softmax for classification). 5. What are vanishing gradients and how do you avoid them? o In deep networks, gradients can become too small during backprop, stopping learning. Use ReLU, batch normalization, or residual connections to mitigate. 6. How does an RNN differ from a Transformer? o RNNs process input sequentially, while Transformers process all tokens in parallel using attention. Transformers are more scalable and efficient. 🔸 Reinforcement Learning 7. What is a Markov Decision Process? o A framework defined by (states S, actions A, transition probability P, reward R, discount factor γ) for modeling decision making. 8. How does PPO differ from DQN? o PPO is a policy-based, on-policy algorithm using gradient ascent. DQN is value-based and off-policy, using Q-learning. 9. Why is entropy used in SAC? o To encourage exploration by maximizing both expected return and entropy (randomness) of the policy. 🔸 Meta Learning / Few Shot 10. How does MAML enable few-shot learning? o It trains a model’s initial weights so that only a few gradient steps are needed to adapt to new tasks. 11. What’s the intuition behind Prototypical Networks? o For each class, compute the mean embedding of a few samples (prototype), then classify a new sample based on nearest prototype. 🔸 Vision Projects 12. How does the sliding window lane detection work? o Histogram of lower half of the binary image finds lane start points. Sliding windows move upward to follow lane pixels frame-by-frame. 13. Why do we use HSV over RGB in detection tasks? o HSV separates brightness from color, making color-based segmentation robust to lighting variations. 14. How does Optical Flow help in opponent car tracking? o It tracks pixel movement across frames to estimate object motion and relative velocity. 🔸 Robotics & SLAM 15. How does Cartographer SLAM perform loop closure? o It detects when the robot revisits a location and optimizes the entire trajectory using pose graph optimization. 16. What’s the role of IMU and Encoder in motion estimation? o Encoders measure displacement, IMUs measure orientation. Together they form an odometry system for dead-reckoning. 17. How is point cloud data converted to obstacle maps? o 3D points from stereo/depth are projected onto 2D grids (occupancy maps) for path planning. 🔸 RAG + LangChain 18. What is Retrieval Augmented Generation? o It retrieves relevant documents using vector search and augments the LLM prompt for better context-aware responses. 19. How does LangChain facilitate RAG pipelines? o It offers chaining modules that integrate document loaders, embeddings, vector stores, and LLM prompts in a pipeline. 20. Compare OpenAI API vs Open-Source LLMs in performance and control. o OpenAI offers high performance but limited control and cost. Open-source models are more customizable but may need infrastructure. 🔸 Hardware / Embedded 21. What is the difference between Mealy and Moore FSM? o Mealy: output depends on current state + input. Moore: output depends only on state. 22. How do UART and RTC help in IoT power optimization? o UART enables low-power serial communication. RTC manages sleep/wake schedules, reducing energy consumption. 23. Describe your VHDL-based pipelined CPU design. o It had 6 stages (IF, ID, EX, MEM, WB), handled hazards via forwarding and stalling, and was tested using ModelSim simulations.
0
You can add this document to your study collection(s)
Sign in Available only to authorized usersYou can add this document to your saved list
Sign in Available only to authorized users(For complaints, use another form )