Assignment Report 1 Introduction Goal: designed and implemented a reliable Machine Learning pipeline for the MNIST dataset, integrating key components such as model storage, a testing cases, and an API interface. This report elaborates on the structure, components, and functionality of the developed pipeline. 2 Project Structure Root Directory: Lunit_assignment 1. nets: 1. nn.py 2. utils: 1. data_loader.py 2. util.py 3. Core Implementation files 1. trainer 2. evaluater 3. Server.py 4. Client.py 3 nets: ==> nn.py • nn.py: Defines the architecture of the neural network used for MNIST data classification. • Convolutional Layers: •conv1: This is the first convolutional layer which takes a single channel (grayscale) input and produces 32 output channels using a 3x3 kernel. •conv2: The second convolutional layer takes the 32 channels produced by conv1 as input and produces 64 output channels using a 3x3 kernel. • Dropout Layers: •dropout1: This dropout layer is designed to regularize the model by randomly setting a fraction (25%) of the input units to 0 during training, helping to prevent overfitting. •dropout2: A more aggressive dropout layer which sets 50% of its input units to 0 during training. • Fully Connected (Linear) Layers: •fc1: The first linear layer has an input feature size of 9216 and an output size of 128. •fc2: This linear layer reduces the feature size from 128 to 10, corresponding to the ten possible digit classes (0-9) of the MNIST dataset. 4 utils: ==> data_loader.py && util.py • The primary focus of these modules is to facilitate data loading and configuration handling for the MNIST classification task. Converts images to PyTorch tensors (transforms.ToTensor()). Function: This function facilitates the loading of MNIST data: • Arguments: • use_cuda: A flag indicating whether CUDA should be used. • batch_size: The size of batches in which data should be loaded. • Logic: • Based on the use_cuda flag, the function sets specific parameters (num_workers, pin_memory, shuffle) to optimize data loading when using GPU. • Utilizes the transformation returned by get_transform() to preprocess the data. • Loads both training and testing datasets from the ./dataset directory. • Uses DataLoader to prepare batches of data for both 5 trainer.py: • The model training process for MNIST classification. It integrates with the MLflow platform for effective experiment management. Key Components: Initialization: • Determines computation device. • Loads training and test datasets. • Initializes the neural network and optimizer. •Training Loop: • Processes data in batches, computes loss, and updates model weights. • Provides real-time feedback using a progress bar. • Evaluates model performance after each epoch. •Model Logging: • Model is saved as "mnist_model.pt". • Model weights and training artifacts are logged to MLflow for tracking. • For The Training: please run main.py 6 evaluater.py: Functionality: Evaluation Mode: • The model is set to evaluation mode using model.eval(), ensuring batch normalization and dropout layers behave differently from training. •Loss & Accuracy Computation: • Processes test data and computes the loss and predictions without gradients (torch.no_grad()). • Loss is aggregated and accuracy is determined by comparing predictions to actual targets. •Results Display: • Outputs the average loss and accuracy percentages for the test set. •MLflow Logging: • Logs test loss and accuracy metrics to MLflow. 7 server.py: • server.py implements a FastAPI server providing an API interface for training and prediction tasks using the MNIST model. Key Components: Initialization: • load_model(model_path): Loads the MNIST model from a specified path and sets it to evaluation mode. • On server startup, the model is pre-loaded, readying it for inference. •Endpoints: • Training (/train/): Accepts an uploaded configuration file to initiate model training, saving the configuration for reproducibility. Returns a success message upon completion. • Prediction (/predict/): Receives an image file, preprocesses it, and returns the model's digit prediction. •Server Launch: When executed directly, the script uses uvicorn to run the server on all interfaces at port 8000. 8 client.py: • client.py serves as an client interface to a FastAPI backend, catering to MNIST model operations, specifically training and inference. Key Components: Inference Request - request_inference(): • Data Preparation: Loads a test image from the MNIST dataset. • API Interaction: Sends the image to the /predict endpoint of the server to obtain a digit prediction. • Response Handling: Displays the model's prediction and any other associated metadata received from the server. •Training Request - request_train(args): • Configuration File Handling: Reads the configuration file provided through commandline arguments. • API Interaction: Sends the configuration file to the /train endpoint of the server to initiate the model training process. • Response Handling: Outputs the server's response, typically an acknowledgment of the training completion. •Command-Line Interface: • Utilizes the argparse library to allow users to specify whether they want to initiate a training or a testing request. Users can also provide a specific configuration file for the training process. • Error Handling && Execution Mechanism 9 tests: • The tests folder contains checks that ensure parts of the MNIST project work as intended. Key tests (functions): • Test_config_loading.py: Config Loading: • • • Checks if the configuration file loads correctly. • Confirms essential settings like max_epochs, lr, and batch_size are present. Test_data_loader.py: Data_loading: • • • Makes sure data is loaded correctly for training. • Verifies the shape and size of the loaded data batches. test_model.py: • • Model Verification • Initializes the neural network model. • Ensures the model has the expected number of layers or modules. test_tester.py • • Evaluation Test • Validates the model's testing process using sample data. • Uses a basic model and sample dataset to ensure smooth evaluation. For testing : run pytest from terminal (project directory: Lunit_assignment) 10 Result (Train with new config file && Inference) We have to give new config file (.yaml) in that line or by terminal 1 2 We set train default as True for the training, you can give by terminal python client.py --train --config initial_experiment2.yaml 1 2 Note: For training with a new configuration: • Utilize the provided command or alternatively, set the desired configuration file as the default argument within the code. • All configuration files are stored in the configs folder, found in the project's main directory. • They are organized by name for easy reference. Executing Inference: python client.py --test 11