Uploaded by Shohruh Rakhmatov

shohruh presentation lunit assignment

advertisement
Assignment Report
1
Introduction
Goal: designed and implemented a reliable Machine Learning pipeline for the MNIST dataset, integrating key
components such as model storage, a testing cases, and an API interface. This report elaborates on the structure,
components, and functionality of the developed pipeline.
2
Project Structure
Root Directory: Lunit_assignment
1. nets:
1.
nn.py
2. utils:
1.
data_loader.py
2.
util.py
3. Core Implementation files
1.
trainer
2.
evaluater
3.
Server.py
4.
Client.py
3
nets: ==> nn.py
•
nn.py: Defines the architecture of the neural network used for MNIST data classification.
•
Convolutional Layers:
•conv1: This is the first convolutional layer which takes a single channel (grayscale) input
and produces 32 output channels using a 3x3 kernel.
•conv2: The second convolutional layer takes the 32 channels produced by conv1 as input
and produces 64 output channels using a 3x3 kernel.
•
Dropout Layers:
•dropout1: This dropout layer is designed to regularize the model by randomly setting a
fraction (25%) of the input units to 0 during training, helping to prevent overfitting.
•dropout2: A more aggressive dropout layer which sets 50% of its input units to 0 during
training.
•
Fully Connected (Linear) Layers:
•fc1: The first linear layer has an input feature size of 9216 and an output size of 128.
•fc2: This linear layer reduces the feature size from 128 to 10, corresponding to the ten
possible digit classes (0-9) of the MNIST dataset.
4
utils: ==> data_loader.py && util.py
•
The primary focus of these modules is to facilitate data loading and configuration handling for the MNIST classification task.
Converts images to PyTorch tensors (transforms.ToTensor()).
Function:
This function facilitates the loading of MNIST data:
•
Arguments:
• use_cuda: A flag indicating whether CUDA should be used.
• batch_size: The size of batches in which data should be loaded.
•
Logic:
• Based on the use_cuda flag, the function sets specific parameters
(num_workers, pin_memory, shuffle) to optimize data loading when
using GPU.
• Utilizes the transformation returned by get_transform() to preprocess
the data.
• Loads both training and testing datasets from the ./dataset directory.
• Uses DataLoader to prepare batches of data for both
5
trainer.py:
•
The model training process for MNIST classification. It integrates with the MLflow platform for effective experiment management.
Key Components:
Initialization:
•
Determines computation device.
•
Loads training and test datasets.
•
Initializes the neural network and optimizer.
•Training Loop:
•
Processes data in batches, computes loss, and updates model
weights.
•
Provides real-time feedback using a progress bar.
•
Evaluates model performance after each epoch.
•Model Logging:
•
Model is saved as "mnist_model.pt".
•
Model weights and training artifacts are logged to MLflow for
tracking.
•
For The Training: please run main.py
6
evaluater.py:
Functionality:
Evaluation Mode:
•
The model is set to evaluation mode using model.eval(), ensuring
batch normalization and dropout layers behave differently from
training.
•Loss & Accuracy Computation:
•
Processes test data and computes the loss and predictions
without gradients (torch.no_grad()).
•
Loss is aggregated and accuracy is determined by comparing
predictions to actual targets.
•Results Display:
•
Outputs the average loss and accuracy percentages for the test
set.
•MLflow Logging:
•
Logs test loss and accuracy metrics to MLflow.
7
server.py:
•
server.py implements a FastAPI server providing an API interface for training and prediction tasks using the MNIST model.
Key Components:
Initialization:
•
load_model(model_path): Loads the MNIST model from a
specified path and sets it to evaluation mode.
•
On server startup, the model is pre-loaded, readying it for
inference.
•Endpoints:
•
Training (/train/): Accepts an uploaded configuration file to
initiate model training, saving the configuration for
reproducibility. Returns a success message upon completion.
•
Prediction (/predict/): Receives an image file, preprocesses it,
and returns the model's digit prediction.
•Server Launch: When executed directly, the script uses uvicorn to run the
server on all interfaces at port 8000.
8
client.py:
•
client.py serves as an client interface to a FastAPI backend, catering to MNIST model operations, specifically training and inference.
Key Components:
Inference Request - request_inference():
•
Data Preparation: Loads a test image from the MNIST dataset.
•
API Interaction: Sends the image to the /predict endpoint of the server to obtain a digit
prediction.
•
Response Handling: Displays the model's prediction and any other associated metadata
received from the server.
•Training Request - request_train(args):
•
Configuration File Handling: Reads the configuration file provided through commandline arguments.
•
API Interaction: Sends the configuration file to the /train endpoint of the server to
initiate the model training process.
•
Response Handling: Outputs the server's response, typically an acknowledgment of the
training completion.
•Command-Line Interface:
•
Utilizes the argparse library to allow users to specify whether they want to initiate a
training or a testing request. Users can also provide a specific configuration file for the
training process.
•
Error Handling && Execution Mechanism
9
tests:
•
The tests folder contains checks that ensure parts of the MNIST project work as intended.
Key tests (functions):
• Test_config_loading.py:
Config Loading:
•
•
•
Checks if the configuration file loads correctly.
•
Confirms essential settings like max_epochs, lr, and batch_size are present.
Test_data_loader.py:
Data_loading:
•
•
•
Makes sure data is loaded correctly for training.
•
Verifies the shape and size of the loaded data batches.
test_model.py:
•
•
Model Verification
•
Initializes the neural network model.
•
Ensures the model has the expected number of layers or modules.
test_tester.py
•
•
Evaluation Test
•
Validates the model's testing process using sample data.
•
Uses a basic model and sample dataset to ensure smooth evaluation.
For testing : run pytest from terminal (project directory: Lunit_assignment)
10
Result (Train with new config file && Inference)
We have to give new config file (.yaml) in that line or by terminal
1
2
We set train default as True for the training, you can give by
terminal
python client.py --train --config initial_experiment2.yaml
1
2
Note: For training with a new configuration:
•
Utilize the provided command or alternatively, set the desired
configuration file as the default argument within the code.
•
All configuration files are stored in the configs folder, found in the project's
main directory.
•
They are organized by name for easy reference.
Executing Inference:
python client.py --test
11
Download