Group Members:
Uriel Anjelo A. Macaspac
Franz Leann U. Ferry
Julliane Pierre L. Oplado
Methodology
This section delves into the dataset chosen for the research, outlines the intricacies of the
detection model, and discusses the image processing techniques employed, all in alignment with
the research methodology devised by (Dutta et al., 2024). The dataset selection process is crucial,
as it forms the foundation of the research's empirical basis, ensuring the relevance and accuracy of
the data used. Additionally, the section explores the intricacies of the detection model, drawing
inspiration from methodology of (Dutta et al., 2024) to enhance its efficacy. In line with the broader
scope regarding the disorders or illnesses that will be subjected to detection.
The specifications of the machine that will be used for the purpose of the research is shown
in Table _._. The programming language that will be used for the research is Python and the editor
used for developing the algorithm is PyCharm and Jupyter Notebook.
Table _._ Machine Specifications
Processor
Intel® Core™ i7-12700h
RAM
16.0 GB
Graphics Card
RTX 3060 Nvidia Graphics
System Type
x64-based processor
Operating System
Windows 11 Home
Conceptual Framework
The conceptual framework utilizes the MBTD (Medical Brain Tumor Detection) dataset
which follows a systematic and comprehensive approach to develop and evaluate a brain tumor
detection model. The framework proceeds in a well-defined sequence, beginning with image
processing using Convolutional Neural Networks (CNNs) to extract relevant features from the
MRI scans. Subsequently, the dataset is divided into training and testing sets, allowing for the
model to be trained on a portion of the data while the rest is reserved for testing and validation
purposes. To ensure the robustness of the model, K-fold cross-validation is employed, enabling the
detection model to be rigorously tested on different subsets of the dataset.
The next step involves training the brain tumor detection model using the CNN
architecture, leveraging the insights gained from the image processing stage. Once the model has
been trained effectively, it is exported for further usage in a Jupyter Notebook environment. In the
testing and validation phase, the model's performance is evaluated by employing datasets from
various sources, including BraTS, Figshare, and Kaggle. These external datasets provide a realworld assessment of the model's generalizability and effectiveness in detecting brain tumors. By
systematically following this conceptual framework, we can develop and validate a robust brain
tumor detection model, contributing to advancements in medical imaging and healthcare
applications.
Detection Model targeting 5 diseases/disorders visible in a brain MRI scan.
Data Gathering
The study proposes the use of three publicly available dataset to verify the efficacy of the
model. The researchers also propose to use the datasets from the websites that provide publicly
available datasets as the format or style of MRI scans are the same. This not only make the data
gathering more efficient but also provides the study a more diverse dataset.
Based on the research by (Dutta et al., 2024), the study proposes the utilization of MBTD
and BraTS datasets. Other datasets from other publicly available datasets such as from Kaggle and
Figshare will be used for testing, again, for the verification of the model. The dataset will have 5
brain tumor classifications which will be determined after further research.
Detection Model
The study proposes the use of 5-fold and 10-fold cross validation which will be used for
the training-testing split that will be used for training the detection model. The following models
will then be compared to determine which detection model is better with the given processed
dataset. Data Augmentation techniques will also be used to deal with overfitting issues as it was
done in the research by (Dutta et al., 2024). The augmentation techniques will be specified after
further research.
For the image preprocessing, the study proposes the use of CNN. However, the researchers
still do not have an exact model to use but the researchers are leaning to the concept of using
RESnet for the gradient issues with image processing. Again, this part will be updated after further
research on CNN. The detection model, the researchers are also planning on using the CNN model.
Therefore, two models will be created, one for image pre-processing/processing and one for
detection. This will allow the researchers to export separate models that will allow the researchers
to multitask when processing images from different datasets and testing the detection model. The
researchers will also further study other related literature to ensure that the methodology is founded
upon verified methods and recommendations from another research.
Training and Validation
For the testing and validation, the researchers plan to test the model with computing for all
parts of the 2x2 confusion matrix to test the true positive, true negative, false positive, and false
negative. After that, the precision and accuracy will be tested. The model will also be tested and
validated using different datasets and the dataset used by (Dutta et al., 2024) to validate if the
model has better, worse, or the same performance.