Uploaded by guru610datta

Breast cancer ppt

advertisement
BREAST CANCER PREDICTION USING MACHINE LEARNING
BACHELOR OF TECHNOLOGY
IN
ELECTRONICS AND COMMUNICATION ENGINEERING
SUBMITTED
BY
-
A. SAI
32010651200
KRISHNA
1
A. GURU
32010651200
DATTA
2
M. YAKSHITH
32010651203
UNDER THE GUIDANCE OF
0
PROF. G SASI BHUSHANA RAO
DEPARTMENT OF ELECTRONICS AND
COMMUNICATION ENGINEERING
ANDHRA UNIVERISTY COLLEGE OF
ENGINEERING
ANDHRA UNIVERISTY
VISAKHAPATNAM-530003
2023-2024
CONTENTS :
Abstract
Introduction
Tumor types and differences
Project Methodology
Machine Learning Algorithms
Feature Extraction
Result
Conclusion
References
ABSTRACT :
• Breast cancer is a common cause of female mortality in
developing countries. Early detection and treatment are
crucial for successful outcomes.
• In this project, we propose the adoption of logistic
regression as an alternative to k-nearest neighbors (KNN)
for classification tasks.
• We present a comparative analysis of the two methods
using real-world datasets, evaluating their performance
metrics such as accuracy, precision, recall, and F1score.
• Our findings demonstrate the potential of logistic
regression as a powerful alternative to KNN, providing
insights
for
practitioners
seeking
to
improve
classification performance in their applications.
INTRODUCTION :
• Breast cancer is one of the most prevalent forms of
cancer affecting women globally, making early detection
crucial for successful treatment and improved survival
rates.
• Manual cancer identification using microscopic biopsy
images is subjective; findings can vary from expert to
expert depending on their experience and other factors.
• With the advancements in technology, machine learning
(ML) has emerged as a powerful tool in healthcare for
predicting and diagnosing various diseases, including
breast cancer.
• The automated identification of malignant tissue by
extracting features from microscopic biopsy images using
Machine Learning helps to alleviate the problems outlined
above and gives improved outcomes.
Basic types of tumors :
• Benign and malignant tumors are two fundamental
classifications of tumors, and understanding the
distinction between them is crucial in the context of
breast cancer diagnosis and prognosis
• Both benign and malignant tumors arise from abnormal
cell growth, but they exhibit key differences in terms
of behavior, impact on surrounding tissues, and the
potential for spreading to other parts of the body
• Malignant cells are considered cancerous, Malignant
breast cells have the potential to grow uncontrollably,
invade surrounding tissues, and spread to other parts
of the body, leading to the formation of tumors.
• BENIGN
• MALIGNANT
• Slowly growing.
• Rapidly growing.
• Regular surface ,Capsulated.
• Irregular surface ,Noncapsulated.
• No spread or Metastasis.
• Not attached to deep
structures.
• Spread or Metastasis.
• Slight pressure effect in
neighboring organ.
• Remarkable pressure effect in
neighboring organ.
• Attached to deep structures.
Symptoms :
• A change in the size, shape or
contour of your
breast.
• A mass or lump, which may feel
as small as a pea.
• A lump or thickening in or near
your breast or in your underarm
that persists through
your menstrual cycle .
• A change in the look or feel of
your skin on your breast or
nipple.
• A marble-like hardened area
under your skin.
PROJECT METHODOLOGY:
Collection of
microscopic biopsy
images
Feature Extraction
Data processing
Train and Evaluation
split
Machine Learning
model
Prediction
Classification
Suffering from
Breast Cancer
Not Suffering from
Breast Cancer
MACHINE LEARNING :
• Machine learning (ML) is a subfield of artificial intelligence
that uses statistical, probabilistic, and optimization
techniques to help computers learn from past examples and find
patterns in data sets.
• In essence, it's about teaching machines to recognize patterns
and make decisions based on data rather than being explicitly
programmed to do so.
• Machine learning
can Learning:
be broadlyAlgorithms
categorized
intofrom
several
Supervised
learn
labeled
types:
data, making predictions or decisions based on
input-output pairs provided during training.
 Unsupervised Learning: Algorithms learn from
unlabeled data to discover patterns or structures
within it, without explicit guidance on what to look
for.
EXISTING METHOD FOR BREAST CANCER DETECTION :
K- Nearest Neighbours:
• The k-nearest neighbors (KNN) algorithm is a non-parametric,
supervised learning classifier, which uses proximity to make
classifications or predictions about the grouping of an
individual data point.
• The KNN algorithm uses 'feature similarity' to predict the
values of any new data points. This means that the new point is
assigned a value based on how closely it resembles the points
in the training set.
Minkowski Distance :
Minkowski distance is a mathematical
measure of the distance between two
points in a multidimensional space.
Generally, we use p=2 in case of
k-Nearest Classifier
PROPOSED METHOD FOR BREAST CANCER DETECTION :
Logistic Regression:
• Logistic regression is a statistical method used for binary
classification tasks, where the target variable has two
possible outcomes (e.g., true/false, yes/no, 0/1).
• logistic regression is a classification algorithm, not a
regression algorithm. It models the probability that a given
input belongs to a particular category using a logistic
function (also known as the sigmoid function).
SIGMOID FUNCTION
The Logistic function is of the form:
where μ is a location parameter
(the midpoint of the curve, where
p(μ )=1/2}) and s is a scale
parameter.
This expression may be rewritten
as:
Where
and is known as the intercept and
is the rate parameter
Feature extraction:
Method used : Gray Level Co-occurrence Matrix
• GLCM stands for Gray-Level Co-occurrence Matrix. It's a
technique used in image processing to understand the texture
of an image.
• GLCM organizes this information into a matrix. Each cell in
the matrix represents how often two gray levels appear
together at a certain distance and in a certain direction in
the image.
Features
Extracted :
 Contrast
 Correlation
 Dissimilarity
 Homogenity
 Angular Second
Movement
 Energy
Contrast :
Contrast in GLCM (Gray-Level Co-occurrence Matrix) refers to how much
the gray levels in an image differ from each other in neighboring
pixels.
Correlation :
Correlation in GLCM (Gray-Level Co-occurrence Matrix) is a measure of
how much the gray levels in an image are related or vary together in a
particular direction.
Dissimilairty :
Dissimilarity in GLCM (Gray-Level Co-occurrence Matrix) is a measure of
how different neighboring pixels are from each other in an image.
Homogeneity :
Homogeneity in GLCM (Gray-Level Co-occurrence Matrix) is a measure of
how uniform or smooth the texture of an image appears.
Angular Second Moment:
ASM is like a measure of orderliness in an image. It calculates how
regularly different gray levels appear in different directions
throughout the image.
Energy :
Probabilities of different grey levels in the image.
Results :
Conclusion:
• In conclusion, our comparative analysis demonstrates that
logistic exhibit higher accuracy in breast cancer detection
compared to k-nearest neighbors (KNN).
• The logistic regression model consistently outperforms knearest neighbors (KNN) various evaluation metrics, benefiting
from its ability to model linear relationships and provide
interpretable results.
• KNN suffers from computational complexity, especially in highdimensional spaces. The superior performance of logistic
regression suggests their potential utility as a predictive
tool in clinical practice, aiding in early diagnosis and
treatment planning.
• This underscores the importance of selecting appropriate
machine learning algorithms tailored to specific healthcare
tasks, with logistic regression proving to be a reliable
choice for breast cancer detection.
Future Scope:
Personalized Medicine:
• Machine learning models can be tailored to individual patient
characteristics, such as genetic markers, medical history, and
lifestyle factors.
• This personalized approach can lead to more accurate risk
assessment and treatment recommendations, improving patient
Early
Detection
and Prevention:
outcomes
and reducing
unnecessary interventions.
• Machine learning algorithms can analyse large-scale datasets to
identify subtle patterns and biomarkers associated with earlystage breast cancer. By detecting cancer at an earlier stage,
patients can receive timely interventions, leading to better
prognosis and survival rates.
Explainable AI (XAI):
• As machine learning models become more complex, there is a
growing need for transparency and interpretability.
• Explainable AI techniques can help clinicians and researchers
understand how models make predictions, enabling them to trust
and validate the results and identify potential biases or
limitations.
References:
• “Comparing Logistic Regression to the K-nearest
Neighbors (KNN) technique, A Novel Pattern Discovery
Based Human Activity Recognition” by S. Ritesh Reddy,
Devi T.
• “Comparison of machine learning models for breast
cancer diagnosis” by Rania R. Kadhim, Mohammed Y.
Kamil
• VolcashDB: Volcanic ash particle image and
classification database January 2023
THANK YOU
Download