Uploaded by ketema2129

Bean Crop Disease Detection Thesis

advertisement
ADMAS UNIVERSITY
POSTGRADUATE SCHOOL
MSC PROGRAM
BEAN CROP DISEASE DETECTION USING A MACHINE LEARNING
ALGORITHM
A Thesis Submitted to
the Department of Computer Science for the Partial Fulfillment of the
Requirements for the Degree of Master of Science in Computer Science
By
AGMAS GETNET
AUGUST 2020
ADDIS ABABA, ETHIOPIA
i
Declaration
I, Agmas Getnet,the under signed, declare that this thesis entitled: “BEAN CROP DISEASE
DETECTION USING A MACHINE LEARNING ALGORITHM” is my original work. I have
undertaken the research work independently with the guidance and support of the research
advisor. This study has not been submitted for any degree or diploma program in this or any
other institutions and that all sources of materials used for the thesis has been duly
acknowledged.
Declared by:
Name____________________________
Signature: ________________________
Department: ______________________
Date: _____________________________
ii
Certificate of Approval of Thesis
School of Postgraduate Studies
Admas University
This is to certify that the thesis prepared by Agmas Getnet, entitled “BEAN CROP DISEASE
DETECTION USING A MACHINE LEARNING ALGORITHM” and submitted in partial
fulfilment of the requirements for the Degree of Masters of science in computer science complies
with the regulations of the University and meets the accepted standards with respect to
originality and quality.
Name of Candidate: ___________________: Signature: _____________Date: _____________.
Name of Advisor: _____________________: Signature: ______________Date: ____________.
Signature of Board of Examiner`s:
External examiner: ____________________Signature: ____________Date: ____________.
Internal examiner: ____________________Signature: ____________Date: ____________.
Dean, SGS: __________________________Signature: ____________Date: ____________.
iii
ABSTRACT
Bean is one of the widely grown crops in the world. This crop is easily prone to various diseases
such as Rust, Bacterial blight, angular leaf spot, Alternaria leaf spot, web blight and root rots.
Among bacterial blight the most dangerous and widely occurring disease caused by
Xanxomonasoryzaepv.pryzae. Now a day farmerand agricultural experts are identifying
symptoms of the diseases by their vision, but they cannot differentiate types of the disease at its
earliest stage of development. To know type of disease farmers, need to get guidance from the
expert which helps to minimize time and cost besides knowing the disease type correctly. In
order to minimize disease factor, it should be detected at its primary stage of development and
pesticide is sprayed to the diseased plants. If the growth of disease extends its earliest stage of
development, it cannot be controlled easily. In order to solve the problems a novel automated
computer-based system is important and proposed for classification and early detection of
diseases on bean crop using image processing with an image segmentation algorithm called Kmeans clustering and the machine learning algorithm, SVM classifier. In the identification of
bean crop diseases, we have followed the steps image acquisition, image preprocessing, feature
extraction, segmentation and classification. At the first beginning we have collected data’s as
input and removing noises and resize of images with preprocessing step. In the third step
segmentation step performed with dividing the images into clusters (1,2,3). Fourthly, feature
extraction of images has been performed by extracting features of images like texture. Lastly the
images classified to the appropriate group of disease it belongs by using the classifier, support
vector machine with an accuracy of 96.77%.
Keywords: Bean, Bacterial Blight, Image processing, K-means clustering, Image processing,
SVM.
iv
ACKNOWLEDGEMENT
First and foremost, thanks to God, the almighty, for his blessings throughout my research work.
I would like to express my gratitude to my advisor Dr. Henok Mulugeta for his invaluable
guidance, sincerity and motivation to accomplish the research.
I am highly indebted to Admas University post graduate school for their guidance and constant
supervision as well as for providing necessary information regarding this research and thanks to
Ethiopian institute of Agricultural Research, Debre zeyt branch for their information.
I would like to thank my parents. Thank You my mother, Mastewal Wondmnew, for your alarm
to start master’s program and for your unremitting motivation and thanks to my sister
Yealemmebrat Getnet.
I would like to thanks my friends Addis Tsega, Addisu Gizachew,Meron Kassa and Shewakena
Getnet for their support, inspiration, stimulating discussion and impetus.
v
TABLE OF CONTENTS
ABSTRACT................................................................................................................................... iv
TABLE OF CONTENTS ............................................................................................................... vi
LIST Of FIGURES ........................................................................................................................ ix
LIST Of TABLES ........................................................................................................................... x
ABBREVIATIONS ....................................................................................................................... xi
CHAPTER ONE ............................................................................................................................. 1
INTRODUCTION .......................................................................................................................... 1
1.1.
Background .......................................................................................................................... 1
1.2.
Statement of problem ........................................................................................................... 3
1.3.
Objectives ............................................................................................................................ 4
1.3.1.
General objective .......................................................................................................... 4
1.3.2.
Specific objectives ........................................................................................................ 4
1.4.
Scope and limitation of the research .................................................................................... 4
1.4.1.
Scope of the research .................................................................................................... 4
1.4.2.
Limitation of the research ............................................................................................. 4
1.5.
Significance of the study ...................................................................................................... 4
1.6.
Research organization .......................................................................................................... 5
CHAPTER TWO ............................................................................................................................ 6
LITERATURE REVIEW ............................................................................................................... 6
2.1.
Introduction .......................................................................................................................... 6
2.2.
Bean Diseases ...................................................................................................................... 8
2.3.
Digital Image Processing ................................................................................................... 10
2.3.1.
Image processing methods.......................................................................................... 11
2.3.2.
Fundamental Steps of Digital Image Processing ........................................................ 12
vi
2.3.3.
2.4.
Types of image processing ......................................................................................... 20
Related sources .................................................................................................................. 21
CHAPTER THREE ...................................................................................................................... 23
RESEARCH METHODOLOGY.................................................................................................. 23
3.1.
Experimentation Tools ....................................................................................................... 23
3.2.
Algorithm ........................................................................................................................... 24
3.3.
Analysis and Design .......................................................................................................... 24
3.4.
Data Collection and Dataset Preparation ........................................................................... 24
3.4.1.
Data Collection ........................................................................................................... 24
3.4.2.
Dataset Preparation ..................................................................................................... 24
3.5.
Sampling techniques .......................................................................................................... 25
3.6.
Materials and methods ....................................................................................................... 26
3.7.
Evaluation Technique ........................................................................................................ 27
CHAPTER FOUR ......................................................................................................................... 29
PROPOSED SYSTEM MODEL .................................................................................................. 29
4.1.
System Architecture ........................................................................................................... 29
4.2.
Tasks of Image processing ................................................................................................. 32
4.2.1.
Image Acquisition....................................................................................................... 32
4.2.2.
Image Preprocessing ................................................................................................... 32
4.2.3.
Image segmentation .................................................................................................... 34
4.2.4.
Feature extraction ....................................................................................................... 37
4.2.5.
Classification .............................................................................................................. 41
CHAPTER FIVE .......................................................................................................................... 45
RESULTS AND DISCUSSIONS ................................................................................................. 45
5.1.
Introduction ........................................................................................................................ 45
vii
5.2.
Data Set .............................................................................................................................. 45
5.3.
Testing Techniques on MATLAB ..................................................................................... 46
5.4.
Implementation .................................................................................................................. 46
5.4.1.
Stage One: Image acquisition ..................................................................................... 47
5.4.2.
Stage two: Image Preprocessing ................................................................................. 47
5.4.3.
Stage Three: Image segmentation............................................................................... 50
5.4.4.
Stage Four: Feature extraction .................................................................................... 52
5.4.5.
Stage Five: Classification ........................................................................................... 53
CHAPTER SIX ............................................................................................................................. 56
CONCLUSION AND RECOMMENDATION ............................................................................ 56
6.1.
Conclusion ......................................................................................................................... 56
6.2.
Recommendation ............................................................................................................... 57
References ..................................................................................................................................... 59
viii
LIST Of FIGURES
Figure 2-1: Phases of plant disease detection system ................................................................... 14
Figure 2-2:Block diagram for Image Processing .......................................................................... 14
Figure 2-3:Framework of image processing operation ................................................................. 15
Figure 2-4: Low level Image processing ...................................................................................... 20
Figure 2-5:Middle level image processing ................................................................................... 21
Figure 4-1: Proposed system architecture ..................................................................................... 30
Figure 4-2: Flow chart to classify images ..................................................................................... 31
Figure 5-1: Conversion of RGB2HSI ........................................................................................... 47
Figure 5-2: Preprocessing images ................................................................................................. 48
Figure 5-3: Conversion of image to R, G and B and Histogram of the R, G and B ..................... 49
Figure 5-4: Histogram equalization .............................................................................................. 50
Figure 5-5:Conversion of RGB to L*a*b color ............................................................................ 51
Figure 5-6: K-mean clustering ...................................................................................................... 51
Figure 5-7:Bean Crop disease detection GUI ............................................................................... 54
Figure 5-8:Accuracy and Error rate detection of diseases ............................................................ 55
ix
LIST Of TABLES
Table 4-1: Summary of different segmentation techniques .......................................................... 35
Table 4-2: Summary of different color techniques ....................................................................... 38
Table 4-3: Summary of different texture feature extraction techniques ....................................... 39
Table 4-4: Summary of different classifiers ................................................................................. 41
Table 5-1: Accuracy value for each disease detection (%) ........................................................... 54
x
ABBREVIATIONS
SVM
SUPPORT VECTOR MACHINE
LAC
LATIN AMERICAN CARIBBEAN
BCMV-
BEAN COMMON MOSAIC VIRUS
BGMV
BEAN GOLDEN MOSAIC VIRUS
CBB
COMMON BACTERIAL BLIGHT
KG HA-1
KILOGRAM PER HECTAR
GSM
GLOBAL SYSTEM for MOBILE
GPRS
GENERAL PACKET RADIO SERVICES
KNN
K- NEAREST NEIGBOUR
ANN
ARTIFICIAL NURAL NETWORK
RGB
RED, GREEN, BLUE
HSI
HUE, SATURATION, INTENSITY
MRMR
MINIMUM REDUNANCY MAXIMUM RELEVANCE
SGDM
SPATIAL GRAY-LEVEL DEPENDENCE MATRICES
FCM
FUZZY C-MEANS
IEEE
INSTITUTE OF ELECTRICAL AND ELECTRONICS ENGINEERS
PNN
PROBABILISTIC NURAL NETWORK
RBF
RADIAL BASIS FUNCTION
EIAR
ETHIOPIAN INSTITUTE OF AGRICULTURAL RESEARCH
TP
TRUE POSITIVE
TN
TRUE NEGATIVE
FN
FALSE NEGATIVE
FP
FALSE POSITIVE
GLCM
GREY LEVEL CO-OCCURRENCE MATRIX
GDP
GROSS DOMESTIC PRODUCT
JPG
JOINT PHOTOGRAPHIC EXPERTS GROUP
xi
CHAPTER ONE
INTRODUCTION
1.1.
Background
Agriculture bargains employment to a large population of the world. Its products are alsoa key
need of every citizen. The study of agriculture, known as the agricultural sciences has led to a lot
of development in this field like scientific farming, use of high-end technology in growing crops
[1]. Bean is one of the widely grown crop in the world. In the year 2010 global bean production
was approximately 23,816,123 tons, with 24.4 and 17.7% of the world production in LAC and
Africa, respectively [2]. Bean is an important source of nutrients about 500 million people in
parts of Africa and Latin America, representing 65% of total protein consumed, 32% of energy
[3]. Minerals and nutrients such as iron, phosphorus, magnesium, potassium, calcium, zinc and
folate (B vitamin) are found in beans and contribute to a balanced healthy diet [4]. It is assumed
that Faba bean is familiarized to Ethiopia in the 16th century by Portuguese [1]. Economic
significance of bean in Ethiopia is quite considerable since it represents one of the major foods
and cash crops. By small scale farmers it is often grown as cash crop and used as a major food
legume in many parts of the country where it is consumed in different types of traditional dishes
[5].
Even though bean is easily disposed to various diseases such as Alternaria leaf spot, Bacterial
blight, Cercospora yellow spot and Red spider Mite [6] it is still source of nutrients about 500
million people in parts of Africa and Latin America, representing 65% of total protein consumed,
32% of energy [5]. The yearly global bean production is approximately 12 million metric tons,
with 5.5 and 2.5 million metric tons alone in Latin America Caribbean (LAC) and Africa,
respectively [7].Faba bean is grown throughout Ethiopia and is increasingly an important
commodity in the cropping systems of smallholder producers (the average farm size for
smallholder farmers is between 0.25 to 0.5 hectares) for food security and income. It has also
health benefits because it is rich in protein content (about 23% for dried shelled beans and about
6% for green beans) and serving as a good source of iron and zinc (both of which are key
elements for mental development). According to the report of central statics agency in the year
1
2016 the area covered by bean production in Ethiopia was 113,249.95 ha and 244,049.94 ha for
white and red bean respectively with total area of 357,299.89 ha and total production of about
540,238.94 tons/ha and national average yield was 1600 kg/ha. It is mainly grown in Eastern,
Southern, South Western and the Rift valley areas of Ethiopia[5]. The production constraints
reported in the literatures for beans are poor agronomic practices, soil infertility, lack of
improved cultivars, moisture stresses, weed competition, and damage caused by pests and
diseases [8][9].
Rust (Uromycesappendiculatusa (Pers., Unger), anthracnose (Colletotrichum lindemuthianum
(Sacc.) Magnas), angular leaf spot (Phaeoisariopsisgriseola (Sacc. Ferr), web blight (Rhizoctonia
solanipv. phaseoli (Kuhn.), root rots (Fusarium solanipv. Phaseoli (Mart.) Sacc bean common
mosaic virus (BCMV) and bean golden mosaic virus (BGMV) are also the major diseases
identified and cause considerable yield reduction in Ethiopia. Bacterial blight (BB) is a
significant seed borne disease of bean, caused by the gram-negative bacterial pathogen
Xanthomonas axonopodispv. phaseoli (Xap) and its fuscans variant Xanthomonas fuscans subsp.
fuscans (Xff) [7]. Bacterial blightaffects foliage, pods and seeds of faba bean and is considered
as the major problem in most faba bean production areas of the world. During extended period of
warm and humid weather, the disease can be highly destructive and causes losses in both yield
and seed quality of bean in many production areas of Ethiopia [5].
Bean bacterial blight is reported as the main obstacles to faba bean production throughout the
country. However, prevalence varies with growing area and seasons. For instance, for each
percent increase in Bacterial blightseverity in broadcast and mixed intercropping, about 5.2
kilogram per hectare (kg ha-1) and 9.1 kg ha-1 seed yield losses, respectively, occurred at
physiological maturity of the crop in Hararghe, eastern Ethiopia. At flowering, for each percent
increase in bacterial blightseverity, there is 38.8 kg ha-1 and 71.1 kg ha-1 yield reduction in pure
stand and row intercropping system respectively, in this area [7].Therefore, because of the above
bean crop usefulness for the country, bacterial blight disease attack level, difficulty of disease
detection and farmers loss of energy to detect bean diseases we proposeda solution for the above
problems. We have detected this crop disease by using Support Vector Machine of image
processing.
2
1.2.
Statement of problem
In Ethiopia the productivity level is declining because of diseases on the crops[10]. Many crop
diseases can reduce the production of agriculture causing a tremendous amount of losses for
farmers. It is difficult to detect these diseases with human eye. Now a day there is no a best way
to detect the diseases of the crops rather than observation or guessing the diseases from previous
symptoms if occurred before in other crops or areas. Therefore, rather than losing economy,
energy and time in trying to identify bean crop disease using naked eye it is a solution to detect
its type and notify to the farmers the resulted gained by using machine learning. Even if some
disease is visible by humans, it is not an easy task to detect and classify the bean crop disease; as
it requires a continuous monitoring which can be exhausting and expensive. Moreover, it is not
good to wait till the time the symptoms are visible so as to take some actions in treating them. It
might be too late to act.
The problems described above motivate us to use image processing technique to resolve these
issues. We came up with this research due to the factors of the above stated bean crop disease
and believed that there must be a solution that detect the disease early with an automated system
that identify the level of the crop leaves. Besides the above obstacles walk up us to investigate on
themethod that could be used to detect bacterial blight, Alternaria leaf spot and halo blight
disease of bean crop.We go through also to add one additional class with different bean crop
diseases and to detect the healthy bean crop as healthy not like the works done on the research
paper “A Novel Approach to Classify and Detect Bean Diseases based on Image Processing”
[11].Since beans are vital to the existence of both humans and animals, farmers should be
supplied with the best modern technologies. These technologies should be capable of identifying
and classifying a wide variety of bean diseases in a short time. Detecting bean diseases at early
stages can reduce the amount of crop losses significantly [11]. Image processing is necessary in
these cases. The algorithms used in image processing make it possible to detect bean crop
diseases automatically.
Research questions
The research answers the question below.
 How to use a machine learning algorithm to classify bean crop disease?
3
1.3.
Objectives
1.3.1. General objective
The main objective of this research is to increase the healthy crop production of the country,
Ethiopia, by detecting bean crop diseases using Machine learning algorithms.
1.3.2. Specific objectives
 Design a model to detect disease of bean crop.
 To analyze the images and accurate results in detecting the bean crop disease
 To evaluate the performance and the result of the selected models
 To report the result of the study and recommend future research works
 To highlight and sensitize farmers and/or small-scale seed enterprises on possible
disease
1.4.
Scope and limitation of the research
1.4.1. Scope of the research
The thesis is mainly focused on the design and development of bacterial blight, alternaria leaf
spot and halo blight disease detection model on bean crop.
1.4.2. Limitation of the research
The research is true for specifically with Ethiopian bean crop leaf disease detection. It is good to
make the research applicable for each and every farmer with GSM and GPRS networks but the
research resultis carried out only in the laboratory. It also require an expert who is trained in each
and every disease type of bean and image processing.
1.5.
Significance of the study
Identification of bean crop disease with image processing in accordance with machine learning
algorithm computer vision is a good technology to detect bean crop diseases. Because the
physically bean crop is many in number in a farm for detection of the disease affects it by using a
necked eye, it needs a lot of time and effort and at the same time, that is less accurate and applied
4
in a limited area. Whereasautomatic disease identification techniques and methods which can be
deployed digitally are used it takes to less time, less effort, more accurate, and covers a large
area.
This research paper will enable agriculture experts to increase the value and the importance of
computer vision in the field of agriculture. It will also help to harvest more amount of crop due
to the fact that the disease is detected early without finding agricultural experts and to decrease
the cost of experts for continues caring of crops in a very large farm. The outcome of this
research paper will also help different authorities to provide proper measures in situations where
there is Bacterial blight, Halo blight and Alternaria leaf spot disease. Finally, this thesis will
serve as reference material for the researchers who will conduct their research in computer vision
especially researches related to crop disease identification.
1.6.
Research organization
Besides this chapter the research paper is organized as stated below. The second chapter focus on
reviewing related literatures that helps to get additional inputs and strength the idea driven from
me. The third chapter is about research methodology that describes the different methods and
techniques to be followed in order to achieve the work. In the fourth chapter we have focused on
the proposed model structure that describes the system architecture. Here the research paper
shows and describes the actions flows with how bean crop disease detection havebeen
implemented. The fifth chapter shows the implementation of the research in accordance with the
problems reviewed and identified, the objective and the proposed methodology in checking up of
the algorithms selected to identify the disease. Lastly the research paper is enclosed by giving
conclusion and recommendation based from the general outcome of the research the future
thinking of the researchers.
5
CHAPTER TWO
LITERATURE REVIEW
2.1.
Introduction
When using digital image processing, the systems of machine vision starts from image
acquisition. After the images are captured there are a number of processes that the system
follows to reach the desired goal of a machine vision system. Researchers used machine learning
algorithms (such as SVM, KNN, ANN) to detect and classify plant diseases. SVM is called
discriminative classifier as formally defined by a separating hyper plane. It also finds separators
with maximum margin to improve the performance of the classifier [12]. K-means algorithm is
used for segmenting images of the diseases alternaria leaf spot, bacterial blight and cercospora
yellow spot only. Diseases with unique spots like a webbed spot of spider mite cannot be
segmented using K- means algorithm [13]. It is a kind of self-adaption search algorithm based on
partition. It can segment the image with different color, and divide the different part into
different clusters. By using K-means clustering algorithm we can dispose the different parts
conveniently [14].
Several research papers regarding plant disease detection are explained briefly that can be
classified into two main categories. The first category focuses on detecting specific diseases on a
certain plant or a group of plants. In addition, some of the algorithms needed to implement each
step of image processing. On the other hand, the second category describes the main steps
needed for detecting plant diseases. A description of several techniques that are currently used in
detecting plants diseases provided in the research paper "Detection and classification of plant
leaf diseases using image processing techniques: A review"[15]. The implemented system would
determine if the plant is healthy or not, the disease name, and percentage of the infected areas in
the leaf. The authors did not focus on a certain plant or disease. In fact, their main aim was to
increase the accuracy in detecting plants diseases. In order to do so, they used a non-linear
classifier called SVM. The author M. T. [16] demonstrate the image processing steps that should
be used for detecting plant diseases. These steps include image acquisition, image preprocessing, image segmentation, feature extraction, and classification of diseases. They stated
also an explanation of different algorithms for implementing image segmentation, feature
6
extraction and classification. These algorithms include k-means clustering, color co-occurrence,
and neural network.
According to the research paper titled “Agricultural plant leaf disease detection using image
processing” [17], explained there is a problem in choosing the best classification technique for
detecting plants diseases. This is because each classifier has given different result for different
type of input data. Therefore, several classification techniques are explained in details along with
their advantages and disadvantages. From the researcher’s point of view, although k-Nearest
Neighbor is the simplest classifier among all of them, it takes a long time in making predictions
and it can be affected by irrelevant parameters. A survey of different approaches in detecting
plants diseases were mentioned in the paper “A novel approach for the detection of plant
diseases” in IJCSMC,2016[18]. The researcher’s purpose was to explain the algorithms used for
each step of image processing. The aforementioned algorithms regarding segmentation were Kmeans clustering, Otsu method, and converting RGB image to HSI model. While, CoOccurrence and mRMR methods were used for feature extraction. The authors mentioned Knearest neighbor, ANN, fuzzy logic, and SVM as techniques used for classifying the detected
disease.
A general procedure in how to detect leaf diseases has been provided in the paper called "Plant
disease detection using image processing”[19]. The image should be preprocessed to extract the
useful information from it. Therefore, color transformation is needed to convert the RGB image,
which is a color generator, into HSI, a color descriptor. Consequently, the green pixels should be
masked and removed and then the image should be segmented. After segmenting the image, a
Spatial Gray-level Dependence Matrices (SGDM) method has been applied to extract the texture
features of the leaf before passing it into a classifier. The research paper written by M. T. [16] did
not specify an approach that can be applied to detect a specific type of plant or disease but we
have specified. The authors researched as "Detection and classification of plant leaf diseases
using image processing techniques: A review”[15] did not focus on a certain plant or disease
rather than discussing some disease and plants. The researchers did not focus on the dangerous
bean crop diseases, bacteria blight, Alternaria leaf spot and Halo blight that attack bean and
minimize the productivity of the country, Ethiopia, but we focused on it. Therefore, here we
researched a solution for Bacteria blight, Halo blight and Alternaria leaf spot crop disease which
is not considered for bean crops before by the other local or international researchers. We have
7
been focused on detecting the diseases type and calculate the percentage of the bean crop leaf.
Basically, the research has classified the bean crop in to two as diseased (bacteria blight, halo
blight and Alternaria leaf spot) and healthy with increasing the accuracy of the classifier.
2.2.
Bean Diseases
Bacteria infect the seed either by passing through the vascular system of the pedicel of the pod,
the vascular tissues of the pod, and then the funiculus, or by growing from an external infection
of the pod through the parenchyma, and into the conducting tissues. Infections of the seed in
either case are usually not deep, but primarily in the surface to subsurface regions of the
cotyledons.
Bacterial Blight
It is one of the most severe diseases of the pomegranate and bean and caused by the bacteria. It
shows up to 100% severity in some orchards. The symptoms can be initially found on stem part
which gradually impregnate to leaves and later to fruits. On fruits brown-black spots appear on
peri-cap with cracks passing through those spots. It spreads as the bacteria survive on the tree, on
the diseased fallen leaves, to the healthy plants in the area through wind splashed rains and
infected cuttings. High temperature and relative humidity favor the disease[20]. It is also a
typical leaf spot and leaf blight disease. At first, the lesions on leaves are small, translucent,
water-soaked spots which later develop dry brown centers and narrow yellow halos. Lesions
coalesce into irregularly-shaped areas which may include the whole leaflet. Lesions on stems and
pods tend to be more restricted and sunken. Those on the pods frequently turn from red to brown
with age. With continued development, the vascular system may also turn brown and surface
cankers form on the stem. This bean disease cannot be diagnosed, with certainty, in the field.
This disease must be distinguished from four other bacterial diseases of bean which have
overlapping symptoms. This problem of differentiating different diseases from overlapping
symptoms is commonly encountered in the positive diagnosis of plant diseases.
Common bacterial blight symptoms exhibit a scalded appearance on leaf tissue and contain
water-soaked spots. These small, irregularly shaped lesions often enlarge to 1 inch or more and
form dark brown lesions along the edge of the leaflet. A narrow lemon-yellow margin often
8
surrounds these lesions and large portions of the foliage can be infected. Infected pods exhibit
circular, water-soaked areas that often produce yellow masses of bacterial ooze. Later, spots dry
and appear as reddish-brown lesions. Pod infection often causes discoloration, shriveling and
bacterial contamination of seeds; however, some seed may appear healthy[21].Bacterial Blight is
characterized by small, pale green spots or streaks appeared as water-soaked. The lesions will
expand then appear as dry dead spots[22]. It may extend until the full length of the leaf. Bacterial
blights, caused by various species of bacteria, occur in most of the bean growing areas of the
world[23]. Under favorable weather conditions, these bacteria can spread rapidly through a field
causing defoliation and pod damage.
Bacterial blight (BB) is a significant seed borne disease of Faba bean, caused by the gramnegative bacterial pathogen Xanthomonas axonopodispv. phaseoli (Xap) and its fuscans variant
Xanthomonas fuscans subsp. fuscans (Xff). Both strains cause identical symptoms but
Xanthomonas phaseoli var. fuscans has been reported to be more aggressive[24]. In Ethiopia, it
is ranked among the most important and wide spread diseases of Faba bean. It also reported as
the main constraints to Faba bean production throughout the country. It is caused by Xanthomas
axonopodisPv. Punicae bacteria[25]. It shows its presence over leaves, stem as well as on fruits
and reduce nearly 65% to 75% of yield and this disease is not so easily curable by any of the
antibiotics or by any chemicals, it needs a periodic day to day observation. This only helps in
finding infection in the early stages and then farmer can easily take a precautionary measure to
make the plants to overcome from the infection.
Halo blight
The researcher K.W. described different bean disease that are frequently attacking the crop as
Halo blight, Fuscous blight and Bacterial Brown Spot[23]. The first one, Halo blight, is caused
by Pseudomonas phaseolicola, characteristically exhibits a halo. The disease, however, cannot be
distinguished from common blight on the basis of its large halo for the halo is not produced if
there have been periods of very warm weather. Halo blight symptoms first appear as small,
angular, water-soaked spots (almost resembling little pin pricks) on the undersurfaces of leaves.
As these spots grow and turn brown, a characteristic light green to yellow halo appears around
the spots. This halo is due to the action of a toxin produced by the bacteria and is a diagnostic
symptom of the disease.
9
Fuscous blight
This is the second disease stated by [23] and it can be distinguished only by the brown pigment
that the causal organism, Xanthomonas phaseoli var. fuseans, produces on certain media such as
PDA.
Bacterial Brown Spot
This disease is caused by Pseudomonas syringaepv. Syringaewritten by the researcher K.M [23],
is more common on lima beans than other bean types. Small, water-soaked spots on leaves
become red-brown in color. Spot centers dry out, turn grey, and may fall away. Veins on the
underside of the leaves may turn red-brown. Spots on stems and pods are more elongated than
those on leaves.
2.3.
Digital Image Processing
Crop disease detection using image processing is a useful method to reduce the crop diseases.
Multiple methods of image processing are used to detect the diseases. The researchers with titled
“Crops Disease Diagnosing Using Image-Based Deep Learning Mechanism” [26] in 2018 used
an approach based on convolution of neural networks to classify the disease of strawberry plants.
The system makes use of deep learning to diagnose the disease. The researchers O. Min and N.
Chi Htun[27] in the same year used image processing techniques to detect and classify four types
of plant diseases which are Rust, Cercospora Leaf Spot, Bacterial Blight and Powdery Mildew.
In the research “Disease detection in crops using remote sensing image”in 2017 [28] used remote
sensing images to early detect the crop diseases. Canny edge detection & histogram matching
was used. In 2016 [29] classified six different diseases of tomato plants using image processing
techniques. These image processing techniques extract features from images of healthy and
diseased plants.
Digital image processing is the use of computer algorithms to perform image process on digital
pictures[30]. It permits a far wider vary of algorithms to be applied to the computer file and
might avoid issues like the build-up of noise and signal distortion throughout process. Digital
image process has terribly important role in agriculture field. it's widely adaptedto observe the
crop disease with high accuracy. Detection and recognition of diseases in plants mistreatment
10
digital image method is extremely effective in providing symptoms of characteristic diseases at
its early stages. Plant pathologists are analyzed the digital pictures mistreatment using digital
image process for diagnosing of Crop diseases.
According to the research paper “A Brief Review on Plant Disease Detection using in Image
Processing”[30] Computer Systems area unit developed for agricultural applications, like
detection of leaf diseases, fruits diseases etc. altogether these techniques, digital pictures are
collected employing a camera and image process techniques are applied on these pictures to
extract valuable data that are essential for analysis. These diseases are mostly on leaves and on
stem of plant. The diseases are viral, bacterial, fungal, diseases due to insects, rust, nematodes
etc. on plant. It is important task for farmers to find out these diseases as early as possible. Image
processing is a form of signal processing for which the input is an image and the output of image
processing may be either an image or a set of characteristics or parameters related to the image
[31]. Most image-processing techniques treat the image as a two-dimensional signal. Image
processing is computer imaging where application involves a human being in the visual loop. In
other words, the images are to be examined and are acted upon by people.
The research papers "SVM Classifier Based Grape Leaf Disease Detection", “Detection of Leaf
Diseases and Classification using Digital Image Processing”, "A Survey on Detection and
Classification of Rice Plant Diseases", “Image based Plant Disease Detection in Pomegranate
Plant for Bacterial Blight”,” Image based Plant Disease Detection in Pomegranate Plant” and
”leaf disease detection and fertilizer suggestion”developed the system that detect plant diseases
with having two phases- Training phase which includes test image acquisition, test image
preprocessing, feature extraction, segmentation, classification and calculation of percentage
infection and test phases. Accordingly, the raised the main characteristics of crop disease
detection using machine learning algorithms that must be achieved as speed and
accuracy[32][33][34][35][25][36].
2.3.1. Image processing methods
There are two methods which are used to process image as stated below on the research paper
“Overview of Image Processing”[31].
11
A. Analog Image Processing
Analog image processing as an image processing task conducted on two-dimensional analog
signals and has the capability of the alteration of image through electrical means like the
television image and used for the hard copies. In creating images using analog photography, the
image is burned into a film using a chemical reaction activated by controlled exposure to light.
Analog images are processed in a darkroom, using special chemicals to create the actual image.
B. Digital Image Processing
Digital image processing is the use of computer algorithms to perform image processing on
digital images. Because of digital image processing we are beneficial inconstant high quality of
the image, a low cost of processing and the ability to manipulate all aspects of the process and
the image is stored as a computer file. The stored file is translated using photographic software to
generate an actual image. The advantages of Digital Image Processing methods are its versatility,
repeatability and the preservation of original data precision.
2.3.2. Fundamental Steps of Digital Image Processing
In the research written by S. D et al[37] discussed about the main steps of image processing to
detect disease in plant and classify it. It includes the steps image acquisition, image
preprocessing, image segmentation, feature extraction and classification. For segmentation, they
used the methods otsu’s method, converting RGB image into HIS model and k-means clustering.
According to them k-means clustering method gives accurate result. After that, feature extraction
is carried out the features color, texture, morphology, edges etc. Among this, morphology feature
extraction gives better result. After feature extraction, classification is done using classification
methods like Artificial Neural Network and Back Propagation Neural Network.The
researchersM. C. Ghulam and G. Vikrant[38] introduced an advanced system for detection of
plant disease. The researchers aim at the design and development of image processing-based
software for automatic classification and detection of disease in plants. In this research paper
detection of the disease is done on two distinct classes of disease like scorch and spot.
Algorithms are designed for segmentation, feature extraction, classification and detection of
disease. One of the drawbacks of the technique used in this paper is that it can be implemented
12
only in controlled laboratory condition. It has good adaptability for different color spaces, but it
yields poor segmentation results on the tested images.
In the research paper named “Image Processing System for Plant Disease Identification by Using
FCM-Clustering Technique” implemented a method for plant disease identification using the
FCM (fuzzy C-means) clustering technique. Segmentation is done by using FCM clustering
technique [39]. Features are extracted from affected regions and passed to the SVM (support
vendor machine) classifier for classification. The combination of classifier technique is used in
this paper can classify diseases efficiently, but takes more processing time. Hence main
drawback of this paper is early detection of disease is not possible. In the research paper “Petiole
detection algorithm based on leaf image” [40] implemented a method for detection of unhealthy
region of plant leaves using image processing and genetic algorithm. Genetic algorithm is the
iteratively formed evolutionary algorithm for generating solutions to analytical problems. The
algorithm begins with a set of solutions called a population. Solutions from one population are
chosen and used to form a new population. This paper can extract features of the disease from
the segmented part efficiently, but it takes more time to handle multiple iterations of the sample
input. It has low execution speed since it takes more training time.
In the research “Plant Disease Detection Techniques: A Review”[41] states the process of plant
disease detection system basically involves four phases as shown in Fig 2-1. The first phase
involves acquisition of images either through digital camera and mobile phone or from web. The
second phase segments the image into various numbers of clusters for which different techniques
can be applied. Next phase contains feature extraction methods and the last phase is about the
classification of diseases as shown below.
13
Figure 2-1: Phases of plant disease detection system
In the research titled “Detection and classification of plant leaf diseases using image processing
techniques: A review” [42] proposed the steps of image processing from image preprocessing as
shown below in Fig 2-2 by missing the image acquiring step but the researchers must get the
images with image acquiring step.
Figure 2-2:Block diagram for Image Processing
We proposedthat it is mandatory to include the image acquiring step in classification of the
images with respect to the diseases attacked them with the steps as shown below in Fig 2-3.
14
Figure 2-3:Framework of image processing operation
1. Imageacquisition
Theimages of the bean crop leaves are captured through the camera[25]. The captured images
were in RGB form. Scaling of an image and color transformation of image, if required, has been
done in image pre-processing. Image acquisition in image processing can be broadly defined as
the action of retrieving an image from some source, usually a hardware-based source, so it can be
passed through whatever process need to occur afterward[43]. Performing image acquisition in
image processing is always the first step in the workflow sequence because, without an image,
no processing is possible. The images acquired was completely unprocessed and is the result of
whatever hardware has been used to generate it, which can be very important in some fields to
have a consistent baseline from which to work. One of the ultimate goals of this process is to
have a source of input that operates with in such controlled and measured guidelines that the
same image can, if necessary, be nearly perfectly reproduced under the same conditions so
anomalous factors are easier to locate and eliminate.
2. Image Pre-processing
Pre-processing is a technique used to analyze real time problems in images[44]. Inspection is
only way to investigate the disease present within fruit. In order to accomplish this image of leaf
15
can be captured and then analyzed using pre-processing techniques. The preprocessing technique
utilized for this purpose of converting RGB to different color space conversion. Different
preprocessing techniques such as image cropping, resizing, color transformation, contrast
enhancement and filtering is done for removing noise and enhancing images in dataset[45]. For a
better outcome for segmentation steps, we concentrate on enhancing the image of bean crop leaf
in order to improve the image colors or image intensities which help emphasize the texture and
disease color. This is useful for segmentation step that use the colors’ intensity as an attribute. In
the study, preprocessing is performed the image resizing to minimize the image size and reduce
the use of memory in processing system. Histogram equalization is used in order to adjust image
intensities and to enhance contrast. Doing this, we have obtained an image with clearer edge of
leaf and diseases that have been occurred.
3. Image segmentation
In this research the method used for segmentation is k-means clustering algorithm. K-means
clustering is one of the unsupervised machine learning algorithms use to classify or categorize
datasets into groups. K-means clustering is an iterative, data-partitioning algorithm that assigns n
observations to exactly one of k clusters defined by centroids[46]. Image Segmentation also aims
at simplifying the representation of an image and it becomes more meaningful and easier to
analyze [47]. As the premise of feature extraction, this phase is also the fundamental approach of
image processing. There are various methods using which images can be segmented such as kmeans clustering, Otsu’s algorithm and thresholding etc. The k-means clustering classifies
objects or pixels based on a set of features into K number of classes. The classification is done
by minimizing the sum of squares of distances between the objects and their corresponding
clusters [48].
Image segmentation can play a vital and important role in plant disease detection[34]. Image
segmentation means to divide the image into particular regions or homogeneous objects.
According to the research titled “A Survey on Detection and Classification of Rice Plant
Diseases”, the primary aim of segmentation is to analyze the image data so one can extract the
useful features from the data. There are two ways to carry out the image segmentation: (1) based
on discontinuities and (2) based on similarities. In the first way, an image is partitioned based on
sudden changes in intensity values, e.g., done via edge detection. While in the second way,
16
images are partition based on the specific predefined criteria, e.g., thresholding done using
Otsu’s method. [49] has taken a number of crop types namely, fruit crops, vegetable crops, cereal
crops and commercial crops to detect fungal diseases on plant leaves. Different methods have
been adopted for each type of crop.
In the research paper titled “Detection of Plant Disease Using Threshold, K- Mean Cluster and
ANN Algorithm” by IEEE,2017 [42] States different points for each and every type of crops as
such below.
 For fruit crops, k-means clustering is the segmentation method used, texture features
have been focused on and classified using ANN and nearest neighbor algorithms
achieving an overall average accuracy of 90.723%.
 For vegetable crops, chan-vase method used for segmentation, local binary patterns for
texture feature extraction and SVM and k-nearest neighbor algorithm for classification
achieving an overall average accuracy of 87.825%.
 The commercial crops have been segmented using grab-cut algorithm. By using
Mahalnobis distance and PNN as classifiers with an overall average accuracy of
84.825% a wavelet-based feature extraction has been adopted.
 The cereal crops have been segmented using k-means clustering and canny edge
detector. Color, shape, texture, color texture and random transform features have been
extracted. SVM and nearest neighbor classifiers used to get an overall average accuracy
of 83.72%.
Here on we have researched to increase the accuracy of crops in the research paper titled
“Detection of Plant Disease Using Threshold, K- Mean Cluster and ANN Algorithm”[42] with
texture feature extraction.
For this research paper we have selected K-means clustering to segment the bean crop leaf
mages. K-means clustering is used for segmenting an image into three groups [50]. The clusters
contain diseased part of leaf. Before clustering ‘a’ component is extracted from L*a*b space.
And its Properties of K-Means Algorithm and K-Means Algorithm Process are given as below:
1) Properties of K-Means Algorithm
a) There is K number of clusters always.
b) There is minimum one item in each of the given cluster.
17
c) The clusters never overlap with each other. d) Each member of single cluster is nearer to
its cluster than any other cluster.
2) The Process of K-Means Algorithm
a) First divide the dataset into K number of clusters and assign the data points randomly to
the clusters.
b) Then for each data point, calculate the Euclidean distance, from the data point to every
cluster. The Euclidean distance is the straight-line distance between two pixels and is given
as follows:
Euclidean Distance=√((x1-x2)² + (y1 -y2)²)
------------------- (1)
Where (x1, y1) & (x2, y2) are nothing but two-pixel points (or two data points).
c) If the data point is closest to its own cluster then leave it where it is.
d) Shift it into the nearby cluster, if the data point is not closest to its own cluster.
e) Repeat all steps until an entire pass through all the data points.
f) Now the clusters become stable and the process of clustering reached final step.
4. Feature extraction
It is the process of determining common features and then group or clusters are formulated from
which particular values can be extracted to reduce the complexity. In image processing SVD
approach is commonly used for feature extraction purpose [44]. According to [51] a web-based
tool has been developed to identify fruit diseases by uploading fruit image to the system.
Features extraction has been done using parameters such as color, morphology and CCV (color
coherence vector). Clustering has been done using the k-means algorithm. SVM is used for
classification as infected or non-infected. The researchers work achieved an accuracy of 82% to
identify pomegranate disease.The feature extraction aspect of image analysis focused on
detecting essential characteristics or features of objects present within an image [34]. These
features can be used to describe the object.Generally, features under following three categories
are extracted: color, shape, and texture. The researchers assumed that color is an important
feature because it can differentiate one disease from another.
Furthermore, each disease may have different shape; thus, system can differentiate diseases using
shape features. Some shape features are area, axis, and angle. Texture means how color patterns
are scattered in the image. The feature extraction of disease sections are extracted in to the
18
categories: color, texture and shape features. Since each type of diseases presented in different
color and shape properties, the color and shape features are further used in classification. The
feature extraction is used to extract the information that can be used to find out the significance
of the given sample. The main types of features are shape, color and texture which are mostly
used in image processing technique. We are extracted the texture features (Statistical based
Feature Extraction) of the bean crop leaf to get full features and prepared for classification with
less ambiguity.
5. Classification
Classification is the process to find the feature of images and group them into specific classes.
We used Support Vector Machine (SVM), which is a supervised machine learning algorithm
[52][53]. It is capable of classifying in high dimensional spaces effectively, working on small
dataset and dealing with non-separable data by combining the technique called Kernel function.
By using Kernel function, we are mapped the feature space to a high dimensional feature space
where the data vector became linearly separated so that we can later find the hyper plane to
separate the dataset in high dimensional feature space. Kernel functions such as polynomial,
radial basis function (RBF) and Sigmoid function, are usually used with SVM and it is a wellknown machine learning algorithm. SVM is also used to classify data sets into specified
categories[54]. It is a discriminative classifier formally defined by a separating hyper plane. This
method finds separators with maximum margin to improve the performance of the classifier.
The Kernel function used in SVM is defined as the mathematical formula use to transform
inseparable data set from the input data plot to a lower or higher dimensional space that results in
separable results [46]. For classification purpose [55] used the SVM classifier to identify the
classes, which are closely connected to the known and trained classes. The Support vector
machine creates the optimal separating hyper plane between the classes using the training data.
Many classifiers have been used in the past few years by researchers such as k-nearest neighbor
(KNN), support vector machines (SVM), artificial neural network (ANN), back propagation
neural network (BPNN), Naïve Bayes and Decision tree classifiers. According to [42] also the
most commonly used classifier is SVM. Though every classifier has its advantages and
disadvantages, SVM is simple to use and robust technique.
Image processing techniques could be applied on various applications as follows [30] :
19
1. To detect plant leaf, stem, crop and fruit diseases.
2. To quantify affected area by disease.
3. To find the boundaries of the affected area.
4. To determine the color of the affected area
5. To determine size, texture and shape of fruits.
2.3.3. Types of image processing
The researchers did the research paper ”Automatic Flower Disease Identification Using Image
processing”[43]states the types of image processing as Low level, Mid- level and high-level
image processing and described as below.
Low- level
Low level processes involve primitive operations such as image pre-processing to reduce noise,
contrast enhancement, and image sharpening as it is shown in Figure 2-4. It is characterized by
the fact that both its inputs and outputs are images.
Figure 2-4: Low level Image processing
Mid- level
The tasks included in mid-level processing of images are segmentation (partitioning an image
into regions or objects), description of those objects to reduce them to a form suitable for
computer processing, and classification (recognition) of individual objects. The input of this step
may be images from first level processing or images that are directly captured. This level can be
characterized by its inputs generally are images, however its outputs are attributes extracted from
those images (e.g., edges, contours, and the identity of individual objects).
20
Figure 2-5:Middle level image processing
High-level
Higher-level processing involves “making sense” of an ensemble of recognized objects, as in
image analysis, and, at the far end of the continuum, performing the cognitive functions normally
associated with vision and, in addition, encompasses processes that extract attributes from
images, up to and including the recognition of individual objects. The processes of acquiring an
image of the area containing the text, pre-processing that image, extracting (segmenting) the
individual characters, describing the characters in a form suitable for computer processing, and
recognizing those individual characters are in the scope of what we call digital image
processing.Today, there is almost no area of technical attempt that is not affected in some way by
a digital image processing.
The application areas of a digital image processing are different. One of the simplest ways to
develop an understanding of the extent of image processing application is to categorize images
according to their original source. The principal energy source for images in use today is the
electromagnetic energy ambit. Other important source of energy includes acoustic, ultrasonic,
and electronic (in the form of electron beams used in electron microscopy). Thus, imaging
techniques based on this source of energy includes gamma-ray imaging (nuclear medicine and
astronomical observation), X-ray imaging(medical diagnosis and astronomy), imaging in the
ultraviolet band(lithography, industrial inspection, microscopy, lasers, biological imaging and
astronomical observation), imaging in the visible and infrared bands(light microscopy,
astronomy, remote sensing, industry, and law enforcement), imaging in the microwave
band(radar system) and imaging in the radio band(medicine and astronomy).
2.4.
Related sources
In order to accomplish the objectives of the research, literatures on contemporary development of
machine learning algorithms related to cereal, plant and fruit classification are reviewed. The
21
literatures reviewed are concerned with image processing with different machine learning
algorithms by SVM, KNN, ANN and many others. All of the literatures are basically done with
image processing, deep learning and machine learning to detect and classify Crops, Plants,
flowers and vegetables. We used the sources from different research papers as input for our work
with giving siting them
22
CHAPTER THREE
RESEARCH METHODOLOGY
This chapter clarified about the exemption method or approach used and required resources. In
order to make this research successful with respect to its objective, literatures on contemporary
development of image processing related to plants, cereal or fruit classification was reviewed.
From these insight reviews of image processing techniques using a machine learning and tools
that were employed on agricultural products variety identification and that were pertinent to this
work have been selected. The methodology for detecting bean crop leaf diseases involves several
tasks, such as Image acquisition, image preprocessing, image segmentation, feature extraction
and leaf diseases classification based. Firstly, acquiring image in which the images of the various
bean crop leaves that are to be classified are taken using a digital camera. Secondly, image
preprocessing was applicable to remove noises. In the third phase, segmentation is performed to
discover the actual segments of the leaf in the image. At the fourth level, feature extraction for
the infected part of the leaf is completed based on specific properties among pixels in the image
or their texture and side by side certain statistical analysis tasks calculated to choose the best
features that represent the given image, thus minimizing feature redundancy. Finally,
classification is completed using support vector machine.
3.1.
Experimentation Tools
There are several things that should be considered to make sure the development stage of the
system can run successfully such as software and hardware specification. This software is
developed by Math Works. We have selected MATLAB version R2015bto implement the
prototype of the system with other libraries that is compatible with the simulator. Visio 2016 was
also used for designing the system architecture, algorithms and SVM classifier. In addition to the
above software’s we have used Adobe photoshop cs4 for image formatting. We have applied
also mobile phone withmodel TechnoW5 having digital camera, 13 Mega Pixel is used to collect
bean images.
23
3.2.
Algorithm
The research used the algorithm called K-means cluster algorithm for segmentation and Support
Vector machine for classification which was easy to detect the bean crop diseases MATLAB
software. The code designed in MATLAB consists of two major functions, which are generating
training model and test Data.
3.3.
Analysis and Design
Analysis is the process determining the needs or conditions to meet for a new or altered the
system. Design is the process of problem solving and planning for a software solution. It
includes low-level component and algorithm implementation issues as well as architectural view.
There is a growing demand of image processing in diverse application areas, such as multimedia
computing, secured image data communication, biomedical imaging, biometrics, remote sensing,
texture understanding, pattern recognition, content-based image retrieval, compression and so on.
For our designing purpose we analyzed the data that are collected from different resources with
respect to the diseases that we wanted to identify in accordance with the aim of the research.
3.4.
Data Collection and Dataset Preparation
3.4.1. Data Collection
We have collected bean crop leaf images from Ethiopian Institute of Agricultural Research
(EIAR), Debre Zeit center through the techniques observation and existed resources. In addition
to the image from the institute another samples (healthy and infected leaf images) are collected
from the Web. This can benefit the model to train with different imaging properties and
conditions.
3.4.2. Dataset Preparation
Data preparation is required to train and test the model. From the collected images manually
classified and labeled in training set and randomly selected, unclassified and unlabeled image
data in testing set are prepared. The images in testing set are different from the images that are
24
included in the training set. From the collected total 100 images 80 (80%) samples for training
and 20 (20%) samples are used.
3.5.
Sampling techniques
Sampling is one of the core procedures in classification and detecting disease. For sampling, we
have selected the faba bean by taking sample of images of faba bean. We took faba bean sample
because of its availability and functionality for the country, Ethiopia and it is more attacked by
the disease. From our samples, major samples were used for training and remaining which is less
compared to the training from total were used for testing purposes. The training sample is the
composition of the images healthy and Bacterial blight, halo blight and Alternaria leaf spot
diseased faba beans.
25
Literature review
Sample collection
Image Acquistion
Image preprocessing
Image segmentation
Feature extraction
Image classification
Figure 3-1: Methodology of sampling
3.6.
Materials and methods
When images have been taken, the camera was mount on a stand which provides easy vertical
movement and stable support for the camera. Samples were arranged on a black background
table during image recording. The diseased beans were scattered on the table, each making no
contact with another. The separation between diseased beans was kept in order to make image
segmentation easier. To obtain uniform lightning or balanced illumination, an incandescent lamp
whose light source was 100W with a rated voltage of 220V was used in all experiments. The
26
lighting system was switched on for about 5 minutes prior to acquiring any images for its
stabilization. In-order to reduce the influence of surrounding light, we took the samples in a
controlled room. The images were taken at resolution of 2818 x 1826 pixels and resized to 256 x
256.
3.7.
Evaluation Technique
The research model has been assessed by exploiting a test dataset on the classifier using the
training dataset and the model’s performance of the classifiers was returned as an output that
contains percentage of accuracy measures for each class. In the research the classifier accuracy
and total infected part of the images have been calculated. The system drivesalso the error rate of
the classifier with respect to the images classified withindicating the correct/incorrect allotment
of samples into their respective classes
Figure 3-2: Evaluation metric samples
𝑇𝑃+𝑇𝑁
Accuracy (%) = 𝑇𝑃+𝐹𝑃+𝑇𝑁+𝐹𝑁*100
Error rate (%)= 1- Accuracy (%)
------------(2)
-------------(3)
Where TP – True Positive, TN – True Negative, FP – False Positive, FN – False Negative
In this research case, these representations can be interpreted as:
 TP: number of infected crop leaves sorted out as ‘INFECTED’
 TN: number of healthy crop leaves sorted out ‘HEALTHY’
 FP: number of healthy crop leaves sorted out as ‘INFECTED’
 FN: number of infected crop leaves sorted out as ‘HEALTHY’
 Total: total number of samples (crop leaf images)
In this research paper, the efficiency of the proposed methodology is tested and evaluated to
detect fababean crop diseases. In order to successfully evaluate the classification accuracy and
detect those disease, five steps are implemented:
27
1. Image Acquisition isthe first step to process an image. In this step, the available images
from the digital camera or internet have been taken.It is about gathering images for
preprocessing.
2. Preprocessing step aims to make the collected image is scaled and applied to a min-max
linear contrast stretching to improve the quality of the original image. This was also
linearly expand the original value of the data into a new distribution. Then, with
transformation structure is built for the enhanced image and create an enhanced image.
3. The segmenting process is carried out using the K-Means Clustering with Euclidean
Distance to extract the region of interest from the image.
4. Feature extraction was up on after dividing images into its homogenous parts. Grey Level
Co-occurrence Matrix (GLCM) is used for this purpose. Thirteen parameters have been
also extracted from the testing and training images.
5. Classification stage is the last stage to classify bean crop leaf, it is good to use a linear
classifier which is machine-learning algorithm called Support Vector Machine (SVM). It
is chosen among other classifiers since it has high prediction accuracy and it works when
there are errors in the training samples. This classifier can be used for many classification
types including texture classification. Usually the SVM input is nonlinear however, in
some high dimensional space it is mapped into linearly separated data that are resulted in
good classification. This classifier works with only two classes divided using the hyper
plane in which the distance between the support vector and the hyper plane is as far as
possible. However, this does not mean that the implementation of a multiclass
classification cannot be done using this classifier [10].
28
CHAPTER FOUR
PROPOSED SYSTEM MODEL
The aim of this chapter is to discuss the approach and framework for the project. Method,
technique or approach that has been used while designing and implementing the thesis included
in the content.
4.1.
System Architecture
In the figure below the architecture for an implementation of how the system work is depicted.
By considering having additional sub-tasks under each main task, these sub-tasks has been
explained in the next sections detailly and here it is an overview how the system look like as
referenced to the research paper “[43]”.
29
Figure 4-1: Proposed system architecture
The architecture shown above tells the overall process followed to classify an input image in
either of two classes. According to the architecture training and testing phases should be
performed independently. The training phase begins by importing a number of images, which are
arranged to process in one after another methodology and independently before the testing phase
have done after the training phase finished the process and prepared images for accurate
classification by training the machine. In the second phase, testing phase, an image is imported
for the process. After image is imported in both phases pass through the same processes that are
providing the same purpose. The preprocessing, segmentation and feature extraction
functionalities are same for both phases. After the feature extraction both phases follow different
paths, the training phase provides feature vector with a label input for the model to train and the
result is stored in the knowledge base. The testing phase provides a feature vector to the model
30
and expects for label return classifier returns that label from knowledgebase that is trained
previously.
The proposed methodology for bean crop image classification has five vital stages; the initial
stage is the image acquisition stage through which the real-world sample is recorded in its digital
form. In the next stage of the research imagesaretransformed to a preprocessing stage, making
use of its size and complexity of the image was reduced. The precise digital information was
subjected to segmentation and feature extraction process which separates the rotten portion of the
leaf samples. Finally, the area of the segmented part has beencalculated using machine learning
algorithm, SVM, and classified to its category.
Figure 4-2: Flow chart to classify images
31
4.2.
Tasks of Image processing
To process an image in detecting fababean crop disease our research followed the following
tasks procedurally.
4.2.1. Image Acquisition
According to the research paper “A Novel Approach to Classify and Detect Bean Diseases based
on Image Processing” [11] the initial process is to collect the data from the source which is
selected by the researcher. We gather by our camera and took the images as input for further
processing. We have taken most popular image domains so that we can take with the format .jpg
as input to be processed. The process can be developed by using a device called camera. The
output of this process is a number of captured images of faba bean in the format it is captured by
the device.
Figure 4-3: Healthy bean and with bacterial blight
4.2.2. Image Preprocessing
When the images are acquired from the field and web it may containdin. Therefore,
preprocessing is performed to eliminate the din in the image, so as to adjust the pixel values and
changing images background as black. It enhances the quality of the image.To remove all noises
of images, we can use images filtering and segmentation techniques. The output of this phase is
segmented images containing the leaves from the images of the first phase (image acquisition).
To remove noise in image different preprocessing techniques are considered. In this research
image cropping and image enhancement are used to remove noise of images by cropping of the
32
leaf image to get the interested image regionand increase the images contrast respectively. The
Red, Green and Blue (RGB) images are also converted into grey images using color conversion
by the following formula:
F(x) = 0.2989*R + 0.5870*B + 0.114*B
--------------------- (4)
The input images, originally having thousands by thousands of dimension, are resized 256x256
pixels contented to the next process and are cropped leaving only the diseased area of the leaf to
clear the images. By doing so, the computational time and computing memory power is
condensed seeing that only a small portion of the bean crop leaf is processed.
Techniques of image preprocessing
To clean the noises from the collected images from many sources it is important to follow the
following technique[43].
1) Image Scaling
Image scaling is functional because the size of training and testing images are not matching.
Some of these images have beenimmense in size that can a basis for a problem in the
implementation including out of memory. Dropping the image size can rush the processing time.
Therefore, all the image sizes were set to [256,256]. We have selected this process for
preprocessing the images in removing of noises.
2) Min- Max Linear Contrast Stretch
The input images may have low variance, which can affect the detecting process. That is why
using min-max linear contrast stretch is necessary to advance the quality. This is because it
reallocates the lowest and the highest values of the data into new set of values that apply the full
range of available intensity values. For example, if the lowest intensity value of an image is 45
and the highest brightness value is 205. The values from 0 to 44 along with the values from 206
to 255 have not shown. That is why the lowest value should be stretched to 0 and the highest
value should be stretched to 255, which is done by applying the min-max linear stretch.
33
4.2.3. Image segmentation
Segmentation is a strategy that divides an image into different screens and distributes them based
on the appearance that can be observed in the image such as the color, texture, boundaries and
many more [56]. It is established on different appearance found in a picture such as color
orientation, texture, boundaries, etc. It is the third step in our proposed method. In this research
paper the segmented images are clustered into different segments using k-mean clustering
algorithm. Segmentation can be done using various methods like Otsu method, k-means
clustering, converting RGB image into HIS model, converting RGB image into Gray level
thresholding model etc. [56]. we have selected K-means clustering and before grouping the
images by the method, the RGB color model is transformed into contrast enhanced model. The
commencement of this model is to easily cluster the segmented images.
K-means Clustering Algorithm
In k-means clustering, each point from the given dataset is associated to the centroid with the
minimum distance repeatedly[43]. In our research paper the distance between the two points are
calculated using Euclidean Distance. This is because measuring the distance between any two
objects are not reformed if new objects are added to the investigation.
To make the algorithm, K –means Clustering operational we followed steps below:
1. Prefer center of K cluster, either randomly or based on some heuristic.
2. Allocate each pixel in the image to the cluster that diminishes the distance between the pixel
and the cluster center.
3. Again compute mean of the cluster centers of the pixels in the cluster. Repeat steps 2 and 3
until convergence is achieved.
Otsu Threshold Algorithm
Thresholding creates binary images from grey-level images by setting all pixels below some
threshold to zero and all pixels above that threshold to one. The Otsu algorithm defined in [5] is
as follows:
i) According to the threshold, Separate pixels into two clusters
ii) Find the mean of each cluster.
34
iii) Square the difference between the means.
iv) Multiply the number of pixels in one cluster times the number in the other.
The infected leaf shows the symptoms of the disease by changing the color of the leaf. Hence the
greenness of theleaves can be used for the detection of the infected portion of the leaf. The R, G
and B components are extracted from the image. The threshold is calculated using the Otsu’s
method.Then the green pixels are masked and removed if the green pixel intensities are less than
the computed threshold.The researchers with title called “a survey on detection of disease and
fruit grading”different segmentation techniques as describe in table 4-1[58].
Table 4-1: Summary of different segmentation techniques
Segmentation Description
Benefits
Drawbacks
Technique
It is the simplest method Any prior information It does not work well
approach
of
image about
Thresholding Method
segmentation by dividing required
image
Fast,
is
not for image with broad
simple and flat valleys and
the image pixels based andcomputationally
does
on their intensity level. inexpensive.
peak.
not
The threshold value can Can be easily applicable Spatial
have any
information
be computed depending and suitable for real life may be ignored and
on the peak of the image applications
resultant image cannot
histogram.
guarantee
that
the
segmented regions are
contiguous.
Threshold selection is
very crucial.
Extremely
sensitive.
35
noise
In
this
method It is flexible enough to Required
construction
of choose
more
between computation time and
segmentation region is interactive
and memory and sequential
Region Based Method
based on association and automatic technique for in nature.
dissociating
neighbor image segmentation.
pixels. It works on the
principle
of
homogeneity, with the
fact the adjacent pixels
inside
specific
flocks
characteristics
More
clear
Noisy seed selection
object by user leads to faulty
boundaries by the flow segmentation.
from the inner point to
outer region.
region
Because
of
splitting
scheme
in
region
to
other splitting
segments
related Compare
and methods it gives more seem square.
unrelated to the pixel in accurate result.
the other region.
In this method pixels Homogeneous
having
similar can be easily obtained.
characteristics in image
Clustering Method
regions Poor
Computationally faster.
worst-case
behaviour.
It requires similar size
are segmented into same
clusters,
so
the
an K-means works faster
assignment of the
image into different parts for the smaller value of
adjacent cluster center
based on the features of K.
is
the
correct
the image. The k-means
assignment.
algorithm is commonly
clusters.
Cluster
used for this method.
36
In this method all edges Works
Edge Based Method
are detected first and images
then
to
segment
well
with
the contrast
required region, edges regions.
are connected to form the
for
the Work not well for the
better image
having
more
between edges.
Selection
of
right
object edge is difficult.
object boundaries. It is
based on discontinuity
Segmentation Method
Equation Based
Partial Differential
detection in edges.
These
are
appropriate
fast
for
and Fastest Method
time
Computational
Complexity is more
critical applications. It is
based on the differential
equation working.
4.2.4. Feature extraction
Feature extraction is the important part to stylishly predict the infected region. Here shape and
textural feature extraction is done the research paper “Plant disease detection and its solution
using image classification”[57]. The shape-oriented feature extraction like Area, Color axis
length, eccentricity, solidity and perimeter are calculated. Similarly, the texture-oriented feature
extraction like contrast, correlation, energy, homogeneity and mean. Leaf image is captured and
processed to determine the health of each plant. The output of this phase is a number of feature
vectors corresponding to the segmented images resulted from phase (3). Image features usually
include color, shape and texture features.
37
Table 4-2: Summary of different color techniques
L*a*b [58]
Method
Description
Merits
Demerits
a) This color space consists one a) In this color and
a)
channel for Luminance and two intensity manage
singularity
other channels are a and b known as individually.
other
chromaticity layers.
transformation.
b) It can measure
Problem
of
as
nonlinear
b) Space consists of dimension L small color
for lightness and a and b for color differences.
adversary dimensions.
HSV Histogram [59]
a) HSV can be represented as a) Accuracy is more a) Sensitivity to
hexagon in three dimensions in
b) Applicable for
which intensity can be represented
real time
as central vertical axis.
lighting variations
is less.
applications.
b) It is Hue, saturation value.
c) Colors are described in term of
shades and brightness
a) It is color space based on RGB a)
RGB [58]
model.
suitable
display
for a)
It
is
correlative.
highly
So,
b) Consists of three independent
not good for color
image planes, one for each primary
image processing
color red, green and blue
c) It is an additive model
38
a)
Main
channel
luminance a) Overcome the a)
Correlation
describes the light intensity like rod correlation of RGB exists but less than
YUV [58]
cells of the retina
to some extent and RGB
b) Chrominance components U and require
V carry the color information
less
computation time
c) In this black and white color
information is separated from the
color information
According to the researchers A.A et al there are different texture feature extraction
techniques[60].
Table 4-3: Summary of different texture feature extraction techniques
Method
Description
Merits
Feature
vector a) Many matrices is
used to examine the length is small
required to be
texture which considers
computed
the spatial relationship
of pixels is the grey
Matrices
Grey Level Co-occurrence
a) It is statistical method a)
Demerits
level
co-occurrences
matrix.
b) Can be applied
for
the
different b) It’s not invariant
color space for color with
co-occurrence
rotation and scaling
matrix
Transform
Wavelets
a) It works better on the a) Best features with a) It is quite complex
frequency domain rather the higher accuracy and slower
than the spatial domain
can be produced
39
Analysis
ndependent Component
a) It is computational
a)
method for splitting a
statistics
multivariate signal into
easily obtained
additive small
order a) It is rarely used
can
be method.
b) It separates mixed
subcomponents
signal into a set of
independent signals.
a) It is used to analyze a)
specific
Gabor filter
Higher
It
is
multi a) So many filters are
frequency resolution and multi- used in application so
content in the image in scale filter
overall computational
specific directions in a
cost is high.
localized region around
the region of interest
b) It is used for
orientation, spectral
bandwidth
and spatial extent
Figure 4-4: Conversion of images to R, G and B images
40
4.2.5. Classification
The linear Support Vector Machine (SVM) algorithm is used to perform the binary classification
on whether an input image is infected with diseases bacterial blight, alternaria leaf spot and halo
blight or not. SVM separates bean healthy leaves from diseased and looks for the hyperplane
which ensures that the margin between the nearest healthyand diseased is the largest. According
to the researchers A.A et al there are different texture feature extraction techniques[60].
Table 4-4: Summary of different classifiers
Classifier
Description
Naive Bayes a) It is Probabilistic classifier
Classifier
b)
Strong
Merits
Demerits
a) Small amount of
Interaction between
independence training data is
assumption theorem
required for
features
learnt
can’t
be
because
of
c) value of the particular classification
independency
feature is independent of the
among the feature
value of any other feature
K-nearest
a) It is statistical and non- a) Implementation is a) Very Sensitive to
neighbor
parametric classifier
simple
noisy or irrelevant
b) Weight can be assigned to b) Don’t required data
the
contributions
of
the classes to be linearly b)
More
time-
neighbors, so nearer neighbor separable
consuming
testing
donates more in the average
process
than the distance neighbor
requires calculation
c) Distance metric has been
of distance to all
calculated for samples and
known instances
because
classify based on this distance
d) It uses Euclidean distance to
calculate distance
Support
a) It is based on the decision a) It is effective in a) Training time is
Vector
planes that define decision high
boundaries.
spaces
dimensional very high with large
data set
41
Machine
b) There are two stages of its b) In comparison
b)
For
mapping
working
with other
original
1) off-line process
classification
high dimension data
2) online process
techniques
selection of kernel
data
into
c) Multi-class support vector classification
function and kernel
machine as a set of binary accuracy is high.
parameters
is
vector machine is used for c) SVM is robust difficult
training and classification
enough, even though
training
samples
have
some
distortion.
Decision Tree
a) It repetitively divides the a) Small sized trees a) For some datasets
working area into small sub can
parts
by
identifying
attributes.
be
easily it is observed to over
its interpreted
fit
with
noisy
b) For many simple classification tasks.
b) Leaves present the class data sets accuracy is
labels and branches present comparable
with
features that lead to those other classifications
classes.
Artificial
a) It is derived from the a) It is robust and a)
Neural
concept
Network
biological neurons system
of
the
human can
handle
data
Requires
more
noisy training time
b)
Requires
large
b) It consists of two datasets b) Well suited to training samples
one for training and one for analyze
testing
numbers
complex c)
Requires
more
processing time
SVM usually used to recognize an object and brands it with their labels or names based on the
given information that is obtained during the feature extraction phase [61][62] . The descriptors
of new images are then used for comparison with the descriptors of the images already found in
the database to categorize them accordingly to their classification.It is a supervised machine
42
learning algorithm that is based on the concept of decision planes where linearly separable
classes can be identified using a hyperplane. Although this classifier takes time when training
images, it still does perform well even if the training sample has some bias and is limited. This
algorithm is a binomial classification type but can also be applied to multiple classes.It alsois
extremely popular around the time they were developed in the 1990s and continue to be the go-to
method for a high-performing algorithm with little tuning [63]. In machine learning, it is a set of
supervised learning models with associated learning algorithms that analyses data used for
Classification and regression analysis. supervised learning is possible if and only ifdataislabeled.
It constructs a hyper lane and a set of hyper lanes which in a high and infinite dimensional space,
which can be used for another task like outlier detection. Support vector machine is based on
finding the hyper lane that gives the largest minimum distance to the training. It analyses the
data after that it classify that data and then the regression is done with having the following
advantage and disadvantage.
The advantages of support vector machine are:
 Operative in high dimensional spaces.
 Good where number of dimensions is larger than the number of samples.
 Its memory is well-organized.
 Adaptable.
The disadvantages of support vector machines are:
 If the number of features is much greater than the number of samples, avoid over-fitting.
 SVMs do not directly provide probability estimations.
The proposed methodology in detecting the crop disease in image processing is as below.
Step 1: Havingbean crop leaves.
Step 2: Pre-process the Image to decrease noise value in considered leaf image.
Step 3: Image Segmentation performed using K-means Clustering to cluster the image into
leaf affected portion and unaffected one.
Step 4: Select the affected Region, if there of Interest from the Segmented Image.
Step 5: Feature Extraction is performed by maintaining statistical Parameters of Skewness,
Standard_ Deviation, Homogeneity, Contrast, Smoothness, Correlation, Kurtosis, Energy,
Entropy, Mean, Variance, RMS, and IDM.
43
Step 6: Use the Support Vector Machine for the Detection of leaf type (diseased and
healthy).
Step 8: Affirm the Disease type and assess the percentage of disease of that crop leaf.
44
CHAPTER FIVE
RESULTS AND DISCUSSIONS
5.1.
Introduction
Here, we have presented a report for experimental results actioned in testing the effectiveness of
our research. Accordingly, the type of classifier, the data set used and the results attained in the
classification process have been conversed. Besides these, the discriminative power of color,
size, and shape are tested, evaluated and compared with a number of algorithms used in each
processing steps.
5.2.
Data Set
A total of 100 fababean crop leaves are prepared to test the proposed model. Those crop sample
constituents are separated into their corresponding 4 classes based on their characteristics.
Hence, we finally have 4 outputs each corresponding to each of the classes. The data were
partitioned into bacterial blight, alternaria leaf spot and healthy. From the total data sets, 20 are
of Bacterial Blight, 20 are of Alternata Alternaria, 20 Halo blght and 20 are for healthy leaf and
the rest 20 are settled for testing purpose. For back ground we select the black color in
identifying the actual images. The samples of bean plants are positioned directly under the
camera for image acquisition. The classifier, SVM, used 80% of the data for training and the rest
20% is used for testing.
The main objective of this research is to train the machine in order to predict the disease. Bean
crop leaf disease was basically identified by witnessing different patterns on the parts of the crop
leaf. The design of the research is constructed with the sub activities:Image collection, Image
Preprocessing, Image segmentation, Feature extraction and Classification using SVM.The
research result in detecting the fababean diseases is starting with acquisition of images followed
by the steps of preprocessing the image and enhancing it, then to segment the image using
inverse difference method. Then, the extracted texture features of the bean leaf hasbeen passed to
SVM classifier so as to identify the disease.
45
5.3.
Testing Techniques on MATLAB
Beforehand of testing the projected classification technique on factual medical images which are
often times complex, we supposed it is mandatory to test the technique on foreseeable, noise free
imitated images. The principal perseverance of using MATLAB generated images for testing the
technique was to quickly determine whether the outcome of the test is precise or erroneous based
on inputs with noticeable outputs. The opening test is achieved on a MATLAB generated noise
free color image which has three colored areas particularly with colors Red, Blue and Green.
5.4.
Implementation
To implement the algorithm, the "MATLAB" tool is selected. MATLAB has an imageprocessing toolbox, which contains all functions that are used to analyze the image such as
reading, enhancement, converting from one image type to another, segmentation, labeling and
more. The research settled and implemented with MATLAB r2015b software on Hp, CORE i7
with 8GB RAM personal computer. The technique tested on the different set of datacollected
fromsources as stated in the third chapter. Firstly, MATLAB generated artificial color images
that are supposed easy for manual classification considered. The technique tested on healthy and
disease detected beans taken from the data used in different sites and an institution. In each steps
of image processing we have used different algorithms and techniques and we tested those
algorithms with respect to detecting bean leaf disease as classified. Firstly, we access the RGB
images with clicking “LOAD IMAGES” and select the appropriate image that we want to
remove noises or preprocess. In the succeeding stage click ‘ENHANCE CONTRAST’ button to
deepen the contrast of the input image. Then the preprocessed images were clustered in to fixed
pieces and we have to select one cluster which include the diseased crop leaf among(cluster 1,
cluster 2 and cluster 3) to see its features(in our case the clustered image have 13 parameters to
classify it to its type of images (Bacterial blight, Alternaria leaf spot, Halo Blight,
Healthy).When clicking the “CLASSIFICATION RESULT”, we have got the type of disease the
leaf detected or the message “healthy” with the percentage of regions detected (if it is affected).
At the end we have checked the Accuracy level of the classification we did for the input images
by clicking “Accuracy” button.
46
5.4.1. Stage One: Image acquisition
The diseased leaves sample images are collected and are used in training the system. To train and
to test the system, diseased leaf images and some healthy images are taken. The images are
stored in their captured or preprocessed format. In this research, we took images available in the
internet that are infected by alternaria leaf spot, halo blight and bacterial Blight.
Image Conversion
The system read the images with (a = imread(path);) function and the images have been
converted to Gray by the function (b = rgb2gray(a);) because to see the different color features of
the images we used as input for training. The input images were also be converted to each of the
Red, Gray and Blue color images. RGB color of the images were converted to HSI for higher
efficiency in observing those and to see additional features. as shown below. The images must
also be resized and changed the background color.
Figure 5-1:Conversion of RGB2HSI
5.4.2. Stage two: Image Preprocessing
Image pre-processing is substantial for genuine data that are frequently noisy and irregular[64].
In this phase, the transformation is performed to convert the image into another image to
improve the quality that better suits for analyzing. Properties like boundaries and edges are better
viewed in black images; statistical properties related to intensities are observed in greyscale
format, and the information related to color is seen well in RGB, HSI and other color formats of
47
the image. In this system, the imagesare resized to 256x256 and thresholding is done using
Otsu’s method which converts the intensity image to binary image. Convert the RGB image
format to a gray-scale image is also possible. Input image’s histogram is used to compute the
mean of the distribution and then scaled to a normalized value between 0 and 1. The image
below is the result of leaf detected with bacterial blight in preprocessing.
Figure 5-2:Preprocessing images
48
RGB is converted to Histogram equalization because it usually increases the global contrast of
the processing image and it is also useful for the images which are bright or dark. Histogram
equalization is a consideration for the image enhancement. It is a traditional approach of image
contrast adjustment then the histogram equalization is shown in Figure below. The histogram is a
graph showing the number of pixels in an image for each intensity level in the image.
Figure 5-3:Conversion of image to R, G and B and Histogram of the R, Gand B
49
Figure 5-4:Histogram equalization
5.4.3. Stage Three: Image segmentation
Here, the given image is separated into a similar region based on the features. Larger data sets
are put together into clusters of smaller and similar data sets using clustering approach. We have
used K-means clustering algorithm in segmenting the given image into three sets as a cluster that
contains the diseased part of the leaf. Since we have to consider all of the colors for
segmentation, intensities are kept aside for a while and only color information is taken into
consideration. In Bean crop leaf image segmentation of K-Means Algorithmclustering the
images are segmented as stated below.
1. To assign data points randomly the given data set should be divided into K number of clusters.
2. For each data point, the distance from data point to each cluster is computed using Euclidean
distance, which is the distance between two-pixel points and is given as follows:
Euclidean Distance=√ ((x1-x2) ² + (y1 -y2)²) where, (x1, y1) & (x2, y2) are two-pixel points (or
two data points).
3. The data point which is nearer to the cluster to which it belongs to should be left as it is.
50
4. The data point which is not close to the cluster to which it belongs to should be then shifted to
the nearby cluster.
5. Reiterate all the above steps for all data points.
6. Once the clusters are constant, clustering process needs to be immobile.
The clusters have their own structures that are identified and calculated to classify the images to
the appropriate type with respect to the disease that affect the leaf image.
Figure 5-5:Conversion of RGB to L*a*b color
Figure 5-6:K-mean clustering
51
5.4.4. Stage Four: Feature extraction
The features of the input images must be extracted. To do so instead of choosing the total set of
pixels we can choose only which are necessary and satisfactory to describe the whole of the
segment. The segmented image is first selected by manual interference. The affected area of the
image can be found from calculating the area connecting the components. First, the connected
components with 6 neighborhood pixels are found. Later the basic region properties of the input
binary images are found. The interest here is only with the area. The affected area is found out.
The percent area covered in this segment says about the quality of the result. The histogram of an
entity or image provides information about the frequency of occurrence of certain value in the
whole of the data/image. It is an important tool for frequency analysis. The co-occurrence takes
this analysis to next level wherein the intensity occurrences of two pixels together are noted in
the matrix, making the co-occurrence a tremendous tool for analysis.From gray-co-matrix, the
features such as Contrast, Correlation, Energy, Homogeneity' are extracted. The features
Standard deviation (SD), Mean, Entropy, RMS, variance, Smoothness, Kurtosis, Skewness,
IDM, Contrast, Correlation, Energy and Homogeneity have been calculated and used as input to
classify the images based on the values of each healthy or diseased bean plants.
Trainingdata’sare implemented in such below procedure:
1. Start with images of that are known.
2. Calculate the feature set for each of them and then label.
3. Take the next image as input and calculate features of this one as new input.
4. Implement the binary SVM to multi class SVM procedure.
5. Train SVM using kernel function of choice. The output will contain the SVM structure and
information of support vectors, bias value etc.
6. Group the class of the input image.
7. Depending on the outcome species, the label to the next image is given. Add the features set to
the database.
8. Steps 3 to 7 are repeated for all the images that are to be used as a database.
9. Testing procedure consists of steps 3 to 6 of the training procedure. The outcome species is the
class of the input image.
52
10. To find the accuracy of the system or the SVM, in this case, random set of inputs are chosen
for training and testing from the database.
GLCM Texture features are extracted from the segmented image. These features create a feature
vector which are served as an input for the training of a classification model collaboration with
training labels.
5.4.5. Stage Five: Classification
The classifier, SVM, makes use of the hyper-plane is called as the conclusion limit between two
of the classes. SVM is important in the problems of pattern recognition like texture classification.
In high dimensional spaceSVM plots nonlinear input data to the linear data that provides good
classification. SVM is used to maximize the marginal distance between different classes.
Different kernels are used to divide the classes. It is basically a binary classifier which
determines the hyper plane in dividing two classes. The boundary is maximized between the
hyper plane and the two classes. Support vectors are the samples that are nearest to the margin
which is selected in determining the hyper plane. It is also possible to use Multiclass
classification either by using one-to-one or one-to many. The one with the highest output
function is determined as the aiming class.The system classifies the disease type of the bean crop
by displaying the type of disease and calculating the amount of region that is affected with the
disease identified. At last the system allows to calculate the Accuracy of the SVM classification.
53
Figure 5-7:Bean Crop disease detection GUI
By using the classifier, SVM, the bean disease detected the status of bean crop leaves based on
the parameters calculated by the feature extraction. The result was in the boundary of Bacterial
blight, Alternata Alternaria, Halo blight and Healthy leaves. The classifier has selected the inputs
from the datasets of each diseased and the healthy leaf. From the total of 100 images about 96
are correctly classified to their classes (bacterial blight, Alternaria leaf spot, halo blight and
healthy). The result of the classification process has an average accuracy of 96.77% with error
rate of 3.23%.from the total images we portioned the into 20 datasets for the four classes and
correctly detected 19, 18, 19 and 18 of Alternaria leaf spot, bacterial blight, Halo blight and
Healthy leaf respectively. The average detection rate of diseasesis 92.5%.
Table 5-1: Accuracy value for each disease detection (%)
No.
Types of leaf
Accuracy (%)
Error rate (%)
1.
Detected with Alternaria Leaf spot
95
5
2.
Detected with Halo blight
90
10
3.
Detected with bacterial blight
95
5
4.
Healthy
90
10
54
Accuracy (%) and Error rate
Performace Evauation
100
80
60
Accuracy
40
Error
20
0
Alternaria Leaf spot
Bacterial blight
Halo blight
Healthy
Bean crop status
Figure 5-8:Accuracy and Error rate detection of diseases
The general procedure of bean crop leaf diseases detection and classification system was as
follow:
1. Read input image.
2. Resize the image of step1.
3. Enhance the contrast of the resized image
4. K-mean clustering operation will be applied
5. Segment images into three sub-features (Cluster 1, 2 and 3).
6. Select the disease affected area from the clusters (step 5).
7. Filter the image by use median filter to filter the image.
8. Feature extraction of images using Gray-Level Co-occurrence Matrix (GLCM).
9. Compute Skewness, Standard Deviation, Homogeneity, Contrast, Smoothness,
Correlation, Kurtosis, Energy, Entropy, Mean, Variance, RMS, and IDM.
10. Classify the diseases type using support vector machine.
11. Compute the accuracy.
Display calculated accuracy.
55
CHAPTER SIX
CONCLUSION AND RECOMMENDATION
6.1.
Conclusion
Agriculture in Ethiopia is the groundwork of the country's economy, accounting for half of gross
domestic product (GDP), 83.9% of exports, and 80% of total employment. Ethiopia's agriculture
is inundated by disease, periodic drought, soil degradationcaused by overgrazing, deforestation,
high levels of taxation and poor infrastructure (making it difficult and expensive to get goods to
market). Yet agriculture is the country's most promising resource with those obstacles. Crops are
the one that make the country build up its economy in production. Among the crops bean is the
one that are rich in protein. Production is overwhelmingly of a subsistence nature, and a large
part of commodity exports are provided by the small agricultural cash-crop sector. Principal
crops include coffee, pulses (e.g., beans), oilseeds, cereals, potatoes, sugarcane, and vegetables.
But the crops are affected with diseases that make it to minimum and unhealthy production for
the country. We have selected bean crop to put our contribution in increasing the countries
production by detecting its diseases with image processing and the machine learning algorithm
called SVM.
In fact, biological pest control has the great advantage in assuring the safety of employee,
protecting the environment, and also to reduce cost while increasing quality by processing
images. In order to achieve the biological pest controlling mechanism we have to identify the
disease in its early stage. Thus, developing an automatic system that identifies the disease of
bean in its early stage has no doubt. Accordingly, to identify different bean diseases, we have
chosen a digital image processing technique that is a recent research area in computer science.
Digital image processing is a means of processing digital images using a digital computer. Every
digital image processing application follows some fundamental steps like image acquisition,
preprocessing, feature extraction, segmentation and classification. For this research, we have
used a digital image processing to develop a method for automatic identification of bean crop
diseases with two phases that is training phase and test phase.
In the first phase bean crop leaf images are captured. Then, the images are preprocessed in order
to remove noises, lightening effects and others. After preprocessing features of it extracted and
56
the extracted images are segmented using Otsu’s method to identify the region of interest and
useful features are extracted. For this research we have selected the texture feature and median
filter of flower image and extracted using Gabor feature extraction. we have extracted the texture
feature of the image and we have represented the texture features using thirteen different
statistical data representation techniques. Finally, those thirteen texture features are used to
create the knowledge base which is used to train. In the testing phase bean crop leaf images that
are different from images that we use in training phase, are captured. Then, like the first phase
images are preprocessed, segmented and useful texture features of those images are extracted
from the image using the aforementioned technique. To test the classification accuracy of the
system an independent data set was used. The data set contains texture features of a normal and
diseased bean crop leaf image that are extracted using GLCM. The experimental result shows
that bean crop disease classification using texture features are efficient to classify the disease of a
bean in its class of disease.In general, identification of bean crop disease can be done
automatically using an image processing technique. Using the test data’s, the three class of
diseases bacterial blight, halo blight and Alternaria and healthy are identified as 95%, 90%, 95%
and 90% respectively, and the classifier overall performance is96.77%.
6.2.
Recommendation
In Ethiopia no researches have been conducted for bean crop in the identification of disease to
support the agricultural sector. Hence, this research work may encourage different researchers to
work on this area. Image analysis for the identification of bean crop disease can be further
investigated. The work can also be seen in depth and researched by the different structure of
bean crop image.
The following recommendations are made for further research and improvement.
✓ In this research paper we have built a system that identifies the type of disease that attack
the crop, bean. However, the system does not estimate the asperity of the disease
identified by the system. Therefore, automatically estimating the asperity of the identified
disease can be one research route.
✓ After the crop disease is identified it is good to recommend appropriate treatment. So,
automatically recommending the appropriate handling technique for the disease identified
will be likewise other research route.
57
✓
Increase the database for more bean crop disease by using large number of data as
training purpose in classification.
✓
✓
Identify the bean crop disease with machine learning algorithms rather than SVM
Maximize the disease detection and classification accuracy of the classifier we have
achieved
58
References
[1] N. Sneha, S. Thota and R. M. C, "A Comparative Study on Agricultural Crop Disease
Detection System," IJTSRD, vol. 2, 2018.
[2] FAO, "Statistics of dry bean," 2014.
[3] M. Blair, L. Gonzales, P. Kiman and L. Butare, "Genetic diversity, inter-gene pool
introgression and nutritional quality of common beans (Phaseolus vulgaris L.) from central
Africa," pp. 237-248, 2010.
[4] A. Cortes, F. Monserrate, J. Ramírez-Villegas, S. Madriñán and M. Blair, "Drought
tolerance in wild plant populations: The case of common beans (Phaseolus vulgaris L.),"
2013.
[5] T. Belete and Bastas, "Common Bacterial Blight ( Xanthomonas axonopodis pv. phaseoli)
of Beans with Special Focus on Ethiopian Condition," Journal of Plant Pathology &
Microbiology, 2017.
[6] P. Devaraj, P. A. Megha and P. V. B, ""Early detection of leaf diseases in Beans crop using
Image Processing and Mobile Computing techniques"," Advances in Computational
Sciences and Technology, vol. 10, 2017.
[7] B. Solomon, M. Firew, K. Gemechu and A. Birhanu, "Genetic Progress for Yield and Yield
Components and Reaction to bean Anthracnose ( C o ll e t o t r i c h u m li n d e m u t h i a n
u m ) of Large-Seeded Food Type Common Bean (Phaseolus vulgaris) Varieties," East
African Journal of science, vol. 1, pp. 15-26, 2019.
[8] M. M. T. Kajumula, "Evaluation of common bean (Phaseolus vulgaris L.) genotypes for
adaptation to low phosphorus.," ISRN Agronomy, 2012.
[9] D. L. J. B. Rodríguez, "Major constraints and trends for common bean production and
commercialization: Establishing priorities for future research.," Agron Colomb, pp. 423431, 2014.
[10] W. Mekuria and M. Ashenafi, "Evaluation of Faba Bean (Vacia faba L.) Varieties for
Chocolate Spot (Botrytis fabae L.) Disease Resistance at Bale Zone, Southeastern Ethiopia,"
Agricultural Research and Technology, 2018.
[11] E. A, A. Sa’ed and Anwar, "A Novel Approach to Classify and Detect Bean Diseases based
on Image Processing” Computer Engineering Department, Kuwait University, Kuwait,"
IEEE, 2018.
59
[12] A. Z and R. H, "Detecting diseases in Chilli Plants Using K-Means Segmented Support
Vector Machine," ThirdInternational Conference on Imaging, Signal Processing and
Communication, 2019.
[13] M. A. Devaraj and V. P, "Early detection of leaf diseases in Beans crop using Image
Processing and Mobile Computing techniques," Advances in Computational Sciences and
Technology, vol. 10, 2017.
[14] Y.-y. L, Z. Shi-yu and S. Jia-hui, "Detection of Ginseng Leaf Cicatrices Base on Kmeans
Clustering Algorithm”," 10th International Congress on Image and Signal Processing, Bio
Medical Engineering and Informatics, 2017.
[15] S. A, G. N and Parul, "Detection and classification of plant leaf diseases using image
processing techniques: A review," International Journal of Recent Advances in Engineering
& Technology (IJRAET), vol. 2, no. 3, pp. 1-7, 2014.
[16] K. J, C. R, S. T and K. R, "A review paper on plant disease detection using image
processing and neural network approach," Int. Journal of Engineering Sciences & Research
Technology (IJESRT), pp. 758-763, 2016.
[17] B. Sanjay, N. D and K. P, "Agricultural plant leaf disease detection using image
processing," International Journal of Advanced Research in Electrical, Electronics and
Instrumentation Engineering, vol. 2, no. 1, pp. 599-602, 2013..
[18] V. Sujeet and D. Tarun, "A novel approach for the detection of plant diseases," IJCSMC,
vol. 5, no. 7, pp. 44-54, 2016.
[19] D. Sachin, A. Khirade and P. B, "Plant disease detection using image processing,"
International Conference on Computing Communication Control and Automation, pp. 768771, 2015.
[20] D. Mrunmayee and A. Ingole, "Diagnosis of Pomegranate Plant Diseases using Neural
Network," IEEE, 2015.
[21] WASHINGTON STATE UNIVERSITY;, "Common Bacterial Blight and Halo Blight, Two
Bacterial Diseases of Phytosanitary Significance for Bean Crops in Washington State,"
WASHINGTON STATE UNIVERSITY EXTENSION FACT SHEET ,FS038E.
[22] G. Saradhambal, R. Dhivya and S. R. Latha, "PLANT DISEASE DETECTION AND ITS
SOLUTION USING IMAGE CLASSIFICATION," International Journal of Pure and
Applied Mathematics, vol. 119, pp. 879-874, 2018.
[23] K. W, "Bean Diseases," Cooperative Extension Service University of Kentucky College of
Agricu lture, Food and Environment.
60
[24] T. Belete and K. Bastas, "Common Bacterial Blight (Xanthomonas axonopodis pv.
phaseoli) of Beans with Special Focus on Ethiopian Condition," Journal of Plant Pathology
& Microbiology, 2017.
[25] D. Sharath, S. Akhilesh, K. Arun, M. Rohan and C. Prathap, "Image based Plant Disease
Detection in Pomegranate Plant for Bacterial Blight," International Conference on
ommunication and Signal Processing, April 4-6,2019.
[26] P. H, J. E and H. K. S, "Crops Disease Diagnosing Using Image-Based Deep Learning
Mechanism," International Conference on Computing and Network Communications
(CoCoNet), 2018.
[27] O. Min and N. Chi Htun, "Plant Leaf Disease Detection and Classification," International
Journal of Research and Engineering, vol. 5(9), pp. 516-523, 2018.
[28] S. L, A. Adline, A. L, A. N and K. G, "Disease detection in crops using remote sensing
image," IEEE Technological Innovations in ICT for Agriculture and Rural Developement
,IEEE, 2017.
[29] S. H and S. K, "Tomato plant disease classification in digital images using classification
tree," 2016 International Conference on Communication and Signal Processing(ICCSP),
2016.
[30] K. Rajneet, "A Brief Review on Plant Disease Detection using in Image Processing,,"
IJCSMC, vol. 6, p. 101 – 106, 2017.
[31] D. C, A. L and Y. Sandeep, "Overview of Image Processing," International Journal for
Research in Applied Science & Engineering Technology, vol. 2, 2014.
[32] B. Pranjali and A. Anjali, "SVM Classifier Based Grape Leaf Disease Detection,"
Conference on Advances in Signal Processing (CASP) Cummins College of Engineering for
Women, IEEE, 2016.
[33] M. R, Prakash, P. G, G. Saraswathy, K. Ramalakshmi, M. H and K. T, "Detection of Leaf
Diseases and Classification using Digital Image Processing," International Conference on
Innovations in Information, Embedded and Communication Systems (ICIIECS, IEEE, 2017.
[34] P. Jitesh, B. Harshadkumar and K. Vipul, "A Survey on Detection and Classification of Rice
Plant Diseases," IEEE, 2016.
[35] D. M. Sharath, Akhilesh, A. K. S, M. Rohan and P. C, " Image based Plant Disease
Detection in Pomegranate Plant for Bacterial Blight," international Conference on
Communication and Signal Processing, IEEEE, 2019.
[36] R. Indumathi, S. V. Thejuswini and Swarnareka.R., " LEAF DISEASE DETECTION AND
FERTILIZER SUGGESTION," Proceeding of International Conference on System
61
Computation Automation and Networking,IEEE, 2019.
[37] S. D, K. A and P. B, "Plant Disease Detection Using Image Processing I," International
Conference on Computing Communication Control and Automation, IEEE, pp. 768-771,
2015.
[38] M. C. Ghulam and G. Vikrant, "Advance in Image Processing for Detection of Plant
Diseases," International Journal of Advanced Research in Computer Science and Software
Engineering,, vol. 5, pp. 1090-1093, 2015.
[39] S. Megha, C. R. Niveditha, N. SowmyaShree and K. Vidhya, "Image Processing System for
Plant Disease Identification by Using FCM-Clustering Technique," International Journal of
Advance Research, Ideas and Innovations in Technology, vol. 3, pp. 445-449, 2017.
[40] W. Zhaobin, Z. Xu, S. Xiaoguang, W. Hao, Z. Ying, L. Jianpeg and M. Yide, "Petiole
detection algorithm based on leaf image," IEEE 28th Canadian Conference on Electrical and
Computer Science Engineering, Halifax, Canada, pp. 1430-1434, 2015.
[41] G. K and R. K, "Plant Disease Detection Techniques: A Review," International Conference
on Automation, Computational and Technology Management (ICACTM) Amity University,
IEEE, 2019.
[42] N. Trimi and K. Sushma, "Detection of Plant Disease Using Threshold, K- Mean Cluster
and ANN Algorithm," 2nd International Conference for Convergence in Technology
(I2CT),IEEE, 2017.
[43] T. Getahun, "Automatic Flower Disease Identification Using Image processing," thesis
submitted to the school of graduate studies of the addis ababa university in partial
fulfillment for the degree of masters of science in computer science, 2015.
[44] K. K and M. Chetan, "Analaysis of Diseases in Fruits using Image Proccessing
Technqiues," International Conference on Trends in Electronics and Informatics (ICEI)
,IEEE, 2017.
[45] R. Namrata and R. V, "Diseases Detection of Cotton Leaf Spot using Image Processing and
SVM Classifier," Proceedings of the Second International Conference on Intelligent
Computing and Control Systems (ICICCS),IEEE, 2018.
[46] A. H. B. A and R. Z., "Detecting diseases in Chilli Plants Using K-Means Segmented
Support Vector Machine," 3rd International Conference on Imaging, Signal Processing and
Communication, IEEE, 2019.
[47] S. Gharge and P. Singh, "Image Processing for Soybean Disease Classification and Severity
Estimation," Emerging Research in Computing, Information, Communication and
Applications,IEEE , pp. 493-500, 2016.
62
[48] J. Singh and H. Kaur, "A Review on: Various Techniques of Plant Leaf Disease Detection,"
Proceedings of the Second International Conference on Inventive Systems and
Control,IEEE, vol. 6, pp. 232-238, 2018.
[49] Khirade, D. Sachin and B. P. A, "Plant Disease Detection Using Image Processing,"
Computing Communication Control and Automation (ICCUBEA), International
Conference, IEEE, , 2015.
[50] B. Pranjali and A. Anjali, "SVM Classifier Based Grape Leaf Disease Detection C,"
Conference on Advances in Signal Processing (CASP) Cummins College of Engineering for
Women, IEEE, 2016.
[51] M. Bhange and H. Hingoliwala, "Smart Farming: Pomegranate Disease Detection Using
Image Processing," Second International Symposium on Computer Vision and the Internet,
vol. 58, pp. 280-288, 2015.
[52] N. Sneha, S. Thota and R. C, "A Comparative Study on Agricultural Crop Disease
Detection Systems," IJTSRD, vol. 2, 2018.
[53] A. D, K. R, S. J and I. K, "Identification of Plant Disease using Image Processing
Technique," International Conference on Communication and Signal Processing,IEEE,
2019.
[54] P. S. P. Patil1 and Z. Ms. Rupali S, "Classification of Cotton Leaf Spot Disease Using
Support Vector Machine," Ms. Rupali S.Zambre et al Int. Journal of Engineering Research
and Applications,Ijera, vol. 4, no. 5(1), pp. 92-97, 2014.
[55] G. Nikita, J. Dhruv and A. Sinha, "Prediction Model for Automated Leaf Disease," IEEE,
2018.
[56] T. N. T. a. S. Kamlu, " Detection of plant disease using threshold, k-mean cluster and ann
algorithm," 2nd International Conference for Convergence in Technology (I2CT),
December, 2017.
[57] D. L. R. Saradhambal.G, "PLANT DISEASE DETECTION AND ITS SOLUTION USING
IMAGE CLASSIFICATION," International Journal of Pure and Applied Mathematics, vol.
Volume 119, pp. 879-884, 2018.
[58] U. K. J. D. G. T. U. solanki, " a survey on detection of disease and fruit grading,"
international journal of innovative and emerging, vol. 2, no. 2, 2015.
[59] G. Nilay and P. Atul, "A Survey on Diseases Detection and Classification of Agriculture
Products using Image Processing and Machine Learning," International Journal of
Computer Applications, vol. Volume 180 , January 2018.
63
[60] T. V, P. a and P. P, "Cucumber disease detection using artificial neural network,"
International Conference on Inventive Computation Technologies (ICICT), January 2017.
[61] V. T. a. P. P. Pooja Pawar, " Cucumber disease detection using artificial neural network,"
International Conference on Inventive Computation Technologies (ICICT) , January, 2017.
[62] A. A. S. a. V. Pawar, " Machine learning regression technique for cotton leaf disease
detection and controlling using IoT," International conference of Electronics,
Communication and Aerospace Technology (ICECA) , April, 2017.
[63] D. Aarju and N. Sumit, "Wheat Leaf Disease Detection Using Machine Learning Method- A
Review," International Journal of Computer Science and Mobile Computing, vol. 7, no. 5,
pp. 124-129, May 2018.
[64] T. Suman and T. Dhruvakumar, "Classification of paddy leaf diseases using shape and color
features I," JEEE, vol. 07, no. 01, pp. 339-250, 2015.
[65] T. Getahun, "Automatic Flower Disease Identification Using Image processing," Thesis
submitted to the school of graduate studies of the addis ababa university in partial
fulfillment for the degree of masters of science in computer science, 2015.
64
Download